The hidden killer of your user experience? Inference latency. 👈 When scaling your open-source LLM to serve millions, that momentary pause quickly compounds into an operational nightmare. The core of the problem lies in the inefficient memory management of concurrent requests. In the newest episode of AI Lab, we dive deep into Paged Attention — the optimization technique that eliminates this bottleneck and fundamentally reshapes open-source LLM inference. For any engineer building scalable AI applications, you’ll learn: ✦ Why the traditional key-value (KV) cache creates crippling memory fragmentation and low GPU utilization. ✦ How paged attention (the secret sauce behind vLLM and SGLang) borrows a page from operating systems to achieve massive throughput gains. ✦ The trade-offs between vLLM and SGLang ✦ How Crusoe Managed Inference removes this complex operational burden ➡️ Watch the full episode now: https://lnkd.in/gQxVEP7h
Crusoe
Technology, Information and Internet
Denver, Colorado 50,844 followers
The AI factory company. We are on a mission to accelerate the abundance of energy and intelligence.
About us
Crusoe is the industry’s first vertically integrated, purpose-built AI cloud platform. The company is redefining AI cloud infrastructure and its platform is recognized as the "gold standard" among builders for its reliability and performance in developing, training, and deploying AI models. Powered by clean, renewable energy, Crusoe aligns the future of computing with the future of the climate. Leading Fortune 500 companies trust Crusoe’s advanced, AI-optimized cloud to support their most demanding AI applications.
- Website
-
https://crusoe.ai/
External link for Crusoe
- Industry
- Technology, Information and Internet
- Company size
- 501-1,000 employees
- Headquarters
- Denver, Colorado
- Type
- Privately Held
- Founded
- 2018
- Specialties
- AI, Cloud, Data Centers, Energy, Infrastructure, Manufacturing, Environment, and Sustainability
Locations
-
Primary
Get directions
Denver, Colorado 80202, US
Employees at Crusoe
Updates
-
Crusoe is excited to share that we’re integrating NVIDIA’s full-stack, co-designed AI platform — including NVIDIA Blackwell NVL72 systems — into our infrastructure. Visit the blog to learn: 🔀 How Crusoe Cloud delivers up to 9.9x inference performance gains for complex MoE models like DeepSeek-R 🚀 Why MoE models are becoming the new industry standard 🧩 How the NVIDIA GB200 NVL72 distributes experts across up to 72 GPUs ➡️ Discover a faster path to production, scale, and real-world outcomes:
-
Last week, Crusoe hosted an exclusive tech talk and happy hour in London with a special guest: generative video pioneer Decart. CTOs, technical teams, researchers, and ML/LLM specialists from across the industry joined us to explore how Crusoe is powering some of the world’s largest AI factories and navigating the next wave of innovation. We kicked the evening off with a keynote by Ido Dan, Decart’s VP of R&D, followed by an AI builders’ panel with Guillaume Lebedel, co-founder and CTO of StackOne, John Torr, founder and CEO of Inephany and Jonathan Scholz, founder and CEO of Reimagine Robotics. We also raised a glass to celebrate Crusoe’s recent $1.375B funding round, which will help us continue supporting AI-native startups in the UK and across EMEA. 🇬🇧 A huge thank you to everyone who joined us in London — and to our generous panelists for sharing their founder journeys and advice for AI builders.
-
-
-
-
-
+3
-
-
NVIDIA GB200 NVL72 performance unlocked. 👇🔑 The NVIDIA GB200 is a rack-scale powerhouse with a 72-GPU NVLink-coherent domain and Arm-based Grace CPU. We teamed up with PyTorch and TorchTitan for Llama 4 Scout 17B pretraining to validate its true potential. Our finding is critical for builders: Using SLURM block scheduling to respect the physical NVLink topology yielded a 13% tokens-per-second throughput improvement in aligned runs. At 4-rack/256-GPU scale, we still achieved a massive 97% scaling efficiency! 🛑 Stop leaving performance on the table. See the full technical breakdown, SLURM configuration, and PyTorch setup now. ➡️
-
We had a great time at NVIDIA GTC in Washington, D.C. last month! One highlight? Hearing our VP of Public Affairs Sara Axelrod's keynote on how Crusoe is powering the AI race. 🎤 Sara drew a direct line from America’s historic infrastructure investments — the interstate highway system, the Hoover Dam — to the industrialization of AI unfolding today. In her words, “AI is driving the largest capital investment in human history.” Catch the replay 📺: https://lnkd.in/g57x7uqe We’re already looking forward to GTC 2026 March 16–19 in San Jose! Come find us there to connect with our team.
-
As we near the end of 2025, we're feeling nothing but gratitude. ⛰️ We're thankful for our incredible Crusoe teammates, who embody resilience and commitment daily, tackling hard things and accelerating our mission. 🚀 And we’re deeply thankful for our customers, whose ambition and trust fuel the next era of abundant, purpose-built AI. Your partnership drives us forward. Wishing everyone a happy, restful, and well-deserved break!
-
-
Crusoe is teaming up with Bain Capital Ventures (BCV), Tilde Research, and Actioners Residency for Compute Poker Night, an evening of cards, compute, and good company. ⌚️When: Wednesday, December 10 at 6:30 p.m. PST 📍Where: San Francisco, CA Come trade compute and play for your share of $30K in Crusoe Cloud credits over beverages and light snacks. AI infrastructure founders, researchers, and investors — bring your best poker face. ♠️ Space is limited to ensure a quality experience. Register today ➡️ https://luma.com/h3ysgxs0
-
-
Your high-performance GPUs shouldn't sit idle. ⌛ We've published a technical guide detailing the full deployment of NVIDIA Run:ai on Crusoe Managed Kubernetes (CMK), transforming your capacity into a pooled, elastic resource. This isn't just a basic install. We guide you through the complete stack for an end-to-end solution: 💥 Secure cluster setup: Configuration of firewall rules and Cert-Manager to ensure encrypted, trusted communication between all Run:ai control plane and cluster components. 💥 MLOps ready: Step-by-step integration of Kubeflow, MPI, and Knative to handle the full MLOps lifecycle, from distributed training to highly scalable, serverless inference. 💥 Results: Run:ai's intelligent scheduler maximizes utilization, combining with CMK's robust infrastructure to accelerate your entire AI lifecycle. If you run distributed training or need dynamic GPU resource sharing, this guide is a must. ➡️ Read the step-by-step guide and deploy Run:ai on CMK today: https://lnkd.in/giJpYhBV
-
-
“What does an AI factory mean to you?” That was the first question NVIDIA’s Satej D. posed to panelists on the main stage at NVIDIA GTC in Washington, D.C. last month. 🎤 Here’s how our CEO and co-founder Chase Lochmiller answered: “It’s purposeful infrastructure that’s meant to produce intelligence. Because of that, you make a lot of different design decisions, ranging from where you locate the facility to how dense the configurations are to how the cooling architecture works to how the networking fabric works. It all comes together to produce intelligent results for customers.” Chase joined Christopher James (CEO, Engine No. 1), Giordano Albertazzi (CEO, Vertiv), Dana Adams (President, North America, Vantage Data Centers), and Jacob DeWitte (Co-founder and CEO, Oklo Inc) for a dynamic discussion on: 💰 Managing the total cost of AI factory ownership, from silicon to labor ⚡️ Optimizing factory performance, energy use, scalability & sustainability 🛠️ Making the right design decisions to build efficient, future‑ready AI infrastructure 🎥 Watch the panel, “How to build and right-size AI Factories for the age of intelligence”:
-
We’re hiring! Crusoe is looking for a Sr. Software Engineer to join our Managed Platform Services (MaPS) team. Ready to make an impact on the future of AI? Join us: 🔗 https://lnkd.in/gNEJwBWN
At Crusoe, we move at a rapid pace to deliver the next generation of AI infrastructure. But we also recognize that doing our best work requires taking time to disconnect, unplug, and recharge together. The Managed Platform Services (MaPS) team took that to heart this November, wrapping up a season of hard work with an afternoon sail across the SF Bay. It was the perfect opportunity to bond, catch some Golden Gate views, and say hello to the famous Pier 39 sea lions 🦭. Huge thanks to the team for all the hard work and for making it a great afternoon on the water! Join a team that values high performance /and/ balance. We're hiring Senior Software Engineers in the Bay Area to design and scale Crusoe Cloud's customer-facing platforms. Apply here: 🔗 https://lnkd.in/gaskC2GY