Product Website | 🤗 Hugging Face | Paper | Paper Website | Cosmos Cookbook
NVIDIA Cosmos™ is a platform purpose-built for physical AI, featuring state-of-the-art generative world foundation models (WFMs), robust guardrails, and an accelerated data processing and curation pipeline. Designed specifically for real-world systems, Cosmos enables developers to rapidly advance physical AI applications such as autonomous vehicles (AVs), robots, and video analytics AI agents.
Cosmos World Foundation Models come in three model types which can all be customized in post-training: cosmos-predict, cosmos-transfer, and cosmos-reason.
- [November 25, 2025] Added Blackwell + ARM inference support, Auto/Multiview code fixes, along with fixes for the help menu and CLI overrides, improved guardrail offloading, and LFS enablement for large assets.
- [November 11, 2025] Refactored the Cosmos-Transfer2.5-2B Auto/Multiview code, and updated the Auto/Multiview checkpoints in Hugging Face.
- [November 7, 2025] We added autoregressive sliding window generation mode for generating longer videos. We also added a new multiview cross-attention module, upgraded dependencies to improve support for Blackwell, and updated inference examples and documentation.
- [November 6, 2025] As part of the Cosmos family, we released the recipe, a reference diffusion model and a tokenizer for synthetic LiDAR point cloud generation from RGB image!
- [October 28, 2025] We added Cosmos Cookbook, a collection of step-by-step recipes and post-training scripts to quickly build, customize, and deploy NVIDIA’s Cosmos world foundation models for robotics and autonomous systems.
- [October 28, 2025] We added the autogeneration of spatiotemporal masking for control inputs when prompt is given, added cosmos-oss, new pyrefly annotations, introduced multi-storage backend in easyio, reorganized internal packages, and boosted Transfer2 speed with Torch Compile tokenizer optimizations.
- [October 21, 2025] We added on-the-fly computation support for depth and segmentation, and fixed multicontrol experiments in inference. Also, updated Docker base image version, and Gradio related documentation.
- [October 13, 2025] Updated Transfer2.5 Auto Multiview post-training datasets, and setup dependencies to support NVIDIA Blackwell.
- [October 6, 2025] We released Cosmos-Transfer2.5 and Cosmos-Predict2.5 - the next generation of our world simulation models!
- [June 12, 2025] As part of the Cosmos family, we released Cosmos-Transfer1-DiffusionRenderer
Cosmos-Transfer2.5 is a multi-controlnet designed to accept structured input of multiple video modalities including RGB, depth, segmentation and more. Users can configure generation using JSON-based controlnet_specs, and run inference with just a few commands. It supports both single-video inference, automatic control map generation, and multiple GPU setups.
Physical AI trains upon data generated in two important data augmentation workflows.
Minimizing the need for achieving high fidelity in 3D simulation.
Input prompt:
A contemporary luxury kitchen with marble tabletops. window with beautiful sunset outside. There is an esspresso coffee maker on the table in front of the white robot arm. Robot arm interacts with a coffee cup and coffee maker on the kitchen table.
| Input Video | Computed Control | Output Video |
|---|---|---|
robot_cg.mp4 |
robot_seg.mp4See more computed controlsrobot_depth.mp4robot_edge.mp4 |
robot_output.mp4 |
Leveraging sensor captured RGB augmentation.
Input prompt:
Dashcam video, driving through a modern urban environment, winter with heavy snow storm, trees and sidewalks covered in snow.
| Input Video | Computed Control | Output Video |
|---|---|---|
car_input.mp4 |
car_edge.mp4See more computed controlscar_blur.mp4car_depth.mp4car_seg.mp4 |
car_output.mp4 |
Robotic Matrix Diversity Example
Robot_Matrix_Diversity_Sample.mp4
AV Matrix Diversity Example
AV_Matrix_Diversity_Sample.mp4
For an example demonstrating how to augment sythentic data with Cosmos Transfer on robotics navigation tasks to improve Sim2Real performance see Cosmos Transfer Sim2Real for Robotics Navigation Tasks in the Cosmos Cookbook.
Cosmos-Transfer supports data generation in multiple industry verticals, outlined below. Please check back as we continue to add more specialized models to the Transfer family!
Cosmos-Transfer2.5-2B: General checkpoints, trained from the ground up for Physical AI and robotics.
Cosmos-Transfer2.5-2B/auto: Specialized checkpoints, post-trained for Autonomous Vehicle applications. Multiview checkpoints. For an example demonstrating how to augment sythentic data with Cosmos Transfer on Autonomous Vehicle see Cosmos Transfer 2.5 Sim2Real for Simulator Videos in the Cosmos Cookbook.
We thrive on community collaboration! NVIDIA-Cosmos wouldn't be where it is without contributions from developers like you. Check out our Contributing Guide to get started, and share your feedback through issues.
Big thanks 🙏 to everyone helping us push the boundaries of open-source physical AI!
This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
NVIDIA Cosmos source code is released under the Apache 2 License.
NVIDIA Cosmos models are released under the NVIDIA Open Model License. For a custom license, please contact cosmos-license@nvidia.com.
