Skip to content

nvidia-cosmos/cosmos-transfer2.5

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

NVIDIA Cosmos

Product Website  | 🤗 Hugging Face  | Paper | Paper Website | Cosmos Cookbook

NVIDIA Cosmos™ is a platform purpose-built for physical AI, featuring state-of-the-art generative world foundation models (WFMs), robust guardrails, and an accelerated data processing and curation pipeline. Designed specifically for real-world systems, Cosmos enables developers to rapidly advance physical AI applications such as autonomous vehicles (AVs), robots, and video analytics AI agents.

Cosmos World Foundation Models come in three model types which can all be customized in post-training: cosmos-predict, cosmos-transfer, and cosmos-reason.

News

  • [November 25, 2025] Added Blackwell + ARM inference support, Auto/Multiview code fixes, along with fixes for the help menu and CLI overrides, improved guardrail offloading, and LFS enablement for large assets.
  • [November 11, 2025] Refactored the Cosmos-Transfer2.5-2B Auto/Multiview code, and updated the Auto/Multiview checkpoints in Hugging Face.
  • [November 7, 2025] We added autoregressive sliding window generation mode for generating longer videos. We also added a new multiview cross-attention module, upgraded dependencies to improve support for Blackwell, and updated inference examples and documentation.
  • [November 6, 2025] As part of the Cosmos family, we released the recipe, a reference diffusion model and a tokenizer for synthetic LiDAR point cloud generation from RGB image!
  • [October 28, 2025] We added Cosmos Cookbook, a collection of step-by-step recipes and post-training scripts to quickly build, customize, and deploy NVIDIA’s Cosmos world foundation models for robotics and autonomous systems.
  • [October 28, 2025] We added the autogeneration of spatiotemporal masking for control inputs when prompt is given, added cosmos-oss, new pyrefly annotations, introduced multi-storage backend in easyio, reorganized internal packages, and boosted Transfer2 speed with Torch Compile tokenizer optimizations.
  • [October 21, 2025] We added on-the-fly computation support for depth and segmentation, and fixed multicontrol experiments in inference. Also, updated Docker base image version, and Gradio related documentation.
  • [October 13, 2025] Updated Transfer2.5 Auto Multiview post-training datasets, and setup dependencies to support NVIDIA Blackwell.
  • [October 6, 2025] We released Cosmos-Transfer2.5 and Cosmos-Predict2.5 - the next generation of our world simulation models!
  • [June 12, 2025] As part of the Cosmos family, we released Cosmos-Transfer1-DiffusionRenderer

Cosmos-Transfer2.5

Cosmos-Transfer2.5 is a multi-controlnet designed to accept structured input of multiple video modalities including RGB, depth, segmentation and more. Users can configure generation using JSON-based controlnet_specs, and run inference with just a few commands. It supports both single-video inference, automatic control map generation, and multiple GPU setups.

Physical AI trains upon data generated in two important data augmentation workflows.

Simulation 2 Real Augmentation

Minimizing the need for achieving high fidelity in 3D simulation.

Input prompt:

A contemporary luxury kitchen with marble tabletops. window with beautiful sunset outside. There is an esspresso coffee maker on the table in front of the white robot arm. Robot arm interacts with a coffee cup and coffee maker on the kitchen table.

Input Video Computed Control Output Video
robot_cg.mp4
robot_seg.mp4
See more computed controls
robot_depth.mp4
robot_edge.mp4
robot_output.mp4

Real 2 Real Augmentation

Leveraging sensor captured RGB augmentation.

Input prompt:

Dashcam video, driving through a modern urban environment, winter with heavy snow storm, trees and sidewalks covered in snow.

Input Video Computed Control Output Video
car_input.mp4
car_edge.mp4
See more computed controls
car_blur.mp4
car_depth.mp4
car_seg.mp4
car_output.mp4

Scaling World State Diversity Examples

Robotic Matrix Diversity Example

Robot_Matrix_Diversity_Sample.mp4

AV Matrix Diversity Example

AV_Matrix_Diversity_Sample.mp4

For an example demonstrating how to augment sythentic data with Cosmos Transfer on robotics navigation tasks to improve Sim2Real performance see Cosmos Transfer Sim2Real for Robotics Navigation Tasks in the Cosmos Cookbook.

Cosmos-Transfer2.5 Model Family

Cosmos-Transfer supports data generation in multiple industry verticals, outlined below. Please check back as we continue to add more specialized models to the Transfer family!

Cosmos-Transfer2.5-2B: General checkpoints, trained from the ground up for Physical AI and robotics.

Cosmos-Transfer2.5-2B/auto: Specialized checkpoints, post-trained for Autonomous Vehicle applications. Multiview checkpoints. For an example demonstrating how to augment sythentic data with Cosmos Transfer on Autonomous Vehicle see Cosmos Transfer 2.5 Sim2Real for Simulator Videos in the Cosmos Cookbook.

User Guide

Contributing

We thrive on community collaboration! NVIDIA-Cosmos wouldn't be where it is without contributions from developers like you. Check out our Contributing Guide to get started, and share your feedback through issues.

Big thanks 🙏 to everyone helping us push the boundaries of open-source physical AI!

License and Contact

This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.

NVIDIA Cosmos source code is released under the Apache 2 License.

NVIDIA Cosmos models are released under the NVIDIA Open Model License. For a custom license, please contact cosmos-license@nvidia.com.

About

Cosmos-Transfer2.5, built on top of Cosmos-Predict2.5, produces high-quality world simulations conditioned on multiple spatial control inputs.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Packages

No packages published

Languages