
SAN JOSE, CA (GTC), Mar 21, 2025 – NVIDIA has unveiled its new NVIDIA Cosmos world foundation models (WFMs), offering developers an open and adaptable system designed to advance physical AI. This update lets users build virtual environments with more freedom, offering greater control over world creation.
NVIDIA is also introducing two advanced blueprints built on the NVIDIA Omniverse and Cosmos platforms. These tools act as powerful synthetic data engines, enabling developers to produce customizable datasets to train robots and autonomous vehicles with precision and efficiency.
Leading companies like 1X, Agility Robotics, Figure AI, Foretellix, Skild AI, and Uber are already turning to Cosmos to transform how they generate training data. With this technology, they can scale up data creation while enhancing accuracy, speeding progress in physical AI development.
“Just as large language models revolutionized generative and agentic AI, Cosmos world foundation models are a breakthrough for physical AI,” said Jensen Huang, founder and CEO of NVIDIA. “Cosmos introduces an open and fully customizable reasoning model for physical AI and unlocks opportunities for step-function advances in robotics and the physical industries.”
Cosmos Transfer for Synthetic Data Generation
Cosmos Transfer WFMs ingest structured video inputs such as segmentation maps, depth maps, lidar scans, pose estimation maps and trajectory maps to generate controllable photoreal video outputs.
Cosmos Transfer streamlines perception AI training, transforming 3D simulations or ground truth created in Omniverse into photorealistic videos for large-scale, controllable synthetic data generation.
Agility Robotics will be an early adopter of Cosmos Transfer and Omniverse for large-scale synthetic data generation to train its robot models.
“Cosmos offers us an opportunity to scale our photorealistic training data beyond what we can feasibly collect in the real world,” said Pras Velagapudi, chief technology officer of Agility Robotics. “We’re excited to see what new performance we can unlock with the platform, while making the most use of the physics-based simulation data we already have.”
The NVIDIA Omniverse Blueprint for autonomous vehicle simulation uses Cosmos Transfer to amplify variations of physically based sensor data. With the blueprint, Foretellix can enhance behavioral scenarios by varying conditions like weather and lighting for diverse driving datasets. Parallel Domain is also using the blueprint to apply similar variation to its sensor simulation.
The NVIDIA GR00T Blueprint for synthetic manipulation motion generation combines Omniverse and Cosmos Transfer to generate diverse datasets at scale, benefiting from OpenUSD-powered simulations and reducing data collection and augmentation time from days to hours.
Cosmos Predict for Intelligent World Generation
Announced at the CES trade show in Jan’25, Cosmos Predict WFMs generate virtual world states from multimodal inputs like text, images and video. New Cosmos Predict models will enable multi-frame generation, predicting intermediate actions or motion trajectories when given start and end input images. Purpose-built for post-training, these models can be customized using NVIDIA’s openly available physical AI dataset.
With the inference compute power of NVIDIA Grace Blackwell NVL72 systems and their large NVIDIA NVLink domain, developers can achieve real-time world generation.
1X is using Cosmos Predict and Cosmos Transfer to train its new humanoid robot NEO Gamma. Robot brain developer Skild AI is tapping into Cosmos Transfer to augment synthetic datasets for its robots. Plus, Nexar and Oxa are using Cosmos Predict to advance their autonomous driving systems.
Multimodal Reasoning for Physical AI
Cosmos Reason is an open, customizable WFM with spatiotemporal awareness that uses chain-of-thought reasoning to understand video data and predict the outcomes of interactions – such as a person stepping into a crosswalk or a box falling from a shelf – in natural language.
Developers can use Cosmos Reason to improve physical AI data annotation and curation, enhance existing world foundation models or create new vision language action models. They can also post-train it to build high-level planners to tell the physical AI what it needs to do to complete a task.
Accelerating Data Curation and Post-Training for Physical AI
Based on their downstream task, developers can post-train Cosmos WFMs using native PyTorch scripts or the NVIDIA NeMo framework on NVIDIA DGX Cloud.
Cosmos developers can also use NVIDIA NeMo Curator on DGX Cloud for accelerated data processing and curation. Linker Vision and Milestone Systems are using it for curating large amounts of video data to train large vision language models for visual agents built on the NVIDIA AI Blueprint for video search and summarization. Virtual Incision is exploring it to be deployed in future surgical robots, while Uber and Waabi are advancing autonomous vehicles development.
Driving Responsible AI and Content Transparency
In line with NVIDIA’s trustworthy AI principles, NVIDIA enforces open guardrails across all Cosmos WFMs. In addition, NVIDIA is collaborating with Google DeepMind to integrate SynthID to watermark and help identify AI-generated outputs from the Cosmos WFM NVIDIA NIM microservice featured on build platform.
Availability
Cosmos WFMs are available for preview in the NVIDIA API catalog and now listed in the Vertex AI Model Garden on Google Cloud. Cosmos Predict and Cosmos Transfer are openly available on Hugging Face and GitHub. Cosmos Reason is available in early access.
Source: NVIDIA
About NVIDIA
NVIDIA Corp. is an American tech company headquartered in Santa Clara, CA. Renowned for designing and manufacturing graphics processing units (GPUs), NVIDIA’s innovations have significantly impacted various sectors. The company’s products and services cater to industries such as gaming, where its GPUs enhance visual experiences; artificial intelligence (AI), providing high-performance computing solutions; automotive, contributing to autonomous vehicle technologies; and robotics, offering advanced AI perception and simulation tools. Over its more than three decades in business, NVIDIA has experienced substantial growth. In the fiscal quarter ending January 2025, the company reported record revenue of $39.3 billion and a net income of $22.1 billion. NVIDIA’s headquarters, designed to facilitate a flat organizational structure, emphasizes information flow and harmony between leadership and employees.