Home 9 Computing 9 NVIDIA Launches Cosmos 3 Open Model for Physical AI

NVIDIA Launches Cosmos 3 Open Model for Physical AI

by | Jun 8, 2026

Open world model combines vision reasoning, generation and action prediction for robots, AVs and vision systems
Cosmos 3. Image: NVIDIA

TAIPEI, Taiwan (NVIDIA GTC), June 8, 2026 – NVIDIA launched Cosmos 3, an open world foundation model for physical AI that combines vision reasoning, world generation and action prediction in one system. The model uses a mixture-of-transformers architecture for robots, autonomous vehicles and vision AI systems.

Cosmos 3 is an open omnimodel that can process and generate text, images, video, ambient sound and actions with physics-based accuracy. The model can reduce physical AI training and evaluation cycles from months to days.

NVIDIA also launched the NVIDIA Cosmos Coalition, a collaboration between world model builders and AI developers. Members include Agile Robots, Black Forest Labs, Generalist, LTX, Runway and Skild AI, which are working on next-generation world models.

“The big bang of physical AI is just around the corner thanks to breakthroughs in multimodal reasoning language, vision and world models,” said Jensen Huang, founder and CEO of NVIDIA. “The Cosmos 3 family of open, frontier omnimodels gives developers a generational leap in ability to build robots, autonomous vehicles and vision AI that perceive, reason, plan and act in the physical world.”

A New Architecture for Physical AI

Cosmos 3 addresses a core challenge in physical AI: helping robots, autonomous vehicles and vision agents generalize in real-world settings with limited training data and fragmented simulation stacks.

The model’s mixture-of-transformers architecture pairs a reasoning transformer with an expert generation transformer. This structure allows Cosmos 3 to process object interactions, motion and spatial-temporal relationships before generating video and action trajectories.

Cosmos 3 was trained on one of the largest multimodal physical AI datasets. The training gives developers a pretrained foundation for building physical AI systems with less data and lower training costs.

Developers can use Cosmos 3 as:

  • A world model or video foundation model that simulates physical environments and predicts future world states for training and evaluation.

Cosmos 3 models ranked first among open models across several physical AI benchmarks. These include Artificial AnalysisPhysics-IQPAI-Bench and R-Bench for world generation accuracy, RoboLab and RoboArena for action policy, and the VANTAGE-Bench and TAR leaderboards for vision understanding.

The Cosmos 3 lineup gives developers options for different stages of physical AI development:

  • Cosmos 3 Super for post-training robotics and autonomous vehicle models that need high physics accuracy and generation quality.
  • Cosmos 3 Nano for high-quality video and action reasoning in fractions of a second.
  • Cosmos 3 Edge, coming soon, for real-time inference at the edge.

Cosmos Coalition Supports Open World Model Development

The Cosmos Coalition brings together world model builders, AI developers and physical AI companies to advance open world models across industries. Members can contribute models, research and evaluation techniques while using Cosmos 3 technologies, training tools and NVIDIA DGX Cloud infrastructure for large-scale training.

Founding coalition members include Agile Robots, Black Forest Labs, Generalist, LTX, Runway and Skild AI. The coalition is designed to support open development, broader interoperability and shared technical progress in physical AI.

Developers Build on Cosmos

The Cosmos software stack supports NVIDIA’s physical AI stack for training and evaluation workflows across industries. It now includes datasets for robotics, physics, human motion, autonomous driving, warehouse safety and spatial reasoning, along with physical AI agent skills for neural scene reconstruction, defect-image generation and video augmentation.

Physical AI developers are building on Cosmos across multiple industries. Users include Agile Robots, Doosan Robotics, LG Electronics, Samsung Electronics and Skild AI for robotics, Li Auto for autonomous vehicles, and Centific, Fogsphere, Linker Vision, Milestone Systems and Yuan for vision AI agents in industrial AI and smart spaces applications.

Availability

Cosmos 3 Super and Cosmos 3 Nano are available now, with Cosmos 3 Edge coming soon. Developers can try Cosmos 3 on build.nvidia.com, download open models from Hugging Face, customize models and generate synthetic data with Hugging Face Diffusers and resources on GitHub, and deploy the models as NVIDIA NIM microservices.

Model builders and software providers can use physical AI agent skills on GitHub for reasoning and synthetic data generation workloads. Access, customization and deployment are supported through inference services and cloud infrastructure partners, including Baseten, CoreWeave, Microsoft Azure, Nebius, Deep Infra and Classmethod.

Source: NVIDIA

About NVIDIA

NVIDIA, founded in 1993 and headquartered in Santa Clara, CA, designs and manufactures graphics processing units, systems on chips, networking hardware, and AI intelligence software such as CUDA. Its products serve industries including gaming, data centers, autonomous vehicles, professional visualization, robotics, health care, and energy. The company introduced the GPU in 1999 and later expanded into accelerated computing and AI infrastructure. In gaming, its GPUs support high-performance rendering, while in AI and high-performance computing, its systems provide the infrastructure for training and deploying large-scale models. NVIDIA also develops tools for robotics and autonomous driving.