> Cosmos 3 Nano is the compact version with 16B parameters and optimized for efficient inference. It’s designed to run on workstation-grade compute, like the NVIDIA RTX PRO 6000 GPU for real-time robotics inference and physical AI applications.
Looking forward to trying this out on my $10000+ workstation grade GPU that I need an equally expensive set up to run.
As I understand it, they mean both computer vision and video gen, linked by a pretty robust world model. One of their hosted examples is purely analysing an existing video, the other is predicting (i.e. video gen) from a static image to a video
Still impressive nonetheless given its artificially generated training sets.
Beats nano banana 1 but not yet competitive with 2 or seedance2, grok imagine,etc.
Looking forward to trying this out on my $10000+ workstation grade GPU that I need an equally expensive set up to run.
> Generates future observations and action sequences.
Is that just a complicated way of saying video gen?