Google DeepMind announces Genie 3 general purpose world model



 Google DeepMind Blog:

Today we are announcing Genie 3, a general purpose world model that can generate an unprecedented diversity of interactive environments.

Given a text prompt, Genie 3 can generate dynamic worlds that you can navigate in real time at 24 frames per second, retaining consistency for a few minutes at a resolution of 720p.


Towards world simulation​

At Google DeepMind, we have been pioneering research in simulated environments for over a decade, from training agents to master real-time strategy games to developing simulated environments for open-ended learning and robotics. This work motivated our development of world models, which are AI systems that can use their understanding of the world to simulate aspects of it, enabling agents to predict both how an environment will evolve and how their actions will affect it.

World models are also a key stepping stone on the path to AGI, since they make it possible to train AI agents in an unlimited curriculum of rich simulation environments. Last year we introduced the first foundation world models with Genie 1 and Genie 2, which could generate new environments for agents. We have also continued to push the state of the art in video generation with our models Veo 2 and Veo 3, which exhibit a deep understanding of intuitive physics.

Each of these models marks progress along different capabilities of world simulation. Genie 3 is our first world model to allow interaction in real-time, while also improving consistency and realism compared to Genie 2.

BTZLy7YX2zeZCnbOYhaHyyb4xhovsOmu7AvrncDWrBN_1EMCoxRIYopq0Ezw5Pvx8qTlOV6ij1ZKiOYZjJbMburLl6E8aN7qLH78H12EODN_7pjiI6c=w1070


Limitations​

While Genie 3 pushes the boundaries of what world models can accomplish, it's important to acknowledge its current limitations:
  • Limited action space. Although promptable world events allow for a wide range of environmental interventions, they are not necessarily performed by the agent itself. The range of actions agents can perform directly is currently constrained.
  • Interaction and simulation of other agents. Accurately modeling complex interactions between multiple independent agents in shared environments is still an ongoing research challenge.
  • Accurate representation of real-world locations. Genie 3 is currently unable to simulate real-world locations with perfect geographic accuracy.
  • Text rendering. Clear and legible text is often only generated when provided in the input world description.
  • Limited interaction duration. The model can currently support a few minutes of continuous interaction, rather than extended hours.

Responsibility​

We believe foundational technologies require a deep commitment to responsibility from the very beginning. The technical innovations in Genie 3, particularly its open-ended and real-time capabilities, introduce new challenges for safety and responsibility. To address these unique risks while aiming to maximize the benefits, we have worked closely with our Responsible Development & Innovation Team.

At Google DeepMind, we're dedicated to developing our best-in-class models in a way that amplifies human creativity, while limiting unintended impacts. As we continue to explore the potential applications for Genie, we are announcing Genie 3 as a limited research preview, providing early access to a small cohort of academics and creators. This approach allows us to gather crucial feedback and interdisciplinary perspectives as we explore this new frontier and continue to build our understanding of risks and their appropriate mitigations. We look forward to working further with the community to develop this technology in a responsible way.

Next steps​

We believe Genie 3 is a significant moment for world models, where they will begin to have an impact on many areas of both AI research and generative media. To that end, we're exploring how we can make Genie 3 available to additional testers in the future.

Genie 3 could create new opportunities for education and training, helping students learn and experts gain experience. Not only can it provide a vast space to train agents like robots and autonomous systems, Genie 3 can also make it possible to evaluate agents’ performance, and explore their weaknesses.

At every step, we’re exploring the implications of our work and developing it for the benefit of humanity, safely and responsibly.


 Read more:

 

Latest Support Threads

Back
Top Bottom