I built marshmallow castles in Google’s new world AI generator

0 0 5 minutes read

I built marshmallow castles in Google’s new world AI generator

Google DeepMind opens access to Project Genie, its AI tool for creating interactive game worlds from text or image data.

Starting Thursday, Google AI Ultra subscribers in the US can play with an experimental research prototype, powered by a combination of Google’s latest world model Genie 3, its photo production model Nano Banana Pro, and Gemini.

Coming five months after the Genie 3 research preview, the move is part of a wider program to collect user feedback and training data as DeepMind races to develop more capable global models.

Global models of AI systems generate an internal representation of the environment, and can be used to predict future outcomes and plan actions. Many AI leaders, including DeepMind, believe that world models are an important step towards achieving general artificial intelligence (AGI). But in the near term, labs like DeepMind envision a go-to-market program that starts with video games and other forms of entertainment and branches out to training integrated agents (robots) through simulation.

DeepMind’s release of Project Genie comes as the race for global models heats up. Fei-Fei Li’s World Labs late last year released its first commercial product called Marble. Runway, an AI video production startup, also launched a world model recently. And former Meta chief scientist Yann LeCun is starting AMI Labs to focus on developing global models.

“I think it’s exciting to be in a place where we can get a lot of people and give us feedback,” Shlomi Fruchter, director of research at DeepMind, told TechCrunch via video interview, smiling from ear to ear with obvious joy at the release of Project Genie.

The DeepMind researchers TechCrunch spoke to were upfront about the experimental nature of the tool. It can be inconsistent, sometimes producing impressively playable worlds, other times producing confusing results that miss the mark. Here’s how it works.

Techcrunch event

Boston, MA
|
June 23, 2026

A clay-style castle in the sky made of marshmallows and candyPhoto credits:TechCrunch

You start with a “world sketch” by giving textual instructions to both the environment and the main character, who will later be able to guide him around the world from a first or third person perspective. Nano Banana Pro creates a notification-based image that you can, in theory, modify before Genie uses the image as a jumping off point for an interactive world. The fix was very effective, but the model has tripped and will give you purple hair if you ask for green.

You can also use real-life photos as the basis for a world-building model, which, again, is hit or miss. (More on that later.)

Once you’re satisfied with the image, it only takes a few seconds for Project Genie to create a world for you to explore. You can also mix existing worlds into new meanings by building on their information, or explore selected worlds in the gallery or with the randomizer tool for inspiration. You can then download videos of the world you just explored.

DeepMind only offers 60 seconds of world generation and navigation at the moment, partly due to budget and computational limitations. Because Genie 3 is a self-healing model, it takes a lot of dedicated computing – which puts a tight lid on how much DeepMind can offer users.

“The reason we limited it to 60 seconds is because we wanted to bring it to more users,” Fruchter said. “Basically when you use it, there’s a chip somewhere that’s only yours and dedicated to your session.”

He added that extending it beyond 60 seconds would reduce the growing number of tests.

“Surroundings are interesting, but sometimes, because of their level of interaction with natural forces they are somewhat limited. However, we see that as a limitation that we hope to improve on.”

Whimsy works, reality doesn’t

Google received a cease and desist from Disney last year, so it won’t be building Disney-related models.Photo credits:TechCrunch

By the time I was using the model, the safety devices were already working. I couldn’t produce anything resembling nudity, and I couldn’t produce worlds that even remotely smelled like Disney or other copyrighted material. (In December, Disney hit Google with a cease and desist, accusing the company’s AI models of copyright infringement by training on Disney characters and IP and producing unauthorized content, among other things.) I couldn’t even get Genie to generate a mermaid world that explores an underwater fantasy world or winter elephants.

Still, the demo was deeply impressive. The first world I built was an attempt to live a small fairy tale, where I could explore a castle in the clouds made of marshmallows with a river of chocolate sauce and trees made of candy. (Yes, I was a chubby kid.) I asked the model to do it in clay style, and it brought to life the funny world that I would eat as a child, pastel-and-white colored spiers and turrets that look puffy and tasty enough to tear off a piece and dip in a chocolate moat. (Video above.)

A “game of thrones” inspired world that failed to make the picture as realistic as I wantedPhoto credits:TechCrunch

That said, Project Genie still has some kinks to work out.

Models have succeeded in creating worlds based on artistic sensibilities, such as using water colors, anime style or classic cartoon aesthetics. But it tends to fail when it comes to photorealistic or cinematic worlds, often coming off looking more like a video game than real people in a real-life setting.

It also didn’t respond well when given real images to work with. When I gave it a picture of my office and asked it to create a world based on the picture as it was, it gave me a world that had the same furniture of my office – a wooden desk, plants, a gray sofa – placed in a different way. And it looked sterile, digital, not lifelike.

When I fed it a picture of my desk with a stuffed toy, Project Genie showed the toy moving through space, and even had other objects reacting periodically as it passed through them.

That interaction is something DeepMind is working on improving. There were several times when my characters walked through walls or other solid objects.

I asked Project Genie to animate a stuffed toy (Bingo Bronson) to explore my deskPhoto credits:TechCrunch

When DeepMind first released Genie 3, the researchers highlighted that the automatic model building meant it could remember what it had generated, so I wanted to test that by going back to previously generated parts of the environment to see if they would match. For the most part, the model succeeded. In one instance, I made a cat exploring another desk, and it was only once I went back to the right side of the desk that the model produced a second cup.

The part I found most annoying was how you navigated the space using the arrows to look around, the spacebar to jump or climb, and the WASD keys to move. I’m not a gamer, so this didn’t occur to me, but the keys were often unresponsive, or misdirected. Trying to walk from one side of the room to the door on the other side is often a tortuous task, like trying to steer a cart with a broken wheel.

Fruchter assured me that his team is aware of these flaws, and reminded me that Project Genie is a test model. In the future, he said, the team hopes to improve the reality and improve communication capabilities, including giving users more control over actions and location.

“We don’t think [Project Genie] as a finished product that people can come back to every day, but we think there is a glimpse of something interesting and unique that can’t be done otherwise,” he said.

admin 1 week ago

0 0 5 minutes read