Is the so-called “real thing” really that important?

Jay, from Ao Fei Temple
QbitAI | WeChat public account

Killed by AI?

There’s video evidence. I got taken out in one hit by something I couldn’t tell was human or AI.

And it happened inside a world created by a world model.

Yes, that lousy-looking web FPS with the awful, mosaic-like graphics.

There’s no game engine, no physics engine, and no rendering code behind it.

Everything you see is generated in real time by a world model called Agora-1.

Humans and AI clash in the same arena.

First, let’s watch this official product launch video.

I watched it several times, and it still felt strangely unreal.

It’s nothing like the demos we’ve seen before. The company has a very distinctive aesthetic, and the launch video feels straight out of Black Mirror.

And that “fake” vibe is incredibly strong. I kept suspecting that even the humans in the video were AI-generated.

I couldn’t focus, so I had codex transcribe it verbatim. In short, this is what it says: they built a world-model-driven multiplayer game where up to four players—humans and AI mixed together—fight inside the same AI-generated world.

Honestly, that got my hands itching.

Cut the fluff and just show me the game already.

Then I looked closer at the post and found the game link.

The team added this in the comments:

Beat those losers down!

Alright.

I’ve got nothing else to say.

My life’s dream has finally come true.

Openly, shamelessly, and right on my desk during work hours, I am now playing games with complete honesty (bushi).

The very first moment I opened it, I knew my hunch from the video was right.

This is not normal.

Not an insult. The moment I opened the browser, even the BGM felt off.

That melody is still looping in my head now…

The UI is weird too. Dark tones, low saturation… it feels like you’re watching Black Mirror at any moment.

Even the smallest details match the setting.

Hover your mouse over a button, and it plays a sound effect.

The audio quality is like an old radio, with a faint crackling hiss mixed in. It feels like a horror game.

Anyway, enough setup. Let’s get into it.

Before the game starts, you’re forced to choose a name.

I picked my pen name and went straight into the lobby.

Three players missing. Come on, hurry up.

ps: I later found out you don’t actually have to fill all four slots; if you wait long enough, the game can start with just two players.

That was a little odd. Aren’t there AI players? If there aren’t enough, why not just add AI?

No idea.

Right, the rules of this game are actually quite simple. Once you add a bit of background, it makes perfect sense.

Its inspiration is GoldenEye 007.

This was a Nintendo N64 classic from 1997, based on the James Bond film GoldenEye, and is considered one of the foundational titles for console FPS multiplayer.

The rules are extremely simple. A group of players fight in split-screen mode, using pistols, submachine guns, rocket launchers, gold guns, and more, and whoever kills the others wins.

A pure kill-or-be-killed deathmatch. No story, no missions. Just chasing each other around and fighting.

Agora’s game design basically follows that formula.

Alright, time to play.

Once inside, I found a Backrooms-style map spread out before me, like this:

What’s even creepier are the players… sometimes another character flickers into view, gliding by with no footsteps at all, as if they’re skating.

Seriously, the character movement is impossible. Everyone looks fake at first glance. I genuinely couldn’t tell which ones were AI and which ones were human.

And I have to complain: the controls are terrible!!

You can’t move the view with the mouse—you have to use the left and right arrow keys instead.

On top of that, there’s awful latency, plus a strange tugging feeling backward, making movement feel like drifting on ice.

I have no idea what they were thinking.

And you can’t aim!

Nothing ever stops properly, so you can’t line the cursor up precisely on an enemy.

Then I died.

I didn’t land a single shot…

The other side has to be AI. How are they so accurate over there?!

So annoying!! Stop laughing!! It’s not my fault!!

This death screen is pretty brutal too. Just a solid blood-red wash.

At the end, your results are shown.

Still, not bad. It really does feel like two bad players swinging wildly at each other.

(Though maybe the opponent was just a bot…)

Beyond the match itself, this game hides a few interesting tricks.

For example, if you press the information button, you can view Odyssey’s company introduction.

Also, according to players, there’s a bug that lets you clip into the bricks in the game and go inside them.

After that, the world model automatically fills in the empty space.

No crash, no black screen. It improvises the space you were never supposed to see on the spot.

That’s way too cool.

In traditional games, the outside of the map is just void. It’s nothing more than an area the programmer didn’t write.

But a world model doesn’t think in terms of boundaries.

That said, what really matters isn’t the game itself.

If you think back to the controls I just described, in the context of a traditional game it would seem like “just a simple mechanism.”

But don’t forget: this is an AI-generated world.

There’s no hard-coded physics, no prebuilt terrain textures. Every frame you see, and even the out-of-bounds scenes that shouldn’t exist, are computed by the model in real time.

Choosing GoldenEye as the testbed is also a very muscular demo.

A chaotic split-screen game is hard because synchronization issues and discontinuities show up immediately.

To make a multiplayer FPS work, everyone has to be looking at the same world. The continuously simulated environment has to stay consistent.

More importantly, because the game space interacts in real time, it can easily go out of control.

Balancing complexity and playability is extremely hard.

So who made this thing?

Odyssey, Going All In on General-Purpose World Models

The company behind this game is Odyssey, founded in 2023.

Yes, “Odyssey,” named after the ancient Greek hero Odysseus.

The name fits the whole company vibe really well. You can tell just by looking at the visual design.

It’s an AI lab focused on general-purpose world models, and almost every product they make is a world model.

The founders’ backgrounds are interesting too. Oliver Cameron and Jeff Hawke are both from the self-driving industry.

In July 2024, they made their public market debut and raised $9 million in seed funding, with GV as the lead investor.

A few months later, Odyssey closed a $18 million Series A, bringing total funding to $27 million.

That said, their original business had nothing to do with games. Back then, AI video was hot, so that’s what they focused on. Now, the story is shifting toward active interaction.

Agora-1 is their latest result.

Its biggest feature is:

Multiplayer.

Traditional world models, no matter how beautiful, usually only had one person inside them.

You’d wander around an AI-generated world alone, looking at the scenery and exploring. No matter how good the visuals were, it was still just a single-player experience.

With Agora-1, though, up to four players can enter the same generated world and interact with each other in real time.

(Though it’s not exactly gentle.)

So why is multiplayer so hard?

This part is really interesting, so let’s dig into it a bit.

It’s not like no one has tried before.

Two prior references are Multiverse and Solaris.

Multiverse’s approach is relatively intuitive: it stitches together all player states into a single split-screen-like image and processes that as one input.

It works, but it’s pretty brute-force and not especially elegant.

Solaris, on the other hand, concatenates each participant along the sequence dimension of a single autoregressive diffusion Transformer to generate a more robust shared simulation.

But there’s an obvious problem: as the number of people increases, the context breaks down, and scalability gets pretty bad.

These two also share a common pain point:

When players are out of each other’s view, it’s hard to maintain consistency reliably.

In other words, the brain just runs out of capacity.

To reduce the load, Agora-1 took a different path:

separating simulation from rendering.

Agora-1 learns two different functions.

1. Simulation.

It learns how the world state changes over time, and how those changes respond to player actions.

To do this, the team trained the model directly on the internal state of one or more games.

In Agora-1’s case, that game is GoldenEye. The model learns the game’s basic mechanics and how player actions trigger state transitions.

2. Rendering.

Here, Agora-1 learns how to turn that shared state into visual output.

This stage is implemented with a DiT-based world model. It doesn’t rely on prompts, images, or other traditional conditioning signals; instead, it takes the shared game state directly as input.

Broadly speaking, this separation is similar to the structure of a modern game engine.

The difference is that the model itself learns both components. It doesn’t depend on hand-written game logic or rendering rules.

As a result, it can directly manipulate the underlying game state.

In other words, Agora-1 can generate new levels while preserving the same game dynamics as the original game.

That’s the key to maintaining multiplayer consistency.

By the way, the day before Agora-1 was announced, they also released something else.

And honestly, that one hit even harder.

It’s called Starchild-1, which they describe as the “first real-time multimodal world model.”

It generates vision and audio simultaneously, in real time, and the two can interact.

You can even make it play the piano, with the keys moving down at the exact same time the notes are played.

Or you can revisit a warm memory, reconstructed by AI—for example:

a wedding.

That really opens up the imagination.

Maybe AIGC content can be used as material to fill in the gaps of memories we can’t quite recall.

Does the real thing matter?

Suddenly, I feel a little dazed.

I know these products are still in the early stages. The graphics are rough, the controls are bad, and the latency is high. It’s nowhere near a great experience, and it’s still far from the kind of instant shock ordinary users felt when GPT-Image-2 arrived.

Even so, while playing Agora-1, there was one moment when I genuinely felt my mind drift.

I aimed at a character and fired. The character went down.

I couldn’t tell whether it was human or AI.

I also had no idea how the world I was looking at was being rendered.

And let alone whether the world the other side was seeing was the same as mine.

At that moment, I suddenly remembered:

everything I was seeing was the result of model computation.

That feeling is pretty strange.

With recent GPT updates, people have been worrying that AI will fabricate fake chat histories, and some have even argued that images may no longer count as evidence.

But now I honestly think images are still better, because at least they’re just still frames.

World models are different.

They simulate continuously moving, multi-user, real-time evolving environments.

They simulate time itself, and subjective experience itself.

Honestly, watching world models keep evolving this year—from blurry visuals to sharp visuals, from one person to multiple people, from image-only worlds to audio, touch, and full-sensory experiences—sometimes gives me goosebumps.

How can we be sure that the world we’re in right now wasn’t generated by a higher-level world model?

In 1997, young people played split-screen chase-and-shoot in a tiny N64 window and thought it was the coolest thing ever.

In 2026, AI learned how to generate worlds on its own, and its creator “lured” me into one.

At the current pace of AI progress, what on earth will be happening in 2035?