What exactly is “the real thing”… does it even matter?

Jay from 凹非寺
量子位 | WeChat account QbitAI

Did AI kill me?

I have video proof: something I couldn’t tell was human or AI blew me away with a single shot.

And it happened in a world created by a world model.

Yeah, in this browser-based FPS with graphics blurred into a mosaic.

No game engine behind it, no physics rules, no rendering code.

Everything you see is being generated in real time by a world model called Agora-1.

Humans and AI are both fighting inside it.

Let’s watch the official product launch video first.

I watched it several times, and it gave me a very strange feeling.

It was unlike any demo I’d seen before. This is a company with a very distinctive aesthetic; the launch video feels like it was shot straight out of Black Mirror.

The uncanny-valley vibe is especially strong. I kept wondering whether the people in the video were AI-generated too.

I really couldn’t focus, so I had codex turn it into a transcript. Roughly speaking, it says:

They built a world-model-driven multiplayer game, with up to four players, where humans and AI are mixed together, battling in the same AI-generated world.

Honestly, by this point I was getting pretty itchy to try it.

Enough talk—let me see what this game is about.

Then I took a closer look at the post, and sure enough, there was a game link.

The team even added this in the comments:

Go crush those noobs!

Fine.

What else is there to say.

A lifelong dream, finally realized.

During work hours, I was able to openly and shamelessly play games at my desk (bushi).

The very first second I clicked in, I knew my initial instinct from the video was right.

This thing is not normal.

I’m not insulting it—the moment the page opened, the BGM was just off.

Even now that melody is still looping in my head…

The UI is weird too: dark, low-saturation… it constantly gives off a Black Mirror vibe.

The details are equally on point.

Hovering over buttons triggers simulated sound effects.

That kind of old radio texture—scratchy, grainy, like a horror game.

Anyway, let’s get serious and start.

Before the game begins, you have to choose a name.

I picked my pen name, then entered the waiting room.

Three short of a full squad. Hurry up, hurry up.

PS: I later found out it doesn’t actually have to be exactly four people; if you wait long enough, you can start with just two.

That’s a bit odd. Didn’t they say there are AI players? If you can’t fill the room, why not just let AI take the slot?

Beats me.

By the way, the rules of this game are actually very simple. Let me give you some background so it makes sense.

It pays homage to GoldenEye 007.

That was the classic 1997 Nintendo 64 game based on the James Bond film GoldenEye, and it’s widely considered one of the starting points of console FPS multiplayer.

The rules are extremely simple: a few people split-screen, shoot each other with pistols, SMGs, rocket launchers, and the golden gun. Just kill the other players.

A pure deathmatch. No story, no objectives—just chase and shoot.

Agora’s game design is basically following that template.

Alright, the game begins.

Once inside, the scene has a Backrooms-style look, like this.

Even creepier are the players… other characters occasionally flicker into view, with no footsteps, sliding past like they’re ice skating.

Seriously, the character movement is absurd. Everyone looks pretty uncanny. I genuinely couldn’t tell which ones were AI and which ones were real players.

And I have to complain: the controls are infuriating!!

You can’t use the mouse to adjust the view; instead, you have to use the left and right keys.

There’s lag, plus a weird after-delay, so moving feels like drifting on ice.

Who thought this was a good idea?

And the aiming is impossible!

You can’t stop in time at all; the cursor just won’t stay on the enemy.

Then I died.

I didn’t hit a single shot…

The other side must be AI—why the hell can you aim that well?!

I’m so pissed!! Don’t laugh!! It’s not my fault!!

Even the death screen is maddening, a deep blood-red.

At the end, it shows you your stats.

Not bad, honestly. Basically noobs beating up noobs.

(Or maybe it was just because the other side was bots…)

Beyond the matches themselves, the game hides some pretty interesting stuff.

For example, if you click the information button, you can see Odyssey’s company intro.

And players say you can bug into the brick blocks in the game.

After that, the world model automatically fills in the missing part.

It doesn’t crash, it doesn’t black out, it improvises a space you were never supposed to see.

That’s so interesting.

In traditional games, everything beyond the map boundary is void—the place the programmers never wrote.

But a world model has no concept of boundaries.

Still, the real point isn’t the game itself.

Looking back at the controls just now, the traditional-game logic sounds simple enough.

But don’t forget: this is an AI-generated world.

There are no hard-coded physics rules, no pre-made map textures. Every frame you see, including the out-of-bounds scenes you were never meant to see, is computed by the model in real time.

Using GoldenEye as a testbed is also a great flex.

The reason chaotic split-screen gameplay is so hard is that it easily exposes desynchronization and incoherence.

For a multiplayer FPS, you have to make sure everyone is seeing the same world, and this continuously simulated environment has to stay consistent at all times.

More importantly, the game world is interactive in real time, so it’s easy for things to spiral out of control.

Striking a balance between complexity and playability is incredibly hard.

So, who built this thing?

Odyssey, fully focused on general world models

The company behind this game was founded in 2023 and is called Odyssey.

Yes, named after the hero from the ancient Greek epic.

The name actually fits the company’s overall vibe pretty well—you can tell just by looking at the visual design.

It’s an AI lab focused on general world models. Basically all of its products are world models.

The founders’ backgrounds are also interesting: Oliver Cameron and Jeff Hawke, both of whom came from self-driving.

In July 2024, they made their first appearance in the capital markets, raising a $9 million seed round led by GV.

A few months later, Odyssey closed another $18 million Series A, bringing total funding to $27 million.

Originally, though, their business wasn’t about games at all. Back then, AI video was the trend. But now the story is shifting toward active interaction.

Agora-1 is their latest achievement.

Its biggest feature is:

multiplayer.

Previously, no matter how impressive world models were, there was only one person inside them.

You’d wander around an AI-generated world alone, taking in the scenery and exploring, but no matter how detailed the visuals were, it was still a single-player experience.

Agora-1 can bring in up to four players, all interacting in real time inside the same generated world.

(Though it’s not exactly friendly.)

So, why is multiplayer so hard?

This is actually pretty interesting and worth unpacking.

It’s not like nobody has tried before.

Two relevant reference points are Multiverse and Solaris.

Multiverse took a pretty intuitive approach: it stitches all players’ states into a split-screen image and treats that as a single picture to process.

It works, but it’s crude—not really fundamental.

Solaris, meanwhile, concatenates each participant along the sequence dimension of a single autoregressive diffusion Transformer, creating a more robust shared simulation.

But the problem is obvious too: once too many people join, the context blows up and scalability gets bad.

And both of them share one pain point:

When players move out of each other’s view, it becomes very hard to maintain consistency.

In plain terms, the model’s brain runs out of room.

To reduce the load, Agora-1 explores a different path—

decoupling simulation from rendering.

Agora-1 learns two different functions.

1. Simulation.

It learns how the world state changes over time and how those changes respond to player interactions.

To do this, the team trains the model directly on the internal state of one or more games.

In Agora-1’s case, that game is GoldenEye. The model learns the underlying game dynamics and how player actions trigger state transitions.

2. Rendering.

Here, Agora-1 learns how to turn that shared state into visual output.

This is done through a DiT-based world model. It conditions directly on the shared game state rather than on prompts, images, or other traditional conditioning signals.

You can loosely think of this split as similar to the structure of a modern game engine.

The difference is that both components are learned by the model itself, without relying on hand-written game logic or rendering rules.

The result is that the underlying game state can be manipulated directly.

In other words, Agora-1 can generate brand-new levels while preserving the same game dynamics as the source game.

That’s the secret to keeping multiplayer consistent.

By the way, the day before Agora-1 was released, they also launched another thing.

And honestly, that one hit me even harder.

It’s called Starchild-1, which they describe as the first real-time multimodal world model.

Visual plus audio, generated in real time, and interactive.

You can make it play the piano, and as the keys go down, the sound comes out.

You can also use AI reconstruction to revisit a warm memory, like a wedding.

That opens up so much room for imagination.

AIGC content could perhaps be used to fill in the gaps in memory—those pieces you can’t manage to recall no matter how hard you try.

Does reality really matter?

I suddenly felt a little dazed.

I know these products are all still very early. The graphics are blurry, the controls are bad, the latency is high, and the experience is absolutely not good. They’re nowhere near the stage where something like GPT-Image-2 can give ordinary people an immediately visceral sense of awe.

But while playing Agora-1, there was a moment when I genuinely spaced out.

I aimed at a character and shot. It fell.

I didn’t know whether it was a human or AI.

I didn’t know how the world I was looking at was being rendered.

I didn’t even know whether the world my opponent was seeing was the same as mine.

Then I suddenly remembered:

Everything I see is being computed by a model.

That feeling is very strange.

With the recent GPT update, everyone’s worried about AI-generated fake chat logs, as if the age of photo evidence is coming to an end.

But honestly, I think images are still manageable—they’re just static, after all.

World models are different.

They simulate a continuously operating, multiplayer-shared, real-time evolving environment.

What they simulate is time and subjective experience itself.

To be honest, as I’ve watched world models keep evolving this year—from blurry to clear, from single-player to multiplayer, from visuals only to sound, touch, and full sensory experience…

I sometimes get goosebumps out of nowhere.

How do I know the world I’m in right now isn’t itself generated by some higher-level world model?

In 1997, kids chased each other around on the tiny split-screen of the N64 and thought that was the coolest thing ever.

In 2026, AI learned to generate worlds on its own, and its creators “lured” me into one.

At today’s pace of AI development, what will 2035 look like?