On 520 Day, 4 Million AI Professionals Watched These Nearly 20 Talks and Dialogues on QbitAI | The 4th China AIGC Industry Summit
Here are the core takeaways.
Organizing Committee / by Ao Fei Temple
QbitAI | Official Account QbitAI
Lobster, Harness… hit products kept emerging one after another, and Agent became the buzzword for the next breakthrough that everyone was talking about.
AI in 2026 is surging forward, diverging, and moving toward real-world deployment, transforming from a “tool” into a “productivity system.” It is shifting from merely “generating content” to actually “completing tasks.”
@Everyone, it’s time to seriously get good at using AI!
Nearly 20 AI leaders gathered at the 4th China AIGC Industry Summit and confronted the industry’s sharpest questions head-on.
- Will Agent become the next-generation super entry point?
- Where is the real tipping point for AI applications?
- How will multimodality and spatial intelligence reshape future interaction?
- As models become increasingly homogenized, where do the real non-consensus insights lie?
Those answers were explored again and again on site.
The atmosphere was electric from start to finish. The offline venue was packed, with all seats taken and attendees spilling into the aisles and along the walls. The online stream also maintained a high level of engagement, with viewers participating from their screens and comments, reactions, and discussions never stopping.
On stage stood frontline practitioners and academic authorities who have spent years working deeply in AI, sharing hands-on industry insights and technical judgments. In the audience sat attendees and explorers tracking the pulse of the industry, all rushing to this once-a-year intersection of AIGC ideas.
So, let’s join QbitAI’s “carbon-based editors” to see what important signals this high-energy “let’s AI-ify everything right now” conference revealed.
The China AIGC Industry Summit is an industry summit hosted by QbitAI, bringing together nearly 20 industry representatives for discussions. Offline attendance exceeded 1,000, and online viewership reached around 4 million, attracting broad attention and coverage from major media outlets.
Fang Han, Chairman and CEO of Kunlun Wanwei
At the summit, Fang Han, Chairman and CEO of Kunlun Wanwei, first gave a speech on “How Should Individuals and Companies Respond to the Impact of Agent?”
Key takeaways:
- If an industry or skill operates in a closed loop and failure is tolerable, it is easier to replace. But if you have judgment and taste, you can still go the distance for a long time.
- How much Token is consumed is becoming an increasingly interesting metric. For ordinary employees, it can be in the millions to tens of millions per month; for AI coding and technical roles, it can reach hundreds of millions to tens of billions; and for heavy Agent users, monthly usage can easily exceed the hundred-billion level. Token is already the “power consumption” of the AI era.
- With AI intervention, the growth ladder for individuals is compressed. In the past, there was a clear and orderly growth path, with employees gradually stepping up from entry-level. But now the trend is splitting into two extremes: either you are a beginner or a master, making it hard for the middle layer to exist. Once the intermediate “steps” disappear, how will ordinary people’s growth change? That is a phenomenon worth watching.
- There are five types of people AI can never replace: those who tell stories, those who generate ideas, those who define beauty, those who build systems, and those who rebuild paradigms.
- If most industries want to implement AI, the goal should be to become No. 2. The No. 1 player must bear extremely high trial-and-error and exploration costs. A No. 3 player cannot capture the industry’s rewards and will be left behind. In IT, the reality is brutally harsh: you can only compete for first place. But AI resets everyone to the same starting line.
Yi Zhengchao, CEO of Fengxing Online
Yi Zhengchao, CEO of Fengxing Online, shared his views under the theme “From AI Programming to AI Video: Co-creation Is the Core Lever of AI Productivity.”
Key takeaways:
- AI has brought major changes to the large entertainment industry. First, costs and entry barriers have dropped significantly, supply has become abundant, and competition has become even fiercer. Second, business forms such as web novels, IP characters, video content, interactivity, and games have become more diversified. Third, what AI brings to the video industry is not only creation, but also support for the operations and management of entertainment companies. Fourth, in content creation, screening for imagination has become more important, and co-creation is an inevitable choice. Finally, while AI creation can be immersive, it also breaks down the boundary between creation and consumption.
- As an AI application company, Fengxing Online follows five principles: trust AI, but do not meddle with models. AI animated short dramas are popular, but they are not the whole of AI video. Although it is an AI video company, its success actually comes more from AI programming. Expanding individual value is important, but expanding the organization is even more important. Agent is powerful, but co-creation is the lever.
- Co-creation is the social structure of the AI era. Today’s companies are no longer simply containers for “super employees” and “super Agents.” Instead, they are places that organize intellectual resources to tap external collective intelligence and generate more value.
- The co-creation network formed by employees, digital employees, and external partners is, in essence, a socialized ecosystem-style organizational structure.
- AI amplifies execution, but at the same time it also amplifies the state of “only making yourself feel good.” This is a common side effect in coding and content production alike, and the antidote is simple: produce results.
Lin Dahua, Executive Vice President and Chief Scientist of SenseTime
Lin Dahua, Executive Vice President and Chief Scientist of SenseTime, spoke on the theme “From Multimodal Unification to Spatial Intelligence: Toward a New Frontier of AI That Can Perceive, Generate, and Act.”
Key points:
- No matter how fast the era changes, long-term vision is always what determines how far we can go. AI is a long-distance race, and only with the support of long-termism can we truly reach the future.
- In enterprise AI adoption, the large model itself is not the most important factor. The real bottleneck lies in how to connect diverse forms of data—tables, Excel files, images, videos, web pages, knowledge bases, and more—into a single AI system. This alone often accounts for more than 70% of a company’s AI usage cost.
- Agent is the engine of this era, but the key to making that engine work in real scenarios lies in its ability to handle multiple modalities. The reason SenseTime Xiaohuanyuan has continued to achieve high growth is that it connects everything end to end, from messy data to deliverable outputs, truly delivering value to users.
- Beyond digital space lies an even larger world: physical space. Even today’s most advanced multimodal models remain highly fragile when they enter the real physical world, and this is the core bottleneck preventing robots from becoming general-purpose. To open the door to physical space, we must rethink the world from first principles.
- To truly break through spatial intelligence, language models and visual understanding/generation must be unified into a single model. In other words, one model must be able to handle linguistic expression while also generating elements of the visual world.
- SenseTime’s new-generation model SenseNova U1 unifies understanding, reasoning, and generation on a new foundation, allowing seamless movement between language and vision. It expresses understanding through language and imagination through vision, achieving truly coherent mixed text-and-image creation.
- “Unification” itself opens up new expressive space and new possibilities. Give an image generation model the ability to think, and give a thinking model the ability to imagine.
- The true agent of the future should be able to simultaneously analyze digital space and act in physical space within one “big brain.” It must integrate multimodal information for decision-making and also move nimbly in the physical world. The fusion of digital and physical space is the true destination AI should be heading toward.
Deng Yafeng, Vice President of Shanda Group and CEO of EverMind
Deng Yafeng, Vice President of Shanda Group and CEO of EverMind, spoke on the theme “Self-Evolution Driven by Long-Term Memory: From Tool-Based AI to a Digital Productivity System.”
Key points:
- Lobster is like the iPhone 4 of the Agent era. It defined a product paradigm and gave people their first real sense that they now had an AI Jarvis that could work 72 hours straight. But it is not the final form; it needs to be continuously updated and surpassed.
- Claude 4 is an important milestone in Agent’s move toward autonomy. The paradigm shift from Chat to Agent ultimately helped Anthropic overtake OpenAI, while also bringing major changes to SaaS. In the past, SaaS delivered processes and UI; now, it is increasingly delivered through messages.
- Agent has two important characteristics: autonomy and self-evolution. The key supporting mechanism is long-term memory, which solves three problems. First, it abstracts and summarizes rapidly expanding context. Second, it remembers who the user is, what they like, and what goals and values they have. Third, based on that, it anticipates what the user needs.
- The stronger the model becomes, the more memory becomes the most differentiable asset in business processes.
- If AI truly knows you inside out, it becomes a brand-new entry point for intent distribution. At that point, memory accumulation becomes crucial. Memory belongs to the individual and should be synchronizable across different Agents such as Codex, Claude Code, and Lobster.
Wang Xiaoye, Technical Director, Product Technology Division, Amazon Web Services
Wang Xiaoye, Technical Director in the Product Technology Division of Amazon Web Services, spoke on the theme “Breaking Through the Adoption Barrier for Agent: From the Most Powerful Models to Enterprise AI Agents.”
Key points:
- Using Lobster as an individual is not the same as operating Lobster in an enterprise. For enterprises to run Agents safely, reliably, and stably, there are many barriers to overcome. Agents like Lobster show what lies on the other side, but enterprises still need a bridge to move toward production deployment.
- Amazon Web Services focuses on five layers when helping enterprises build Agentic AI. The bottom layer is inference compute, above that is multimodel selection, then enterprise data and knowledge, then the Agent-building platform, and at the top are ready-to-use Agent applications.
- In enterprises, Coding Agents are already fairly mature, and the next explosion point is Working Agents. Amazon Web Services’ answer is Amazon Quick, which enables enterprise employees to use Agents safely, flexibly, and freely.
- Agent brings new challenges to data management. Memory needs to be shared, isolated, and able to coexist. Incorrect knowledge, outdated information, and contradictory content all affect an Agent’s judgment. People complain that Tokens are expensive, but in many cases, the issue is not the unit price—it is that too much useless information is being fed into the model.
- In the context of Agent, Harness refers to all software infrastructure other than the model itself. If the model is compared to a CPU, then Harness packages the OS into something usable, and the Agent ultimately emerges as a complete application. Amazon Bedrock AgentCore is that Harness, and its greatest value is enabling users to spend less effort on the Harness and more on their own business value.
GenAI Talk: A Conversation with Shen Yujun, Chief Scientist of Ant Lingbo Technology
In the past, Shen Yujun, Chief Scientist of Ant Lingbo Technology, was among the first to introduce the concept of AIGA in public talks such as the Zhongguancun Forum. He has pointed out that in the second half of AI 2.0, artificial intelligence should move from “enjoying itself” in the digital world to “working” in the physical world, shifting from AI-driven Content generation to Action generation.
In this morning’s GenAI Talk, Shen Yujun, Chief Scientist of Ant Lingbo Technology, and Li Gen, Co-founder and Editor-in-Chief of QbitAI, held an in-depth dialogue around this topic under the theme “The Second Half of AI 2.0: From AIGC to AIGA.”
Key points from Shen Yujun:
- Large models have effectively captured the data dividend accumulated by the internet over the past decades. By contrast, there is still a major gap in the physical-world data available for robots. In the second half of AI, the more important question is how to move data from the digital world into the physical world.
- To build a brain for general-purpose robots in the physical world, spatial cognition is a crucial step. The key lies in how to convert sensor inputs into information and communicate that information more effectively to the model, as well as how to understand the world from the sensor-input stage itself.
- As for the technical path debate between VLA and world models, first, no matter how the technology evolves, data is indispensable. Second, neither path will be the final form. Once robot data accumulates to a certain scale, the two will inevitably merge, giving rise to models native to the physical world.
- As a forecast, within 1 to 2 years, several benchmark cases will emerge in which models can truly be deployed in production. Within 2 to 3 years, these will spread widely, enabling models to be used across more industries. Within 5 to 10 years, robots will begin to enter the consumer side in some form. Beyond 10 years, robots will truly become household devices.
- When robots enter every home, that will be the ChatGPT moment for embodied intelligence.