Special Topic: China AIGC Innovation Application Forum 2025 at the Fair for Trade in Services
The China International Fair for Trade in Services - 2025 China AIGC Innovation Application Forum was held in Beijing on September 12, 2025, with the theme "From Large Models to Intelligent Agents, Driving AI's New Ecosystem." Zhou Weiwei, Chairman and CEO of Weiye Zhisuan, Vice President of SenseTime's Large Infrastructure Business Group, and Deputy Director of the Computing Power Industry Professional Committee of the China Electronics Chamber of Commerce, attended and delivered a speech.
The following is the speech transcript:
Distinguished guests, good morning! I am Zoe Zhou Weiwei. Standing on the podium of the Fair for Trade in Services again today, I feel particularly excited. Three years ago, when I first came here as a keynote speaker, "computing power" was still a foreign and distant concept to many people. At that time, my team and I had just completed the establishment of NVIDIA's Beijing AI Innovation Empowerment Center and delivered China's first professional AI training platform based on NVIDIA's SuperPOD architecture. It was from that moment that the Fair for Trade in Services became my annual "examination hall" for reporting progress and sharing insights.
Looking back over the past three years, computing power has evolved from being ignored to becoming highly sought after, from a technical topic to an industry core. However, behind the popularity, problems have emerged: Is there really a shortage of computing power? Can domestic chips actually be used? Can intelligent computing centers actually be profitable? These questions linger in the minds of almost every industry professional.
Today, I want to share some thoughts and practices from my team and me when facing these questions, including our efforts in promoting the construction of next-generation AIDC.
First, is there a shortage of computing power? Definitely yes, and it's a structural shortage! Many people might ask: Since computing power is so scarce, why are there still idle intelligent computing centers, and even financial industry professionals tell us that computing power has become non-performing assets even in first-tier cities like Beijing, Shanghai, Guangzhou, and Shenzhen? The truth behind this is that computing power resources face serious misallocation and uneven regional development.
We can compare computing power to electrical power. The arrival of the AI era has accelerated the productivity revolution across all industries. In the future, AI tools will penetrate all industries like water, electricity, and the internet. Our national power grid infrastructure is already highly developed, but we still see: surplus wind, solar, and hydroelectric power in central and western regions, while the Yangtze River Delta faces power shortages.
Computing power is similar - it's not absolutely scarce, but structurally scarce.
Many traditional data centers haven't kept up with AI workload demands, while newly built intelligent computing centers lack technical and operational capabilities, resulting in "computing power without scenarios, scenarios without computing power." This is precisely the core reason why many computing power platforms cannot achieve profitability.
Second, can domestic chips be used? SenseTime has long been tracking and testing compatibility with almost all domestic chips on the market. Our answer is not only that they can be used, but they are becoming the main force in the inference era. To be honest, more than a year ago, domestic chips were indeed difficult to meet AI developers' needs, but today we can say with certainty that domestic chips are not only usable but are becoming a key force driving AI application breakthroughs.
This is because our AI is transitioning from the "training era" to the "inference era." Training requires extremely high computing power concentration and ecosystem maturity, while inference relies more on distributed computing power and energy efficiency - this is precisely the advantage battlefield for domestic chips.
Our domestic chips still have obvious gaps compared to international advanced flagship products in training scenarios and overall performance, but each chip has performance that is superior or even far ahead in specific single-task scenarios. Through cloud management platforms that identify tasks and distribute them to the most suitable GPU clusters for execution, even in training scenarios, through heterogeneous fusion scheduling technology, combining domestic chips with international high-performance chips for collaborative use, we can meet demands while promoting autonomous control.
At SenseTime's Large Infrastructure and in many facilities managed by Weiye Zhisuan, we have achieved large-scale deployment of multiple domestic chip projects through this technical approach. This is not a compromise, but a productivity upgrade achieved through technical optimization.
Today is the Fair for Trade in Services, and we focus on "service" and "trade." For computing power to truly become a tradable and inclusive service, it depends on two major innovations: technological innovation to improve service quality, and business innovation to expand the user base.
What characteristics should next-generation AIDC have? Cloud delivery, heterogeneous scheduling, and token billing are three key terms.
Cloud delivery: Computing power is essentially a service, not hardware. The bare metal era is over; customers need a more convenient and stable cloud experience. SenseTime's Large Infrastructure Cloud Management Platform 2.0 achieves full-process cloud-native delivery from training to inference, with stability reaching 99.998%.
Customer service data provided by Weiye Zhisuan to Beijing Telecom over the past three months shows that training task interruption rates have significantly decreased, and R&D efficiency has improved threefold. Please remember: the cost of cluster "jitters" far exceeds limited rental discounts.
Second is heterogeneous fusion scheduling. Future computing power architecture must be multi-element fusion. Through systems and software-defined computing power, we unify scheduling of different architecture chips (NVIDIA, domestic chips, ASIC, etc.) to maximize resource efficiency. In large multimodal model training projects, we used intelligent scheduling systems to decompose tasks across four different chip architectures, improving overall efficiency by 65% and reducing costs by 42%. This is what we call "letting professional chips do professional work."
The last point is token billing, which is a key breakthrough in computing power commercialization. OpenAI recently announced significant reductions in token costs, while domestically, platforms that can truly provide computing power services in token output mode remain rare.
Based on SenseTime's Large Infrastructure Cloud Management Platform 2.0, Weiye Zhisuan has launched scarce domestic token billing services, where users pay based on actual usage without bearing fixed asset investment risks, greatly lowering entrepreneurial barriers. Token sales combined with integrated training and inference improve GPU utilization efficiency, and compared to traditional node-based sales methods, can effectively increase investor returns by more than 3 times, which is also a key to improving AIDC profitability.
If our technological innovation is to help developers use it well, our business innovation is actually to make it affordable for everyone. Computing power is hard to sell; fundamentally, it's because computing power is too expensive, especially for many startups. But these startups, if we say AI's moat is the ecosystem, are the core seed users for building AI's moat. Only by optimizing their cash flow and lowering industry entry barriers can we truly make the AI ecosystem come alive.
Here I can share a small story. Many years ago, NVIDIA built Cambridge-1, a 600P intelligent computing center in Cambridge. Compared to current domestic scales, this intelligent computing center was very small, but computing power was extremely scarce at the time. So NVIDIA proposed that they could allow enterprises needing computing power to enter their intelligent computing center at a slight discount, but the exchange condition was that enterprises must develop new projects, and second, new projects must give them 1% equity. Among them, a company called AstraZeneca chose to move into Cambridge-1, and everyone knows what happened next. AstraZeneca developed a COVID vaccine, and this product's emergence allowed Cambridge-1 investors to recover costs originally expected to take 10 years within that same year.
Inspired by this, we launched the "Computing Power Bank" model, borrowing from Silicon Valley Bank's innovative spirit to build new "deposit, withdrawal, investment, and lending" mechanisms:
We know many computing power enterprises and listed companies have very strong capabilities, and traditional state-owned enterprises have also invested in building intelligent computing centers, but they face cross-industry operational challenges. Enterprises can deposit their idle computing power into the computing power bank we manage, and we provide management and maintenance to improve asset utilization efficiency. Taking our partner enterprise as an example, it's a traditional manufacturing enterprise with a 200-server cluster. Previously, not only could it not find customers for itself, but when conducting market expansion, it could only watch these assets depreciate on its balance sheet. After cooperating with us on computing power bank deposits, we first installed our cloud management platform for management, introduced external customer resources for shared joint scheduling, and entered our computing power pool. After this, the entire cluster's utilization efficiency improved to 75%, and we can provide different types of customers with annual returns of at least 4%, while customers retain asset disposal rights and can withdraw this computing power at any time.
Another aspect is computing power investment. We can use computing power to participate in equity investments in some startups. The entire project evaluation for startups is managed through blockchain, with AI agents assisting our decisions for rapid disbursement. We can also adopt joint operation + revenue sharing methods. Among our cooperative customers, for example, an AI pharmaceutical enterprise that originally had to spend money to purchase hardware themselves, this method saved them 20 million yuan in initial hardware investment, which could be used to attract the most core AI talent for project development, improving their R&D cycle by more than 6 months.
Another is the computing power credit model. We can provide startup enterprises with flexible rental loans like computing power credit cards, supporting rentals as short as one hour, and customers can pay interest monthly with flexible settlement methods like principal repayment at maturity. For example, we have a customer doing AI visual creative development and AI agent development. Through this model, they only need to pay 30% of traditional computing power rental costs during the testing phase to complete the entire development process. After interacting with their customers, they pay us, significantly reducing cash flow pressure. After completing product launch, they can make payments, helping developers accelerate their business revenue and financing progress. When their financial situation improves, they return higher rental income than traditional rental models.
Of course, the series of innovative models we mentioned earlier all depend on the technological service innovations mentioned above, particularly adopting physical cluster sharing + logical resource isolation architecture to increase intelligent computing center flexible scheduling. Because during the entire development process, customers' use of computing power cluster sizes is often not a straight line but fluctuates high and low, presenting a valley-like pattern compared to large companies. At this time, if a cluster and service provider can provide good elastic scheduling, it can effectively meet customer needs while mainly helping them reduce their development costs.
So far, through the computing power bank model, we have helped numerous customers obtain high-quality computing power support at approximately 30% below market prices. For cultural tourism and consumer agent developers, overall resource utilization has improved from the previous industry average of 40% to over 75%, and our partner developers have seen average valuation growth of more than 2 times within the year.
Actually, achieving computing power inclusiveness relies not only on infrastructure cost reduction and efficiency improvement, but also on model optimization, integrated training and inference, computing-power collaboration, energy revolution, and financial tool intervention as a series of methods. Domestic substitution currently underway is no longer just at the hardware level. With the step-by-step development of RISC-V, in the near future, autonomous control at the instruction set level is becoming standard. We also never lack creative financial solutions; what we lack are planners and executors who understand both the industry and finance, and can string financial tools together into chains adapted to computing power services.
Achieving computing power inclusiveness relies not only on infrastructure cost reduction and efficiency improvement, but also on model optimization, integrated base models, computing-power collaboration, energy revolution, and the introduction of various financial means. Domestic substitution currently underway is no longer just at the hardware level, but with the step-by-step development of RISC-V, we can see that in the near future, instruction-level autonomous control will definitely become a standard promoted by the country again.
In our market, there are never shortages of creative financial solutions; what we lack are planners and executors who understand both the computing power industry and finance, and can string financial tools together into chains truly adapted to the computing power service industry. Our computing power should not become a high wall hindering innovative development, but should become a cornerstone supporting dreams. Being a truly positive and constructive builder of the AGI era, selling computing power, managing it well, using it quickly, and using computing power to empower AI ecosystem development - I think this is our future development direction.
Finally, I want to share some professional insights from my three years in the computing power industry. Actually, the concept of artificial intelligence has existed for over 70 years, and now it has entered a fast development track. We see many practitioners, developers, and enterprises - some leaving overseas amid praise, others repeatedly falling and standing up again under difficult circumstances. This should be called a time when heroes and ambitious figures emerge, ideals and desires intertwine, and controversy and praise coexist. But in this era, talent and gifts can be realized, small and beautiful enterprises can overtake on curves, and long-term oriented entrepreneurs will ultimately receive market attention and recognition. We usually call such times the golden age. Computing power is like the breathing and survival rights of this AGI golden age. We often say that if computing power is not free, data is meaningless, and all development is out of the question.
We hope to hold such breathing and survival rights in our own hands, gripping them a bit tighter, and hope that the computing power service industry can be somewhat different before and after my team and I arrived.
Finally, I want to thank this platform of the Fair for Trade in Services for witnessing my growth from what was once called the operator of "China's first computing power concept stock" to now having some industry voice as a practitioner. I also hope that next year's examination can once again report and share with everyone here. Thank you all, I am Zhou Weiwei.