Huawei Cloud’s new generation of Susten AI cloud services are fully launched: the first to create fully peer-to-peer interconnection of 384 Susten NPUs and 192 Kunpeng CPUs

Team Passionategeekz
4 Min Read

Also See


Free Article Submission
SUBMIT YOUR ARTICLE HERE FOR FREE

Passionategeekz On June 20, at the Huawei Developer Conference 2025 (HDC 2025) held today, Huawei Executive Director and Huawei Cloud Computing CEO Zhang Ping’an announcedThe new generation of Susten AI cloud services based on CloudMatrix384 super nodes are fully launchedproviding “surging computing power” for large-scale model applications.

With the explosive growth of the demand for computing power in large model training and inference, traditional computing architectures have no longer supported the intergenerational leap of AI technology. Huawei Cloud’s new generation of rising AI cloud service is based on CloudMatrix384 super node.The first to create a super “AI server” by connecting 384 rising NPUs and 192 Kunpeng CPUs through the new high-speed network MatrixLink.single card inference throughput jumped to 2300 Tokens/s.

The hypernode architecture can better support the reasoning of hybrid expert MoE big model, and can realize “one card, one expert”.A supernode can support 384 experts in parallel inferenceimprove efficiency. At the same time, super nodes can also support “one card, one computing task”, flexibly allocate resources, improve task parallel processing, reduce waits, and increase the effective computing power usage rate (MFU) by more than 50%.

For the big model training task with trillions of parameters and ten trillion parameters, in the cloud data center,It can also cascade 432 supernodes into a super-large cluster of up to 160,000 calories; At the same time, super nodes can also support the integrated deployment of training and calculation power, such as “day and night training”, which can flexibly allocate the computing power to help customers optimize their resource use.

Sina and Huawei Cloud have cooperated in depth to build a unified reasoning platform for the “Smart Xiaolang” intelligent service system based on CloudMatrix384 Shengteng AI cloud service, with the underlying support provided by Shengteng AI computing power. The delivery efficiency of inference has been improved by more than 50%, and the model is launched exponentially; through coordinated adjustment of soft and hard, the utilization rate of NPU has been improved by more than 40%.

Silicon-based flow is using CloudMatrix384 hypernodes to efficiently provide DeepSeek V3 and R1 inference services to millions of users. Wall-facing Intelligent uses CloudMatrix384 hypernodes, which has improved the inference business performance of their small cannon model by 2.7 times.

In the field of scientific research, the Chinese Academy of Sciences has built its own model training framework based on CloudMatrix384 supernodes, quickly constructed the AI ​​for Science research model of the Chinese Academy of Sciences, and got rid of its dependence on foreign high-performance AI computing power platforms.

In the Internet field, 360’s nano AI search has provided users with super AI search services and has also started testing of CloudMatrix384 super nodes.

Passionategeekz learned from the conference that Shengteng AI cloud service currently provides AI computing power to more than 1,300 customers.

Huawei Developer Conference HDC 2025 Topic

Advertising statement: The external redirect links (including, not limited to, hyperlinks, QR codes, passwords, etc.) contained in the article are used to convey more information and save selection time. The results are for reference only. All articles from Passionategeekz include this statement.



Source link


Discover more from PassionateGeekz

Subscribe to get the latest posts sent to your email.

Share This Article
Leave a Comment

Leave a Reply

Discover more from PassionateGeekz

Subscribe now to keep reading and get access to the full archive.

Continue reading