Table of Contents

Z.ai

Z.ai

Review Cycle

March 2026

Read Time

3 min read

Technical Depth

53% Detailed

Z.ai
Source: Huawei

Executive Summary

Z.ai, formerly known as Zhipu AI, is a Chinese technology company specializing in artificial intelligence (AI). The company was founded in 2019 and has since become one of China's 'AI Tiger' companies, considered the third largest LLM market player in China's AI industry. Z.ai provides various products and services, including the General Language Model (GLM) series, which is a series of pre-trained dialogue models. The company has also developed other models such as the Ying text-to-video model and the GLM-4.0 open-source end-to-end speech large language model.

In recent years, Z.ai has made significant advancements in the field of AI, including the development of the Atlas 350 AI accelerator, which is powered by the Ascend 950PR chip. The Atlas 350 is designed for the prefill stage (inference) of AI deployment and delivers 1.56 PFLOPS of FP4 throughput, which is 2.87 times higher than Nvidia's China-only H20. The Atlas 350 also comes with 112GB of Huawei's proprietary HBM known as 'HiBL 1.0' and supports 2 TB/s interconnect bandwidth using the new LingQu protocol.

Architecture & Design

The Atlas 350 AI accelerator is based on the Ascend 950PR chip, which is a significant upgrade over the last-gen Ascend 910-class silicon. The Ascend 950PR chip features 128 GB of memory with a 1.6 TB/s bandwidth, and the Atlas 350 maxes out at 1.4 TB/s. The memory access granularity has been reduced from 512 bytes to just 128 bytes, and it also supports 2 TB/s interconnect bandwidth using the new LingQu protocol.

The Atlas 350 is rated at 600W, which is 200W more than the H20. The Atlas 350 is designed for server deployment and is targeted at search recommendations, multimodal AI generation, and large language model inference. The Ascend 950PR chip is also designed for core AI inference workloads such as prefill and recommendation, and it delivers double the vector compute and finer-grained cache-line memory access than previous models.

Performance & Thermal

The Atlas 350 AI accelerator delivers 1.56 PFLOPS of FP4 throughput, which is 2.87 times higher than Nvidia's China-only H20. The Atlas 350 also comes with 112GB of Huawei's proprietary HBM known as 'HiBL 1.0', which allows for larger models to be deployed on the same hardware while requiring less memory. The Atlas 350 is rated at 600W, which is 200W more than the H20.

The performance advantage of the Atlas 350 over the H20 is significant, with the Atlas 350 delivering nearly three times the compute power of the H20. The Atlas 350 is also designed for high-efficiency and is optimized for FP4 precision, which allows for faster data throughput in AI inference workloads. The Atlas 350 is also designed to be compatible with Huawei's Ascend processors, which provides a seamless integration with Huawei's AI infrastructure.

Market Positioning

Z.ai is considered one of China's 'AI Tiger' companies and is the third largest LLM market player in China's AI industry. The company has made significant advancements in the field of AI and has developed various products and services, including the General Language Model (GLM) series and the Atlas 350 AI accelerator.

The Atlas 350 AI accelerator is designed to compete with Nvidia's H20, and the company claims that it delivers nearly three times the compute power of the H20. The Atlas 350 is also designed to be compatible with Huawei's Ascend processors, which provides a seamless integration with Huawei's AI infrastructure. The company has also announced plans to integrate closely with national data centers and cloud service providers to provide AI solutions to a wider range of customers.

Verdict

In conclusion, the Atlas 350 AI accelerator is a significant development in the field of AI and is designed to compete with Nvidia's H20. The Atlas 350 delivers 1.56 PFLOPS of FP4 throughput, which is 2.87 times higher than the H20, and comes with 112GB of Huawei's proprietary HBM known as 'HiBL 1.0'. The Atlas 350 is designed for server deployment and is targeted at search recommendations, multimodal AI generation, and large language model inference.

Z.ai is a company to watch in the field of AI, with significant advancements in recent years. The company's products and services, including the General Language Model (GLM) series and the Atlas 350 AI accelerator, are designed to provide AI solutions to a wider range of customers. With the Atlas 350 AI accelerator, Z.ai is well-positioned to compete with Nvidia and other companies in the field of AI.

Specifications

Ascend 950PR Chip1.56 PFLOPS of FP4 throughput
HiBL 1.0 Memory112GB
Interconnect Bandwidth2 TB/s
Power Consumption600W
CompatibilityHuawei's Ascend processors

Frequently Asked Questions

What is the Atlas 350 AI accelerator?

The Atlas 350 AI accelerator is a dedicated hardware accelerator designed for server deployment, targeted at search recommendations, multimodal AI generation, and large language model inference.

What is the Ascend 950PR chip?

The Ascend 950PR chip is a significant upgrade over the last-gen Ascend 910-class silicon, featuring 128 GB of memory with a 1.6 TB/s bandwidth.

What is the performance advantage of the Atlas 350 over the H20?

The Atlas 350 delivers nearly three times the compute power of the H20, with 1.56 PFLOPS of FP4 throughput.