Vastai Technologies Releases First High Performance Ultra-Low Latency General Purpose AI Inference Chip for Data Centers


SHANGHAI, CHINA, July 7, 2021 – Vastai Technologies (Shanghai) Inc. (“Vastai Technologies” or “Vastai”), a provider of high-performance AI and video processing semiconductors, announced its first general-purpose AI inference chips, SV100 series, and VA1 general-purpose inference acceleration add-in card for data centers on the first day of World Artificial Intelligence Conference (WAIC) 2021. SV100 series and VA1 are expected to be available in mass production in the fourth quarter of this year, enabling ultra-high performance and ultra-low latency deep learning inference performance and significantly reducing the deployment cost of data center and edge intelligence applications.

Following continuous development of technologies and breakthroughs in accuracy, performance, and other key metrics of algorithm models in many areas, the artificial intelligence industry has entered the stage of large-scale deployment of deep learning algorithms. The demand for inference processing power from data centers in various industries is growing rapidly worldwide, and the diverse application scenarios of enterprise customers have given rise to diverse needs for AI acceleration chips, yet there are few better solutions on the market other than mainstream GPUs. Based on the insight of the industry's technical product requirements and the obvious huge market opportunity, after more than two years of technical demonstration and integrated hardware-software development, Vastai Technologies has launched the SV100 series chips and VA1 inference accelerator card, which are specifically designed for cloud data centers.

The SV100 series chips have excellent performance, with INT8 peak processing power of over 200 TOPS on a single chip and the deep learning inference performance multiple times higher than existing mainstream data center GPUs, with ultra-high throughput and ultra-low latency. Vastai's inhouse developed general-purpose architecture is extremely optimized for various deep learning inference workloads. The SV100 series chips support FP16, BF16 and INT8 data formats and enable rapid deployment of many mainstream neural networks and diverse inference scenarios such as computer vision, video processing, natural language processing, search, and recommender systems. It also integrates up to 64 channels of H.264/H.265/AVS2 1080p video decoding, which can widely be used in cloud and edge intelligence application scenarios to enhance customers' equipment asset efficiency and reduce operation costs.

The VA1 inference accelerator card based on the SV100 series chip is a half-height, half-length, single-slot 75-watt PCIe x16 card that supports 32GB of memory and PCIe 4.0 high-speed interface protocol and can be used in all manufacturers' AI servers without additional power supply to achieve high-density and high-performance deployment in data centers.

John Qian, Founder and CEO of Vastai, said, "The SV100 series is the achievement of all of us at Vastai, and I couldn't be prouder of our team! We have a deep understanding of our target customers' needs for throughput, latency, versatility, and cost. At the same time, forward compatibility is particularly important, and our software stack is extremely flexible and scalable to support future emerging algorithm models and user-defined operator extensions. On the other hand, computer vision applications, which account for more than half of China's AI application market, require high-density video decoding with AI processing to achieve end-to-end acceleration, and we have done a lot of work in balancing the two in our design. Through core technology development and forward-looking deployment, Vastai officially releases our SV100 series general-purpose AI inference chips and VA1 inference cards with excellent performance for data centers, which can effectively address those industry pain points in low latency, versatility and video processing, and accelerate the deployment of intelligent applications in the cloud and at the edge."

Louis Zhang, founder and CTO of Vastai Technologies, said, "The SV100 series chips are based on advanced DSA architecture and achieve the highest deep learning inference performance multiple times that of data center GPUs with the same power consumption. The VA1 inference card is a 75-watt, half-height, half-length, single-slot design that can be seamlessly adapted to a variety of AI servers to maximize the density of processing power deployment. Our VastStream software platform supports models of common deep learning framework such as TensorFlow, PyTorch and Caffe2, as well as ONNX format, with highly customizable AI compilers to fully optimize model execution on Vastai hardware. We offer a complete software stack and tools that meet developers’ habits in the industry, making it easy for users to migrate and deploy existing algorithms and applications to the Vastai hardware platform at a very low cost."

About Vastai Technologies

Founded in December 2018 in Shanghai, Vastai Technologies has R&D branches in Beijing, Shenzhen and Toronto. The company's core staff comes from the world's top high-tech companies, with an average of more than 15 years of relevant chip and software design experience. The company currently has a senior team of over 200 people and is growing rapidly. Vastai Technologies is dedicated to becoming the powerhouse of heterogeneous cloud and edge computing and one of the leading chip design companies in the world.