
Deep learning inference throughput multiple times of mainstream data center GPU’s with super low latency.
General purpose architecture optimized for various types of inference workloads – support for Computer Vision, Natural Language Processing and Recommender Systems, etc.
Integrated with high density video decoders, reducing both capital expenditures (CAPEX) and operating expenses (OPEX) for cloud and edge solutions.