Features
- 502 INT8 TOPs
- 204MB on-chip SRAM
- 75W TDP 40W typical
- 8 TOPs/W
- At-Memory Architecture
- Scalable voltage and frequency
- Low latency, native batch = 1 PCIe Gen4 x16
Applications
The runAI200 devices are designed to accelerate a multiplicity of AI inference and HPC workloads, such as vision-based convolutional networks, transformer networks for natural language processing, time-series analysis for financial applications, and general-purpose linear algebra for high-performance computing applications.
Markets | Application | Networks |
Vision | Classification, object detection, semantic segmentation | ResNets, YOLO, SSD, Unets, Pose |
Natural language processing | Text-to-speech, speech-to-text, chatbots | RNNs, Transformers, BERT |
Financial technology | X-Value adjustments, credit risk, portfolio balancing | TCNs, LSTMs |
HPC | Climate modelling, deep packet inspection, simulations | FFTs, BLAS, arbitrary computation |
imAIgine Software Development Kit
The imAIgine SDK gives developers powerful automated tools and supporting software to quickly go from the pilot model to production. It is organized into three parts.
- The imAIgine Compiler
- Import TensorFlow, PyTorch, or ONNX graphs directly
- Automated quantizer and extracts performance without sacrificing accuracy
- Specify performance levels, silicon utilization, and power consumption targets
- The imAIgine Toolkit
- Evaluate functionality and performance using the extensive profiling and simulation tools
- The imAIgine Runtime
- Provides C-based API for integration into your deep learning environment
- Monitor the health and temperature of the tsunAImi® acceleration cards to ensure proper operation and prevent thermal damage
Resources
- Webinar: Introduction to tsunAImi – Accelerating AI Inference
- Datasheet whitepaper
- How to access tsunAImi tsn200ES Accelerator: Github