The Jotunn8 is engineered to redefine performance standards for AI datacenter inference, supporting prominent large language models. Standing as a fully programmable and algorithm-agnostic tool, it supports any algorithm, any host processor, and can execute generative AI like GPT-4 or Llama3 with unparalleled efficiency. The system excels in delivering cost-effective solutions, offering high throughput up to 3.2 petaflops (dense) without relying on CUDA, thus simplifying scalability and deployment.
Optimized for cloud and on-premise configurations, Jotunn8 ensures maximum utility by integrating 16 cores and a high-level programming interface. Its innovative architecture addresses conventional processing bottlenecks, allowing constant data availability at each processing unit. With the potential to operate large and complex models at reduced query costs, this accelerator maintains performance while consuming less power, making it the preferred choice for advanced AI tasks.
The Jotunn8's hardware extends beyond AI-specific applications to general processing (GP) functionalities, showcasing its agility. By automatically selecting the most suitable processing paths layer-by-layer, it optimizes both latency and power consumption. This provides its users with a flexible platform that supports the deployment of vast AI models under efficient resource utilization strategies.
This product's configuration includes power peak consumption of 180W and an impressive 192 GB on-chip memory, accommodating sophisticated AI workloads with ease. It aligns closely with theoretical limits for implementation efficiency, accentuating VSORA's commitment to high-performance computational capabilities.