The Jotunn8 AI Accelerator is a transformative solution designed to significantly outperform existing market offerings in AI inference and processing. It is engineered to tackle the massive compute requirements of large language models (LLMs) while keeping deployment costs exceptionally low. The Jotunn8 features 6,400 Tflops of floating-point performance using fp8 Tensor Cores and is fully programmable, meaning it can efficiently handle any algorithm and interface with any host processor. It is built with high-level programmability, enabling rapid algorithm deployment and enhancing operational flexibility. One of its standout features is the ability to maintain algorithmic efficiency even with large models like GPT-4, thus enabling these applications to be deployed with minimal latency and power consumption. The device includes 192 GB of on-chip memory and boasts peak power usage of 180W, suitable for both cloud and on-premise applications, effectively minimizing operational costs while maximizing performance.