All IPs > Processor > AI Processor
The AI Processor category within our semiconductor IP catalog is dedicated to state-of-the-art technologies that empower artificial intelligence applications across various industries. AI processors are specialized computing engines designed to accelerate machine learning tasks and perform complex algorithms efficiently. This category includes a diverse collection of semiconductor IPs that are built to enhance both performance and power efficiency in AI-driven devices.
AI processors play a critical role in the emerging world of AI and machine learning, where fast processing of vast datasets is crucial. These processors can be found in a range of applications from consumer electronics like smartphones and smart home devices to advanced robotics and autonomous vehicles. By facilitating rapid computations necessary for AI tasks such as neural network training and inference, these IP cores enable smarter, more responsive, and capable systems.
In this category, developers and designers will find semiconductor IPs that provide various levels of processing power and architectural designs to suit different AI applications, including neural processing units (NPUs), tensor processing units (TPUs), and other AI accelerators. The availability of such highly specialized IPs ensures that developers can integrate AI functionalities into their products swiftly and efficiently, reducing development time and costs.
As AI technology continues to evolve, the demand for robust and scalable AI processors increases. Our semiconductor IP offerings in this category are designed to meet the challenges of rapidly advancing AI technologies, ensuring that products are future-ready and equipped to handle the complexities of tomorrow’s intelligence-driven tasks. Explore this category to find cutting-edge solutions that drive innovation in artificial intelligence systems today.
Continuing the evolution of AI at the edge, the 2nd Generation Akida provides enhanced capabilities for modern applications. This upgrade implements 8-bit quantization for increased precision and introduces support for Vision Transformers and temporal event-based neural networks. The platform handles advanced cognitive tasks seamlessly with heightened accuracy and significantly reduced energy consumption. Designed for high-performance AI tasks, it supports complex network models and utilizes skip connections to enhance speed and efficiency.
The NMP-750 serves as a high-performance accelerator IP for edge computing solutions across various sectors, including automotive, smart cities, and telecommunications. It supports sophisticated applications such as mobility control, factory automation, and energy management, making it a versatile choice for complex computational tasks. With a high throughput of up to 16 TOPS and a memory capacity scaling up to 16 MB, this IP ensures substantial computing power for edge devices. It is configured with a RISC-V or Arm Cortex-R/A 32-bit CPU and incorporates multiple AXI4 interfaces, optimizing data exchanges between Host, CPU, and peripherals. Optimized for edge environments, the NMP-750 enhances spectral efficiency and supports multi-camera stream processing, paving the way for innovation in smart infrastructure management. Its scalable architecture and energy-efficient design make it an ideal component for next-generation smart technologies.
KPIT offers a comprehensive solution for Autonomous Driving and Advanced Driver Assistance Systems. This suite facilitates the widespread adoption of Level 3 and above autonomy in vehicles, providing high safety standards through robust testing and validation frameworks. The integration of AI-driven decision-making extends beyond perception to enhance the intelligence of autonomous systems. With a commitment to addressing existing challenges such as localization issues, AI limitations, and validation fragmentation, KPIT empowers automakers to produce vehicles that are both highly autonomous and reliable.
The KL730 AI SoC is equipped with a state-of-the-art third-generation reconfigurable NPU architecture, delivering up to 8 TOPS of computational power. This innovative architecture enhances computational efficiency, particularly with the latest CNN networks and transformer applications, while reducing DDR bandwidth demands. The KL730 excels in video processing, offering support for 4K 60FPS output and boasts capabilities like noise reduction, wide dynamic range, and low-light imaging. It is ideal for applications such as intelligent security, autonomous driving, and video conferencing.
The NMP-350 is designed to offer exceptional efficiency in AI processing, specifically targeting endpoint accelerations. This IP is well-suited for markets that require minimal power consumption and cost-effectiveness, such as automotive, AIoT/Sensors, Industry 4.0, smart appliances, and wearables. It enables a wide variety of applications, including driver authentication, digital mirrors, machine automation, and health monitoring. Technically, it delivers up to 1 TOPS and supports up to 1 MB local memory. The architecture is based on the RISC-V or Arm Cortex-M 32-bit CPU, ensuring effective processing capabilities for diverse tasks. Communication is managed via three AXI4 interfaces, each 128 bits wide, to handle Host, CPU, and Data interactions efficiently. The NMP-350 provides a robust foundation for developing advanced AI applications at the edge. Designed for ultimate flexibility, it aids in predictive maintenance and personalization processes in smart environments. With its streamlined architecture, it provides unmatched performance for embedded solutions, enabling seamless integration into existing hardware ecosystems.
KPIT's digital solutions harness cloud and edge analytics to modernize vehicle data management, optimizing efficiency and security in connected mobility. With a focus on overcoming data overload and ensuring compliance with regulatory standards, these solutions enable secure and scalable cloud environments for vehicle connectivity. The edge computing aspect enhances system responsiveness by processing data within vehicles, promoting innovation and dynamic feature development.
The Metis AIPU PCIe AI Accelerator Card provides an unparalleled performance boost for AI tasks by leveraging multiple Metis AIPUs within a single setup. This card is capable of delivering up to 856 TOPS, supporting complex AI workloads such as computer vision applications that require rapid and efficient data processing. Its design allows for handling both small-scale and extensive applications with ease, ensuring versatility across different scenarios. By utilizing a range of deep learning models, including YOLOv5 and ResNet-50, this AI accelerator card processes up to 12,800 FPS for ResNet-50 and an impressive 38,884 FPS for MobileNet V2-1.0. The card’s architecture enables high throughput, making it particularly suited for video analytics tasks where speed is crucial. The card also excels in scenarios that demand high energy efficiency, providing best-in-class performance at a significantly reduced operational cost. Coupled with the Voyager SDK, the Metis PCIe card integrates seamlessly into existing AI systems, enhancing development speed and deployment efficiency.
Origin E1 neural engines are expertly adjusted for networks that are typically employed in always-on applications. These include devices such as home appliances, smartphones, and edge nodes requiring around 1 TOPS performance. This focused optimization makes the E1 LittleNPU processors particularly suitable for cost- and area-sensitive applications, making efficient use of energy and reducing processing latency to negligible levels. The design also incorporates a power-efficient architecture that maintains low power consumption while handling always-sensing data operations. This enables continuous sampling and analysis of visual information without compromising on efficiency or user privacy. Additionally, the architecture is rooted in Expedera's packet-based design which allows for parallel execution across layers, optimizing performance and resource utilization. Market-leading efficiency with up to 18 TOPS/W further underlines Origin E1's capacity to deliver outstanding AI performance with minimal resources. The processor supports standard and proprietary neural network operations, ensuring versatility in its applications. Importantly, it accommodates a comprehensive software stack that includes an array of tools such as compilers and quantizers to facilitate deployment in diverse use cases without requiring extensive re-designs. Its application has already seen it deployed in over 10 million devices worldwide, in various consumer technology formats.
Designed for high-performance environments such as data centers and automotive systems, the Origin E8 NPU cores push the limits of AI inference, achieving up to 128 TOPS on a single core. Its architecture supports concurrent running of multiple neural networks without context switching lag, making it a top choice for performance-intensive tasks like computer vision and large-scale model deployments. The E8's flexibility in deployment ensures that AI applications can be optimized post-silicon, bringing performance efficiencies previously unattainable in its category. The E8's architecture and sustained performance, alongside its ability to operate within strict power envelopes (18 TOPS/W), make it suitable for passive cooling environments, which is crucial for cutting-edge AI applications. It stands out by offering PetaOps performance scaling through its customizable design that avoids penalties typically faced by tiled architectures. The E8 maintains exemplary determinism and resource utilization, essential for running advanced neural models like LLMs and intricate ADAS tasks. Furthermore, this core integrates easily with existing development frameworks and supports a full TVM-based software stack, allowing for seamless deployment of trained models. The expansive support for both current and emerging AI workloads makes the Origin E8 a robust solution for the most demanding computational challenges in AI.
The Metis AIPU M.2 Accelerator Module is a powerful AI processing solution designed for edge devices. It offers a compact design tailored for applications requiring efficient AI computations with minimized power consumption. With a focus on video analytics and other high-demand tasks, this module transforms edge devices into AI-capable systems. Equipped with the Metis AIPU, the M.2 module can achieve up to 3,200 FPS for ResNet-50, providing remarkable performance metrics for its size. This makes it ideal for deployment in environments where space and power availability are limited but computational demands are high. It features an NGFF (Next Generation Form Factor) socket, ensuring it can be easily integrated into a variety of systems. The module leverages Axelera's Digital-In-Memory-Computing technology to enhance neural network inference speed while maintaining power efficiency. It's particularly well-suited for applications such as multi-channel video analytics, offering robust support for various machine learning frameworks, including PyTorch, ONNX, and TensorFlow.
BrainChip's Akida is an advanced neuromorphic processor that excels in efficiency and performance, processing data similar to the human brain by focusing on essential sensory inputs. This approach drastically reduces power consumption and latency compared to conventional methods by keeping AI local to the chip. Akida’s architecture, which scales to support up to 256 nodes, allows for high efficiency with a small footprint. Nodes in the Akida system integrate Neural Network Layer Engines configurable as either convolutional or fully connected, maximizing processing power by handling data sparsity through event-based operations.
SCR9 is tailored for entry-level server-class applications and high-performance computing. This 64-bit RISC-V core supports a range of extensions, including vector operations and scalar cryptography. Utilizing a dual-issue 12-stage pipeline, SCR9 excels in environments requiring Linux-based operations, enabling advanced data processing capabilities like those needed in AI and personal computing devices.
The Origin E2 family of NPU cores is tailored for power-sensitive devices like smartphones and edge nodes that seek to balance power, performance, and area efficiency. These cores are engineered to handle video resolutions up to 4K, as well as audio and text-based neural networks. Utilizing Expedera’s packet-based architecture, the Origin E2 ensures efficient parallel processing, reducing the need for device-specific optimizations, thus maintaining high model accuracy and adaptability. The E2 is flexible and can be customized to fit specific use cases, aiding in mitigating dark silicon and enhancing power efficiency. Its performance capacity ranges from 1 to 20 TOPS and supports an extensive array of neural network types including CNNs, RNNs, DNNs, and LSTMs. With impressive power efficiency rated at up to 18 TOPS/W, this NPU core keeps power consumption low while delivering high performance that suits a variety of applications. As part of a full TVM-based software stack, it provides developers with tools to efficiently implement their neural networks across different hardware configurations, supporting frameworks such as TensorFlow and ONNX. Successfully applied in smartphones and other consumer electronics, the E2 has proved its capabilities in real-world scenarios, significantly enhancing the functionality and feature set of devices.
Cortus's High Performance RISC-V Processor represents the pinnacle of processing capability, designed for demanding applications that require high-speed computing and efficient task handling. It features the world’s fastest RISC-V 64-bit instruction set architecture, implemented in an Out-of-Order (OoO) execution core, supporting both single-core and multi-core configurations for unparalleled processing throughput. This processor is particularly suited for high-end computing tasks in environments ranging from desktop computing to artificial intelligence workloads. With integrated features such as a multi-socket cache coherent system and an on-chip vector plus AI accelerator, it delivers exceptional computation power, essential for tasks such as bioinformatics and complex machine learning models. Moreover, the processor includes coherent off-chip accelerators, such as CNN accelerators, enhancing its utility in AI-driven applications. The design flexibility extends its application to consumer electronics like laptops and supercomputers, positioning the High Performance RISC-V Processor as an integral part of next-gen technology solutions across multiple domains.
The AI Camera Module from Altek is an innovative integration of image sensor technology and intelligent processing, designed to cater to the burgeoning needs of AI in imaging. It combines rich optical design capabilities with software-hardware amalgamation competencies, delivering multiple AI camera models that assist clients in achieving differentiated AI + IoT needs. This flexible camera module excels in edge computing by supporting high-resolution requirements such as 2K and 4K, thereby becoming an indispensable tool in environments demanding detailed image analysis. The AI Camera Module allows for superior adaptability in performing functions such as facial detection and edge computation, thus broadening its applicability across industries. Altek's collaboration with major global brands fortifies the AI Camera Module's position in the market, ensuring it meets diverse client specifications. Whether used in security, industrial, or home automation applications, this module effectively integrates into various systems to deliver enhanced visual processing capabilities.
The KL630 AI SoC embodies next-generation AI chip technology with a pioneering NPU architecture. It uniquely supports Int4 precision and transformer networks, offering superb computational efficiency combined with low power consumption. Utilizing an ARM Cortex A5 CPU, it supports a range of AI frameworks and is built to handle scenarios from smart security to automotives, providing robust capability in both high and low light conditions.
Aimed at performance-driven environments, the NMP-550 is an efficient accelerator IP optimized for diverse markets, including automotive, mobile, AR/VR, drones, and medical devices. This IP is crucial for applications such as driver monitoring, fleet management, image and video analytics, and compliance in security systems. The NMP-550 boasts a processing power of up to 6 TOPS and integrates up to 6 MB of local memory, empowering it to handle complex tasks with ease. It runs on a RISC-V or Arm Cortex-M/A 32-bit CPU and supports multiple high-speed interfaces, specifically three AXI4, 128-bit connections that manage Host, CPU, and Data traffic. This IP is engineered for environments demanding high performance with efficient power use, addressing modern technological challenges in real-time analytics and surveillance. The NMP-550 is adept at improving system intelligence, allowing for enhanced decision-making processes in connected devices.
The Automotive AI Inference SoC by Cortus is a cutting-edge chip designed to revolutionize image processing and artificial intelligence applications in advanced driver-assistance systems (ADAS). Leveraging RISC-V expertise, this SoC is engineered for low power and high performance, particularly suited to the rigorous demands of autonomous driving and smart city infrastructures. Built to support Level 2 to Level 4 autonomous driving standards, this AI Inference SoC features powerful processing capabilities, enabling complex image processing algorithms akin to those used in advanced visual recognition tasks. Designed for mid to high-end automotive markets, it offers adaptability and precision, key to enhancing the safety and efficiency of driver support systems. The chip's architecture allows it to handle a tremendous amount of data throughput, crucial for real-time decision-making required in dynamic automotive environments. With its advanced processing efficiency and low power consumption, the Automotive AI Inference SoC stands as a pivotal component in the evolution of intelligent transportation systems.
Origin E6 NPU cores are cutting-edge solutions designed to handle the complex demands of modern AI models, specializing in generative and traditional networks such as RNN, CNN, and LSTM. Ranging from 16 to 32 TOPS, these cores offer an optimal balance of performance, power efficiency, and feature set, making them particularly suitable for premium edge inference applications. Utilizing Expedera’s innovative packet-based architecture, the Origin E6 allows for streamlined multi-layer parallel processing, ensuring sustained performance and reduced hardware load. This helps developers maintain network adaptability without incurring latency penalties or the need for hardware-specific optimizations. Additionally, the Origin E6 provides a fully scalable solution perfect for demanding environments like next-generation smartphones, automotive systems, and consumer electronics. Thanks to a comprehensive software suite based around TVM, the E6 supports a broad span of AI models, including transformers and large language models, offering unparalleled scalability and efficiency. Whether for use in AR/VR platforms or advanced driver assistance systems, the E6 NPU cores provide robust solutions for high-performance computing needs, facilitating numerous real-world applications.
The xcore.ai platform stands as an economical and high-performance solution for intelligent IoT applications. Designed with a unique multi-threaded micro-architecture, it supports applications requiring deterministic performance with low latency. The architecture features 16 logical cores, split between two multi-threaded processor tiles, which are equipped with 512 kB of SRAM and a vector unit for both integer and floating-point computations. This platform excels in enabling high-speed interprocessor communications, allowing tight integration among processors and across multiple xcore.ai SoCs. The xcore.ai offers scalable performance, adapting the tile clock frequency to meet specific application requirements, which optimizes power consumption. Its ability to handle DSP, AI/ML, and I/O processing within a singular development environment makes it a versatile choice for creating smart, connected products. The adaptability of the xcore.ai extends to various market applications such as voice and audio processing. It supports embedded PHYs for MIPI, USB, and LPDDR control processing, and utilizes FreeRTOS across multiple threads for robust multi-threading performance. On an AI and ML front, the platform includes a 256-bit vector processing unit that supports 8-bit to 32-bit operations, delivering exceptional AI performance with up to 51.2 GMACC/s. All these features are packaged within a development environment that simplifies the integration of multiple application-specific components. This makes xcore.ai an essential platform for developers aiming to leverage intelligent IoT solutions that scale with application needs.
The Chimera GPNPU series stands as a pivotal innovation in the realm of on-device artificial intelligence computing. These processors are engineered to address the challenges faced in machine learning inference deployment, offering a unified architecture that integrates matrix, vector, and scalar operations seamlessly. By consolidating what traditionally required multiple processors, such as NPUs, DSPs, and real-time CPUs, into a single processing core, Chimera GPNPU reduces system complexity and optimizes performance. This series is designed with a focus on handling diverse, data-parallel workloads, including traditional C++ code and the latest machine learning models like vision transformers and large language models. The fully programmable nature of Chimera GPNPUs allows developers to adapt and optimize model performance continuously, providing a significant uplift in productivity and flexibility. This capability ensures that as new neural network models emerge, they can be supported without the necessity of hardware redesign. A remarkable feature of these processors is their scalability, accommodating intensive workloads up to 864 TOPs and being particularly suited for high-demand applications like automotive safety systems. The integration of ASIL-ready cores allows them to meet stringent automotive safety standards, positioning Chimera GPNPU as an ideal solution for ADAS and other automotive use cases. The architecture's emphasis on reducing memory bandwidth constraints and energy consumption further enhances its suitability for a wide range of high-performance, power-sensitive applications, making it a versatile solution for modern automotive and edge computing.
The Ultra-Low-Power 64-Bit RISC-V Core by Micro Magic, Inc. is a groundbreaking processor core designed for efficiency in both power consumption and performance. Operating at a mere 10mW at 1GHz, this core leverages advanced design techniques to run at reduced voltages without sacrificing performance, achieving clock speeds up to 5 GHz. This innovation is particularly valuable for applications requiring high-speed processing while maintaining low power usage, making it ideal for portable and battery-operated devices. Micro Magic's 64-bit RISC-V architecture embraces a streamlined design that minimizes energy consumption and maximizes processing throughput. The core's architecture is optimized for high performance under low-power conditions, which is essential for modern electronics that require prolonged battery life and environmental sustainability. This core supports a wide range of applications from consumer electronics to automotive systems where energy efficiency and computational power are paramount. The RISC-V core also benefits from Micro Magic's suite of integrated design tools, which streamline the development process and enable seamless integration into larger systems. With a focus on reducing total ownership costs and enhancing product life cycle, Micro Magic's RISC-V core stands out as a versatile and eco-friendly solution in the semiconductor market.
The Matchstiq™ X40 is a high-performance software-defined radio (SDR) engineered with a low size, weight, and power (SWaP) design. This SDR is optimized for edge computing, particularly suited for artificial intelligence (AI) and machine learning (ML) applications. It integrates a robust RF front end with multi-channel digital transceivers, providing access to frequencies up to 18GHz and a bandwidth of 450MHz. It features a Nvidia Orin NX 16G for advanced data processing and an AMD Zynq Ultrascale+ FPGA for signal integration. The small form factor, with precise dimensions and modest weight, offers advantages in space-constrained deployments such as unmanned aerial systems (UxS) and ALE payloads. The Matchstiq™ X40 facilitates superior performance in frequency agility, ideal for spectrum management and complex signal detection tasks. The SDR's architecture supports numerous interface options including USB 3.0 and 1 GbE networking, complemented by serial port connectivity. This device is designed to leverage the libsidekiq API for seamless API integration, making it an indispensable tool for rapid prototyping, testing, and field deployment.
The Dynamic Neural Accelerator II Architecture (DNA-II) by EdgeCortix is a sophisticated neural network IP core structured for extensive parallelism and efficiency enhancement. Distinguished by its run-time reconfigurable interconnects between computing elements, DNA-II supports a broad spectrum of AI models, including both convolutional and transformer networks, making it suitable for diverse edge AI applications. With its scalable performance starting from 1K MACs, the DNA-II architecture integrates easily with many SoC and FPGA applications. This architecture provides a foundation for the SAKURA-II AI Accelerator, supporting up to 240 TOPS in processing capacity. The unique aspect of DNA-II is its utilization of advanced data path configurations to optimize processing parallelism and resource allocation, thereby minimizing on-chip memory bandwidth limitations. The DNA-II is particularly noted for its superior computational capabilities, ensuring that AI models operate with maximum efficiency and speed. Leveraging its patented run-time reconfigurable data paths, it significantly increases hardware performance metrics and energy efficiency. This capability not only enhances the compute power available for complex inference tasks but also reduces the power footprint, which is critical for edge-based deployments.
Topaz FPGAs from Efinix are designed for volume applications where performance and cost-effectiveness are paramount. Built on their distinctive Quantum® compute fabric, Topaz devices offer an efficient architecture that balances logic resource availability with power minimization. Suitable for a plethora of applications from machine vision to wireless communication, these FPGAs are characterized by their robust protocol support, including PCIe Gen3, MIPI D-PHY, and various Ethernet configurations. One of the standout features of Topaz FPGAs is their flexibility. These devices can be effortlessly adapted into systems requiring seamless high-speed data management and integration. This adaptability is further enhanced by the extensive logic resource options, which allow increased innovation and the ability to add new features without extensive redesigns. Topaz FPGAs also offer product longevity, thriving in industries where extended lifecycle support is necessary. Efinix ensures ongoing support until at least 2045, making these FPGAs a reliable choice for projects aiming for enduring market presence. Among the key sectors benefiting from Topaz's flexibility are medical imaging and industrial control, where precision and reliability are critical. Moreover, Efinix facilitates migration from Topaz to Titanium for projects requiring enhanced performance, ensuring scalability and minimizing redesign efforts. With varying BGA packages available, Topaz FPGAs provide comprehensive solutions that cater to both the technological needs and strategic goals of enterprises.
The RWM6050 Baseband Modem from Blu Wireless is a high-performance component designed for mmWave communications. It supports gigabit-level data rates through its advanced modulation and channelization technologies, making it ideal for various access and backhaul applications. The modem's substantial flexibility is attributed to its compatibility with multiple RF chipsets and its design which is influenced by Renesas collaboration, ensuring robust, scalable wireless connectivity. This modem features dual integrated modems and an adaptable digital front end, including PHY, MAC, and ADC/DAC functionalities. It supports beamforming with phased array antennas, facilitating efficient signal processing and network synchronization for enhanced performance. These attributes make the RWM6050 a key enabler for deploying next-generation wireless communication systems. Built to optimize cost efficiency and power consumption, the RWM6050 offers versatile options in channelization and modulation coding, effectively scaling bandwidth to match multi-gigabit requirements. It provides a powerful solution to meet the growing demands of modern data networks, effectively balancing performance, adaptability, and integration ease.
The Low Power RISC-V CPU IP from SkyeChip is crafted to deliver efficient computation with minimal power consumption. Featuring the RISC-V RV32 instruction set, it supports a range of functions with full standard compliance for instruction sets and partial support where necessary. Designed exclusively for machine mode, it incorporates multiple vectorized interrupts and includes comprehensive debugging capabilities. This CPU IP is well-suited for integration into embedded systems where power efficiency and processing capability are crucial.
iModeler is Xpeedic's innovative solution for automated PDK model generation. This tool streamlines the process of creating Process Design Kits, which are foundational for semiconductor manufacturing processes. iModeler’s capabilities in automating PDK generation reduce time and resources required, providing a significant advancement over traditional manual methods. By utilizing sophisticated algorithms, iModeler enhances accuracy in developing intricate models that are essential for advanced semiconductor fabrication. The tool supports a broad range of semiconductor processes, ensuring cross-compatibility and robustness in model output. This level of precision supports engineers in achieving optimal results in both design and manufacturing stages. With iModeler, companies can significantly boost their development productivity, enabling quicker turnarounds in the semiconductor lifecycle. For organizations looking to maintain cutting-edge competitiveness, iModeler is a game-changer, providing the necessary infrastructure to support rapid advancements in chip manufacturing technologies.
The NPU by OPENEDGES, known as ENLIGHT, is a cutting-edge neural processing unit tailored for deep learning applications requiring high computational efficiency. It introduces an innovative approach by utilizing mixed-precision computation (4/8-bit), heavily optimizing processing power and reducing DRAM traffic through advanced scheduling and layer partitioning techniques. This NPU offers superior energy efficiency and compute density, making it significantly more effective than competing alternatives. It is highly customizable, accommodating varying core sizes to meet specific market demands, ensuring a broad application reach from AI and ML tasks to edge computing requirements. ENLIGHT enhances performance with its DNN-optimized vector engine and advanced algorithm supports, including convolution and non-linear activation functions. Its toolkit supports popular formats like ONNX and TFLite, simplifying integration and accelerating the development process for complex neural network models in high-performance environments.
The Talamo SDK is a powerful development toolkit engineered to advance the creation of sophisticated spiking neural network-based applications. It melds seamlessly with PyTorch, offering developers an accessible workflow for model building and deployment. This SDK extends the PyTorch ecosystem by providing the necessary infrastructure to construct, train, and implement spiking neural networks effectively. A distinguishing feature of Talamo SDK lies in its ability to map trained neural models onto the diverse computing layers inherent in the spiking neural processor hardware. This is complemented by an architecture simulator enabling fast validation, which accelerates the iterative design process by simulating hardware behavior and helping optimize power and performance metrics. Developers will appreciate the end-to-end application support within Talamo SDK, including the integration of standard neural network operations alongside spiking models, allowing for a comprehensive application pipeline. With ready-to-use models, even those without detailed SNN knowledge can develop powerful AI-driven applications swiftly, benefiting from high-level profiling and optimization tools.
The AON1020 offers a sophisticated blend of edge AI processing capabilities designed for environments demanding continuous interactions and low power consumption. Specializing in voice and sound applications, this processor ensures that devices such as smart speakers and wearable technologies can run smoothly and efficiently, providing excellent functionality without draining resources. Engineered to deliver outstanding accuracy, the AON1020 focuses on maintaining exceptional performance in acoustically complex settings, surpassing traditional capabilities with its innovative approach to data processing. Its capacity to execute real-time event and scene detection, alongside speaker identification, highlights its versatility in enhancing user experience across multiple applications. By integrating AI models that are fine-tuned for power efficiency, the AON1020 stands out as a viable solution for developers aiming to incorporate cutting-edge AI technologies into small, resource-constrained devices. Its prowess in managing battery life while sustaining operational readiness makes it uniquely positioned to revolutionize the interactions between users and their smart devices.
The NeuroSense AI Chip, an ultra-low power neuromorphic frontend, is engineered for wearables to address the challenges of power efficiency and data accuracy in health monitoring applications. This tiny AI chip is designed to process data directly at the sensor level, which includes tasks like heart rate measurement and human activity recognition. By performing computations locally, NeuroSense minimizes the need for cloud connections, thereby ensuring privacy and prolonging battery life. The chip excels in accuracy, significantly outperforming conventional algorithm-based solutions by offering three times better heart rate accuracy. This is achieved through its ability to reduce power consumption to below 100µW, allowing users to experience extended device operation without frequent recharging. The NeuroSense supports a simple configuration setup, making it suitable for integration into a variety of wearable devices such as fitness trackers, smartwatches, and health monitors. Its capabilities extend to advanced features like activity matrices, enabling devices to learn new human activities and classify tasks according to intensity levels. Additional functions include monitoring parameters like oxygen saturation and arrhythmia, enhancing the utility of wearable devices in providing comprehensive health insights. The chip's integration leads to reduced manufacturing costs, a smaller IC footprint, and a rapid time-to-market for new products.
The Tyr Superchip is engineered to facilitate high performance computing in AI and data processing domains, with a focus on scalability and power efficiency. Designed around a revolutionary multi-core architecture, it features fully programmable cores that are suitable for any AI or general-purpose algorithms, ensuring high flexibility and adaptability. This product is crucial for industries requiring cutting-edge processing capabilities without the overhead of traditional systems, thanks to its support for CUDA-free operations and efficient algorithm execution that minimizes energy consumption.
The ULYSS MCU range from Cortus is a powerful suite of automotive microcontrollers designed to address the complex demands of modern automotive applications. These MCUs are anchored by a highly optimized 32/64-bit RISC-V architecture, delivering impressive performance levels from 120MHz to 1.5GHz, making them suitable for a variety of automotive functions such as body control, safety systems, and infotainment. ULYSS MCUs are engineered to accommodate extensive application domains, providing reliability and efficiency within harsh automotive environments. They feature advanced processing capabilities and are designed to integrate seamlessly into various automotive systems, offering developers a versatile platform for building next-generation automotive solutions. The ULYSS MCU family stands out for its scalability and adaptability, enabling manufacturers to design robust automotive electronics tailored to specific needs while ensuring cost-effectiveness. With their support for a wide range of automotive networking and control applications, ULYSS MCUs are pivotal in the development of reliable, state-of-the-art automotive systems.
T-Head's Hanguang 800 AI Accelerator is a leading-edge chip developed to boost artificial intelligence processing capabilities. It’s designed using the latest 12nm process, integrating 170 billion transistors to achieve outstanding computational performance. The chip delivers a peak computing power of 820 TOPS, making it one of the most powerful AI chips in the industry.\n\nOptimized for deep learning tasks, the Hanguang 800 excels in implementing AI models with minimal latency, which is ideal for data centers and edge computing. In industry benchmarks, such as ResNet-50, it achieves inference speeds of 78,563 IPS, coupled with an efficiency of 500 IPS/W, setting a high standard for AI technology.\n\nThe Hanguang 800 also supports extensive neural network frameworks, ensuring broad applicability and integration into existing AI infrastructures. Its performance is enhanced further by T-Head's proprietary HGAI software stack, which offers robust support for model development and deployment, allowing users to maximize the chip's capabilities in various AI scenarios.
aiWare is a cutting-edge hardware solution dedicated to facilitating neural processing for automotive AI applications. As part of aiMotive’s advanced offerings, the aiWare NPU (Neural Processing Unit) provides a scalable AI inference platform optimized for cost-sensitive and multi-sensor automotive applications ranging from Level 2 to Level 4 driving automation. With its unique SDK focused on neural network optimization, aiWare offers up to 256 Effective TOPS per core, on par with leading industry efficiency benchmarks. The aiWare hardware IP integrates smoothly into automotive systems due to its ISO 26262 ASIL B certification, making it suitable for production environments requiring rigorous safety standards. Its innovative architecture utilizes both on-chip local memory and dense on-chip RAM for efficient data handling, significantly reducing external memory needs. This focus on minimizing off-chip traffic enhances the overall performance while adhering to stringent automotive requirements. Optimized for high-speed operation, aiWare can reach up to 1024 TOPS, providing flexibility across a wide range of AI workloads including CNNs, LSTMs, and RNNs. Designed for easy layout and software integration, aiWare supports essential activation and pooling functions natively, allowing maximum processing efficiency for neural networks without host CPU interference. This makes it an exemplary choice for automotive-grade AI, supporting various advanced driving capabilities and applications.
The HUMMINGBIRD by Lightelligence is an innovative optical Network-on-Chip processor that integrates photonic and electronic dies through advanced vertically stacked packaging technologies. This architecture provides a pathway to overcome conventional digital network limitations, particularly the 'memory wall.' With a 64-core domain-specific AI processor, HUMMINGBIRD uses a cutting-edge waveguide system to propagate light-speed signals, drastically reducing latency and power requirements compared to traditional electronic networks. This high-performance device serves as the communication backbone for data centers, facilitating data management and interconnect topology innovations. HUMMINGBIRD exploits the power of silicon photonics to offer a dense all-to-all data broadcast network that enhances the performance and scalability of AI workloads. HUMMINGBIRD's robust integration into PCIe form factors allows easy deployment onto industry-standard servers, and when paired with the Lightelligence Software Development Kit, it can significantly optimize AI and machine learning processes. This integration fosters a higher utilization of computing power and alleviates complexities associated with mapping workloads to hardware.
The SCR7 is a 64-bit high-performance RISC-V core designed for intensive data processing applications. With support for vector operations and various RISC-V extensions, it is equipped with a 12-stage dual-issue pipeline. It caters to fields such as AI, ML, and high-performance computing, benefitting from its robust multicore support and advanced interrupt management systems.
The GenAI v1-Q represents an enhancement over the basic GenAI v1 core, with added support for quantization capabilities, specifically 4-bit and 5-bit quantization. This significantly reduces memory requirements, potentially by as much as 75%, facilitating the execution of large language models within smaller, more cost-effective systems without sacrificing speed or accuracy. The reduced memory usage translates to lower overall costs and diminished energy consumption while maintaining the integrity and intelligence of the models. Designed for seamless integration into various devices, the GenAI v1-Q also ensures compatibility with diverse memory technologies, making it a versatile choice for applications demanding efficient AI performance.
Arasan's MIPI CSI-2 Receiver IP is designed to meet the needs of modern image processing applications. This IP core facilitates the integration of high-resolution cameras by supporting advanced MIPI protocols compliant with the latest standards. It offers versatile data formats and high data throughput capabilities, essential for applications in smartphones, tablets, and other mobile devices. Leveraging a robust architecture, the CSI-2 Receiver IP enhances data integrity and transmission efficiency. Its low power consumption and compact design are tailored for space-constrained environments in consumer electronics and automotive sectors. Advanced error correction features ensure reliability and robustness, critical for real-time video processing and streaming applications. The IP supports a wide range of MIPI protocol features, offering designers flexibility and scalability. Its integration ease ensures that product design cycles are shortened, enabling quicker time-to-market for innovative new products. As such, Arasan's MIPI CSI-2 Receiver IP is a preferred choice for engineers seeking high-performance, dependable camera interface solutions.
The Universal DSP Library is engineered to seamlessly integrate with the AMD Vivado ML Design Suite, offering a collection of digital signal processing components. This library includes essential elements such as FIR and CIC filters, mixers, and CORDIC function approximations, along with multiplexers and converters for a streamlined development experience. Provided in both raw VHDL and as design suite blocks, this library enables rapid construction of signal processing chains, accompanied by bit-true software models for evaluation and integration.
The 1000 Series by Akeana offers high-performance processing capabilities designed to tackle data-heavy computational tasks. With features aimed at enhancing throughput and power efficiency, these processors are suitable for environments that require robust processing power like AI on the edge, industrial automation, and automotive sensing. Supporting rich operating systems like Android and Linux, the 1000 Series is crafted to ensure superior performance with efficient power management, making it a prime choice for modern high-demand applications.
The KL520 AI SoC by Kneron marked a significant breakthrough in edge AI technology, offering a well-rounded solution with notable power efficiency and performance. This chip can function as a host or as a supplementary co-processor to enable advanced AI features in diverse smart devices. It is highly compatible with a range of 3D sensor technologies and is perfectly suited for smart home innovations, facilitating long battery life and enhanced user control without reliance on external cloud services.
The Altera Agilex 7 F-Series SoC offers unparalleled flexibility and high-performance capabilities, enabling the seamless implementation of complex algorithms into a single chip. Targeting sectors like bioscience and radar systems, this SoC optimizes system performance while minimizing power consumption. Its design includes a heatsink with an integrated fan for effective thermal management, ensuring reliable operation in demanding applications. The integration capabilities of the Agilex 7 F-Series make it a versatile choice for developers seeking efficient system solutions.
Built around the Intel Stratix 10 FPGA, the Altera Stratix 10 SoC delivers robust transceiver bandwidth ideal for applications requiring high-performance processing. Suitable for complex computing environments, such as analytic and video processing, this SoC ensures enhanced control and integration through its internal system-on-chip structure. The potent combination of FPGA architecture and integrated circuits makes it a prime choice for agile deployments across various high-demand sectors.
The CTAccel Image Processor (CIP) on Intel Agilex FPGA offers a high-performance image processing solution that shifts workload from CPUs to FPGA technology, significantly enhancing data center efficiency. Using the Intel Agilex 7 FPGAs and SoCs F-Series, which are built on the 10 nm SuperFin process, the CIP can boost image processing speed by 5 to 20 times while reducing latency by the same measure. This enhancement is crucial for accommodating the explosive growth of image data in data centers due to smartphone proliferation and extensive use of cloud storage. The Agilex FPGA's advanced features include transceiver rates up to 58 Gbps, versatile DSP blocks supporting both fixed-point and floating-point operations, and high-performance cryptographic capabilities. These features facilitate substantial performance improvements in image transcoding, thumbnail generation, and image recognition tasks, reducing total cost of ownership by enabling data centers to maintain higher compute densities with lower operational costs. Moreover, the CIP's support for mainstream image processing software such as ImageMagick and OpenCV ensures seamless integration and deployment. The FPGA's capability for remote reconfiguration allows it to adapt swiftly to custom usage scenarios without server downtimes, enhancing maintenance and operational flexibility.
The GenAI v1 is a cutting-edge hardware core developed by RaiderChip specifically engineered to meet the rigorous demands of generative AI workloads, often considered the most challenging. This IP core excels in optimizing efficiency for AI inference, breaking through traditional limitations by improving memory utilization and processing speed. Designed for deployment across a wide range of FPGA devices, particularly the AMD Versal series, it offers impressive speed in AI processing while maintaining low power consumption. The GenAI v1 has been proven effective in various cloud environments, notably on AWS F1 instances, where it demonstrates superior capabilities running complex LLM models like Meta's Llama series. Its architecture, which incorporates advanced parallel processing and optimized memory bandwidth utilization, promises enhanced performance metrics, ensuring it outpaces competitors significantly.
The RISC-V CPU IP U Class offers advanced capabilities suited for Linux and edge computing applications with its enhanced 32-bit architecture and MMU support. This IP extends its flexibility and configurability, making it a strategic fit for environments that require robust data processing and management. The U Class IP is designed to cater to computationally demanding tasks, delivering exceptional performance and efficiency. It integrates security features and functional safety options, aligning with industry standards to ensure secure and reliable operations. Developers utilizing the U Class IP benefit from access to a rich ecosystem of tools, including comprehensive SDKs and operating systems like Linux, which aid in expedited development and deployment processes. This makes it an optimal choice for projects aiming to leverage the full potential of edge computing.
The Yitian 710 Processor, a hallmark product of T-Head, is a high-performance Arm-based server chip. Developed by T-Head, it is designed with advanced architecture, featuring 128 Armv9 CPU cores and 2.75 GHz of frequency. The processor is engineered with cutting-edge 2.5D packaging, integrating 60 billion transistors. This makes it capable of exceptional computational tasks, catering to demands in AI reasoning, big data, and cloud computing applications.\n\nBeyond its core processing prowess, the Yitian 710 boasts a comprehensive I/O subsystem that includes 96 PCIe 5.0 lanes for high-speed data transfer, enhancing its utility for data-intensive applications. It supports up to eight DDR5 memory channels, delivering a bandwidth peak of 281 GB/s, thus ensuring fast and reliable performance for modern data centers.\n\nThis processor stands out for its ability to handle high throughput workloads efficiently, leading to superior performance in distributed computing environments. It supports scalable cloud service applications, making it ideal for complex computations and large datasets, which are fundamental to many modern enterprises.
CMNP is a dedicated image processing NPU from Chips&Media, engineered to provide superior image enhancement through advanced processing algorithms. Targeted at improving image quality across numerous applications such as mobile, drones, and automotive systems, CMNP is built to accommodate the rigorous demands of contemporary image processing tasks. This solution is especially effective in achieving notable image clarity enhancements, leveraging proprietary instruction set architectures for optimal performance. CMNP's state-of-the-art architecture supports high-efficiency super-resolution and noise reduction features, capable of converting 2K resolution visuals to 4K with enhanced clarity and minimal resource consumption. It utilizes CNN-based processing engines that are fully programmable, ensuring flexibility and precision in complex image operations. The focus on low-bandwidth consumption makes it ideal for resource-sensitive devices while maximizing computational efficiency. Incorporating features like extensive bit-depth processing and the ability to handle expansive color formats, CMNP adapts seamlessly to varying media requirements, upholding image quality with reduced latency. The NPU's adaptability and performance make it valuable for developers looking to integrate robust image-processing capabilities into their designs, be it in high-performance consumer electronics or sophisticated surveillance equipment.