All IPs > Processor > Vision Processor
Vision processors are a specialized subset of semiconductor IPs designed to efficiently handle and process visual data. These processors are pivotal in applications that require intensive image analysis and computer vision capabilities, such as artificial intelligence, augmented reality, virtual reality, and autonomous systems. The primary purpose of vision processor IPs is to accelerate the performance of vision processing tasks while minimizing power consumption and maximizing throughput.
In the world of semiconductor IP, vision processors stand out due to their ability to integrate advanced functionalities such as object recognition, image stabilization, and real-time analytics. These processors often leverage parallel processing, machine learning algorithms, and specialized hardware accelerators to perform complex visual computations efficiently. As a result, products ranging from high-end smartphones to advanced driver-assistance systems (ADAS) and industrial robots benefit from improved visual understanding and processing capabilities.
The semiconductor IPs for vision processors can be found in a wide array of products. In consumer electronics, they enhance the capabilities of cameras, enabling features like face and gesture recognition. In the automotive industry, vision processors are crucial for delivering real-time data processing needed for safety systems and autonomous navigation. Additionally, in sectors such as healthcare and manufacturing, vision processor IPs facilitate advanced imaging and diagnostic tools, improving both precision and efficiency.
As technology advances, the demand for vision processor IPs continues to grow. Developers and designers seek IPs that offer scalable architectures and can be customized to meet specific application requirements. By providing enhanced performance and reducing development time, vision processor semiconductor IPs are integral to pushing the boundaries of what's possible with visual data processing and expanding the capabilities of next-generation products.
Continuing the evolution of AI at the edge, the 2nd Generation Akida provides enhanced capabilities for modern applications. This upgrade implements 8-bit quantization for increased precision and introduces support for Vision Transformers and temporal event-based neural networks. The platform handles advanced cognitive tasks seamlessly with heightened accuracy and significantly reduced energy consumption. Designed for high-performance AI tasks, it supports complex network models and utilizes skip connections to enhance speed and efficiency.
The KL730 AI SoC is equipped with a state-of-the-art third-generation reconfigurable NPU architecture, delivering up to 8 TOPS of computational power. This innovative architecture enhances computational efficiency, particularly with the latest CNN networks and transformer applications, while reducing DDR bandwidth demands. The KL730 excels in video processing, offering support for 4K 60FPS output and boasts capabilities like noise reduction, wide dynamic range, and low-light imaging. It is ideal for applications such as intelligent security, autonomous driving, and video conferencing.
The Metis AIPU PCIe AI Accelerator Card provides an unparalleled performance boost for AI tasks by leveraging multiple Metis AIPUs within a single setup. This card is capable of delivering up to 856 TOPS, supporting complex AI workloads such as computer vision applications that require rapid and efficient data processing. Its design allows for handling both small-scale and extensive applications with ease, ensuring versatility across different scenarios. By utilizing a range of deep learning models, including YOLOv5 and ResNet-50, this AI accelerator card processes up to 12,800 FPS for ResNet-50 and an impressive 38,884 FPS for MobileNet V2-1.0. The card’s architecture enables high throughput, making it particularly suited for video analytics tasks where speed is crucial. The card also excels in scenarios that demand high energy efficiency, providing best-in-class performance at a significantly reduced operational cost. Coupled with the Voyager SDK, the Metis PCIe card integrates seamlessly into existing AI systems, enhancing development speed and deployment efficiency.
Origin E1 neural engines are expertly adjusted for networks that are typically employed in always-on applications. These include devices such as home appliances, smartphones, and edge nodes requiring around 1 TOPS performance. This focused optimization makes the E1 LittleNPU processors particularly suitable for cost- and area-sensitive applications, making efficient use of energy and reducing processing latency to negligible levels. The design also incorporates a power-efficient architecture that maintains low power consumption while handling always-sensing data operations. This enables continuous sampling and analysis of visual information without compromising on efficiency or user privacy. Additionally, the architecture is rooted in Expedera's packet-based design which allows for parallel execution across layers, optimizing performance and resource utilization. Market-leading efficiency with up to 18 TOPS/W further underlines Origin E1's capacity to deliver outstanding AI performance with minimal resources. The processor supports standard and proprietary neural network operations, ensuring versatility in its applications. Importantly, it accommodates a comprehensive software stack that includes an array of tools such as compilers and quantizers to facilitate deployment in diverse use cases without requiring extensive re-designs. Its application has already seen it deployed in over 10 million devices worldwide, in various consumer technology formats.
Designed for high-performance environments such as data centers and automotive systems, the Origin E8 NPU cores push the limits of AI inference, achieving up to 128 TOPS on a single core. Its architecture supports concurrent running of multiple neural networks without context switching lag, making it a top choice for performance-intensive tasks like computer vision and large-scale model deployments. The E8's flexibility in deployment ensures that AI applications can be optimized post-silicon, bringing performance efficiencies previously unattainable in its category. The E8's architecture and sustained performance, alongside its ability to operate within strict power envelopes (18 TOPS/W), make it suitable for passive cooling environments, which is crucial for cutting-edge AI applications. It stands out by offering PetaOps performance scaling through its customizable design that avoids penalties typically faced by tiled architectures. The E8 maintains exemplary determinism and resource utilization, essential for running advanced neural models like LLMs and intricate ADAS tasks. Furthermore, this core integrates easily with existing development frameworks and supports a full TVM-based software stack, allowing for seamless deployment of trained models. The expansive support for both current and emerging AI workloads makes the Origin E8 a robust solution for the most demanding computational challenges in AI.
The Origin E2 family of NPU cores is tailored for power-sensitive devices like smartphones and edge nodes that seek to balance power, performance, and area efficiency. These cores are engineered to handle video resolutions up to 4K, as well as audio and text-based neural networks. Utilizing Expedera’s packet-based architecture, the Origin E2 ensures efficient parallel processing, reducing the need for device-specific optimizations, thus maintaining high model accuracy and adaptability. The E2 is flexible and can be customized to fit specific use cases, aiding in mitigating dark silicon and enhancing power efficiency. Its performance capacity ranges from 1 to 20 TOPS and supports an extensive array of neural network types including CNNs, RNNs, DNNs, and LSTMs. With impressive power efficiency rated at up to 18 TOPS/W, this NPU core keeps power consumption low while delivering high performance that suits a variety of applications. As part of a full TVM-based software stack, it provides developers with tools to efficiently implement their neural networks across different hardware configurations, supporting frameworks such as TensorFlow and ONNX. Successfully applied in smartphones and other consumer electronics, the E2 has proved its capabilities in real-world scenarios, significantly enhancing the functionality and feature set of devices.
TimbreAI T3 is engineered to serve as an ultra-low-power AI inference engine optimized for audio processing tasks, such as noise reduction in devices like wireless headsets. The core can execute 3.2 billion operations per second while consuming an exceptionally low power of 300 µW, making it an ideal choice for portable devices where battery efficiency is paramount. This core utilizes Expedera’s packet-based architecture to achieve significant power efficiency and performance within the stringent power and silicon area constraints typical of consumer audio devices. The T3's design precludes the need for external memory, further reducing system power and chip footprint while allowing for quick deployments across various platforms. Pre-configured to support commonly used audio neural networks, TimbreAI T3 ensures seamless integration into existing product architectures without necessitating hardware alterations or compromising model accuracy. Its user-friendly software stack further simplifies the deployment process, providing essential tools needed for successful AI integration in mass-market audio devices.
BrainChip's Akida is an advanced neuromorphic processor that excels in efficiency and performance, processing data similar to the human brain by focusing on essential sensory inputs. This approach drastically reduces power consumption and latency compared to conventional methods by keeping AI local to the chip. Akida’s architecture, which scales to support up to 256 nodes, allows for high efficiency with a small footprint. Nodes in the Akida system integrate Neural Network Layer Engines configurable as either convolutional or fully connected, maximizing processing power by handling data sparsity through event-based operations.
The AX45MP processor is a multi-core, 64-bit CPU core designed for high-performance computing environments. It supports vector processing and includes features like a level-2 cache controller to enhance data handling and processing speeds. This makes it ideal for rigorous computational tasks including scientific computing and large-scale data processing environments.
The Automotive AI Inference SoC by Cortus is a cutting-edge chip designed to revolutionize image processing and artificial intelligence applications in advanced driver-assistance systems (ADAS). Leveraging RISC-V expertise, this SoC is engineered for low power and high performance, particularly suited to the rigorous demands of autonomous driving and smart city infrastructures. Built to support Level 2 to Level 4 autonomous driving standards, this AI Inference SoC features powerful processing capabilities, enabling complex image processing algorithms akin to those used in advanced visual recognition tasks. Designed for mid to high-end automotive markets, it offers adaptability and precision, key to enhancing the safety and efficiency of driver support systems. The chip's architecture allows it to handle a tremendous amount of data throughput, crucial for real-time decision-making required in dynamic automotive environments. With its advanced processing efficiency and low power consumption, the Automotive AI Inference SoC stands as a pivotal component in the evolution of intelligent transportation systems.
The Metis AIPU M.2 Accelerator Module is a powerful AI processing solution designed for edge devices. It offers a compact design tailored for applications requiring efficient AI computations with minimized power consumption. With a focus on video analytics and other high-demand tasks, this module transforms edge devices into AI-capable systems. Equipped with the Metis AIPU, the M.2 module can achieve up to 3,200 FPS for ResNet-50, providing remarkable performance metrics for its size. This makes it ideal for deployment in environments where space and power availability are limited but computational demands are high. It features an NGFF (Next Generation Form Factor) socket, ensuring it can be easily integrated into a variety of systems. The module leverages Axelera's Digital-In-Memory-Computing technology to enhance neural network inference speed while maintaining power efficiency. It's particularly well-suited for applications such as multi-channel video analytics, offering robust support for various machine learning frameworks, including PyTorch, ONNX, and TensorFlow.
The xcore.ai platform stands as an economical and high-performance solution for intelligent IoT applications. Designed with a unique multi-threaded micro-architecture, it supports applications requiring deterministic performance with low latency. The architecture features 16 logical cores, split between two multi-threaded processor tiles, which are equipped with 512 kB of SRAM and a vector unit for both integer and floating-point computations. This platform excels in enabling high-speed interprocessor communications, allowing tight integration among processors and across multiple xcore.ai SoCs. The xcore.ai offers scalable performance, adapting the tile clock frequency to meet specific application requirements, which optimizes power consumption. Its ability to handle DSP, AI/ML, and I/O processing within a singular development environment makes it a versatile choice for creating smart, connected products. The adaptability of the xcore.ai extends to various market applications such as voice and audio processing. It supports embedded PHYs for MIPI, USB, and LPDDR control processing, and utilizes FreeRTOS across multiple threads for robust multi-threading performance. On an AI and ML front, the platform includes a 256-bit vector processing unit that supports 8-bit to 32-bit operations, delivering exceptional AI performance with up to 51.2 GMACC/s. All these features are packaged within a development environment that simplifies the integration of multiple application-specific components. This makes xcore.ai an essential platform for developers aiming to leverage intelligent IoT solutions that scale with application needs.
The AI Camera Module from Altek is an innovative integration of image sensor technology and intelligent processing, designed to cater to the burgeoning needs of AI in imaging. It combines rich optical design capabilities with software-hardware amalgamation competencies, delivering multiple AI camera models that assist clients in achieving differentiated AI + IoT needs. This flexible camera module excels in edge computing by supporting high-resolution requirements such as 2K and 4K, thereby becoming an indispensable tool in environments demanding detailed image analysis. The AI Camera Module allows for superior adaptability in performing functions such as facial detection and edge computation, thus broadening its applicability across industries. Altek's collaboration with major global brands fortifies the AI Camera Module's position in the market, ensuring it meets diverse client specifications. Whether used in security, industrial, or home automation applications, this module effectively integrates into various systems to deliver enhanced visual processing capabilities.
Origin E6 NPU cores are cutting-edge solutions designed to handle the complex demands of modern AI models, specializing in generative and traditional networks such as RNN, CNN, and LSTM. Ranging from 16 to 32 TOPS, these cores offer an optimal balance of performance, power efficiency, and feature set, making them particularly suitable for premium edge inference applications. Utilizing Expedera’s innovative packet-based architecture, the Origin E6 allows for streamlined multi-layer parallel processing, ensuring sustained performance and reduced hardware load. This helps developers maintain network adaptability without incurring latency penalties or the need for hardware-specific optimizations. Additionally, the Origin E6 provides a fully scalable solution perfect for demanding environments like next-generation smartphones, automotive systems, and consumer electronics. Thanks to a comprehensive software suite based around TVM, the E6 supports a broad span of AI models, including transformers and large language models, offering unparalleled scalability and efficiency. Whether for use in AR/VR platforms or advanced driver assistance systems, the E6 NPU cores provide robust solutions for high-performance computing needs, facilitating numerous real-world applications.
The Chimera GPNPU series stands as a pivotal innovation in the realm of on-device artificial intelligence computing. These processors are engineered to address the challenges faced in machine learning inference deployment, offering a unified architecture that integrates matrix, vector, and scalar operations seamlessly. By consolidating what traditionally required multiple processors, such as NPUs, DSPs, and real-time CPUs, into a single processing core, Chimera GPNPU reduces system complexity and optimizes performance. This series is designed with a focus on handling diverse, data-parallel workloads, including traditional C++ code and the latest machine learning models like vision transformers and large language models. The fully programmable nature of Chimera GPNPUs allows developers to adapt and optimize model performance continuously, providing a significant uplift in productivity and flexibility. This capability ensures that as new neural network models emerge, they can be supported without the necessity of hardware redesign. A remarkable feature of these processors is their scalability, accommodating intensive workloads up to 864 TOPs and being particularly suited for high-demand applications like automotive safety systems. The integration of ASIL-ready cores allows them to meet stringent automotive safety standards, positioning Chimera GPNPU as an ideal solution for ADAS and other automotive use cases. The architecture's emphasis on reducing memory bandwidth constraints and energy consumption further enhances its suitability for a wide range of high-performance, power-sensitive applications, making it a versatile solution for modern automotive and edge computing.
DMP’s ZIA Stereo Vision solution is engineered for depth perception and environmental sensing, leveraging stereo image inputs to compute real-time distance maps. This technology applies stereo matching techniques such as Semi-Global Matching (SGM) to accurately deduce depth from 4K resolution images, paving the way for precision applications in autonomous vehicles and robotic systems. The system employs pre- and post-processing techniques to optimize image alignment and refine depth calculations, achieving high accuracy with low latency. By interfacing through the AMBA AXI4 protocol, it ensures easy integration into existing processing chains, requiring minimal reconfiguration for operation. DMP’s expertise in small footprint, high-performance IP allows the ZIA Stereo Vision to deliver industry-leading depth perception capabilities while maintaining a compact profile, suitable for embedded applications needing robust environmental mapping.
The KL630 AI SoC embodies next-generation AI chip technology with a pioneering NPU architecture. It uniquely supports Int4 precision and transformer networks, offering superb computational efficiency combined with low power consumption. Utilizing an ARM Cortex A5 CPU, it supports a range of AI frameworks and is built to handle scenarios from smart security to automotives, providing robust capability in both high and low light conditions.
The MIPITM V-NLM-01 is specialized for efficient image noise reduction using non-local mean (NLM) algorithms. This resourceful hard core supports parameterized search-window sizes and a customizable number of bits per pixel to enhance visual output quality remarkably. Designed to facilitate HDMI outputs at resolutions up to 2048×1080 at frame rates ranging from 30 to 60 fps, it delivers flexibility for numerous imaging applications. Its efficient implementation renders it suitable for tasks demanding high-speed processing and precise noise reduction in video outputs. The MIPITM V-NLM-01’s algebraic approach to noise reduction ensures exceptional image clarity and fidelity, making it indispensable for high-definition video processing environments. Its adaptability for variable processing requirements makes it a robust solution for current and future video standards.
The NeuroSense AI Chip, an ultra-low power neuromorphic frontend, is engineered for wearables to address the challenges of power efficiency and data accuracy in health monitoring applications. This tiny AI chip is designed to process data directly at the sensor level, which includes tasks like heart rate measurement and human activity recognition. By performing computations locally, NeuroSense minimizes the need for cloud connections, thereby ensuring privacy and prolonging battery life. The chip excels in accuracy, significantly outperforming conventional algorithm-based solutions by offering three times better heart rate accuracy. This is achieved through its ability to reduce power consumption to below 100µW, allowing users to experience extended device operation without frequent recharging. The NeuroSense supports a simple configuration setup, making it suitable for integration into a variety of wearable devices such as fitness trackers, smartwatches, and health monitors. Its capabilities extend to advanced features like activity matrices, enabling devices to learn new human activities and classify tasks according to intensity levels. Additional functions include monitoring parameters like oxygen saturation and arrhythmia, enhancing the utility of wearable devices in providing comprehensive health insights. The chip's integration leads to reduced manufacturing costs, a smaller IC footprint, and a rapid time-to-market for new products.
Polar ID is a comprehensive biometric security solution designed for smartphones and beyond, featuring the unique ability to sense the full polarization state of light. This system enhances biometric recognition by using advanced meta-optic technology, allowing it to capture the distinctive 'polarization signature' of a human face. This capability provides a robust defense against sophisticated 3D masks and spoofing attempts, making it a leading choice for secure facial authentication. Polar ID eliminates the need for complex optical modules typical in structured light systems and does not rely on expensive time-of-flight sensors. By delivering all necessary information from a single image, Polar ID streamlines the authentication process, significantly reducing the size and cost of the system. It provides high-resolution recognition across various lighting conditions, from bright daylight to total darkness, ensuring a reliable and secure experience for users. Employing Polar ID, manufacturers can enable secure digital transactions and access controls without compromising on user experience or device aesthetics. By operating seamlessly even when users wear sunglasses or face masks, this system sets a new standard in facial recognition technology. Its compact design fits easily into the most challenging form factors, making it accessible for a wide range of mobile devices.
The AON1020 offers a sophisticated blend of edge AI processing capabilities designed for environments demanding continuous interactions and low power consumption. Specializing in voice and sound applications, this processor ensures that devices such as smart speakers and wearable technologies can run smoothly and efficiently, providing excellent functionality without draining resources. Engineered to deliver outstanding accuracy, the AON1020 focuses on maintaining exceptional performance in acoustically complex settings, surpassing traditional capabilities with its innovative approach to data processing. Its capacity to execute real-time event and scene detection, alongside speaker identification, highlights its versatility in enhancing user experience across multiple applications. By integrating AI models that are fine-tuned for power efficiency, the AON1020 stands out as a viable solution for developers aiming to incorporate cutting-edge AI technologies into small, resource-constrained devices. Its prowess in managing battery life while sustaining operational readiness makes it uniquely positioned to revolutionize the interactions between users and their smart devices.
The SiFive Performance family is designed for superior compute density and performance efficiency, particularly for datacenter and AI workloads. The family includes 64-bit out-of-order cores ranging from 3 wide to 6 wide configurations, supported by dedicated vector engines for AI tasks. This design ensures a blend of energy efficiency and area optimization, making these cores ideal for handling complex, data-intensive tasks while maintaining a compact footprint.
Optimized for power efficiency and high-frequency operations, the Tianqiao-80 CPU core is crafted for applications requiring fast processing speeds with minimal power consumption. It targets mobile, automotive, and intelligent computing applications with its 64-bit RISC-V architecture. This core supports scalable performance and features efficient processing ideal for workload-intensive environments.
aiWare is a cutting-edge hardware solution dedicated to facilitating neural processing for automotive AI applications. As part of aiMotive’s advanced offerings, the aiWare NPU (Neural Processing Unit) provides a scalable AI inference platform optimized for cost-sensitive and multi-sensor automotive applications ranging from Level 2 to Level 4 driving automation. With its unique SDK focused on neural network optimization, aiWare offers up to 256 Effective TOPS per core, on par with leading industry efficiency benchmarks. The aiWare hardware IP integrates smoothly into automotive systems due to its ISO 26262 ASIL B certification, making it suitable for production environments requiring rigorous safety standards. Its innovative architecture utilizes both on-chip local memory and dense on-chip RAM for efficient data handling, significantly reducing external memory needs. This focus on minimizing off-chip traffic enhances the overall performance while adhering to stringent automotive requirements. Optimized for high-speed operation, aiWare can reach up to 1024 TOPS, providing flexibility across a wide range of AI workloads including CNNs, LSTMs, and RNNs. Designed for easy layout and software integration, aiWare supports essential activation and pooling functions natively, allowing maximum processing efficiency for neural networks without host CPU interference. This makes it an exemplary choice for automotive-grade AI, supporting various advanced driving capabilities and applications.
The AI Inference Platform by SEMIFIVE is designed to facilitate cutting-edge AI computations within custom SoC designs. This platform is particularly structured to meet the high-performance needs of AI-driven applications, powered by a quad-core SiFive U74 RISC-V CPU, supported by a high-speed LPDDR4x memory interface. It integrates PCIe Gen4 for enhanced data transfer capabilities and comes equipped with dedicated vision processing and DC acceleration functionalities. This platform offers a robust infrastructure for developing AI applications, with a significant emphasis on delivering rapid deployment and efficient operations. It makes AI integration seamless through pre-verified components and a flexible architecture, reducing the overall time from concept to deployment. The platform promises a configuration that minimizes risks while maintaining the potential for scalability and future enhancements. SEMIFIVE’s AI Inference Platform is ideally suited for applications like big data analytics and high-performance AI workloads. By leveraging a finely tuned ecosystem and a constellation of pre-integrated IPs, the platform not only speeds up the development process but also ensures reliability and robustness in delivering AI-based solutions.
The CTAccel Image Processor (CIP) on Intel Agilex FPGA offers a high-performance image processing solution that shifts workload from CPUs to FPGA technology, significantly enhancing data center efficiency. Using the Intel Agilex 7 FPGAs and SoCs F-Series, which are built on the 10 nm SuperFin process, the CIP can boost image processing speed by 5 to 20 times while reducing latency by the same measure. This enhancement is crucial for accommodating the explosive growth of image data in data centers due to smartphone proliferation and extensive use of cloud storage. The Agilex FPGA's advanced features include transceiver rates up to 58 Gbps, versatile DSP blocks supporting both fixed-point and floating-point operations, and high-performance cryptographic capabilities. These features facilitate substantial performance improvements in image transcoding, thumbnail generation, and image recognition tasks, reducing total cost of ownership by enabling data centers to maintain higher compute densities with lower operational costs. Moreover, the CIP's support for mainstream image processing software such as ImageMagick and OpenCV ensures seamless integration and deployment. The FPGA's capability for remote reconfiguration allows it to adapt swiftly to custom usage scenarios without server downtimes, enhancing maintenance and operational flexibility.
The KL520 AI SoC by Kneron marked a significant breakthrough in edge AI technology, offering a well-rounded solution with notable power efficiency and performance. This chip can function as a host or as a supplementary co-processor to enable advanced AI features in diverse smart devices. It is highly compatible with a range of 3D sensor technologies and is perfectly suited for smart home innovations, facilitating long battery life and enhanced user control without reliance on external cloud services.
Neuropixels is a groundbreaking fully integrated digital neural probe designed for advanced in vivo neuroscience research in smaller animals. These probes deliver unparalleled precision by capturing detailed electrical signals from the brain, allowing researchers to monitor hundreds of neural activities simultaneously. With its compact design, the Neuropixels probe includes arrays of densely packed electrodes that provide comprehensive insights into neural structures and functions. Favored by neuroscientists globally, Neuropixels promises to unveil complex brain dynamics, thereby enhancing our understanding of neurobiological processes.
CMNP is a dedicated image processing NPU from Chips&Media, engineered to provide superior image enhancement through advanced processing algorithms. Targeted at improving image quality across numerous applications such as mobile, drones, and automotive systems, CMNP is built to accommodate the rigorous demands of contemporary image processing tasks. This solution is especially effective in achieving notable image clarity enhancements, leveraging proprietary instruction set architectures for optimal performance. CMNP's state-of-the-art architecture supports high-efficiency super-resolution and noise reduction features, capable of converting 2K resolution visuals to 4K with enhanced clarity and minimal resource consumption. It utilizes CNN-based processing engines that are fully programmable, ensuring flexibility and precision in complex image operations. The focus on low-bandwidth consumption makes it ideal for resource-sensitive devices while maximizing computational efficiency. Incorporating features like extensive bit-depth processing and the ability to handle expansive color formats, CMNP adapts seamlessly to varying media requirements, upholding image quality with reduced latency. The NPU's adaptability and performance make it valuable for developers looking to integrate robust image-processing capabilities into their designs, be it in high-performance consumer electronics or sophisticated surveillance equipment.
The CVC Verilog Simulator from Tachyon Design Automation is a comprehensive solution for simulating electronic hardware models following the IEEE 1364 2005 Verilog HDL standard. This simulator distinguishes itself by compiling Verilog into native X86_64 machine instructions, allowing for rapid execution as a simple Linux binary. It supports both compiled and interpreted simulation modes, enabling efficient elaboration of designs and quick iteration cycles during the design phase. The simulator boasts a large gate and RTL capacity, enhanced by its 64-bit support which enables faster simulations compared to traditional 32-bit systems. To further augment its high speed, CVC integrates features like toggle coverage with per-instance and tick period controls. These allow designers to maintain oversight over signal changes and states throughout the simulation process. Additionally, CVC provides robust support for various interfaces and simulation techniques, including full PLI (programming language interfaces) and DPI (direct programming interface) support, ensuring seamless integration and high-speed interaction with external C/C++ applications. This simulator also supports various design state dump formats which enhance compatibility with GTKWave, a common tool used for waveform viewing.
T-Head's Hanguang 800 AI Accelerator is a leading-edge chip developed to boost artificial intelligence processing capabilities. It’s designed using the latest 12nm process, integrating 170 billion transistors to achieve outstanding computational performance. The chip delivers a peak computing power of 820 TOPS, making it one of the most powerful AI chips in the industry.\n\nOptimized for deep learning tasks, the Hanguang 800 excels in implementing AI models with minimal latency, which is ideal for data centers and edge computing. In industry benchmarks, such as ResNet-50, it achieves inference speeds of 78,563 IPS, coupled with an efficiency of 500 IPS/W, setting a high standard for AI technology.\n\nThe Hanguang 800 also supports extensive neural network frameworks, ensuring broad applicability and integration into existing AI infrastructures. Its performance is enhanced further by T-Head's proprietary HGAI software stack, which offers robust support for model development and deployment, allowing users to maximize the chip's capabilities in various AI scenarios.
The RayCore MC is a state-of-the-art real-time path and ray-tracing GPU that delivers high-definition, photo-realistic graphics with exceptional energy efficiency. Utilizing advanced path tracing technology, this GPU excels in rendering complex 3D images by simulating natural light behaviors such as global illumination and soft shadows. Its small form factor and low-power architecture make it ideal for mobile and embedded devices, supporting a broad range of high-end applications from gaming to augmented reality. Optimized with a MIMD (Multiple Instruction, Multiple Data) architecture, the RayCore MC supports independent parallel computation, enabling effective real-time path and ray tracing regardless of the graphic complexity. As a fully hardwired solution, it ensures linear scalability, enhancing graphics performance as it scales up in multi-core configurations. This GPU is designed to cater to the high demands of photo-realistic rendering in movies, education, simulations, and more. The RayCore MC uniquely supports immersive game environments and high-intensity virtual applications. Its sophisticated hardware design and support for advanced features facilitate cost-effective, low-power graphics solutions, making it an industry leader in cutting-edge GPU technology.
The Yitian 710 Processor, a hallmark product of T-Head, is a high-performance Arm-based server chip. Developed by T-Head, it is designed with advanced architecture, featuring 128 Armv9 CPU cores and 2.75 GHz of frequency. The processor is engineered with cutting-edge 2.5D packaging, integrating 60 billion transistors. This makes it capable of exceptional computational tasks, catering to demands in AI reasoning, big data, and cloud computing applications.\n\nBeyond its core processing prowess, the Yitian 710 boasts a comprehensive I/O subsystem that includes 96 PCIe 5.0 lanes for high-speed data transfer, enhancing its utility for data-intensive applications. It supports up to eight DDR5 memory channels, delivering a bandwidth peak of 281 GB/s, thus ensuring fast and reliable performance for modern data centers.\n\nThis processor stands out for its ability to handle high throughput workloads efficiently, leading to superior performance in distributed computing environments. It supports scalable cloud service applications, making it ideal for complex computations and large datasets, which are fundamental to many modern enterprises.
Kneron's KL530 introduces a modern heterogeneous AI chip design featuring a cutting-edge NPU architecture with support for INT4 precision. This chip stands out with its high computational efficiency and minimized power usage, making it ideal for a variety of AIoT and other applications. The KL530 utilizes an ARM Cortex M4 CPU, bringing forth powerful image processing and multimedia compression capabilities, while maintaining a low power footprint, thus fitting well with energy-conscious devices.
The PB8051 Microcontroller, crafted by Roman-Jones, is an exemplary implementation of the famed 8051 family of microcontrollers. Designed for compatibility with Xilinx FPGAs, it duplicates the 8031 model, which includes key features such as timers and a serial port, providing seamless execution of existing 8051 object code. This core excels by leveraging the Xilinx PicoBaze softcore, prioritizing a compact design while delivering performance comparable to a 12 MHz 8051 at a fraction of the conventional core size. Optimized for small scale application, the PB8051 takes up only around 300 slices, making it extremely efficient for FPGA utilization. With versatile configurability, it caters to custom 8051 setups and supports varying system configurations. Its compatibility with VHDL and Verilog ensures ease of integration within standard design flows, while also supporting diverse simulation tools like ModelSim. The microcontroller core is constructed with a robust focus on ease of use, supporting a broad spectrum of applications across different FPGA families, from Spartan II upwards. This makes it a versatile choice for designers seeking a powerful yet flexible microcontroller solution that facilitates agile and adaptive deployment within FPGA environments.
Designed for high power efficiency, the KL720 AI SoC achieves a superior performance-per-watt ratio, positioning it as a leader in energy-efficient edge AI solutions. Built for use cases prioritizing processing power and reduced costs, it delivers outstanding capabilities for flagship devices. The KL720 is particularly well-suited for IP cameras, smart TVs, and AI glasses, accommodating high-resolution images and videos along with advanced 3D sensing and language processing tasks.
The Tianqiao-90 core is designed as a high-performance commercial-grade RISC-V CPU. It includes features like super scalar out-of-order execution, a twelve-stage pipeline, and robust support for the RISC-V RV64GCBH extension. Engineered for high-performance computing, this CPU core is optimized for data centers, PCs, and high-end mobile applications. It is capable of delivering a SPECint2006 of 9.4/GHz and maintains high efficiency in power and area usage, making it suitable for demanding environments.
The InferX AI platform from Flex Logix stands at the forefront of artificial intelligence processing, offering robust capabilities for AI inference at the edge. This technology is meticulously engineered to handle machine learning models efficiently, catering to applications that require real-time data processing and decision-making without the need for cloud connectivity. InferX AI excels in optimizing compute performance per watt, making it an ideal choice for applications in autonomous vehicles, smart surveillance systems, and industrial automation. Its innovative architecture is tailored to deliver high throughput and low latency, ensuring rapid analysis and action based on incoming data streams. By utilizing a scalable design, InferX AI can be deployed across a wide array of industries, empowering edge devices with advanced AI functionalities while reducing dependency on centralized data centers. It supports seamless model integration and adaptation, allowing it to keep pace with the latest advancements in AI technology, ensuring future-proof solutions for developers and businesses.
Catalyst-GPU is a line of NVIDIA-based PXIe/CPCIe GPU modules designed for cost-effective compute acceleration and advanced graphics in signal processing and ML/DL AI applications. The Catalyst-GPU leverages the powerful NVIDIA Quadro T600 and T1000 GPUs, offering compute capabilities previously unavailable on PXIe/CPCIe platforms. With multi-teraflop performance, it enhances the processing of complex algorithms in real-time data analysis directly within test systems. The GPU's integration facilitates exceptional performance improvements for applications like signal classification, geolocation, and sophisticated semiconductor and PCB testing. Catalyst-GPU supports popular programming frameworks, including MATLAB, Python, and C/C++, offering ease-of-use across Windows and Linux platforms. Additionally, the Catalyst-GPU's comprehensive support for arbitrary length FFT and DSP algorithms enhances its suitability for signal detection and classification tasks. It's available with dual-slot configurations, providing flexibility and high adaptability in various chassis environments, ensuring extensive applicability to a wide range of modern testing and measurement challenges.
The SiFive Intelligence X280 processor is engineered for high-performance AI dataflow applications, providing scalable vector computing abilities tailored for AI, data management, and inference operations. This RISC-V-based solution integrates a high-performance control processor with vector compute resources, making it ideal for managing AI workloads efficiently. The X280 is equipped with SiFive Intelligence Extensions that enable it to address machine learning operations with enhanced performance and minimal power consumption.
The upgraded JH7110 model enhances upon its predecessor by offering improved graphical and AI processing capabilities. With a quad-core processor and integrated GPU, it supports a wide range of high-speed interfaces, making it ideal for modern AI and multimedia applications. Its optimized architecture ensures efficient resource management and low power consumption while delivering high performance.
The CTAccel Image Processor tailored for AWS takes advantage of FPGA technology to offer superior image processing capabilities on the cloud platform. Available as an Amazon Machine Image, the CIP for AWS offloads CPU tasks to FPGA, thereby boosting image processing speed by 10 times and reducing computational latency by a similar factor. This performance leap is particularly beneficial for cloud-based applications that demand fast, efficient image processing. By utilizing FPGA's reconfigurable architecture, the CIP for AWS enhances real-time processing tasks such as JPEG thumbnail generation, watermarking, and brightness-contrast adjustments. These functions are crucial in managing the vast image data that cloud services frequently encounter, optimizing both service delivery and resource allocation. The CTAccel solution's seamless integration within the AWS environment allows for immediate deployment and simplification of maintenance tasks. Users can reconfigure the FPGA remotely, enabling a flexible response to varying workloads without disrupting application services. This adaptability, combined with the CIP's high efficiency and low operational cost, makes it a compelling choice for enterprises relying on cloud infrastructure for high-data workloads.
The CTAccel Image Processor (CIP) on Intel PAC platform leverages FPGA technology to offload image processing workloads from CPUs, thereby significantly boosting data center efficiency. By transferring tasks such as JPEG transcoding and thumbnail generation onto the FPGA, the CIP increases image processing speeds by up to 5 times and reduces latency by 2 to 3 times, promoting higher throughput and reducing total costs dramatically. The Intel PAC enables this swift processing by utilizing advanced FPGA capabilities, which support massively data-parallel processing. This effectively addresses the limitations of traditional CPU and GPU architectures in handling intricate image processing tasks, particularly those requiring high parallelism. Additionally, the CIP ensures full compatibility with leading image processing libraries, including ImageMagick, OpenCV, and GraphicsMagick, which facilitates hassle-free integration into existing workflows. The use of Partial Reconfiguration technology allows users to reconfigure FPGA processing tasks dynamically, ensuring maximum performance adaptability without necessitating server reboots, thus enhancing operational ease and efficiency.
Lightelligence's PACE is an advanced photonic computing platform that integrates a 64x64 optical matrix multiplier into a silicon photonic chip alongside a CMOS microelectronic chip. This fully integrated system employs sophisticated 3D packaging technology and contains over 12,000 discrete photonic devices. The PACE platform is designed to operate at a system clock of 1GHz, making it ideal for ultra-low latency and energy-efficient applications. The platform's architecture is powered by the Optical Multiply Accumulate (oMAC) technology, which is essential for performing optical matrix multiplications. Input vectors are initially extracted from on-chip SRAM and converted into analog values, which are then modulated optically. The resulting optical vector propagates through an optical matrix to generate an output vector, which undergoes conversion back to the digital domain after being detected by an array of photodetectors. PACE aims to tackle computational challenges, particularly in scenarios like solving Ising problems, where interactions are encoded in an adjacency matrix. The photonic processing capabilities of PACE are geared towards speeding up numerous applications, including bioinformatics, route planning, and materials research, promising significant breakthroughs in chip design and computational efficiency.
The RecAccel N3000 PCIe card is an advanced AI recommendation solution, designed to deliver unparalleled performance in data inference. This state-of-the-art hardware is tailored for high-demand elastic data centers, offering exceptional reliability and acceleration capabilities for AI-based recommendation systems. The card leverages NEUCHIPS' proprietary RecAccel series AI chips to optimize AI tasks with impressive speed and efficiency. One of the standout features of the RecAccel N3000 is its deep integration with cloud recommendation architectures. The card boasts supreme algorithmic optimizations, data caching, and dynamic power management to meet the complex needs of compute-bound, latency-bound, memory-bound, and energy-bound applications. By advancing these functionalities, the N3000 ensures seamless scalability, enhancing performance linearly with the addition of extra cards. A key achievement of RecAccel N3000 is its success in industry benchmarks, particularly its performance in the MLPerf v3.0 DLRM Inference Benchmark. The card achieved an industry-leading performance-per-watt metric, evidencing its efficiency and sustainability in practical applications. Its patented FFP8 technology further enhances recommendation accuracy, achieving 99.99% precision, a crucial advancement for AI recommendation engines.
The CTAccel Image Processor on Alveo U200 provides a robust image processing solution by shifting demanding computational workflows from the CPU to FPGA. Specifically designed to handle massive data throughput efficiently, the CIP elevates server performance by up to 6 times while simultaneously reducing latency fourfold. This jump in performance is critical for managing the vast influx of mobile-generated image data within Internet Data Centers (IDCs). Utilizing the FPGA as a heterogeneous coprocessor, the CIP leverages the Alveo U200 platform to enhance tasks such as JPEG decoding, resizing, and color adjustments. The technology removes bottlenecks associated with conventional processing architectures, making it ideal for environments where quick data processing and minimal latency are imperative. The FPGA's ability to undergo remote reconfiguration supports flexible deployment and is designed to maximize operational uptime. The CIP is compatible with popular software libraries like OpenCV and ImageMagick, ensuring an easy transition from traditional software-based image processing to this high-performance alternative. By deploying CIP, data centers can drastically increase compute density, which translates into lower hardware, power, and maintenance costs.
Edge AI Neural Network Fabric 2.0 by Uniquify is a next-generation architectural framework tailored for implementing contemporary neural networks. This innovative fabric supports numerous neural network variants, including Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs), Feedforward Neural Networks (FNNs), Generative Adversarial Networks (GANs), Autoencoders (AEs), and Batch Normalization techniques. A primary focus of this advanced neural network fabric is to deliver high efficiency in terms of area and power consumption, making it a cost-effective solution for edge AI applications. With the ability to adapt to various algorithmic demands, it enhances not just performance but also scalability for developers looking to harness the power of AI at the edge. The Edge AI Fabric 2.0 is designed to integrate seamlessly into diverse systems while ensuring minimal power draw and footprint. Its architectural advantages make it suitable for deployment in applications where compact size and energy efficiency are crucial, such as in IoT devices and mobile platforms.
Heimdall Toolbox is crafted for low-power image processing applications, providing tools that simplify the development of devices requiring detailed image analysis and processing capabilities. This toolbox is essential for engineers working on products where power efficiency is a priority, such as portable medical devices and smart cameras used in automation. The toolbox is tailored towards enabling rapid image interpretation through custom algorithms, ensuring low power consumption without compromising on image quality. It is equipped with interfaces that support a 64x64 pixel matrix, ideal for applications demanding high efficiency in processing images while maintaining minimal power usage. Heimdall Toolbox supports quick proof of concept, allowing developers to swiftly navigate the design phase and optimize their image processing solutions. This facilitates a smoother transition from design to production, ensuring the final product is both energy-efficient and highly functional, meeting the exacting standards of today's smart technology applications.
The Spiking Neural Processor T1 is a groundbreaking ultra-low power microcontroller designed for sensing applications that require continuous monitoring and rapid data processing while maintaining minimal energy consumption. At its core, it fuses an event-driven spiking neural network engine with a RISC-V processor, creating a hybrid chip that effectively processes sensor inputs in real-time. By boosting the power-performance efficiency in dealing with intricate AI tasks, the T1 chip allows for a wide range of applications even in battery-limited environments. In terms of capabilities, the T1 is equipped with a 32-bit RISC-V core and a substantial 384 KB embedded SRAM, which together facilitate fast recognition of patterns within sensor data such as audio signals. The processor draws on the inherent advantages of spiking neural networks, which are adept at task handling through time-sensitive events. This aspect of SNNs enables them to operate with impressive speed and with significantly reduced power requirements compared to conventional architectures. Additional features include numerous interfaces such as QSPI, I2C, UART, and JTAG, providing versatile connectivity options for various sensors. Housed in a compact 2.16mm x 3mm package, the T1 is an ideal candidate for space-constrained applications. It stands out with its ability to execute both spiking and classical neural network models, facilitating complex signal processing tasks ranging from audio processing to inertial measurement unit data handling.
The Evo Gen 5 PCIe card is a specialized AI inferencing solution crafted to maximize computational efficiency and precision. Designed with AI-driven tasks in mind, this PCIe card ensures high inference accuracy while minimizing energy expenditure, allowing enterprises to achieve outstanding operational profits. Built using advanced 7nm ASIC technology, it integrates seamlessly with PCIe Gen 5 and LPDDR5, delivering cutting-edge performance with minimal power consumption. The card specifically addresses large language model (LLM) memory challenges, featuring Neuchips' proprietary Flexible Floating Point 8 (FFP8) architecture that mimics the accuracy of 16-bit computations using only 8-bit data. This innovative approach effectively halves the memory bandwidth required, ensuring efficient processing of AI workloads. By combining custom ASIC technology with a CPU, the card offers a balanced solution for data-intensive applications. With its focus on task-specific computation, the Evo Gen 5 PCIe card delivers unmatched performance for AI domains without excessive computational overhead. It's particularly appealing for businesses looking to lower total cost of ownership while maintaining high energy-efficiency standards in their AI operations. A strong and flexible supply chain ensures that this product remains available through varying market conditions.