All IPs > Platform Level IP > Processor Core Dependent
In the realm of semiconductor IP, the Processor Core Dependent category encompasses a variety of intellectual properties specifically designed to enhance and support processor cores. These IPs are tailored to work in harmony with core processors to optimize their performance, adding value by reducing time-to-market and improving efficiency in modern integrated circuits. This category is crucial for the customization and adaptation of processors to meet specific application needs, addressing both performance optimization and system complexity management.
Processor Core Dependent IPs are integral components, typically found in applications that require robust data processing capabilities such as smartphones, tablets, and high-performance computing systems. They can also be implemented in embedded systems for automotive, industrial, and IoT applications, where precision and reliability are paramount. By providing foundational building blocks that are pre-verified and configurable, these semiconductor IPs significantly simplify the integration process within larger digital systems, enabling a seamless enhancement of processor capabilities.
Products in this category may include cache controllers, memory management units, security hardware, and specialized processing units, all designed to complement and extend the functionality of processor cores. These solutions enable system architects to leverage existing processor designs while incorporating cutting-edge features and optimizations tailored to specific application demands. Such customizations can significantly boost the performance, energy efficiency, and functionality of end-user devices, translating into better user experiences and competitive advantages.
In essence, Processor Core Dependent semiconductor IPs represent a strategic approach to processor design, providing a toolkit for customization and optimization. By focusing on interdependencies within processing units, these IPs allow for the creation of specialized solutions that cater to the needs of various industries, ensuring the delivery of high-performance, reliable, and efficient computing solutions. As the demand for sophisticated digital systems continues to grow, the importance of these IPs in maintaining competitive edge cannot be overstated.
The NMP-750 serves as a high-performance accelerator IP for edge computing solutions across various sectors, including automotive, smart cities, and telecommunications. It supports sophisticated applications such as mobility control, factory automation, and energy management, making it a versatile choice for complex computational tasks. With a high throughput of up to 16 TOPS and a memory capacity scaling up to 16 MB, this IP ensures substantial computing power for edge devices. It is configured with a RISC-V or Arm Cortex-R/A 32-bit CPU and incorporates multiple AXI4 interfaces, optimizing data exchanges between Host, CPU, and peripherals. Optimized for edge environments, the NMP-750 enhances spectral efficiency and supports multi-camera stream processing, paving the way for innovation in smart infrastructure management. Its scalable architecture and energy-efficient design make it an ideal component for next-generation smart technologies.
The Connected Vehicle Solutions by KPIT focus on integrating in-vehicle systems with the broader connected world, transforming the cockpit experience. Utilizing high-resolution displays, augmented reality, and AI-driven personalization, these solutions improve productivity, safety, and user engagement. The company's advancements in over-the-air updates facilitate seamless vehicle interactions and connectivity, ushering in new revenue streams for OEMs while overcoming the challenges of system integration and market competitiveness.
The Metis AIPU PCIe AI Accelerator Card provides an unparalleled performance boost for AI tasks by leveraging multiple Metis AIPUs within a single setup. This card is capable of delivering up to 856 TOPS, supporting complex AI workloads such as computer vision applications that require rapid and efficient data processing. Its design allows for handling both small-scale and extensive applications with ease, ensuring versatility across different scenarios. By utilizing a range of deep learning models, including YOLOv5 and ResNet-50, this AI accelerator card processes up to 12,800 FPS for ResNet-50 and an impressive 38,884 FPS for MobileNet V2-1.0. The card’s architecture enables high throughput, making it particularly suited for video analytics tasks where speed is crucial. The card also excels in scenarios that demand high energy efficiency, providing best-in-class performance at a significantly reduced operational cost. Coupled with the Voyager SDK, the Metis PCIe card integrates seamlessly into existing AI systems, enhancing development speed and deployment efficiency.
The NMP-350 is designed to offer exceptional efficiency in AI processing, specifically targeting endpoint accelerations. This IP is well-suited for markets that require minimal power consumption and cost-effectiveness, such as automotive, AIoT/Sensors, Industry 4.0, smart appliances, and wearables. It enables a wide variety of applications, including driver authentication, digital mirrors, machine automation, and health monitoring. Technically, it delivers up to 1 TOPS and supports up to 1 MB local memory. The architecture is based on the RISC-V or Arm Cortex-M 32-bit CPU, ensuring effective processing capabilities for diverse tasks. Communication is managed via three AXI4 interfaces, each 128 bits wide, to handle Host, CPU, and Data interactions efficiently. The NMP-350 provides a robust foundation for developing advanced AI applications at the edge. Designed for ultimate flexibility, it aids in predictive maintenance and personalization processes in smart environments. With its streamlined architecture, it provides unmatched performance for embedded solutions, enabling seamless integration into existing hardware ecosystems.
Origin E1 neural engines are expertly adjusted for networks that are typically employed in always-on applications. These include devices such as home appliances, smartphones, and edge nodes requiring around 1 TOPS performance. This focused optimization makes the E1 LittleNPU processors particularly suitable for cost- and area-sensitive applications, making efficient use of energy and reducing processing latency to negligible levels. The design also incorporates a power-efficient architecture that maintains low power consumption while handling always-sensing data operations. This enables continuous sampling and analysis of visual information without compromising on efficiency or user privacy. Additionally, the architecture is rooted in Expedera's packet-based design which allows for parallel execution across layers, optimizing performance and resource utilization. Market-leading efficiency with up to 18 TOPS/W further underlines Origin E1's capacity to deliver outstanding AI performance with minimal resources. The processor supports standard and proprietary neural network operations, ensuring versatility in its applications. Importantly, it accommodates a comprehensive software stack that includes an array of tools such as compilers and quantizers to facilitate deployment in diverse use cases without requiring extensive re-designs. Its application has already seen it deployed in over 10 million devices worldwide, in various consumer technology formats.
Designed for high-performance environments such as data centers and automotive systems, the Origin E8 NPU cores push the limits of AI inference, achieving up to 128 TOPS on a single core. Its architecture supports concurrent running of multiple neural networks without context switching lag, making it a top choice for performance-intensive tasks like computer vision and large-scale model deployments. The E8's flexibility in deployment ensures that AI applications can be optimized post-silicon, bringing performance efficiencies previously unattainable in its category. The E8's architecture and sustained performance, alongside its ability to operate within strict power envelopes (18 TOPS/W), make it suitable for passive cooling environments, which is crucial for cutting-edge AI applications. It stands out by offering PetaOps performance scaling through its customizable design that avoids penalties typically faced by tiled architectures. The E8 maintains exemplary determinism and resource utilization, essential for running advanced neural models like LLMs and intricate ADAS tasks. Furthermore, this core integrates easily with existing development frameworks and supports a full TVM-based software stack, allowing for seamless deployment of trained models. The expansive support for both current and emerging AI workloads makes the Origin E8 a robust solution for the most demanding computational challenges in AI.
The Origin E2 family of NPU cores is tailored for power-sensitive devices like smartphones and edge nodes that seek to balance power, performance, and area efficiency. These cores are engineered to handle video resolutions up to 4K, as well as audio and text-based neural networks. Utilizing Expedera’s packet-based architecture, the Origin E2 ensures efficient parallel processing, reducing the need for device-specific optimizations, thus maintaining high model accuracy and adaptability. The E2 is flexible and can be customized to fit specific use cases, aiding in mitigating dark silicon and enhancing power efficiency. Its performance capacity ranges from 1 to 20 TOPS and supports an extensive array of neural network types including CNNs, RNNs, DNNs, and LSTMs. With impressive power efficiency rated at up to 18 TOPS/W, this NPU core keeps power consumption low while delivering high performance that suits a variety of applications. As part of a full TVM-based software stack, it provides developers with tools to efficiently implement their neural networks across different hardware configurations, supporting frameworks such as TensorFlow and ONNX. Successfully applied in smartphones and other consumer electronics, the E2 has proved its capabilities in real-world scenarios, significantly enhancing the functionality and feature set of devices.
Cortus's High Performance RISC-V Processor represents the pinnacle of processing capability, designed for demanding applications that require high-speed computing and efficient task handling. It features the world’s fastest RISC-V 64-bit instruction set architecture, implemented in an Out-of-Order (OoO) execution core, supporting both single-core and multi-core configurations for unparalleled processing throughput. This processor is particularly suited for high-end computing tasks in environments ranging from desktop computing to artificial intelligence workloads. With integrated features such as a multi-socket cache coherent system and an on-chip vector plus AI accelerator, it delivers exceptional computation power, essential for tasks such as bioinformatics and complex machine learning models. Moreover, the processor includes coherent off-chip accelerators, such as CNN accelerators, enhancing its utility in AI-driven applications. The design flexibility extends its application to consumer electronics like laptops and supercomputers, positioning the High Performance RISC-V Processor as an integral part of next-gen technology solutions across multiple domains.
Aimed at performance-driven environments, the NMP-550 is an efficient accelerator IP optimized for diverse markets, including automotive, mobile, AR/VR, drones, and medical devices. This IP is crucial for applications such as driver monitoring, fleet management, image and video analytics, and compliance in security systems. The NMP-550 boasts a processing power of up to 6 TOPS and integrates up to 6 MB of local memory, empowering it to handle complex tasks with ease. It runs on a RISC-V or Arm Cortex-M/A 32-bit CPU and supports multiple high-speed interfaces, specifically three AXI4, 128-bit connections that manage Host, CPU, and Data traffic. This IP is engineered for environments demanding high performance with efficient power use, addressing modern technological challenges in real-time analytics and surveillance. The NMP-550 is adept at improving system intelligence, allowing for enhanced decision-making processes in connected devices.
The AndeShape platform supports AndesCore processor system development by providing a versatile infrastructure composed of Platform IP, hardware development platforms, and an ICE Debugger. This allows for efficient integration and rapid prototyping, offering flexibility in design and development across a comprehensive set of hardware options. It aims to reduce design risk and accelerate time-to-market.
The Automotive AI Inference SoC by Cortus is a cutting-edge chip designed to revolutionize image processing and artificial intelligence applications in advanced driver-assistance systems (ADAS). Leveraging RISC-V expertise, this SoC is engineered for low power and high performance, particularly suited to the rigorous demands of autonomous driving and smart city infrastructures. Built to support Level 2 to Level 4 autonomous driving standards, this AI Inference SoC features powerful processing capabilities, enabling complex image processing algorithms akin to those used in advanced visual recognition tasks. Designed for mid to high-end automotive markets, it offers adaptability and precision, key to enhancing the safety and efficiency of driver support systems. The chip's architecture allows it to handle a tremendous amount of data throughput, crucial for real-time decision-making required in dynamic automotive environments. With its advanced processing efficiency and low power consumption, the Automotive AI Inference SoC stands as a pivotal component in the evolution of intelligent transportation systems.
The Metis AIPU M.2 Accelerator Module is a powerful AI processing solution designed for edge devices. It offers a compact design tailored for applications requiring efficient AI computations with minimized power consumption. With a focus on video analytics and other high-demand tasks, this module transforms edge devices into AI-capable systems. Equipped with the Metis AIPU, the M.2 module can achieve up to 3,200 FPS for ResNet-50, providing remarkable performance metrics for its size. This makes it ideal for deployment in environments where space and power availability are limited but computational demands are high. It features an NGFF (Next Generation Form Factor) socket, ensuring it can be easily integrated into a variety of systems. The module leverages Axelera's Digital-In-Memory-Computing technology to enhance neural network inference speed while maintaining power efficiency. It's particularly well-suited for applications such as multi-channel video analytics, offering robust support for various machine learning frameworks, including PyTorch, ONNX, and TensorFlow.
SCR9 is tailored for entry-level server-class applications and high-performance computing. This 64-bit RISC-V core supports a range of extensions, including vector operations and scalar cryptography. Utilizing a dual-issue 12-stage pipeline, SCR9 excels in environments requiring Linux-based operations, enabling advanced data processing capabilities like those needed in AI and personal computing devices.
The Cortus Lotus 1 is a multifaceted microcontroller that packs a robust set of features for a range of applications. This cost-effective, low-power SoC boasts RISC-V architecture, making it suitable for advanced control systems such as motor control, sensor interfacing, and battery-operated devices. Operating up to 40 MHz, its RV32IMAFC CPU architecture supports floating-point operations and hardware-accelerated integer processing, optimizing performance for computationally demanding applications. Designed to enhance code density and reduce memory footprint, Lotus 1 incorporates 256 KBytes of Flash memory and 24 KBytes of RAM, enabling the execution of complex applications without external memory components. Its six independent 16-bit timers with PWM capabilities are perfectly suited for controlling multi-phase motors, positioning it as an ideal choice for power-sensitive embedded systems. This microcontroller's connectivity options, including multiple UARTs, SPI, and TWI controllers, ensure seamless integration within a myriad of systems. Lotus 1 is thus equipped to serve a wide range of market needs, from personal electronics to industrial automation, ensuring flexibility and extended battery life across sectors.
The RF/Analog offerings from Certus Semiconductor represent cutting-edge solutions designed to maximize the potential of wireless and high-frequency applications. Built upon decades of experience and extensive patent-backed technology, these products comprise individual RF components and full-chip transceivers that utilize sophisticated analog technology. Certus's solutions include silicon-proven RF IP and full-chip RF products that offer advanced low-power front-end capabilities for wireless devices. High-efficiency transceivers cover a range of standards like LTE and WiFi, alongside other modern communication protocols. The design focus extends to optimizing power management units (PMU), RF signal chains, and phase-locked loops (PLLs), providing a full-bodied solution that meets high-performance criteria while minimizing power requirements. With the ability to adapt to various process nodes, products in this category are constructed to offer definitive control over power output, noise figures, and gain. This adaptability ensures that they align seamlessly with diverse operational requirements, while cutting-edge developments in IoT and radar technologies exemplify Certus's commitment to innovation. Their RF/Analog IP line is a testament to their leadership in ultra-low power solutions for next-generation wireless applications.
The iniDSP is a 16-bit digital signal processor core optimized for high-performance computational tasks across diverse applications. It boasts a dynamic instruction set, capable of executing complex algorithms efficiently, making it ideal for real-time data processing in telecommunications and multimedia systems. Designed for seamless integration, the iniDSP supports a variety of interface options and is compatible with existing standard IP cores, facilitating easy adaptation into new or existing systems. Inicore's structured design methodology ensures the processor is technology-independent, making it suitable for both FPGA and ASIC implementations. The core's modular construction allows customization to meet specific application needs, enhancing its functionality for specialized uses. Its high-performance architecture is also balanced with power-efficient operations, making it an ideal choice for devices where energy consumption is a critical consideration. Overall, iniDSP embodies a potent mix of flexibility and efficiency for DSP applications.
The General Purpose Accelerator, known as Aptos, from Ascenium is a state-of-the-art innovation designed to redefine computing efficiency. Unlike traditional CPUs, Aptos is an integrated solution that enhances performance across all generic software applications without requiring modifications to the code. This technology utilizes a unique compiler-driven approach and simplifies CPU architecture, making it adept at executing a wide range of computational tasks with significant energy efficiency. At the heart of the Aptos design is the capability to handle tasks typically managed by out-of-order RISC CPUs, yet it does so with a streamlined and parallel approach, allowing data centers to move past current performance barriers. The architecture is aligned with the LLVM compiler, ensuring that it remains source-code compatible with numerous programming languages, an advantage when future-proofing investments in software infrastructure. The efficiency gains from Aptos are notably due to its ability to handle standard high-level language software in a more efficient manner, achieving nearly four times the efficiency compared to existing state-of-the-art CPUs. This is instrumental in reducing the energy footprint of data centers globally, aligning with broader sustainability goals by cutting carbon emissions and operational costs. Moreover, this makes the technology extremely appealing to organizations seeking tangible ROI through energy savings and performance enhancements.
Eliyan’s NuLink technology revolutionizes die-to-die connections in the semiconductor landscape by delivering robust performance and energy efficiency using industry-standard packaging. The NuLink PHY is designed to optimize serial high-speed die-to-die links, accommodating custom and standard interconnect schemes like UCIe and BoW. It achieves significant benchmarks in terms of power efficiency, bandwidth, and scalability, providing the same benefits typical of advanced packaging techniques but within a standard packaging framework. This versatility enables broader cost-effective solutions by circumventing the high cost and complexity often associated with silicon interposers. NuLink Die-to-Die PHY stands out for its integration flexibility, supporting both silicon and organic substrate environments while maintaining superior data throughput and minimal latency. This innovation is particularly beneficial for system architects aiming to maximize performance within chiplet-based architectures, allowing the strategic incorporation of elements such as high-bandwidth memory and silicon photonics. NuLink further advances system integration by enabling simultaneous bidirectional signaling (SBD), doubling the effective data bandwidth on the same interface line. This singular feature is pivotal for intensive processing applications like AI and machine learning, where robust and rapid data interchange is critical. Eliyan’s NuLink can be implemented in diverse application scenarios, showcasing its ability to manage large-scale, multi-die integrations without the customary bottlenecks of area and mechanical structure. By leading system designs away from vendor-specific, cost-prohibitive supply chains, Eliyan empowers designers with increased freedom and efficiency, further underpinning its groundbreaking role in die-to-die connectivity and beyond.
The SiFive Performance family is designed for superior compute density and performance efficiency, particularly for datacenter and AI workloads. The family includes 64-bit out-of-order cores ranging from 3 wide to 6 wide configurations, supported by dedicated vector engines for AI tasks. This design ensures a blend of energy efficiency and area optimization, making these cores ideal for handling complex, data-intensive tasks while maintaining a compact footprint.
Bluespec's Portable RISC-V Cores are crafted to provide extensive flexibility and compatibility across numerous FPGA platforms, including industry leaders such as Achronix, Xilinx, and Lattice. These cores are designed to support both Linux and FreeRTOS, offering developers a broad range of applications in system development and software integration. Leveraging standard open-source development tools, these cores allow engineers to adopt, modify, and deploy RISC-V solutions with minimal friction. This simplifies the development process and enhances compatibility with various hardware scenarios, promoting an ecosystem where innovation can thrive without proprietary constraints. The Portable RISC-V Cores cater to developers who require adaptable and scalable solutions for diverse projects. By accommodating different FPGA platforms and supporting a wide range of development environments, they represent a versatile choice for implementing cutting-edge designs in the RISC-V architecture space.
aiWare is a cutting-edge hardware solution dedicated to facilitating neural processing for automotive AI applications. As part of aiMotive’s advanced offerings, the aiWare NPU (Neural Processing Unit) provides a scalable AI inference platform optimized for cost-sensitive and multi-sensor automotive applications ranging from Level 2 to Level 4 driving automation. With its unique SDK focused on neural network optimization, aiWare offers up to 256 Effective TOPS per core, on par with leading industry efficiency benchmarks. The aiWare hardware IP integrates smoothly into automotive systems due to its ISO 26262 ASIL B certification, making it suitable for production environments requiring rigorous safety standards. Its innovative architecture utilizes both on-chip local memory and dense on-chip RAM for efficient data handling, significantly reducing external memory needs. This focus on minimizing off-chip traffic enhances the overall performance while adhering to stringent automotive requirements. Optimized for high-speed operation, aiWare can reach up to 1024 TOPS, providing flexibility across a wide range of AI workloads including CNNs, LSTMs, and RNNs. Designed for easy layout and software integration, aiWare supports essential activation and pooling functions natively, allowing maximum processing efficiency for neural networks without host CPU interference. This makes it an exemplary choice for automotive-grade AI, supporting various advanced driving capabilities and applications.
The Altera Agilex 7 F-Series SoC offers unparalleled flexibility and high-performance capabilities, enabling the seamless implementation of complex algorithms into a single chip. Targeting sectors like bioscience and radar systems, this SoC optimizes system performance while minimizing power consumption. Its design includes a heatsink with an integrated fan for effective thermal management, ensuring reliable operation in demanding applications. The integration capabilities of the Agilex 7 F-Series make it a versatile choice for developers seeking efficient system solutions.
The Time-Triggered Protocol (TTP) is an advanced communication protocol specifically designed for managing the growing complexity and requirements of distributed fault-tolerant systems. TTP provides a framework for creating modular, scalable control systems that are essential in modern automotive, aerospace, and industrial applications. Its structured time-triggered communication is tailored to support reliable, synchronized distributed computing, which is crucial for safety-critical systems demanding high-precision operations at lower lifecycle costs.\n\nEstablished as a standard (SAE AS6003), TTP boasts a significant improvement in communication bandwidth over legacy interfaces like ARINC 429 and MIL-1553, enabling efficient integration within sophisticated system architectures. Beyond just enhancing deterministic communication, TTP delivers distributed platform services that simplify designing advanced systems, effectively reducing both software and system lifecycle costs. This attribute makes TTP especially valuable for managing applications where timing and safety are paramount.\n\nComprehensive toolsets and components, including chip IPs and development systems, support and streamline TTP application development. These resources are pivotal in facilitating rapid prototyping and testing, allowing engineers to implement robust and reliable network solutions efficiently. TTP's capability to reduce system complexity positions it as a vital technology in progressing vehicle electronics, aerospace systems, and other automation-driven industries.
Built around the Intel Stratix 10 FPGA, the Altera Stratix 10 SoC delivers robust transceiver bandwidth ideal for applications requiring high-performance processing. Suitable for complex computing environments, such as analytic and video processing, this SoC ensures enhanced control and integration through its internal system-on-chip structure. The potent combination of FPGA architecture and integrated circuits makes it a prime choice for agile deployments across various high-demand sectors.
The GenAI v1-Q represents an enhancement over the basic GenAI v1 core, with added support for quantization capabilities, specifically 4-bit and 5-bit quantization. This significantly reduces memory requirements, potentially by as much as 75%, facilitating the execution of large language models within smaller, more cost-effective systems without sacrificing speed or accuracy. The reduced memory usage translates to lower overall costs and diminished energy consumption while maintaining the integrity and intelligence of the models. Designed for seamless integration into various devices, the GenAI v1-Q also ensures compatibility with diverse memory technologies, making it a versatile choice for applications demanding efficient AI performance.
The Prodigy Universal Processor from Tachyum represents a pioneering technological advancement in the field of semiconductors. This processor is designed to consolidate the capabilities of traditional CPUs, GPGPUs, and TPUs into a unified architecture, enabling it to handle a wide range of applications from artificial intelligence to high-performance computing and cloud deployments. With a focus on sustainability, the processor features highly efficient energy use, which not only reduces costs but also significantly decreases environmental impact, supporting the global push towards greener technology. Designed to push the boundaries of what processors can achieve, the Prodigy aims to transcend the limitations of Moore's Law by enhancing computational power while maintaining energy efficiency. This makes it an attractive option for data centers looking to upgrade their infrastructure to leverage cutting-edge technology without incurring prohibitive costs. Besides computational power, the processor maintains compatibility with existing applications, enabling a seamless integration into current systems without the need for extensive modifications. One of the most extraordinary aspects of the Prodigy processor is its ability to facilitate large-scale AI operations, emulating human-like cognitive tasks with remarkable speed and efficiency. This capability is poised to accelerate advancements in areas like machine learning and data analytics, providing a significant boost to industries reliant on sophisticated computational tasks. Furthermore, its adaptable architecture ensures that it can be scaled according to the demands of different applications, making it a versatile addition to any tech-savvy organization.
The Tyr Superchip is engineered to facilitate high performance computing in AI and data processing domains, with a focus on scalability and power efficiency. Designed around a revolutionary multi-core architecture, it features fully programmable cores that are suitable for any AI or general-purpose algorithms, ensuring high flexibility and adaptability. This product is crucial for industries requiring cutting-edge processing capabilities without the overhead of traditional systems, thanks to its support for CUDA-free operations and efficient algorithm execution that minimizes energy consumption.
Avispado is a 64-bit in-order RISC-V processor core engineered for efficiency and versatility within energy-conscious systems. This core supports a 2-wide in-order pipeline, which allows for streamlined instruction decoding and execution. Its compact design fits well in SOCs aimed at machine learning markets where power and space efficiency are crucial yet it retains the capacity to handle demanding processing tasks. With the inclusion of Gazzillion Misses™ technology, Avispado can handle high sparseness in data effectively, particularly beneficial in machine learning workloads. The core's full compatibility with RISC-V vector specifications and open vector interfaces offers flexibility in deploying various vector solutions, reducing the energy demanded by operations. Avispado is multiprocessor-ready and supports cache-coherent environments, ensuring it can be scaled as operations demand, from minimal cores up to comprehensive systems. It is suitable for applications looking to leverage high throughput with minimized silicon investments, making it a favorable choice for efficiently deploying machine learning and recommendation system support.
Developed to support various standard and custom applications, these ASICs are based on ARM's M-Class architecture, which is renowned for its high performance and low power consumption. Suitable for use in embedded systems, they offer efficient processing capabilities while maintaining minimal power utilization, making them ideal for a wide array of applications. The versatility and adaptability of these ASICs make them perfect for industries ranging from consumer electronics to industrial automation.
The Calibrator for AI-on-Chips epitomizes precision in maintaining high accuracy for AI System-on-Chips through advanced post-training quantization (PTQ) techniques. It offers architecture-aware quantization that sustains accuracy levels up to 99.99% even in fixed-point architectures like INT8. This ensures that AI chips deliver maximum performance while staying within defined precision margins. Central to its operation, Calibrator uses a unique precision simulator to emulate various precision-change points in a data path, incorporating control information that synchronizes with ONNC's compiler for enhanced performance. The integration with ONNC's calibration protocols allows for the seamless refinement of precision, thereby reducing precision drop significantly. Highly adaptable, the Calibrator supports multiple hardware architectures and bit-width configurations, ensuring robust interoperability with various deep learning frameworks. Its proprietary entropy calculation policies and architecture-aware algorithms ensure optimal scaling factors, culminating in a deep learning model that is both compact and precise.
The GenAI v1 is a cutting-edge hardware core developed by RaiderChip specifically engineered to meet the rigorous demands of generative AI workloads, often considered the most challenging. This IP core excels in optimizing efficiency for AI inference, breaking through traditional limitations by improving memory utilization and processing speed. Designed for deployment across a wide range of FPGA devices, particularly the AMD Versal series, it offers impressive speed in AI processing while maintaining low power consumption. The GenAI v1 has been proven effective in various cloud environments, notably on AWS F1 instances, where it demonstrates superior capabilities running complex LLM models like Meta's Llama series. Its architecture, which incorporates advanced parallel processing and optimized memory bandwidth utilization, promises enhanced performance metrics, ensuring it outpaces competitors significantly.
The nxFeed Market Data System leverages FPGA technology to deliver ultra-low latency market data handling. It serves as a comprehensive feed handler that decodes, normalizes, and builds order books with ease, significantly reducing processing resources and latency. The system provides a straightforward API, allowing seamless integration with existing trading algorithms or new in-house developments. By deploying on FPGA-based NICs, nxFeed minimizes network load and accelerates data throughput, enabling rapid algorithmic decision-making. Its design simplifies market data application development, making it a vital tool for traders requiring fast and efficient data processing at volatile exchange feeds.
The eFPGA IP Cores v5 from Menta represent a high-density, programmable logic solution designed for integration within SoCs or ASICs. These embedded FPGAs are crafted to meet diverse market needs, offering designers the flexibility to tailor the exact resources required for their applications. Available in both Soft RTL and Hard GDSII formats, these eFPGAs provide unprecedented design clarity and control.\n\nOne of the key advantages of Menta's eFPGA is its ability to minimize costs and enhance performance. By embedding FPGA functionality on-chip, designers can bypass the limitations of traditional onboard FPGAs, which often pose challenges related to cost and power consumption at high production volumes. Furthermore, integrated eFPGA solutions eliminate the overhead related to I/O pad count and external chip communication, resulting in reduced latency and enhanced efficiency.\n\nMenta underscores the process-portability of its eFPGA cores, having been silicon-proven on more technology nodes than any other provider. Their standard-cell based approach facilitates rapid porting across various semiconductor foundries and node sizes, ensuring maximum adaptability to the latest technological advancements. Security and field-upgradability are integral to Menta's offerings, allowing companies to protect their most sensitive IP throughout the product lifecycle.
Dyumnin Semiconductors' RISCV SoC is a powerful, 64-bit quad-core server-class processor tailored for demanding applications, integrating a multifaceted array of subsystems. Key features include an AI/ML subsystem equipped with a tensor flow unit for optimized AI operations, and a robust automotive subsystem supporting CAN, CAN-FD, and SafeSPI interfaces.\n\nAdditionally, it includes a multimedia subsystem comprising HDMI, Display Port, MIPI, camera subsystems, Gfx accelerators, and digital audio, offering comprehensive multimedia processing capabilities. The memory subsystem connects to various prevalent memory protocols like DDR, MMC, ONFI, NorFlash, and SD/SDIO, ensuring vast compatibility.\n\nThe RISCV SoC's design is modular, allowing for customization to meet specific end-user applications, offering a flexible platform for creating SoC solutions with bespoke peripherals. It also doubles as a test chip available as an FPGA for evaluative purposes, making it ideal for efficient prototyping and development workflows.
Menta's Adaptive Digital Signal Processor (DSP) is a versatile solution designed to offer adaptable signal processing capabilities within embedded FPGA frameworks. Enabled by the Origami tool suite, this DSP solution allows for dynamic configuration, ensuring that each design is perfectly aligned with distinct hardware requirements.\n\nKey to its functionality is the ability to tailor operand sizes for both the multiplier and ALU, and the programmable nature of DSP block operating modes via the bitstream. This adaptability not only empowers designers with fine control over processing architecture but also offers the ability to fine-tune performance metrics such as frequency, area, and latency.\n\nBuilt with robust support for fast inference, the Adaptive DSP's architecture can be configured at a clock-cycle level. This design model allows for real-time adjustments in DSP block operations, perfect for applications demanding high computational loads across diverse environments. The DSP's ease of integration is facilitated by an intuitive software interface, making it an indispensable tool for engineers seeking to implement comprehensive signal processing within constrained embedded systems.
The RAIV is a flexible and high-performing General Purpose GPU (GPGPU), fundamental for industries experiencing rapid transformation due to the fourth industrial revolution—autonomous vehicles, IoT, and VR/AR sectors. Built with a SIMT (Single Instruction Multiple Threads) architecture, the RAIV enhances AI workloads with high-speed processing capabilities while maintaining a low-cost construct. This semiconductor IP supports diverse machine learning and neural network applications, optimizing high-speed calculations across multiple threads. Its high scalability allows tailored configurations in core units, effectively balancing performance with power efficiency dependent on application needs. The RAIV is equipped to handle 3D graphics processing and AI integration for edge computing devices, reinforcing its place in advanced technological development. Additionally, the RAIV's support for OpenCL offers compatibility across various heterogeneous computing platforms, facilitating versatile system configurations. Its optimal performance in AI tasks is further extended for use in metaverse applications, presenting a comprehensive solution that unifies graphics acceleration with AI-enhanced computational operations.
The SAKURA-II AI Accelerator represents the pinnacle of EdgeCortix's innovation, offering cutting-edge efficiency in AI inferencing tasks particularly suited for generative AI applications. This high-performance accelerator is powered by the low-latency Dynamic Neural Architecture (DNA), handling multi-billion parameters with remarkable adequacy. Its adaptability allows it to transform inputs across various modalities such as vision, language, and audio. Particularly distinguished for its compactness and minimal power consumption, the SAKURA-II is engineered to deliver excellent results even in constrained physical environments. It supports vast model complexities with impressive DRAM bandwidth and is equipped for low-latency operations, ensuring fast real-time Batch=1 processing. Notably, its hardware capacity efficiently approximates activation functions, extending substantial support for structures like Llama 2, Stable Diffusion, and ViT. Critically, the SAKURA-II is designed for flexibility, offering robust processing of vision and generative AI workloads in an 8W power envelope. It features a highly-operative memory infrastructure that enhances its performance with sophisticated DRAM bandwidths and ample memory capacity for handling complex data streams and facilitating superior AI compute utility. Its intelligent design aligns with the current move towards sophisticated, low-power AI applications.
TUNGA is an innovative multi-core RISC-V SoC designed to advance high-performance computing and AI workflows using posit arithmetic. This SoC is equipped with multiple CRISP-cores, enabling efficient real-number computation with the integration of posit numerical representations. The TUNGA system exploits the power of the posit data type, known for offering enhanced computational precision and reduced bit-utilization compared to traditional formats. A standout feature of TUNGA is its fixed-point accumulator structure, QUIRE, which ensures exact calculation of dot products for vector lengths extending to approximately 2 billion elements. This precision makes it highly suitable for tasks in cryptography, AI, and data-intensive computations that require high accuracy. In addition, TUNGA leverages a pool of FPGA gates designed for hardware reconfiguration, facilitating the acceleration of processes such as data center services by optimizing task execution paths and supporting non-standard data types. TUNGA is fully programmable and supports various arithmetic operations for specialized computational needs, particularly within high-demand sectors like AI and machine learning, where processing speed and accuracy are critical. By integrating programmability through FPGA gates, users can tailor the SoC for specific workloads, thereby allowing Calligo's TUNGA to stand out as an adaptable element of next-generation cloud and edge computing solutions.
The Tyr AI Processor Family offers a scalable multi-core architecture ideal for any AI or DSP application. It integrates ASIL-D safety standards, making it suitable for advanced driver assistance systems (ADAS) and autonomous driving. Each chip in the Tyr family supports high-level programming and seamless over-the-air updates, providing developers with flexibility and ease of development. The processors achieve near-theoretical efficiency via unique algorithm support, including newer models like BEVformer and Transformers. Additionally, the fully programmable nature of Tyr chips allows them to handle diverse algorithms efficiently, meeting the stringent power and cost requirements of automotive applications.
The Atrevido is a customizable 64-bit RISC-V processor core that showcases standout performance with its out-of-order execution engine. This core is adept at handling memory-intensive applications due to its extensive support for unaligned memory accesses and robust memory management features, making it Linux-ready right out of the gate. Designed for high-bandwidth scenarios, Atrevido excels particularly in machine learning and AI workloads, configurable in 2/3/4-wide out-of-order settings for optimal versatility. Equipped with features such as register renaming and vector processing capabilities, Atrevido is tailored for high-performance applications. It supports the RISC-V vector specification and can seamlessly integrate with custom vector units, offering comprehensive vector processing functionality. Gazzillion Misses™ technology enhances this core's abilities, allowing it to manage sparse data and recommend system operations efficiently. Targeted at diverse markets including machine learning and high-performance computing (HPC), Atrevido is optimized for handling sparse data, typical in applications such as sparse matrices, recommendation systems, and key-value stores. The core's overarching design philosophy emphasizes maximizing data throughput while minimizing the silicon footprint, making it a fit choice for both embedded and data center-level applications.
Monolithic Microsystems by IMEC represent the frontier of microelectronics, where advanced functionalities are integrated directly on top of CMOS technology. This innovation allows for high-performance and miniaturization within a single compact package. Utilizing diverse process modules, including silicon photonics and MEMS, these microsystems offer vast potential across industries from healthcare to automotive. These systems combine multiple technologies such as photonics, optics, and electronics co-integrated into a singular structure, leading to enhanced operational efficiency and reduced costs in mass manufacturing.
The Processor and Microcontroller Cores offered by So-Logic include a diverse range of popular microprocessor and microcontroller components, designed to meet the needs of modern electronic applications. These cores provide the essential computational capabilities required across various industries and are engineered for optimal performance in embedded systems. So-Logic's portfolio includes cores for widely used microprocessors and microcontrollers, ensuring that developers have the tools they need to build efficient and reliable computing systems. These cores simplify the development process, offering compatibility with a range of development environments and tools. With complete verification and a full suite of supportive resources, the cores facilitate straightforward integration into FPGA platforms. They also come with detailed design notes, comprehensive datasheets, and sample applications that aid in the ease of system development and deployment. Extensive technical support ensures that developers can navigate any challenges that arise during their project implementation, making these Processor and Microcontroller Cores a valuable asset in the creation of advanced electronic systems.
iCEVision enhances the capabilities of the Lattice iCE40 UltraPlus FPGA by offering an easy method for rapid prototyping of user functions and designs. This platform is equipped with exposed I/Os, enabling swift implementation and validation of design concepts with standard camera interfaces such as ArduCam CSI and PMOD. \n\nThe development process is facilitated by the Lattice Diamond Programmer software, which allows for reprogramming the onboard SPI Flash with custom code, and iCEcube2, a tool for creating and testing custom designs. These tools make the iCEVision an ideal choice for developers looking to explore complex connectivity solutions in a programmable and flexible environment.\n\nThe iCEVision board features components like a compact 50x50 mm form factor, eight Mb of SPI programmable flash memory, and one Mb of SRAM, enhancing its capabilities for various personalized applications. It comes pre-loaded with a bootloader and RGB demo application, providing a straightforward path from concept to implementation.
The AMD Zynq Ultrascale+ MPSoC merges ARM microprocessing capabilities with FPGA flexibility to create an agile solution for differentiation and advanced analytics. It targets domains requiring significant computational power such as electronic warfare and radar applications. With advanced SW and HW design tools, it simplifies the development process while maximizing system efficiency through multi-processing capabilities. The Zynq Ultrascale+ offers a comprehensive portfolio suited for high-precision, mission-critical environments.
Digital Pre-Distortion (DPD) technology is essential for enhancing the linearity and efficiency of RF power amplifiers in modern wireless systems. This process addresses the distortion issues linked with broadband signals, mitigating spectral regrowth which typically occurs with wide signal transmissions like WCDMA. Faststream's DPD utilizes the Memory Polynomial Algorithm, optimizing PA performance for various test conditions and ensuring minimal non-linear distortion. Through advanced signal processing, DPD supports enhancements in amplifier efficiency while safeguarding signal integrity. By converting non-linear amplifier characteristics into closer linear representation, DPD significantly improves transmission efficacy and reduces operational costs by optimizing power usage in RF systems. Moreover, Faststream’s approach suits modern communication challenges, with optimized DSP core integration allowing DPD functionalities to be housed efficiently within FPGAs. This setup decreases footprint and cost, presenting a viable option for companies aiming to adhere to stringent spectral masks and error vector magnitude criteria.
The ONNC Compiler is designed to meet the growing demands of AI-on-chip development, serving as a cornerstone for transforming neural networks into machine instructions suitable for diverse processing elements. Its robust architecture supports key deep learning frameworks such as PyTorch and TensorFlow, seamlessly converting various file formats into intermediate representations using MLIR frameworks. ONNC facilitates the compilation process with advanced pre-processing capabilities powered by machine learning algorithms to convert input files optimally. The compiler is highly versatile, supporting both single-backend and multi-backend modes to cater to different IC designs. In the single-backend mode, it generates machine instructions for general-purpose CPUs like RISC-V or domain-specific accelerators like NVDLA. For complex AI SoCs, the multi-backend mode manages resources across different processing elements, ensuring robust machine instruction streams. ONNC's enhanced performance is achieved through hardware/software co-optimization, particularly in handling memory and bus allocations within heterogeneous multicore systems. By employing advanced techniques such as software pipelining and DMA allocation, ONNC maximizes resource utilization and curtails energy consumption without compromising on computational accuracy.
The P8700 Series represents a high-performance multi-core processor family that leverages the RISC-V instruction set architecture to deliver unprecedented flexibility and scalability. Designed for applications that require significant compute power and memory, it offers 4-wide out-of-order pipelines with 2-way simultaneous multi-threading, and can support up to 8 cores per cluster. The processor is tailored for applications in automotive, primarily targeting ASIL B safety standards and ISO 26262 compliance, facilitating its use in safety-critical systems such as ADAS. Incorporating advanced multi-threading capabilities, the P8700 Series enhances bandwidth and power efficiency, crucial for high-performance compute tasks. Its architecture is built to facilitate deterministic latency in heterogeneous computing environments, which is essential for real-time data processing in automotive and cloud data center applications. Moreover, the P8700 Series supports robust system designs with fast data movement capabilities, ensuring optimal performance under varied workload conditions. This processor's configurability is a key advantage, as it allows users to customize it to their specific design requirements while maintaining the benefits of the open RISC-V ecosystem. With its ability to seamlessly integrate with other computing subsystems and accelerators, the P8700 Series empowers developers to construct highly optimized systems that can meet demanding power and capacity constraints across diverse industry applications.
The GateMate FPGA series from Cologne Chip AG is crafted to support a variety of FPGAs applications, especially in small to medium-sized enterprises. These field-programmable gate arrays excel in offering first-rate performance regarding logic capacity, power efficiency, and compatibility with PCB designs. Notably, GateMate FPGAs are recognized for delivering industry-leading low costs, rendering them accessible for diverse uses from academic initiatives to extensive production lines. These FPGAs can leverage innovative architecture integrating line programmable elements and a sophisticated routing mechanism, facilitating efficient multiplier constructions. Additionally, memory-intensive applications enjoy block RAM benefits, while the versatile General Purpose IOs allow configurations as single-ended or differential pairs, enhanced by a SerDes interface for high-speed communications.
The NoISA Processor by Hotwright is designed to revolutionize how instruction set architecture processors are perceived and utilized. Deviating from conventional ISA processors that rely on a fixed ALU, register file, and hardware controller, the NoISA processor merges these elements into a singular advanced runtime loadable microcoded algorithmic state machine, also known as the Hotstate machine. This machine is programmed via a subset of the C language, allowing unparalleled flexibility in orchestrating data and control operations.\n\nThis processor is ideal for scenarios where traditional softcore CPUs exhibit limitations, such as excessive power consumption or inadequate speed. The NoISA processor consumes less energy, making it ideal for edge computing and IoT applications where efficiency is paramount. Additionally, it offers the capability to alter the functionality of an FPGA without altering the device itself, achieved by reloading microcode instead of being confined to fixed instructions.\n\nMoreover, the NoISA processor is equipped for high-performance tasks like managing systolic arrays, showcasing its advantage in applications demanding both speed and adaptability. The flexibility of the NoISA processor means users are not limited by a fixed ISA, allowing them to tweak the processor's performance to attain the highest possible efficiency.
UltraRISC Technology's UR-E Processor Core is engineered for high-efficiency computation, especially suited for edge computing scenarios. It harnesses RISC-V architecture's advantages, ensuring a high degree of efficiency for power-sensitive applications. This core focuses on delivering optimum performance across different application domains, accommodating specific computational requirements with its customizable configuration. Offering compatibility with the RISC-V instruction set, the UR-E core can be tailored to specific needs, thus optimizing the processing capability for edge devices. It's designed to support essential processor core resources and SoC-level IP, thereby enabling robust systems that require efficient processing power. The UR-E core exemplifies UltraRISC's dedication to delivering versatile, high-performance computing solutions.