The UltraLong FFT is designed to manage FFT lengths that exceed the internal memory capabilities of FPGA or ASIC devices. When memory usage surpasses on-chip memory limits, the algorithm effectively partitions an N-length transform into smaller N1 and N2 FFTs. This entails three transpose operations in external memory and a subsequent rotation stage to achieve the desired transformation.
To optimize continuous data throughput, the design utilizes separate banks of memory and distinct FFT cores for the N1 and N2 transformations. The architecture allows for numerous design configurations, providing flexibility in terms of memory bank sharing and FFT core utilization. This adaptability is crucial for handling varying performance requirements and conserving logic resources where practical.
Performance is primarily dictated by the bandwidth of the external memory used. Technologies like QDR SRAM offer the highest throughput, while DDR SDRAM enables the processing of more extended FFT lengths. Each UltraLong FFT core is configured to maximize efficiency based on the available memory architecture, ensuring high performance for data-intensive applications.