Choose a High-Performance FFT in oneMKL or Intel® IPP

ID 标签 659937
已更新 7/2/2024
版本 Latest
公共

author-image

作者

Note: This content applies to Intel® oneAPI Math Kernel Library (oneMKL) 2018.0 or later and Intel® Integrated Performance Primitives (Intel® IPP) 2018.0 or later.

Objective

Get information to help you decide whether a fast Fourier transform (FFT) algorithm in oneMKL or Intel IPP is best suited for your application.

Overview

Fourier transforms are used in signal processing, image processing, physics, statistics, finance, cryptography, and many other areas. The discrete Fourier transform (DFT) mathematical operation converts a signal from the time domain to the frequency domain and back.

DFT processing time can dominate a software application. Using FFT (a fast algorithm) reduces the number of arithmetic operations from O(N2) to O(N log2 N) operations. FFTs in oneMKL and Intel IPP are highly optimized for Intel® architecture-based multicore processors using the latest instruction sets, parallelism, and algorithms.

This article provides guidance for selecting the best FFT for your application. For summaries of the oneMKL and Intel IPP libraries, see table 1. For details, see the oneMKL website and the Intel IPP website.

Table 1. Comparison of oneMKL and Intel IPP Functionality

 

 oneMKL

Intel IPP

Target Applications

Mathematical applications for engineering, scientific and financial applications

Speeds performance for imaging, vision, signal, security, and storage applications

Library Structure

  • BLAS
  • Sparse BLAS
  • LAPACK
  • ScaLAPACK
  • FFT
  • Vector Math
  • Vector statistics
  • Random- number generators
  • Partial differential equations
  • Optimization solvers
  • Sparse solvers
  • Deep neural network 1
  • Signal processing
  • Image processing
  • Cryptography
  • Data compression

Linkage Models

  • Static
  • Dynamic
  • Custom dynamic
  • Static
  • Dynamic
  • Custom dynamic

Operating Systems

  • Windows*
  • Linux*
  • macOS*
  • Windows*
  • Linux*
  • Android*2
  • macOS*

Processor Support

IA-32 and Intel® 64 architecture-based platforms and compatible platforms

IA-32 and Intel® 64 architecture-based platforms and compatible platforms

Both libraries contain generic code that is optimized for processors with Intel® Streaming SIMD Extensions (Intel® SSE) and code optimized for processors with Intel SSE2, Intel SSE3, Intel SSE4.1, Intel SSE4.2, Intel® Advanced Vector Extensions, Intel® Advanced Vector Extensions 2, and Intel® Advanced Vector Extensions 512 instruction sets
 

  1. Deep Neural Network has not been a part of the package since 2020.
  2. Android has not been supported since 2020.

FFT Features in oneMKL and Intel IPP

These FFTs are targeted for:

  • oneMKL: engineering and scientific applications
  • Intel IPP: media and communication applications

To help you decide which FFT is best for your application, see table 2.

Table 2: Comparison of oneMKL and Intel IPP DFT Features

Feature

oneMKL

Intel IPP

API

  • DFT
  • Cluster FFT
  • FFTW 2.x and 3.x
  • FFT
  • DFT

Interfaces

  • C, Fortran and DPC++ API 3
  • LP64 (64-bit long and pointer)
  • ILP64 (64-bit int, long, and pointer)
  • C
  • LP64 only

 

Dimensions

1-D up to 7-D

  • 1-D (signal processing)
  • 2-D (image processing)

 

Transform Sizes

  • 32-bit platforms - maximum size is 2^31-1
  • 64-bit platforms - 264 maximum size
  • FFT - Powers of 2 only 4
  • DFT -232 maximum size 4

 

Mixed Radix Support

2, 3, 5, 7, 11, 13, and several larger kernels5

DFT - 2, 3, 5, 7, 11, 13 kernels5

Data Types

(See Table 3 for detail)

  • Real and complex
  • Single- and double-precision

 

  • Real and complex
  • Single- and double-precision

 

Scaling

Transforms can be scaled by an arbitrary floating-point number (with precision the same as input data)

Integer ("fixed") scaling

  • Forward 1/N
  • Inverse 1/N
  • Forward + Inverse  SQRT (1/N)

 

Threading

Platform dependent

  • IA-32: All (except 1D when performing a single transform and sizes are not power of two)
  • Intel 64: All (except for in-place power of two)

1D and 2D6

 

  1. DPC++ APIs have been available since version 2021.
  2. The maximum size limits are:

 

  • For double precision complex DFT (64fc), the length upper bound is 67108863 (2^26 - 1).
  • For single precision complex DFT (32fc), the length upper bound is 134217727 (2^27 - 1).
  • For double precision complex FFT (64fc), the length upper bound is 2^27.
  • For single precision complex FFT (32fc), the length upper bound is 2^28.
  1. Both libraries support arbitrary radix in optimized manner, that is O(N*log(N)), but these specific radixes are better optimized than others.
  2. Since Intel IPP v.2021, only the nonthreaded version is available.

Data Types and Formats

The FFTs for oneMKL and Intel IPP support a variety of data types and formats for storing signal values. Mixed types interfaces are also supported. For details, see the product documentation.

Table 3. Comparison of oneMKL and Intel IPP Data Types and Formats

 

Feature

oneMKL

Intel IPP

Real FFTs

Precision

  • Single
  • Double
  • Single
  • Double

1D Data Types

Real for all dimensions

Real for all dimensions

2D Data Types

Real for all dimensions

Real for all dimensions

1D Packed Formats

  • CCS
  • Pack
  • Perm
  • CCE
  • CCS
  • Pack
  • Perm

2D Packed Formats

  • CCS
  • Pack
  • Perm
  • CCE

RCPack2D

3D Packed Formats

CCE

n/a

Format Conversion Functions

 n/a

 n/a

Complex FFTs

Precision

  • Single
  • Double
  • Single
  • Double

1D Data Types

Complex for all dimensions

Complex for all dimensions

2D Data Types

Complex for all dimensions

Complex for all dimensions

 

Formats Legend
 

  • CCE: Stores the values of the first half of the output complex conjugate-even signal.
  • CCS: Same format as CCE format for 1D. It is slightly different for multidimensional real transforms for 2D transforms. CCS, pack, and perm are not supported for 3D and higher ranks.
  • Pack: Compact representation of a complex conjugate-symmetric sequence.
  • Perm: Same as the Pack format for odd lengths and arbitrary permutation of the Pack format for even lengths.
  • RCPack2D: Exploits the complex conjugate symmetry of the transformed data to store only half of the resulting Fourier coefficients.

Performance

The oneMKL and Intel IPP are optimized for current and future Intel processors, and are specifically tuned for two areas:
 

  • oneMKL is suitable for large problem sizes typical to Fortran and C/C++ high-performance computing software such as engineering, scientific, and financial applications.
  • Intel IPP is designed for smaller problem sizes including those used in multimedia, data processing, communications, and embedded C/C++ applications.

​Choosing the Best FFT for Your Application

Before making a decision, you must understand the specific requirements and constraints of the application. Consider these questions:
 

  • What are the performance requirements for the application? How is performance measured? What is the measurement criteria? Is a specific benchmark used? What are the known performance bottlenecks?
  • What type of application is being developed? What are the main operations being performed and on what kind of data?
  • What API is currently being used in the application for transforms? What programming languages are the application code written in?
  • Does the FFT output data need to be scaled (normalized)? What type of scaling is required?
  • What kind of input and output data does the transform process? What are the valid and invalid values? What type of precision is required?

Summary

oneMKL and Intel IPP both provide optimized FFT functions. For more detailed information on the FFT APIs, parameters, and formats, see the following documents:
 

Other Resources