Porting Guide for ICC Users to DPCPP or ICX

ID 标签 658402
已更新 3/5/2024
版本 Latest
公共

author-image

作者

This porting guide provides information and suggestions to Intel® C++ Compiler Classic (ICC) users migrating to the new Intel LLVM-based compilers Intel® oneAPI DPC++/C++ Compiler (DPCPP and ICX). There is a similar Porting Guide for ifort Users to ifx.

Nomenclature

For simplicity and clarity, we informally refer to some of the terms in this document, as listed below:

  • ICX/ICPX - Intel® oneAPI DPC++/C++ Compiler
  • ICC/ICPC/ICL Classic - Intel® C++ Compiler Classic

In this document everything referred to ICX will apply automatically to ICPX as well unless mentioned otherwise.

Guiding Principles for ICX

The following are the guiding principles for ICX:

  • ICX and ICC Classic use different compiler drivers. The Intel® C++ Compiler Classic compiler drivers are icc, icpc, and icl.  The Intel® oneAPI DPC++/C++ Compiler drivers are icx and icpx. Use icx to compile and link C programs, and icpx for C++ programs.
  • Unlike the icc driver, icx does not use the file extension to determine whether to compile as C or C+. Users must invoke icpx to compile C+ files. . In addition to providing a core C++ Compiler, ICX/ICPX is also used to compile SYCL/DPC++ codes for the Intel® oneAPI Data Parallel C++ Compiler when we pass an additional flag “-fsycl”. 

Intel® oneAPI DPC++/C++ Compiler is a new compiler. It has functional and behavioral differences compared to - Intel® C++ Compiler Classic. You can expect some porting that will be needed for existing applications using Intel® C++ Compiler Classic.  

The transition from Intel® C++ Compiler Classic to Intel® oneAPI DPC++/C++ Compiler should be smooth and effortless. However, you must port and tune any existing applications applications as a part of the process. 

Major Changes in Compiler Defaults

The major changes in compiler defaults are listed below:

  • The Intel® oneAPI DPC++/C++ Compiler drivers are icx and icpx.
  • Intel® C++ Compiler Classic uses iccicpc or icl drivers but this compiler will be deprecated in the upcoming release.
  • DPC++ users can use the icx/icpx driver along with the -fsycl flag which invokes ICX with SYCL extensions. 
  • Unlike Clang*, the ICX Default floating point model was chosen to match ICC behavior and by default it is -fp-model=fast (/fp:fast for Windows*).
  • MACRO naming is changing. Please be sure to check release notes for future macros to be included in ICX.
  • No diagnostics numbers are listed for remarks, warnings, or notes. Every diagnostic is emitted with the corresponding compiler option to disable it. 
  • Compiler intrinsics cannot be automatically recognized without processor targeting options, unlike the behavior in Intel® C++ Compiler Classic. If you use intrinsics, read more on this intrinsic behavior change later in this document. 
  • __try / __except Microsoft* extensions are not supported in the combination with /Qiopenmp option.

Performance

  • Vectorization

    • With ICX 2022.0.0 and later releases  -O2 and -O3 are not sufficient to enable Intel advanced loop optimizations and vectorization. To enable extra levels of loop optimizations and vectorization use the processor targeting option -x or /Qx along with a target architecture. For example, -xskylake-avx512. Or you use the -xhost or /Qxhost option to enable all available Intel optimizations and advanced vectorization for the processor of the platform where you compile your code. For more details on vectorization and implementation details refer to Vectorization in LLVM and GCC for Intel CPUs and GPUs.

Important New Options

Options to Aid Intel Analyzers and Other Profiling Tools

You can use the following to assist Analyzers.

  • -gline-tables-only
    • This option is helpful for profiling tools.
    • It generates line table debug information only.
    • It allows symbolic back traces with inlining information, but does not include any information about variables, their locations, or types.
  • -fdebug-info-for-profiling
    • Adds extra debug information for more accurate profile.

ICX OpenMP* Options

  • -fiopenmp
    • Compile and recognize OpenMP parallel and SIMD pragmas/directives and clauses and use the Intel OpenMP runtime libraries. 
  • -fopenmp NOT RECOMMENDED
    • Compile and recognize OpenMP parallel and SIMD pragmas/directives and clauses and use the open-source OpenMP runtime. Only use for compatibility testing, not performance. For performance and features use -fiopenmp.
  • -fopenmp-targets=spir64
    • This option is needed when OpenMP 4.5/5.0 TARGET pragmas/directives are used. 
    • OpenMP 4.5 and above TARGET directives are only recognized by the ICX Compiler that is included in the Intel® oneAPI HPC Toolkit. Use the two compiler options above together:
      • icx -fiopenmp -fopenmp-targets=spir64
      • icpx -fiopenmp -fopenmp-targets=spir64

Compiler Versioning

  • A new versioning macro is defined for icx
    • __INTEL_LLVM_COMPILER 
  • Version String
    The version string for the LLVM-based compiler is new. Intel® oneAPI uses semantic versioning. Intel® oneAPI Toolkit and Component Versioning Schema explains more about the Intel oneAPI versioning schema. 

    An example:
    icx --version
    Intel(R) oneAPI DPC++ Compiler 2023.0.0 (2023.0.0.20221201)


    The format is:

    MAJOR.MINOR.PATCH (MAJOR.MINOR.PATCH build-string). 
    Where, 
    ·    MAJOR is the product version. It may not always match the calendar year. 
    ·    MINOR is a single-digit minor version number and incremented as needed for minor releases. 
    ·    PATCH starts at “0” for the initial release. If a critical PATCH for specific bug and security fixes, the number is incremented. 
    The build-string is the date of the built and of the form YYYYMMDD. 

Important Compiler Options Mapping

The important compiler options mapping is listed below:

  • Intel® oneAPI DPC++/C++ Compiler drivers icx and icpx will accept most of Intel® C++ Compiler Classic options or Clang*/LLVM Compiler options.
    • Clang*/LLVM Compiler options are interpreted directly.
    • Classic ICC Compiler options passed to ICX are translated to their Clang*/LLVM equivalents, wherever possible.
  • Not all Intel® C++ Compiler Classic options are accepted and/or implemented in ICX. 
  • Undocumented options from Intel® C++ Compiler Classic are NOT implemented and there are no plans to do so. Remember, this is a very different compiler - the old internal, undocumented Intel® C++ Compiler Classic (ICC) options have no meaning or mapping to the Intel® oneAPI DPC++/C++ Compiler. If there is functionality in an undocumented option that you think you need, submit a bug report through the Online Service Center (OSC), explain the behavior you expect and how ICX is not providing what you need. “Because ICC accepted this option and it’s in my makefile” is not justification. This is a different compiler with different optimizations and behavior. Try ICX without the option. 
  • Intel® C++ Compiler Classic (ICC) options: Diagnostic warnings are emitted for ICC Classic options and are CURRENTLY not planned to be implemented in ICX.
    command line warning #10430: Unsupported command line options encountered
    These options as listed are not supported with the compiler selected.
    For more information, use '-qnextgen-diag'.
    • ICX option –qnextgen-diag causes the ICX Compiler to emit a long list of ICC options that are NOT accepted by ICX.
  • ICC options that ARE IMPLEMENTED or will be implemented soon are accepted quietly.
  • All Clang*/LLVM options for the Clang version included in ICX are accepted and implemented. However, sometimes it maybe be necessary to pass options to Clang. If you need to or want to pass Clang options directly, use the following options: 
    • -Xclang
    • If the option has arguments, use multiple -Xclang options.  
      For example, to pass -target-feature +aes, use -Xclang -target-feature -Xclang +aes
    • This -Xclang option is for both Linux* and Windows*. 
  • GNU* and Microsoft* compatible options are accepted by ICC and ICX.

Pragma Support

Do NOT assume ICC or GCC pragmas are supported by ICX!

ICC has many proprietary Intel® pragmas. Excluding OpenMP* pragmas, only a subset of these Intel® pragmas are supported in ICX. Thus, it is recommended to check for the unsupported pragmas as a first porting step. 

You can check for unsupported pragmas using the ICX supported option -Wunknown-pragmas:
icx -Wunknown-pragmas

Consider this example:

cat unknown-pragmas.c 
int main(void) {
float arr[1000]; 

#pragma totallybogus
#pragma simd
#pragma vector
for (int k=0; k<1000; k++) {
	arr[k] = 42.0;
}
}

icx -c -Wunknown-pragmas unknown-pragmas.c
unknown-pragmas.c:4:9: warning: unknown pragma ignored [-Wunknown-pragmas]
#pragma totallybogus
        ^
unknown-pragmas.c:5:9: warning: unknown pragma ignored [-Wunknown-pragmas]
#pragma simd
        ^
2 warnings generated.

Notice two things in this example:

  • “#pragma totallybogus” is a pragma that does not exist in ICC, GCC, or ICX. It makes sense this pragma is called out as a warning. 
  • “#pragma simd” WAS a supported pragma for ICC. This pragma is NOT supported in ICX. ICX will ignore this pragma and will not do what the user expects from ICC (pragma SIMD should be replaced with OpenMP SIMD pragmas). 

In the final case, “#pragma vector” is recognized and implemented by ICX, therefore, there is no warning.

Predefined Macro Support

Macros are being added dynamically.
To see all the defined values for the compiler, use -E -dM option, which will either create a file with.ii extension with the defined values, or writes the values to stdout. 
The following is an example for SYCL/Data Parallel C++, where the compiler creates the file, hello.ii. 

icpx -fsycl -E -dM ./hello.cpp 
more hello.ii 

In case of C/C++, the output will be sent to stdout: 

icpx -E -dM ./hello.cpp 

For any given version of ICX use the below command to output the currently defined macros. 

icx -x c /dev/null -dM –E 

Built-In Functions

Clang* Built-In functions are documented in the open source Clang documentation.

Support for Pre-Compiled Header Files (PCH)

ICX supports creation of “relocatable” precompiled headers. These are built with a given path into your build directory, to be used later from an installed location. The --relocatable-pch option enables this feature. For more information, refer to Relocatable PCH Files

This is a big improvement over ICC, which had limitations with pre-compiled headers. 

ICX uses the Clang method of creating and using Pre-Compiled Headers (PCH). It is a 2-step process:

  1. To create PCH (linux icc example, similar for Windows)
    icx -x c-header file.h   // creates file.h.gch

     

  2. To use PCH:
    icx -include-pch file.h.gch file.c // uses PCH file when compiling file.c

To use “relocatable” PCH and explicitly naming a pch: 

icx --relocatable-pch -isysroot /path/to/build /path/to/build/file.h file.h.pch// uses PCH file when compiling file.c 

Changes in Diagnostics Options and Diagnostic Message Numbering

The following are list of supported compiler diagnostic options: 

Linux option Windows option Replacement
-diag-  /Qdiag  Not supported, details below
-diag-dump  /Qdiag-dump Not supported
-diag-enable=power  /Qdiag-enable:power Not supported but under consideration
-diag-error-limit /Qdiag-error-limit -fmax-errors=
-diag-file /Qdiag-file -serialize-diagnostics
-diag-file-append /Qdiag-file-append Not supported
-diag-id-numbers  /Qdiag-id-numbers Not supported
-diag-once /Qdiag-once Not supported

 

 

 

 

 

 

 

 

The diag- option is not supported and same for the numeric diagnostic messages. The Intel® oneAPI DPC++/C++ Compiler, based on LLVM technology, classifies diagnostic messages using descriptive phrases. The clang manual gives you the list of descriptive phrases that can be used to enable or disable the diagnostic. For more information, refer to, Diagnostic flags in Clang.

Equivalent diagnostic control options exist for both the Linux and Windows Compilers. This section uses the Linux options for demonstration. Refer to the relevant Windows option from the table above to migrate the Windows diagnostic control.

For example, consider this following test case, the file unknown-pragma.c contains this line:

#pragma unknown_pragma

Compiling with icc gives the following warning message:

icc -c unknown-pragma.c
unknown-pragma.c(1): warning #161: unrecognized #pragma
  #pragma unknown_pragma

The unrecognized pragma diagnostic #161 can be silenced by disabling that warning through ICC b y theoption –diag-disable:161. 

However, ICX does not have numbered diagnostic message, instead, it prints a hint about which diagnostic option can be used to control the diagnostic. You can use –Wall to enable all warning diagnostics that pertain to your program. The warning message suggests the option that you can use to enable or disable that diagnostic. 

In ICX, the unknown pragma diagnostic is silent by default. To enable it, you can use –Wunknown-pragmas. To disable it, use –Wno-unknown-pragmas. Incidentally, you can always use the –Wno- prefix to disable any diagnostic. 

icx -Wall -c ~/unknown-pragma.c 
unknown-pragma.c:1:9: warning: unknown pragma ignored 
      [-Wunknown-pragmas] 
#pragma unknown_pragma 

To migrate the diagnostic control options of your application from existing ICC projects, you need to build your source with ICX and read through the diagnostic output. Look for the suggested -Wno- options which disable the diagnostics that you do not wish to see and modify your build procedures to use those options. 

To increase the severity of the diagnostic from warning to an error, use –Werror=unknown-pragmas. This corresponds to the ICC option –diag-error:161. ICX provides no method to decrease the severity of error messages. 

About Clang Enhanced Diagnostics

The creators of the Clang compiler have put substantial effort in creating more useful diagnostic messages. You will find that the Clang diagnostics have improved in several ways:

  • Colorized diagnostics to make the diagnostic more readable, clearly distinguishing between program source text and diagnostic text.
  • Precise source location information, including line and column number, along with range highlighting for related text. 
  • Fix-it hints, suggesting how to correct the issue being reported.
  • Enhanced syntax error recovery, so that the issue can be reported exactly, as well as allowing compilation to continue to find further issues and much more.

For more information, refer to, Clang’s Expressive Diagnostics.

Linking, IPO and PGO changes

The ICX compiler has different methods for Linking, Interprocedural Optimizations (IPO) and Profile Guided Optimizations (PGO). If you are using these features, be aware of the following: 

  • PGO: LLVM supports profile guided optimization with two different kinds of profiling. A sampling profiler can generate a profile with very low runtime overhead, or you can build an instrumented version of the code that collects more detailed profile information. Both kinds of profiles can provide execution counts for instructions in the code, information on branches taken and function invocation. A profile generation by instrumentation uses fprofile-instr-generate. The profile generated via -fprofile-instr-generate must be used with -fprofile-instr-use. Similarly, sampling profiles generated by external profilers must be converted and used with -fprofile-sample-use.  For more information, refer to Profile Guided Optimization.
  • IPO: LLVM uses Link Time Optimization (LTO) technology, which is termed as “Interprocedural Optimization” (IPO) in ICC. 
  • For more information on LLVM LTO, refer to LLVM Link Time Optimization: Design and Implementation
  • In your Makefiles or Project Settings if you have used ‘xilink’ or ‘xild’, replace these with the equivalent native linkers. Similarly replace ‘xiar’ with ‘ar’ or similar archive tool to enable this change. 
  • Uninitialized global variable is in block start symbol (.bss) by default. 

In ICX, uninitialized global variables will be placed in .bss. If symbols are in .bss, then the linker won’t allow to have more than one definition. 

In contrast for ICC, GCC < v10 and older clang uninitialized global variables will be placed in common section. If symbols are in common section, then the linker will allow to have multiple definitions. 
For Example: 

Here, we have a structure my_struct defines in test.h and the test.h is getting included in test1.c and test2.c

$ cat test.h 
struct my_struct 
 
{    char my_structid[8];    
int  temp2[6]; 
}; 
#if _EXTERN 
extern struct  my_struct my_struct; 
#else 
   struct  my_struct my_struct; 
#endif 
 
$ cat test1.c 
#include"test.h" 
 
$ cat test2.c 
#include"test.h" 
int main () 
 
{     return 0; } 

For ICC and GCC < v10 and older clang it compiles without error, but it throws the following error with ICX. 

/usr/bin/ld: /tmp/test2-e6091a.o:(.bss+0x0): multiple definition of `my_struct'; /tmp/test1-b4c494.o:(.bss+0x0): first defined here 
clang: error: linker command failed with exit code 1 (use -v to see invocation) 

There are two ways to address this: 

  • Compile with -fcommon flag will tell icx to place symbols in common section instead of .bss. 
    $ icx test1.c test2.c -fcommon 
  • Declare a global variable as an extern 
    $ icx test1.c test2.c -D_EXTERN 

Language Features

Intel® Cilk™ Plus will not be supported in ICX compiler. Customers are expected to port their program from Intel® Cilk™ Plus to OpenMP* or Intel® TBB. 

For more information, refer to, Migrate your application to use OpenMP or Intel(R) TBB instead of Intel(R) Cilk(TM) Plus. This include #PRAGMA SIMD, which appears in many ICC tuned codes from that era. PRAGMA SIMD should be replaced with OpenMP* SIMD pragmas. “OMP SIMD” pragmas are recognized at O2 or O3, or if -fopenmp-simd option is used. 

Intrinsic Usage Model Change

  • ICX does type checking for arguments to intrinsics when inlining whereas ICC does not. Therefore, you may see warnings or errors from ICX about arguments to intrinsics that did not appear in ICC.  
  • ICX demand using immintrin.h header file, unlike ICC which doesn’t as long as, we define the  __INTEL_COMPILER_USE_INTRINSIC_PROTOTYPES macro. 
  • ICX demand enabling specific processor/architecture specific compiler option to use corresponding intrinsic unlike ICC. 
  • To use intrinsic based code with ICX, follow the below instructions: 
    • Use the compiler option –march or –m, -x for the compiler to recognize the processor/architecture specific intrinsic. 
    • Getting ICC compatibility with respect to intrinsics is under evaluation, check the Release Notes for latest updates. 
    • Include the immintrin.h header file which comes with the intrinsic declarations. 

Example of LLVM Intrinsics Handling Differences

Refer to the example below on how to type check. ICX performs type checking, whereas, ICC lets argument error pass. 

cat sample_mm_prefetch.c 
#include <immintrin.h>
 
#define CACHE_LINE_SIZE 64
 
__attribute__((always_inline))
inline void Prefetch_Block(const void* addr, size_t sz, int hint)
{
    char* pref_addr = (char*)addr;
    size_t pref_iters = (sz + CACHE_LINE_SIZE - 1) / CACHE_LINE_SIZE;
 
    for (int i = 0; i < pref_iters; i++)
    {
        _mm_prefetch(pref_addr, hint /*_MM_HINT_T1*/);
        pref_addr += CACHE_LINE_SIZE;
    }

$ icc -c sample_mm_prefetch.c 
 
$ icx -c sample_mm_prefetch.c
sample_mm_prefetch.c:13:9: error: argument to '__builtin_prefetch' must be a constant integer
        _mm_prefetch(pref_addr, hint /*_MM_HINT_T1*/);
        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/nfs/pdx/disks/cts2/tools/compiler/cpro/Compiler/19.1/initial/compilers_and_libraries_2020.0.166/linux/lib/clang/10.0.0/include/xmmintrin.h:2103:31: note: 
      expanded from macro '_mm_prefetch'
#define _mm_prefetch(a, sel) (__builtin_prefetch((void *)(a), \
                              ^
1 error generated.
compilation aborted for sample_mm_prefetch.c (code 1)

In this case, the argument to_mm_prefetch must be a CONST, although the documentation to intrinsic mm_prefetch does not specify this, the intrinsic is defined for a CONST argument.

Note that the ICC did not do the type checking whereas ICX did it.

The example below demonstrates the change in behavior for ICX where enabling specific processor/architecture specific compiler option is compulsory. 

The error diagnostics are currently incorrect when it comes to ISA recommendation, this is reported to open source community for fixing.

$ cat intrinsic.cpp
#include<iostream>
#include<immintrin.h>  //ICX needs the include
using namespace std;
void add_sse(float *a, int N){
        __m128 x, y;
        y = _mm_set_ps1(1.f);
        for(int i = 0; i < N/4; i++)
        {
                x = _mm_load_ps(a);
                x = _mm_add_ps(x, y);
                _mm_store_ps(a, x);
                a+=4;
        }
}
void add_avx(float *a, int N){
        __m256 x, y;
        y = _mm256_set_ps(1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f);
        for(int i = 0; i < N/8; i++)
        {
                x = _mm256_load_ps(a);
                x = _mm256_add_ps(x, y);
                _mm256_store_ps(a, x);
                a+=8;
        }
}
void add_avx512(float *a, int N){
        __m512 x, y;
        y = _mm512_set_ps(1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f);
        for(int i = 0; i < N/16; i++)
        {
                x = _mm512_load_ps(a);
                x = _mm512_add_ps(x, y);
                _mm512_store_ps(a, x);
                a+=16;
        }
}
int main(){
        float a[32];
        for(int i = 0; i < 32; i++)
                a[i] = i;
        #ifdef SSE
                add_sse(a,32);
        #elif AVX
                add_avx(a,32);
        #else
                add_avx512(a,32);
        #endif
        std::cout<<"a[15] = "<<a[15]<<"\n";
        return 0;
}

The above code only compiles fine in ICC, but not with ICX compiler. Here is what happens for ICX:

$ icpx intrinsic.cpp -DSSE
intrinsic.cpp:19:13: error: always_inline function '_mm256_set_ps' requires target feature 'sse4.2', but would be inlined into function 'add_avx' that is compiled without support for 'sse4.2'
        y = _mm256_set_ps(1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f);
            ^
intrinsic.cpp:22:21: error: always_inline function '_mm256_load_ps' requires target feature 'sse4.2', but would be inlined into function 'add_avx' that is compiled without support for 'sse4.2'
                x = _mm256_load_ps(a);
                    ^
intrinsic.cpp:23:21: error: always_inline function '_mm256_add_ps' requires target feature 'sse4.2', but would be inlined into function 'add_avx' that is compiled without support for 'sse4.2'
                x = _mm256_add_ps(x, y);
                    ^
intrinsic.cpp:24:17: error: always_inline function '_mm256_store_ps' requires target feature 'sse4.2', but would be inlined into function 'add_avx' that is compiled without support for 'sse4.2'
                _mm256_store_ps(a, x);
4 errors generated.
compilation aborted for intrinsic.cpp (code 1)

If –mavx is used to enable Intel® AVX ISA, an error pops up for AVX512 intrinsics.

$ icpx intrinsic.cpp -DSSE -mavx
intrinsic.cpp:31:13: error: always_inline function '_mm512_set_ps' requires target feature 'avx2', but would be inlined into function 'add_avx512' that is compiled without support for 'avx2'
        y = _mm512_set_ps(1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f, 1.f);
            ^
intrinsic.cpp:34:21: error: always_inline function '_mm512_load_ps' requires target feature 'avx2', but would be inlined into function 'add_avx512' that is compiled without support for 'avx2'
                x = _mm512_load_ps(a);
                    ^
intrinsic.cpp:35:21: error: always_inline function '_mm512_add_ps' requires target feature 'avx2', but would be inlined into function 'add_avx512' that is compiled without support for 'avx2'
                x = _mm512_add_ps(x, y);
                    ^
intrinsic.cpp:36:17: error: always_inline function '_mm512_store_ps' requires target feature 'avx2', but would be inlined into function 'add_avx512' that is compiled without support for 'avx2'
                _mm512_store_ps(a, x);
4 errors generated.
compilation aborted for intrinsic.cpp (code 1)

Enable the Intel® AVX-512 ISA using -march=skylake-avx512 compiler option to resolve the error.

Intrinsics Via Function Definition __attribute__((target()))

In the above example we used a compiler option to target a specific instruction set (-march=skylake-avx512). This can be used if just one instruction set exists in the source file. Often, source files will contain multiple instruction sets represented in intrinsic data declarations and intrinsic instructions. This is done to call specific functions or code sections based on the runtime processor discovery. Typically, these functions or code sections are protected by #IFDEFs with specific target architectures and the user code does processor dispatch to these sections or functions.   

The Clang/LLVM community highly encourages users to mark function definitions using the gcc-style attribute target:

__attribute__((target(<required target>))) 

To mark functions containing intrinsics that are intended to be executed on specific target architectures instead of relying on the default processor targeting. Use of this attribute will provide significantly better compile time error checking. This requires putting code for each specific target architecture into separate functions and applying the target attribute to the function definition. The attribute promotes documenting the intrinsics level for the function and the set of intrinsics that should be allowed within that function. For more information on attribute target and gcc-style function multi-versioning, refer to:

  • Attribute target: Clang attribute target
  • Function Multiversioning: GCC-style multiversioning with an example.

An example of Multi-versioning:

#include <stdio.h>

__attribute__ ((target("avx2")))
void dispatch_func() {
  printf("\nCode for Intel Core processors supporting Intel AVX2 goes here\n");
}
 
__attribute__ ((target("sse4.2")))
void dispatch_func() {
  printf("\nCode for Intel Core processors supporting SSE4.2 goes here\n");
}
 
__attribute__ ((target("sse3")))
void dispatch_func() {
  printf("\nCode for Intel Core 2 Duo processors supporting SSSE3 goes here\n");
}

__attribute__ ((target("default")))
void dispatch_func() {
  printf("\nCode for default implementation goes here\n");
};
 
int main() {
  dispatch_func();
  printf("Return from dispatch_func\n");
  return 0;
}

 

Legacy Intrinsics Promotion with Option intrinsic-promote:

This option is Not recommended but is available. For legacy applications with ICC style intrinsics, the ICX compiler provides a new option. The use of this option is not recommended as it is error-prone. This option attempts to automatically promote functions containing intrinsics to the maximum target architecture of the intrinsics inside that function.

A function containing sections with differing targeting can cause runtime faults. For example: user processor dispatched.

Therefore, we do not encourage this option. We are working on better long-term solutions for legacy ICC intrinsics behavior.

Windows* syntax: /Qintrinsic-promote

Linux* syntax: -mintrinsic-promote

If this option is used, functions containing calls to intrinsics that require a specific CPU feature will have their target architecture automatically promoted to the architecture, allowing the required feature. All code within the function will be compiled with that target architecture and the resulting code for such functions will not execute correctly on processors that do not support the required feature. The user is responsible for guarding the execution path at run time so that such functions are not dynamically reachable when the program is run on processors that do not support the required feature. 

This option is provided as a convenience for compiling legacy code. Use __attribute__((target())) to mark functions that are intended to be executed on specific target architectures instead. Use of this attribute will provide significantly better compile time error checking. 

Intel Proprietary Processor Targeting Pragmas and Functions Support

  • Intel proprietary pragmas “optimization_parameter *”:
    #pragma [intel] optimization_parameter target_arch=
    #pragma [intel] optimization_parameter inline-max-total-size=n
    #pragma [intel] optimization_parameter inline-max-per-routine=n

    These pragmas are not supported in ICX, replace them with __attribute__((target())) as described earlier in this document.
  • Intel proprietary intrinsic function _may_i_use_cpu_feature() is supported and may be used.
  • Intel proprietary intrinsic function _allow_cpu_features(), may be used in ICX as a function attribute, like __attribute__(allow_cpu_features())

Floating Point Reproducibility Controls

The following is the current state of the floating point (FP) model support in ICX:

  • The default FP model is -fp-model=fast -fma to match the behavior of ICX with ICC.
  • -fp-model precise is supported. 
  • -fp-model consistent is not supported, using -fp-model= precise -fimf-arch-consistency=true -no-fma as a workaround helps achieve the same. 
  • -fp-model=strict Tells the compiler to strictly adhere to value-safe optimizations.
  • -fp-speculation=safe/fast/strict/off are all supported.
  • There is no support for #pragma fenv_access.
  • The math library related features in ICC are currently being ported to ICX. We have implemented the IMF (Intel® Math Library) attributes in ICX. 

Brutus or Bisectional Optimization Support

If you are unfamiliar of “Brutus” in ICC or Bisectional Optimizations in Clang/LLVM, you may skip this section.

ICX has support for Clang/LLVM –opt-bisect-limit=N for bisectional optimization debug. This is similar to the Brutus option in ICC. A community effort is underway to enhance optimization debugging capabilities in Clang/LLVM through the open source community.

For more information, refer to optisect-limit.

Appendix: References

Useful references include

Intel® oneAPI DPC++/C++ Compiler Developer Guide and Reference

Intel oneAPI Programming Guide

C++20 Features Supported by Intel® C++ Compilers

C++23 Features Supported by Intel® C++ Compilers

C23 Features Supported by Intel® C++ Compilers

OpenMP* Features and Extensions Supported in Intel® oneAPI DPC++/C++ Compiler (icx)

SYCL* 2020 Specification Features and DPC++ Language Extensions Supported in Intel® oneAPI DPC++/C++ Compiler (dpcpp)

Intel® oneAPI DPC++/C++ Compiler Release Notes

Porting Guide for ifort Users to ifx