Intel® oneAPI DPC++ Library (oneDPL) 2021 Release Notes

ID 标签 763036
已更新 12/8/2022
版本 Latest
公共

author-image

作者

Where to Find the Release

Please follow the steps to download the toolkit from the Base Toolkit Download, and follow the installation instructions.

Overview

The Intel® oneAPI DPC++ Library (oneDPL) accompanies the Intel® oneAPI DPC++/C++ Compiler and provides high-productivity APIs aimed to minimize programming efforts of C++ developers creating efficient heterogeneous applications.

2021.7.1

New Features

  • Added possibility to construct a zip_iterator out of a std::tuple of iterators.
  • Added 9 more serial-based versions of algorithms: is_heap, is_heap_until, make_heap, push_heap, pop_heap, is_sorted, is_sorted_until, partial_sort, partial_sort_copy. Please refer to Tested Standard C++ API Reference.
       

Fixed Issues

  • Added namespace alias dpl = oneapi::dpl.
  • Fixed error in reduce_by_segment algorithm.
  • Fixed errors when data size is 0 in upper_bound, lower_bound and binary_search algorithms.
  • Fixed wrong results error in algorithms call with permutation iterator.

Deprecation Notice

  • None in this release.

Known Issues and Limitations

New in This Release

  • None in this release.

Existing Issues

  • std::tuple, std::pair cannot be used with SYCL buffers to transfer data between host and device.
  • std::array cannot be swapped in DPC++ kernels with std::swap function or swap member function in the Microsoft* Visual C++ standard library.
  • The oneapi::dpl::experimental::ranges::reverse algorithm is not available with -fno-sycl-unnamed-lambda option.
  • STL algorithm functions (such as std::for_each) used in DPC++ kernels do not compile with the debug version of the Microsoft* Visual C++ standard library.

NOTE: See oneDPL Guide for other restrictions and known limitations. 

 2021.7.0

 

2021.6.0

 

2021.5.0

 

2021.4.0

2021.3.0

New Features

  • Added the range-based versions of the following algorithms: all_of, any_of, count, count_if, equal, move, remove, remove_if, replace, replace_if.
  • Added the following utility ranges (views): generatefillrotate.

Changes to Existing Features

  • Improved performance of discard_block_engine (including ranlux24, ranlux48, ranlux24_vec, ranlux48_vec predefined engines) and normal_distribution.
  • Added two constructors to transform_iterator: the default constructor and a constructor from an iterator without a transformation. transform_iterator constructed these ways uses transformation functor of type passed in template arguments.
  • transform_iterator can now work on top of forward iterators.

Fixed Issues

  • Fixed execution of swap_ranges algorithm with unseq, par execution policies.
  • Fixed an issue causing memory corruption and double freeing in scan-based algorithms compiled with -O0 and -g options and run on CPU devices.
  • Fixed incorrect behavior in the exclusive_scan algorithm that occurred when the input and ouput iterator ranges overlapped.
  • Fixed error propagation for async runtime exceptions by consistently calling sycl::event::wait_and_throw internally.
  • Fixed the warning: local variable will be copied despite being returned by name [-Wreturn-std-move].

Known Issues and Limitations

  • No new issues in this release.

Existing Issues

  • exclusive_scan and transform_exclusive_scan algorithms may provide wrong results with vector execution policies when building a program with GCC 10 and using -O0 option.
  • Some algorithms may hang when a program is built with -O0 option, executed on GPU devices and large number of elements is to be processed.
  • The use of oneDPL together with the GNU C++ standard library (libstdc++) version 9 or 10 may lead to compilation errors (caused by oneTBB API changes). To overcome these issues, include oneDPL header files before the standard C++ header files, or disable parallel algorithms support in the standard library. For more information, please see Intel® oneAPI Threading Building Blocks (oneTBB) Release Notes.
  • The using namespace oneapi; directive in a oneDPL program code may result in compilation errors with some compilers including GCC 7 and earlier. Instead of this directive, explicitly use oneapi::dpl namespace, or create a namespace alias.
  • The implementation does not yet provide namespace oneapi::std as defined in the oneDPL Specification.
  • The use of the range-based API requires C++17 and the C++ standard libraries coming with GCC 8.1 (or higher) or Clang 7 (or higher).
  • std::tuple, std::pair cannot be used with SYCL buffers to transfer data between host and device.
  • When used within DPC++ kernels or transferred to/from a device, std::array can only hold objects whose type meets DPC++ requirements for use in kernels and for data transfer, respectively.
  • std::array::at member function cannot be used in kernels because it may throw an exception; use std::array::operator[] instead.
  • std::array cannot be swapped in DPC++ kernels with std::swap function or swap member function in the Microsoft* Visual C++ standard library.
  • Due to specifics of Microsoft* Visual C++, some standard floating-point math functions (including std::ldexp, std::frexp, std::sqrt(std::complex<float>)) require device support for double precision.

2021.2.0

New Features

  • Added support of parallel, vector and DPC++ execution policies for the following algorithms: shift_left, shift_right.
  • Added the Range-based versions of the following algorithms: sort, stable_sort, merge.
  • Added experimental asynchronous algorithms: copy_async, fill_async, for_each_async, reduce_async, sort_async, transform_async, transform_reduce_async. These algorithms are declared in oneapi::dpl::experimental namespace and implemented only for DPC++ policies. In order to make these algorithms available the  <oneapi/dpl/async>  header should be included. Use of the asynchronous API requires C++11.
  • Utility function wait_for_all enables waiting for completion of an arbitrary number of events.
  • Added the ONEDPL_USE_PREDEFINED_POLICIES macro, which enables predefined policy objects and make_device_policy, make_fpga_policy functions without arguments. It is turned on by default.

Changes to Existing Features

  • Improved performance of the following algorithms: count, count_if, is_partitioned, lexicographical_compare, max_element, min_element, minmax_element, reduce, transform_reduce, and sort, stable_sort when using Radix sort.
    Note: The sorting algorithms in oneDPL use Radix sort for arithmetic data types compared with std::less or std::greater, otherwise Merge sort.
  • Improved performance of the linear_congruential_engine RNG engine (including minstd_rand, minstd_rand0, minstd_rand_vec, minstd_rand0_vec predefined engines).

Fixed Issues

  • Fixed runtime errors occurring with find_end, search, search_n algorithms when a program is built with -O0 option and executed on CPU devices.
  • Fixed the majority of unused parameter warnings.

Known Issues and Limitations

  • exclusive_scan and transform_exclusive_scan algorithms may provide wrong results with vector execution policies when building a program with GCC 10 and using -O0 option.
  • Some algorithms may hang when a program is built with -O0 option, executed on GPU devices and large number of elements is to be processed.
  • The use of oneDPL together with the GNU C++ standard library (libstdc++) version 9 or 10 may lead to compilation errors (caused by oneTBB API changes). To overcome these issues, include oneDPL header files before the standard C++ header files, or disable parallel algorithms support in the standard library. For more information, please see Intel® oneAPI Threading Building Blocks (oneTBB) Release Notes.
  • The using namespace oneapi; directive in a oneDPL program code may result in compilation errors with some compilers including GCC 7 and earlier. Instead of this directive, explicitly use oneapi::dpl namespace, or create a namespace alias.
  • The implementation does not yet provide namespace oneapi::std as defined in the oneDPL Specification.
  • The use of the range-based API requires C++17 and the C++ standard libraries coming with GCC 8.1 (or higher) or Clang 7 (or higher).
  • std::tuple, std::pair cannot be used with SYCL buffers to transfer data between host and device.
  • When used within DPC++ kernels or transferred to/from a device, std::array can only hold objects whose type meets DPC++ requirements for use in kernels and for data transfer, respectively.
  • std::array::at member function cannot be used in kernels because it may throw an exception; use std::array::operator[] instead.
  • std::array cannot be swapped in DPC++ kernels with std::swap function or swap member function in the Microsoft* Visual C++ standard library.
  • Due to specifics of Microsoft* Visual C++, some standard floating-point math functions (including std::ldexp, std::frexp, std::sqrt(std::complex<float>)) require device support for double precision.

2021.1.1

New Features

  • Added new random number distributions: exponential_distribution, bernoulli_distribution, geometric_distribution, lognormal_distribution, weibull_distribution, cachy_distribution, extreme_value_distribution.
  • Added the serial-based versions of the following algorithms: all_of, any_ofnone_of, count, count_if, for_each, find, find_if, find_if_not. For the detailed list, please refer to Tested Standard C++ API Reference
  • Improved performance of search and find_end algorithms on GPU devices.

Fixed Issues

  •  Fixed SYCL* 2020 features deprecation warnings.
  •  Fixed some corner cases of normal_distribution functionality.
  •  Fixed a floating point exception occurring on CPU devices when a program uses a lot of oneDPL algorithms and DPC++ kernels.
  •  Fixed possible hanging and data races of the following algorithms used with DPC++ execution policies: count, count_if, is_partitioned, lexicographical_compare, max_element, min_element, minmax_element, reduce, transform_reduce.

Known Issues and Limitations

New in This Release

  • The definition of lambda functions used with parallel algorithms should not depend on preprocessor macros that makes it different for the host and the device. Otherwise, the behavior is undefined.

Existing Issues

  • exclusive_scan and transform_exclusive_scan algorithms may provide wrong results with vector execution policies when building a program with GCC 10 and using -O0 option.
  • Some algorithms may hang when a program is built with -O0 option, executed on GPU devices and large number of elements is to be processed.
  • The use of oneDPL together with the GNU C++ standard library (libstdc++) version 9 or 10 may lead to compilation errors (caused by oneTBB API changes). To overcome these issues, include oneDPL header files before the standard C++ header files, or disable parallel algorithms support in the standard library. For more information, please see Intel® oneAPI Threading Building Blocks (oneTBB) Release Notes.
  • The using namespace oneapi; directive in a oneDPL program code may result in compilation errors with some compilers including GCC 7 and earlier. Instead of this directive, explicitly use oneapi::dpl namespace, or create a namespace alias.
  • The implementation does not yet provide namespace oneapi::std as defined in the oneDPL Specification.
  • The use of the range-based API requires C++17 and the C++ standard libraries coming with GCC 8.1 (or higher) or Clang 7 (or higher).
  • std::tuple, std::pair cannot be used with SYCL buffers to transfer data between host and device.
  • When used within DPC++ kernels or transferred to/from a device, std::array can only hold objects whose type meets DPC++ requirements for use in kernels and for data transfer, respectively.
  • std::array::at member function cannot be used in kernels because it may throw an exception; use std::array::operator[] instead.
  • std::array cannot be swapped in DPC++ kernels with std::swap function or swap member function in the Microsoft* Visual C++ standard library.
  • Due to specifics of Microsoft* Visual C++, some standard floating-point math functions (including std::ldexp, std::frexp, std::sqrt(std::complex<float>)) require device support for double precision.

Additional Documentation

Notices and Disclaimers

Intel technologies may require enabled hardware, software or service activation.

No product or component can be absolutely secure.

Your costs and results may vary.

© Intel Corporation. Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.

The products described may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request.

Intel disclaims all express and implied warranties, including without limitation, the implied warranties of merchantability, fitness for a particular purpose, and non-infringement, as well as any warranty arising from course of performance, course of dealing, or usage in trade.