Enhance Deep Learning Workloads on the Latest Intel® Xeon® Processors

Subscribe Now

Stay in the know on all things CODE. Updates are delivered to your inbox.

Overview

The 4th gen Intel® Xeon® Scalable processors (formerly code named Sapphire Rapids) offer several built-in features for boosting performance and efficiency of deep learning applications.

This session focuses on one of them—Intel® Advanced Matrix Extensions (Intel® AMX)—and how to take advantage of its AI acceleration power to boost model training and inference using Intel optimizations for PyTorch* and TensorFlow*.

Topics covered include:

An overview of the Intel optimizations, including performance and features on the latest Intel CPUs and how they compare to stock PyTorch and TensorFlow.
How the optimizations reduce a memory footprint and improve performance by automatically mixing precision using bfloat16 or float16 data types.
Using Intel® oneAPI Deep Neural Network Library (oneDNN) with Intel optimizations for PyTorch and TensorFlow to take advantage of other 4th gen Intel Xeon processor built-in acceleration features, such as Intel® Advanced Vector Extensions 512 and Vector Neural Network Instructions (VNNI)
Reducing model inference time with quantization features in Intel® Optimization for PyTorch*
How speedups can be gained over stock PyTorch and TensorFlow on new Amazon Web Services* instances built on Intel Xeon Scalable processors.

Skill level: Novice