Intel AI Frameworks Cheat Sheets

Get started with Intel® Extension for PyTorch* using the following commands.

This extension provides the most up-to-date features and optimizations on Intel hardware, most of which will eventually be upstreamed to stock PyTorch releases.

For additional installation methods, see the Installation Guide.

Note This extension has version requirements for PyTorch.

For more information, see Intel® Extension for PyTorch*.

"

Basic CPU Installation Using PyPI*	python -m pip install intel_extension_for_pytorch
Basic CPU Installation Using Anaconda*	conda install -c intel intel-extension-for-pytorch
Basic GPU Installation Using PyPI	python -m pip install torch==1.13.0a0 -f https://developer.intel.com/ipex-whl-stable-xpu python -m pip install intel_extension_for_pytorch==1.13.10+xpu -f https://developer.intel.com/ipex-whl-stable-xpu
Import Intel Extension for PyTorch	import intel_extension_for_pytorch as ipex
Set backend to use GPU (default CPU)	model = model.to('xpu') data = data.to('xpu')
Capture a Verbose Log (Command Prompt)	export ONEDNN_VERBOSE=1
Capture a Verbose Log on Demand (in the Code)	import torch.backends.mkl as torch_mkl import torch.backends.mkldnn as torch_mkldnn with torch_mkl.verbose(torch_mkl.VERBOSE_ON), torch_mkldnn.verbose(torch_mkldnn.VERBOSE_ON): model(data)
Optimization During Training	model = … optimizer = ... model.train() optimized_model, optimized_optimizer = ipex.optimize(model, optimizer=optimizer)
Optimization During Inference (Performed After Loading Weights)	model = ... model.load_state_dict(torch.load(PATH)) model.eval() optimized_model = ipex.optimize(model)
Optimization Using the Low-Precision Data Type bfloat16 During Training (Default FP32)	optimized_model, optimized_optimizer = ipex.optimize(model, optimizer=optimizer, dtype=torch.bfloat16) with torch.no_grad(): with torch.cpu.amp.autocast(): model(data)
Optimization Using the Low-Precision Data Type bfloat16 During Inference (Default FP32)	optimized_model = ipex.optimize(model, dtype=torch.bfloat16) with torch.cpu.amp.autocast(): model(data)
Run a Launch Script from a Command Prompt: Automate Configuration Settings for Performance Tuning	ipexrun [knobs] <your_pytorch_script> [args]
Non-Uniform Memory Access (NUMA) from a Command Prompt	numactl --cpunodebind N --membind N python <script>
Set a Number of Threads Using GNU* with OpenMP*	export OMP_NUM_THREADS=<num threads>
Bind Threads to a Specific CPU Using GNU with OpenMP	export GOMP_CPU_AFFINITY=<space- or comma-separated list of CPUs>
Specify Whether Threads May Move Between Processors Using GNU with OpenMP	export OMP_PROC_BIND=<value>
Determine Thread Scheduling Using GNU with OpenMP	export OMP_SCHEDULE=<value>
Switch to OpenMP (libiomp)	export LD_PRELOAD=<path>/libiomp5.so:$LD_PRELOAD
Bind Threads to Physical Processing Units Using OpenMP	export KMP_AFFINITY=granularity=fine,compact,1,0
Use OpenMP to Set a Wait Time (ms) After Completing Running a Parallel Region Before Sleeping	export KMP_BLOCKTIME=<time> Recommended to be to 0 or 1 for convolutional neural network (CNN) based models
Tune an Intel® oneAPI Deep Neural Network Library (oneDNN) Primitive Cache (Note the Increased Memory Use: Adjust as Needed)	export ONEDNN_PRIMITIVE_CACHE_CAPACITY = <Tuning size> //Note that <Tuning size> has an upper limit 65536 cached primitives
Use a Denormal Number to Store Extremely Small Numbers to Boost Performance	torch.set_flush_denormal(True)
Import Quantization Functions	from intel_extension_for_pytorch.quantization import prepare, convert
Post-Training int8 Quantization (Static): Reduce Model Size and Memory Bandwidth While Speeding Up Inference Time by Quantizing Weights and Activations	model = … model.eval() data = … qconfig = ipex.quantization.default_static_qconfig prepared_model = prepare(model, qconfig, example_inputs=data, inplace=False) for d in calibration_data_loader(): prepared_model(d) converted_model = convert(prepared_model)
Post-Training int8 Quantization (Dynamic): Reduce Model Size and Memory Bandwidth While Speeding Up Inference Time with on-the-Fly Quantization of Activations During Inference without Calibration	model = … model.eval() data = … dynamic_qconfig = ipex.quantization.default_dynamic_qconfig prepared_model = prepare(model, qconfig, example_inputs=data)

"

For more information and support, or to report any issues, see:

PyTorch Issues on GitHub*

Intel® AI Analytics Toolkit Forum

Sign up and try this extension for free using Intel® Developer Cloud for oneAPI.

"

选择您的语言

使用 Intel.com 搜索

快速链接

最近搜索

高级搜索

仅搜索

Intel® Extension for PyTorch* Cheat Sheet

产品和性能信息