Identify and Address Remote NUMA Performance Impact
Platform Analysis in Intel® VTune™ Profiler collects coarse-grained, system-level metrics to identify hardware bottlenecks and inefficient use of hardware. The example workload in this video has steady memory use, but it is memory bound and has a high percentage of remote non-uniform memory accesses (NUMA) where data being accessed is in DRAM off the other socket on a dual-socket system. This also generates large cross-socket Intel® Ultra Path Interconnect (Intel® UPI) traffic with many spikes to satisfy the remote NUMA accesses.
By assigning the affinity of all threads in the workload to cores on a single socket, you can optimize memory access performance for this workload.
Intel VTune Profiler visualizes memory access latency and types of memory accesses across the platform. It's easier to identify the root cause of a performance issue in a memory bound workload and find a solution. Test your multisocket server workload with Intel VTune Profiler today.
产品和性能信息
性能因用途、配置和其他因素而异。请访问 www.Intel.cn/PerformanceIndex 了解更多信息。