DPDK Application Profiling using Intel® VTune™ Amplifier

Intel® VTune™ Amplifier is a very powerful tool that is used to profile applications to find out performance overhead. This video provides step-by-step demonstrate on how one can use Intel® VTune™ Amplifier to find out, performance bottlenecks in an application, and where actually in source code need to change to improve performance.

 

Intel® VTune™ Profiler

Find and optimize performance bottlenecks across CPU, GPU, and FPGA systems. Part of the Intel® oneAPI Base Toolkit.

Download the Base Toolkit

See All Tools

 

 

Title: Using VTune tool for application performance analysis

In demo setup, I've already installed VTune, and will be using DPDK sample app testpmd for profiling purpose.

Make sure you build debug version of your application, this allows vtune to collect more information for you.

You can build debug version by using "-g" flag into your application make file.

Running application that you want to profile. which in my case, here, is testpmd.

Now dpdk sample application testpmd is running.

Run VTune to profile testpmd.

For that, first set environment to run VTune.

Next to profile testpmd, either we can use VTune GUI or command-line. To use GUI, we can use amplxe-gui command. However I will use command line option here, because I want to capture information for only few particular performance counters.

This is my VTune script I will run from command-line.

It says collect samples for target-process named "testpmd" for duration "40 seconds" for above mentioned performance counters.

Let's run this script. Now VTune observing testpmd for 40 seconds, and then give us result.

After getting results, scroll up a bit. You can see path to the directory where this result is been stored.

you can also find more details like operating system used, hardware events, etc. Now lets open this result in graphical user interface mode.

Go to "open results" and give path to where recently captured result is been stored.

So here is the result performance counters captured using VTune. We can see there is l1 & l2 misses occur.

Then you can go to check hotspots information. This sections gives more info about what are the most active functions in your application, and how was cpu usage from your app over the time.

If you're interested to see more specific and detailed info for your app most active functions, click on bottom-up.

Here we also had application debug version, that's why we can click on source code file and can actually see in which region of code we need to do changes to fix the performance degradation issue.

Check out more tabs, there is tons of useful information there as well. This completes this video on using VTune for application profiling.

This completes this video on "Using VTune for application performance analysis".

In summary, you saw how to run vtune through command-line, and capture results for application profiling and analysis.

You also saw how running VTune for application "debug" version provided more detailed information for code optimization.