Introduction
This guide aims to show the impact of Intel® processor based developer kits by performing sequential practices with simple to complex modifications.
At the end of the guidance, you will be able to:
- Describe the hardware setup of the system.
- Run a sample application on the developer kit by using two different data sets to compare the CPU performances.
- Modify settings to offload CPU application processing by running the application on the integrated GPU and the Intel® Vision Accelerator.
- Compare the CPU, GPU and Intel® Vision Accelerator performances under two different data sets to understand the impact of developer kit.
Hardware Setup Diagram
Running a Sample Application on the Developer Kit
For this getting started guide we are using a sample application from Intel® Distribution of OpenVINO™ toolkit which already exists on the developer kit. Alternatively, if you have your application developed on your developer workstation, you can copy your application from your developer workstation to the developer kit and run it on the developer kit.
Sample Application Overview
We will be using the sample application “Object Detection YOLO* V3 Python* Demo” which is included in the Intel® Distribution of OpenVINO™ toolkit. The sample uses OpenCV to display the resulting frame with detections (rendered as bounding boxes and labels, if provided). In the default mode, the sample application displays:
- OpenCV time: Frame decoding and time to render the bounding boxes, labels, and to display the results.
- Detection time: Inference time for the object detection network. It is reported in the Sync mode only.
- Wall clock time: Combined application-level performance.
We will run this sample on 3 different processing units, CPU, GPU and the Intel® Vision Accelerator. Also, for each unit, we will use two different detection types, namely single detection and multiple detection as described below.
- Single detection: Basic data set will be used to perform one-by-one-person detection which detects one person at a time.
- Multi detection: Advance data set will be used to perform multi detection such as person, car and bicycle.
Now that you have the overview of the application, we will run the sample application by following the below steps.
View CPU utilization
You can see the CPU usage is low by opening the System Monitor.
You can view CPU utilization on your device, then compare the performance change when you move processing to the GPU or a vision accelerator.
- Click the Ubuntu* icon in your taskbar. From the search bar, search for and open the System Monitor application.
- Click the Resources tab to bring up the CPU History chart. You can see how the CPU utilization changes when you run the sample on the CPU, as described below.
Afterwards, keep the application open so you can compare the utilization numbers when you offload application processing to the GPU or an accelerator.
Steps to Run the Sample Application
Run the Sample Application on CPU
Step 1: Open a terminal on your developer kit by clicking on the terminal on the start up menu.
Step 2: Copy the below command in your open terminal window to go to the sample application folder.
cd /home/devkit/Downloads/YOLOv3
Step 3: You now need to initialize the Intel® Distribution of OpenVINO™ environment on your developer kit. Copy the below command to your open terminal window and run it to initialize.
source /opt/intel/openvino/bin/setupvars.sh
Once the environment is initialized, in the same terminal window we will run the sample application with the below commands.
Step 4: Simple detection running the sample on CPU (Person Detection dataset)
Copy the below command to your open terminal window and run it.
python3 object_detection_demo_yolov3_async.py -i /home/devkit/Downloads/YOLOv3/Sample_videos/one-by-one-person-detection.mp4 -m /home/devkit/Downloads/YOLOv3/tensorflow-yolo-v3/FP32/frozen_darknet_yolov3_model.xml -l /opt/intel/openvino/deployment_tools/inference_engine/lib/intel64/libcpu_extension_avx2.so -t 0.1
Press the tab key while the sample runs to change asynchronous mode options.
You should see one person at a time being identified in the frame with the inference time. Also, please note that some detections may not be seen clearly in the video scene due to the same color of the target objects.
Single detection practice is just a basic step to get started to "Hello world". The actual performance of the development kit can be observed by practicing multiple detection samples on CPU, GPU and VPU.
Step 5: Multiple detection running the sample on CPU (Person, Car and Bicycle detection dataset)
Copy the below command to your open terminal window and run it.
python3 object_detection_demo_yolov3_async.py -i /home/devkit/Downloads/YOLOv3/Sample_videos/person-bicycle-car-detection.mp4 -m /home/devkit/Downloads/YOLOv3/tensorflow-yolo-v3/FP32/frozen_darknet_yolov3_model.xml -l /opt/intel/openvino/deployment_tools/inference_engine/lib/intel64/libcpu_extension_avx2.so -t 0.1
Press the tab key while the sample runs to change asynchronous mode options.
Your results may vary.
Now that you have run the sample application with both the data sets on the CPU, we will change the developer kit runtime parameters to run the application on the integrated GPU and Intel® Vision Accelerator which is built into the developer kit. Moving the application to run on the integrated GPU or Intel® Vision Accelerator makes your CPU available for other applications.
Run the Sample Application on GPU
Step 1: Simple detection running the sample on GPU (person detection dataset)
Copy the below command to your open terminal window and run it.
python3 object_detection_demo_yolov3_async.py -i /home/devkit/Downloads/YOLOv3/Sample_videos/one-by-one-person-detection.mp4 -m /home/devkit/Downloads/YOLOv3/tensorflow-yolo-v3/FP32/frozen_darknet_yolov3_model.xml -d GPU -t 0.1
Press the tab key while the sample runs to change asynchronous mode options.
Step 2: Multiple detection running the sample on GPU (Person, Car and Bicycle detection dataset)
Copy the below command to your open terminal window and run it.
python3 object_detection_demo_yolov3_async.py -i /home/devkit/Downloads/YOLOv3/Sample_videos/person-bicycle-car-detection.mp4 -m /home/devkit/Downloads/YOLOv3/tensorflow-yolo-v3/FP32/frozen_darknet_yolov3_model.xml -d GPU -t 0.1
Press the tab key while the sample runs to change asynchronous mode options.
Your results may vary. Please note that the system is capable of multiple different objects at a time, but in the given sample it may not clearly be seen due to the appearance of target objects in different time frames.
Run the Sample Application on Intel® Vision Accelerator
Step 1: Simple detection running the sample on the Intel® Vision Accelerator (Person Detection dataset)
Copy the below command to your open terminal window and run it.
python3 object_detection_demo_yolov3_async.py -i /home/devkit/Downloads/YOLOv3/Sample_videos/one-by-one-person-detection.mp4 -m /home/devkit/Downloads/YOLOv3/tensorflow-yolo-v3/FP32/frozen_darknet_yolov3_model.xml -d HDDL -t 0.1
Press the tab key while the sample runs to change asynchronous mode options.
Step 2: Multiple detection running the sample on the Intel® Vision Accelerator (Person, Car and Bicycle detection dataset)
Copy the below command to your open terminal window and run it.
python3 object_detection_demo_yolov3_async.py -i /home/devkit/Downloads/YOLOv3/Sample_videos/person-bicycle-car-detection.mp4 -m /home/devkit/Downloads/YOLOv3/tensorflow-yolo-v3/FP32/frozen_darknet_yolov3_model.xml -d HDDL -t 0.1
Press the tab key while the sample runs to change asynchronous mode options.
Your results may vary. Please note that the system is capable of multiple different objects at a time, but in the given sample it may not clearly be seen due to the appearance of target objects in different time frames.
By running the application on the Intel® Vision Accelerator, you are offloading your inference to be on the Intel® Vision Accelerator and freeing up your CPU for other applications.
Learn More
You have been able to utilize the pre-installed sample applications with the sample data sets successfully. Next you can explore additional documentation to get you on your way to customize and retrain your own model and application:
Remotely Run Demos using SSH article will guide you using an example kit on how to remotely connect to a dev kit from your own laptop/desktop system to execute a demo
Retrain TensorFlow Models to work with new data sets article will show you how to retrain a model and utilize the data set(s) you want to use for your application
Model Optimizer will show you how to optimize the model for the Inference Engine after you have trained the model with a new data set
Model Downloader article will provide you steps on how to use the command line interface for Model Downloader. This tool makes it easier to download various publicly available open source pre-trained Deep Neural Network (DNN) models
The Open Model Zoo Repository will provide you additional models and demos to help you expedite your development with pre-trained models and additional tools
"