Using Docker* Containers with Open vSwitch* and DPDK on Ubuntu* 17.10

ID 标签 660442
已更新 4/12/2018
版本 Latest
公共

author-image

作者

Overview

This article describes how to configure containers to take advantage of Open vSwitch* with the Data Plane Development Kit (OvS-DPDK). With the rise of cloud computing, it’s increasingly important to get the most out of your server’s resources, and these technologies allow us to achieve that goal. To begin, we will install Docker* (v1.13.1), DPDK (v17.05.2) and OvS (v2.8.0) on Ubuntu* 17.10. We will also configure OvS to use the DPDK. Finally, we will use iPerf3 to benchmark an OvS run versus OvS-DPDK run to test network throughput.

We configure Docker to create a logical switch using OvS-DPDK, and then connect two Docker containers to the switch. We then run a simple iPerf3 OvS-DPDK test case. The following diagram captures the setup.

Test Configuration Diagram
Test Configuration

Installing Docker and OvS-DPDK

Run the following commands to install Docker and OvS-DPDK.

sudo apt install docker.io
sudo apt install openvswitch-switch-dpdk

After installing OvS-DPDK, we will update Ubuntu to use OvS-DPDK and restart the OvS service.

sudo update-alternatives --set OvS-vswitchd /usr/lib/openvswitch-switch
-dpdk/OvS-vswitchd-dpdk
sudo systemctl restart openvswitch-switch.service

Configure Ubuntu* 17.10 for OvS-DPDK

The system used in this demo is a two-socket, 22 core-per-socket server enabled with Intel® Hyper-Threading Technology (Intel® HT Technology), giving us a total of 44 physical cores. The CPU model used is an Intel® Xeon® processor E5-2699 v4 @ 2.20 GHz. To configure Ubuntu for optimal use of OvS-DPDK, we will change the GRUB* command-line options that are passed to Ubuntu at boot time for our system. To do this we edit the following config file:

/etc/default/grub

Change the setting GRUB_CMDLINE_LINUX_DEFAULT to the following:

GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1G hugepagesz=1G hugepages=16 hugepagesz=2M hugepages=2048 iommu=pt intel_iommu=on isolcpus=1-21,23-43,45-65,67-87"

This makes GRUB aware of the new options to pass to Ubuntu during boot time. We set isolcpus so that the Linux* scheduler will run on only two physical cores. Later, we will allocate the remaining cores to the DPDK. Also, we set the number of pages and page size for hugepages. For details on why hugepages are required and how they can help to improve performance, refer to the section called Use of Hugepages in the Linux Environment in the System Requirements chapter of the Getting Started Guide for Linux at dpdk.org.

Note: The isolcpus setting varies depending on how many cores are available per CPU.

Also, we edit /etc/dpdk/dpdk.conf to specify the number of hugepages to reserve on system boot. Uncomment and change the setting NR_1G_PAGES to the following:

NR_1G_PAGES=8

Depending on the system memory size and the problem size of your DPDK application, you may increase or decrease the number of 1G pages. Use of hugepages increases performance, because fewer pages are needed and less time is spent doing Translation Lookaside Buffers lookups.

After both files have been updated, run the following commands:

sudo update-grub
sudo reboot

Reboot to apply the new settings. If needed, during the boot enter the BIOS and enable:

  • Intel® Virtualization Technology (Intel® VT) for IA-32, Intel® 64 and Intel® architecture
  • Intel VT for Directed I/O

After logging back into your Ubuntu session, create a mount path for your hugepages:

sudo mkdir -p /mnt/huge
sudo mkdir -p /mnt/huge_2mb
sudo mount -t hugetlbfs none /mnt/huge
sudo mount -t hugetlbfs none /mnt/huge_2mb -o pagesize=2MB
sudo mount -t hugetlbfs none /dev/hugepages

To ensure that the changes are in effect, run the commands below:

grep HugePages_ /proc/meminfo
cat /proc/cmdline

If the changes took place, your output from the above commands should look similar to the image below:

grep hugepages output
grep: hugepages output

cat /proc/cmdline output
cat: /proc/cmdline output

Configuring OvS-DPDK Settings

To initialize the ovs-vsctl database, a one-time step, we will run the command sudo ovs-vsctl –no-wait init. The OvS database will contain user-set options for OvS and the DPDK. To pass arguments to the DPDK, we will use the command-line utility as follows:

sudo ovs-vsctl set Open_vSwitch . <argument>

Additionally, the OvS-DPDK package relies on the following config files:

  • /etc/dpdk/dpdk.conf – Configures hugepages
  • /etc/dpdk/interfaces – Configures/assigns network interface cards (NICs) for DPDK use

Next, we configure OvS to use DPDK with the following command:

sudo ovs-vsctl -no-wait set Open_vSwitch . other_config:dpdk-init=true

After the OvS is set up to use DPDK, we will change one OvS setting, two important DPDK configuration settings, and bind our NIC devices to the DPDK.

DPDK settings

  • dpdk-lcore-mask: Specifies the CPU cores on which dpdk lcore threads should be spawned. A hex string is expected.
  • dpdk-socket-mem: Comma-separated list of memory to preallocate from hugepages on specific sockets.

OvS settings

  • pmd-cpu (poll mode drive-mask: PMD (poll-mode driver)) threads can be created and pinned to CPU cores by explicitly specifying pmd-cpu-mask. These threads poll the DPDK devices for new packets instead of having the NIC driver send an interrupt when a new packet arrives.

Use the following commands to configure these settings:

sudo ovs-vsctl -no-wait set Open_vSwitch . other_config:dpdk-lcore-mask=0xfffffbffffe
sudo ovs-vsctl -no-wait set Open_vSwitch . other_config:dpdk-socket-mem="1024,1024"
sudo ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=0x800002

For dpdk-lcore-mask we used a mask of 0xfffffbffffe to specify the CPU cores on which dpdk-lcore should spawn. For more information about masking, read Mask(computing). In our system, we have the dpdk-lcore threads spawn on all cores except cores 0, 21, 22, and 43. Those cores are reserved for the Linux scheduler. Similarly, for the pmd-cpu-mask, we used the mask 0x800002 to spawn two PMD threads for non-uniform memory access (NUMA) Node 0, and another two PMD threads for NUMA Node 1. Finally, since we have a two-socket system, we allocate 1 GB of memory per NUMA Node; that is, “1024, 1024”. For a single-socket system, the string would be “1024”.

Binding Devices to DPDK

To bind your NIC device to the DPDK, run the dpdk-devbind command. For example, to bind eth1 from the current driver and move to use the vfio-pci driver, run dpdk-devbind --bind=vfio-pci eth1. To use the vfio-pci driver, run modsprobe to load it and its dependencies.

This is what it looked like on my system, with 2 x 10 Gb interfaces available:

sudo modprobe vfio-pci
sudo dpdk-devbind --bind=vfio-pci enp3s0f0
sudo dpdk-devbind --bind=vfio-pci enp3s0f1

To check whether the NIC cards you specified are bound to the DPDK, run the command:

sudo dpdk-devbind --status

If all is correct, you should have an output similar to the image below:

Binding of NIC
Binding of NIC

Configuring Open vSwitch with Docker

To have Docker use OvS as a logical networking switch, we must complete a few more steps. First we must install Open Virtual Network (OVN) and Consul*, a distributed key-value store. To do this, run the commands below. With OVN and Consul we will run in overlay mode, which will allow a multi-tenant multi-host network possible.

sudo apt install ovn-common
sudo apt install ovn-central
sudo apt install ovn-host
sudo apt install ovn-docker
sudo apt install python-openvswitch
wget https://releases.hashicorp.com/consul/1.0.6/consul_1.0.6_linux_amd64.zip
unzip consul_1.0.6_linux_amd64.zip
sudo mv consul /usr/local/bin

The Open Virtual Network* (OVN) is a system that supports virtual network abstraction. This system complements the existing capabilities of OVS to add native support for virtual network abstractions, such as virtual L2 and L3 overlays and security groups. Services such as DHCP are also desirable features. Just like OVS, OVN’s design goal is to have a production-quality implementation that can operate at significant scale.

Once we have OVN and Consul installed we will:

  1. Stop the docker daemon.
  2. Set up the consul agent.
  3. Restart the docker daemon.
  4. Configure OVN in overlay mode.

Consul key-value store

To start a Consul key-value store, open another terminal and run the following command:

mkdir /tmp/consul
sudo consul agent -ui -server -data-dir /tmp/consul -advertise 127.0.0.1 -bootstrap-expect 1

This creates a consul agent that will only advertise locally, expect only one server in the consul cluster, and set the directory for the agent to save its state. A Consul key-store is needed for service discovery for multi-host, multi-tenant configurations.

An agent is the core process of Consul. The agent maintains membership information, registers services, runs checks, responds to queries, and more. The agent must run on every node that is part of a Consul cluster.

Restart docker daemon

Once the consul agent has started, open another terminal and restart the Docker daemon with the following arguments and with $HOST_IP set to the local IPv4 address:

sudo dockerd --cluster-store=consul://127.0.0.1:8500 --cluster-advertise=$HOST_IP:0

--cluster-store option tells the Engine the location of the key-value store for the overlay network. More information can be found on Docker Docs.

Configure ovn in overlay mode

To configure OVN in overlay mode, follow the instructions :

http://docs.openvswitch.org/en/latest/howto/docker/#the-overlay-mode

After OVN has been set up in overlay mode, install Python* Flask and start the ovn-docker-overlay-driver.

The ovn-docker-overlay-driver uses the Python flask module to listen to Docker’s networking api calls.

sudo pip install flask
sudo ovn-docker-overlay-driver --detach

To create the logical switch using OvS- DPDK that Docker will use, run the following command:

sudo docker network create -d openvswitch --subnet=192.168.22.0/24 ovs

To list the logical network switch run:

sudo docker network ls

Pull iperf3 Docker Image

To download the iperf3 docker container, run the following command:

sudo docker pull networkstatic/iperf3

Later we will create two iperf3 containers for a multi-tenant, single-host benchmark.

Default versus OvS-DPDK Networking Benchmark

First we will generate an artificial workload on our server, to simulate servers in production. To do this we install the following:

sudo apt install stress

After installation, run stress with the following arguments, or a set of arguments, to put a meaningful load on your server:

sudo stress -c 6 -i 6 -m 6 -d 6 
-c, --cpu N        spawn N workers spinning on sqrt()  
-i, --io N         spawn N workers spinning on sync()  
-m, --vm N         spawn N workers spinning on malloc()/free()  
-d, --hdd N        spawn N workers spinning on write()/unlink()      
    --hdd-bytes B  write B bytes per hdd worker (default is 1GB)

Default docker networking test

With our system under load, our first test will be to run the iperf3 container using Docker’s default network configuration, which uses Linux bridges. To benchmark throughput using Docker's default networking open 2 terminals. In the first terminal run the iper3 container in server mode.

sudo docker run  -it --rm --name=iperf3-server -p 5201:5201
networkstatic/iperf3 -s

Open another terminal, and then run iper3 in client mode.

docker inspect --format "{{ .NetworkSettings.IPAddress }}" iperf3-server
sudo docker run  -it --rm networkstatic/iperf3 -c ip_address

We will obtain the Docker-assigned ip address of the container named "iperf3-server" and then create another container in client mode with the argument of the ip-address of the iperf3-server.

Using Docker's default networking, we were able to achieve the network throughput shown below:

iperf3 default console
iperf3 benchmark default networking.

OvS-DPDK docker networking test

With our system still under load, we will repeat the same test as above, but will have our containers connect to the OvS-DPDK network.

sudo docker run  -it --net=ovs --rm --name=iperf3-server -p 5201:5201 
networkstatic/iperf3 -s

Open another terminal and run iperf3 in client mode.

sudo docker run  -it --net=ovs --rm --name=iperf3-server -p 5201:5201 
sudo docker run  -it --net=ovs --rm networkstatic/iperf3 -c ip_address

Using Docker with OvS-DPDK networking, we were able to achieve the network throughput shown below:

iperf3 terminal
iperf3 benchmark OvS-DPDK networking

Summary

By setting up Docker to use OvS-DPDK on our server, we achieved a approximately 1.5x increase of network throughput compared to Docker’s default networking. Also it is possible to use OvS-DPDK with OVN to create a multi-tenant, multi-host swarm/cluster in your server environment, with overlay-mode.

About the Author

Yaser Ahmed is a software engineer at Intel Corporation and has an MS degree in Applied Statistics from DePaul University and a BS degree in Electrical Engineering from the University of Minnesota.

"