Find Bugs Quickly Using Sanitizers with the Intel® oneAPI DPC++/C++ Compiler

Get the Latest on All Things CODE

author-image

作者

In this tutorial, you will learn how to use sanitizers with Intel oneAPI DPC++/C++ Compiler to efficiently and accurately detect common issues in your C/C++ code.

Note:  C++ with SYCL programs designed to run on a host CPU and a compute accelerator target device are currently not supported.

Sanitizers help identify and pinpoint undesirable or undefined behavior in your code. They are enabled with a compiler option switch that instruments your program, adding additional safety checks to the binary.

As a side effect, there will be performance and file size overhead. The impact on performance and executable size depends on the specific sanitizer tool used and the program's characteristics being analyzed. Factors such as the program's size, the amount of memory allocated, and the number of threads used play a role. Thus, sanitizers should only be used for debugging and code verification, not in production code.

The benefits of using these separate sanitizer-enabled build steps are, however, tremendous as they help to detect and prevent bugs and security vulnerabilities. Their use can be vital for regular software testing in a CI/CD DevOps environment.

They also provide a convenient way for software developers to verify code changes before submitting them to a repository branch.  

In fact, sanitizers, as used with LLVM-based compilers like Clang* or the Intel® oneAPI DPC++/C++ Compiler, are fairly lightweight. This is especially true if you compare them with other open-source software testing solutions like Valgrind* or commercial code analyzer solutions for functional testing and coding standards compliance like Parasoft’s Insure++*, PVS Studio*, AbsInt Astrée*, or QA Systems Cantata*. Usually, sanitizers increase execution time by a factor of 2-3, while Valgrind can introduce overheads of up to 100x.

This makes sanitizers quite useful for testing or debugging a program as part of your regular software development flow or for identifying runtime issues that occur late in the execution of a larger application.

If, instead, you compare with more traditional interactive debug approaches like the use of GDB*, there does, of course, remain one drawback. The use of sanitizers requires recompilation of the program. Ideally, if your program depends on other shared libraries, these, too, should be recompiled with sanitizers enabled (except for the standard libc/libc++ or course). The benefit is that the code instrumentation will do the bug-hunting for you.

In this tutorial, we will take a closer look at the following sanitizers:

  1. AddressSanitizer - detect memory safety bugs
  2. UndefinedBehaviourSanitizer - detect undefined behavior bugs
  3. MemorySanitizer - detect use of uninitialized memory bugs

The example source code used throughout this tutorial can be found in the archive file sanitizers-tutorial.tgz

1. Detecting Memory Safety Bugs With the AddressSanitizer

To demonstrate the different capabilities of sanitizers, we will use a small program that prints the Fibonacci sequence, a sequence in which each number is the sum of the two preceding ones starting with 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, …

Code Sample

Let us start with the following code example, which can be found as fibonacci_v1.c inside the tutorial source archive.

#include <stdlib.h>
#include <stdio.h>

/**
 * Fill array arr of length n
 * with the first n fibonacci numbers
 */
void set_fibonacci_list(int *arr, int n) {
  arr[0] = 0;
  arr[1] = 1;

  for (int i = 2; i < n; i++) {
    arr[i] = arr[i-1] + arr[i-2];
  }
}

/**
 * Print the first n fibonacci numbers
 * */
void print_fibonacci(int n) {
  int fibos[n];

  set_fibonacci_list(fibos, n);

  printf("Fibonacci Sequence\n");
  printf("==================\n");
  for (int i = 0; i < n; i++) {
    printf("%d\n", fibos[i]);
  }
  printf("==================\n");
  if (n > 1 && fibos[n-2] != 0) {
    printf("Golden ratio approximation: %g\n", ((double)fibos[n-1])/fibos[n-2]);
  }
}

int main(int argc, char *argv[]) {
  if (argc != 2) {
    printf("Usage: %s NUM\n", argv[0]);
    return 1;
  }

  print_fibonacci(atoi(argv[1]));

  return 0;
}

Figure 1. Initial Fibonacci sequence example source code

This program takes the amount of Fibonacci numbers to print as a command line parameter. It then computes the Fibonacci sequence in the function set_fibonacci_list and prints it to the screen in the function print_fibonacci.

Running the Sanitizer

We will now use the AddressSantizer to detect potential memory-related bugs in this program. The AddressSanitizer can detect multiple memory safety bugs, including out-of-bounds accesses on the stack and heap and use-after-free bugs.

To compile the program using the AddressSantizer, use the following command:

$ icx src/fibonacci_v1.c -O0 -g -fsanitize=address -fno-omit-frame-pointer -o fibonacci_v1_with_asan

The compiler option -fsanitize=address activates the sanitizer.

Flags -O0 -g -fno-omit-frame-pointer are added to get the best diagnostic output in case we indeed find a coding issue, but these options are not mandatory.

Note that -g implicitly set -O0 and -fno-omit-frame-pointer. So, these options are only listed to provide you with the complete set of parameters.

There are additional sanitizer-related flags that you can pass to the command line. Please refer to the Clang Compiler User’s Manual for a complete list.

For comparison, we can also compile a version without the sanitizer:

$ icx src/fibonacci_v1.c -O0 -g -o fibonacci_v1

Now, you can run both executables with some value for N. They both should print the same output:

$ ./fibonacci_v1 10
Fibonacci Sequence
=====================================
0
1
1
2
3
5
8
13
21
34
=====================================
Golden ratio approximation: 1.61905
$ ./fibonacci_v1_with_asan 10
Fibonacci Sequence
=====================================
0
1
1
2
3
5
8
13
21
34
=====================================
Golden ratio approximation: 1.61905

However, the program contains a bug:

When n < 2, in the set_fibonacci_list function, we assign the initial Fibonacci values to indices that are out of bounds!

Let us try to run the programs with 0 as the argument and see what happens:

$ ./fibonacci_v1 0
Fibonacci Sequence
==================
$ ./fibonacci_v1_with_asan 0
====================================================
==9006==ERROR: AddressSanitizer: dynamic-stack-buffer-overflow on address 0x7ffd21da5a20 at pc 0x000000506601 bp 0x7ffd21da5990 sp 0x7ffd21da5988                                 
...

This illustrates the power of the AddressSanitizer. Normal program execution did not fail in this example. So, we might have easily missed the bug. In other configurations, the program might have crashed. But the crash could have also happened at a later point. In the worst case, a program does not crash but produces wrong results!

AddressSantitizer, on the other hand, immediately detects the error and aborts the execution, showing a verbose diagnostic report. This report includes:

  • The type, location and register values of the bug:
    dynamic-stack-buffer-overflow on address 0x7ffd21da5a20 at pc 0x000000506601 bp 0x7ffd21da5990 sp 0x7ffd21da5988
  • A traceback of where the bug occurred (here the debug compiler flags help):
    WRITE of size 8 at 0x7ffd21da5a20 thread T0
        #0 0x506600 in set_fibonacci_list /home/user/sanitizer_tutorial/src/Fibonacci_v1.c:10:10
        #1 0x5067c1 in print_fibonacci /home/user/sanitizer_tutorial/src/fibonacci_v1.c:24:3
        #2 0x50691e in main /home/user/sanitizer_tutorial/src/fibonacci_v1.c:42:3
        #3 0x7feff05077b2 in __libc_start_main (/lib64/libc.so.6+0x237b2) (BuildId: ade58d86662aceee2210a9ef12018705e978965d)
        #4 0x41eb2d in _start (/home/user/sanitizer_tutorial/fibonacci_v1_with_asan+0x41eb2d) 

     

Valgrind, on the other hand, did not detect the bug in this case. This is because Valgrind does not detect stack-based buffer overflows:

valgrind ./fibonacci_v1 0
==166581== Memcheck, a memory error detector
==166581== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==166581== Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info
==166581== Command: ./Fibonacci_v1 0
==166581==
Fibonacci Sequence
==================
==================
==166581==
==166581== HEAP SUMMARY:
==166581==     in use at exit: 0 bytes in 0 blocks
==166581==   total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==166581==
==166581== All heap blocks were freed -- no leaks are possible
==166581==
==166581== For lists of detected and suppressed errors, rerun with: -s
==166581== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Fixing the Issue

Let us fix the memory safety bug we detected by adjusting the set_fibonacci_list function:

void set_fibonacci_list(int *arr, int n) {
  if (n > 1) {
    arr[0] = 0;
  }
  if (n > 2) {
    arr[1] = 1;
  }
  for (int i = 2; i < n; i++) {
    arr[i] = arr[i-1] + arr[i-2];
  }
}

Figure 2. Fix in set_fibonacci_list function for n<2

You can find the new program in src/fibonacci_v2.c inside the tutorial source archive.

After recompilation, we can now check that the bug is gone by rerunning the sanitized version:

$ ./fibonacci_v2_with_asan 0
Fibonacci Sequence
=====================================

Perfect! We fixed the program.

2. Detecting Undefined Behavior With the UndefinedBehaviorSanitizer (UBSan)

After fixing memory-related bugs in the Fibonacci program, we can now do some more basic manual functional testing with the program.

Observing an Issue

For example, we can try to use larger values for N:

$ ./fibonacci_v2_with_asan 100
Fibonacci Sequence
==================
0
1
...
-889489150
==================
Golden ratio approximation: 9.81579

We can see that the output is wrong: The golden ratio seems very off, and Fibonacci numbers should never be negative!

Running the Sanitizer

To find out what is going wrong, let us now use the UndefinedBehaviorSanitizer (UBSan), a sanitizer that can detect types of undefined behavior in your program:

$ icx src/fibonacci_v2.c -O0 -g -fsanitize=undefined -fno-omit-frame-pointer -o fibonacci_v2_with_ubsan

Use the -fsanitize=undefined to enable UBSan. UBSan will catch a set of common undefined behavior types. Please refer to the UndefinedBehaviorSanitizer documentation to learn how to enable checks on different or additional undefined behavior types.

Let us run our sanitizer-enabled binary:

$ ./fibonacci_v2_with_ubsan 100
src/Fibonacci_v2.c:17:23: runtime error: signed integer overflow: 1836311903 + 1134903170 cannot be represented in type 'int'
SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior src/Fibonacci_v2.c:17:23 in
Fibonacci Sequence
==================
0
...

Identifying the Cause

UBSan has successfully identified the problem: The Fibonacci sequence is growing quickly and leads to a signed integer overflow, which is an undefined behavior according to the C standard.

Similar to the AddressSanitizer, we are getting verbose diagnostic output:

  • The type of undefined behavior (signed integer overflow)
  • Additional information about the problem (1836311903 + 1134903170 cannot be represented in type 'int')
  • The location of the bug (undefined-behavior src/Fibonacci_v2.c:17:23)

Note that in contrast to the AddressSantizer, the program is not aborted on detecting undefined behavior.

3. Detecting uninitialized memory usage with the MemorySanitizer

The MemorySanitizer allows you to catch bugs caused by uninitialized memory usage. You can enable the sanitizer via the -fsanitize=memory flag.

Important Notes on MemorySanitizer Usage:
• The sanitizer does not fail immediately on uninitialized memory reads. It only fails once a branch, syscall, or dynamic call depends directly or indirectly on uninitialized memory.
• All project dependencies should be recompiled with MemorySanitizer. Otherwise, there might be high amounts of false positives

Let us look closer at how the MemorySanitizer can detect coding issues in our program.

One alternative way to fix the integer overflow bug from the previous section is to limit the amount of Fibonacci numbers computed (maximum of 47).

Let's say that naively, we add the limitation to our set_fibonacci_list function:

void set_fibonacci_list(int *arr, int n) {
  if (n > 1) {
    arr[0] = 0;
  }
  if (n > 2) {
    arr[1] = 1;
  }
  if (n > 47) {
    n = 47;
  }
  for (int i = 2; i < n; i++) {
    arr[i] = arr[i-1] + arr[i-2];
  }
}

Figure 3. Limit the number of items in set_fibonacci_list function to n=47

You can find the newly updated program in src/fibonacci_v3.c inside the tutorial source archive.

Observing an Issue

Now we can re-compile and re-run the program:

$ icx src/fibonacci_v3.c -O0 -g -fsanitize=undefined -fno-omit-frame-pointer -o fibonacci_v3_with_ubsan

$ ./fibonacci_v3_with_ubsan 100
Fibonacci Sequence
==================
0
1
...
==================
Golden ratio approximation: -0.000100335
...

The good news is that UBSan does not complain anymore, meaning the program no longer contains a signed integer overflow. The bad news is that we still have negative Fibonacci numbers in our list, and the golden ratio approximation is still off.

Another observation is that the output of the tool changes non-deterministically.

This is a hint that there might be some uninitialized memory usage.

Running the Sanitizer

We can use the MemorySanitizer to double-check that. Use the following command to compile the program with the MemorySanitizer:

$ icx src/fibonacci_v3.c -O0 -g -fsanitize=memory -fsanitize-memory-track-origins=2 -fno-omit-frame-pointer -o fibonacci_v3_with_msan

The -fsanitize=memory flag enables the MemorySanitizer. To additionally track from which variable the uninitialized memory was derived, you can optionally pass the -fsanitize-memory-tracks-origins=2 flag.

Running the memory-sanitizer enabled program yields:

$ ./fibonacci_v3_with_msan 100
Fibonacci Sequence
==================
0
1
1
...
==177412==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x4afd9e in print_fibonacci /home/user/sanitizer_tutorial/src/fibonacci_v3.c:36:5
    #1 0x4b03ec in main /home/user/sanitizer_tutorial/src/Fibonacci_v3.c:53:3
    #2 0x7f1eac55a7b2 in __libc_start_main (/lib64/libc.so.6+0x237b2) (BuildId: ade58d86662aceee2210a9ef12018705e978965d)
    #3 0x41f2dd in _start (/home/user/sanitizer_tutorial/fibonacci_v3_with_msan+0x41f2dd)

  An uninitialized value was created by an allocation of 'vla' in the stack frame of function 'print_fibonacci'
    #0 0x4af6c0 in print_fibonacci /home/user/sanitizer_tutorial/src/Fibonacci_v3.c:29:3

SUMMARY: MemorySanitizer: use-of-uninitialized-value /home/user/sanitizer_tutorial/src/Fibonacci_v3.c:36:5 in print_fibonacci
Exiting

Identifying the Cause

We can see that MemorySanitizer reports a use-of-uninitialized-value bug. The reason is that while we only fill the first 47 entries of the Fibonacci array, we still print and use the unassigned (and, thus, uninitialized) array values.

To fix the problem, we should add an extra check to the main function:

if (atoi(argv[1]) > 47) {
    printf("Please provide a number of 47 or less.\n");
    return 1;
  }

Figure 4. Add check for input parameter for Fibonacci sequence exceeding n=47

You can find the final program in src/fibonacci_v4.c inside the tutorial source archive.

Congratulations, you have fixed all bugs in the program!

Summary

In this tutorial, we introduced you to the fundamentals of sanitizer usage with the Intel oneAPI DPC++/C++ Compiler. Sanitizers help you to catch multiple bugs in a simple program.

By using sanitizers, you can effectively catch issues early in the development process, saving time and reducing the likelihood of costly errors in production code.

Useful Resources 

Here are some detailed resources for you to explore the oneAPI DPC++/C++ Compiler: 

Get The Software  

You can install the Intel® oneAPI DPC++/C++ Compiler as a part of the Intel® oneAPI Base Toolkit or the Intel® oneAPI HPC Toolkit. You can also download a standalone version of the compiler or test it across Intel® CPUs and GPUs on the Intel® Developer Cloud platform.