Intel® Threading Building Blocks for Heterogeneous Computing

Computing platforms are becoming increasingly heterogeneous with the combination of CPUs, GPUs, FPGAs, co-processors, and domain-specific compute engines. But programming to take advantage of these heterogeneous environments in a single application remains a challenge. This video provides a brief overview of the heterogeneous capability of Intel® Threading Building Blocks (Intel® TBB), a leading C++ template library for shared-memory parallel programming, and discuss its new and upcoming features for harnessing the spectrum of compute resources made available in heterogeneous systems. These features enable asynchronous communication with devices using standard programming models, such as OpenCLTM*, and also provide the basic building blocks needed to integrate proprietary APIs for accessing devices in to an Intel TBB application in a composable way. Join us to see how you can use Intel TBB to not only exploit the CPU cores on your system, but to coordinate the other compute resources.

Hi, my name is Sharmila. Today we are going to talk about heterogeneous computing features in Intel threading building blocks. Let's get started. 

Our hardware landscape is evolving fast. Computing platforms are becoming increasingly intragenous, with a combination of CPUs, GPUs, FPGAs, co-processors, and domain-specific compute engines. But programming with a single application in order to take advantage of these heterogeneous environments remains a challenge. Intel TBB and the flow graph feature in it can help you address this challenge. 

Developing for heterogeneous systems can be difficult, because the programming models associated with different system components are disjointed. On one hand, a multi-core HPC developer might prefer environments based on C, C++, Fortran, or Java, or libraries such as OPENMP or TBB. On the other hand, we have GPGPU developers who may use languages like CUDA, DirectCompute, or open specifications like OpenCL. 

This divergence in programming tools can cause significant development and maintenance costs. With these differences in architecture, it becomes impractical to use the same programming model. Intel TBB's vision is to act as this coordination layer between these devices, which is the hardware, and the programming models, the software environment. 

So how can TBB help? Intel TBB's flow graph features addresses the three common questions that come up with heterogeneous computing. One, how do you get your computation onto a device that is not necessarily programmed in C++? Two, how do you figure out the execution order of your application, and how do you get notification of when a job is done in one device, in order to get started with the other device? Three, how do you optimize the various code pieces so tasks get assigned to various kernels in a way to get optimal performance? 

Our vision with TBB is to provide one parallel software platform to coordinate across these libraries, programming models, and devices. We are constantly developing additional intragenous support, with low level building blocks and high level support. 

Thank you for watching. For more details, check out our webinar linked the description below, and don't forget to like this video and subscribe.