Essential Tools for Jumpstarting AI Development Projects

Get the Latest on All Things CODE

author-image

作者

@IntelDevTools

AI projects are compute- and data-intensive. Many projects start with hardware platforms chosen and optimized only for the training phase of AI. However, data manipulation and preprocessing can slow development productivity. Often, deployed models must run in browsers or on edge devices—which means a larger memory footprint—while still providing fast turnaround on a different processing architecture. As a result, many AI projects must maintain multiple code repositories.

But imagine if you could create just one TensorFlow* project and run the same code for all processors. Or if you could train and run your PyTorch* models and automatically use the best configurations and optimizations for the hardware it's running on. Or imagine if you could analyze your entire end-to-end machine learning pipelines using just one tool. This is what Intel's AI tools, libraries, and framework optimizations—built on the oneAPI open standard—provide for AI developers.

This article examines oneAPI and explores some of Intel's many tools built on it to see how oneAPI makes AI development more straightforward and manageable.

A Single Code Base to Target Multiple Architectures

oneAPI is a unified cross-platform, cross-industry standard for building data and machine learning applications that enable developers to build and innovate faster. The same code can target multiple architectures and hardware performance increases due to optimizations provided by toolkits. Companies have been collaborating and delivering tools that follow this standard, providing a better developer experience for data scientists as well as AI and machine learning developers.

People working on AI projects no longer need to worry about and manage the processors and accelerator architectures the projects run on and support. Intel's AI optimizations accelerate not only the computing time of AI models but the development time and effort involved for teams across the industry.

AI Tools, Libraries, and Framework Optimizations

Here's a brief overview of some tools designed to jump-start your AI projects with faster development and performance.

AI Tools

AI Tools is a collection of familiar Python*-based libraries, tools, and frameworks optimized to get the most performance out of Intel® CPUs, GPUs, and accelerators.

Software optimizations built into standard frameworks such as TensorFlow, PyTorch*, scikit-learn*, and XGBoost can improve memory usage for large models and datasets, and can significantly speed up training and inference. Because these optimizations are powered by oneAPI, you can deploy them across hardware architectures.

Intel® Distribution of Modin* is a drop-in replacement for pandas, enabling distributed DataFrame processing that uses all of your machine's available cores. AI Tools also includes a model compression tool, Intel® Neural Compressor. Model compression is a collection of techniques, such as quantization, pruning, and knowledge distillation, that reduce the size of your model while maintaining your required level of prediction accuracy.

Finally, the AI Tools includes Intel® AI Reference Models, which contains various pretrained models, sample scripts, and step-by-step tutorials to help you start as quickly as possible.

Intel® Distribution of OpenVINO™ Toolkit

The Intel® Distribution of OpenVINO™ toolkit can load, convert, and optimize models trained using numerous popular frameworks like Caffe* and ONNX* (Open Neural Network Exchange). You can deploy the same application to combinations of CPU, GPU, and accelerator hardware, while inferencing in parallel across them. Similarly, after fine-tuning and optimization, you can deploy your application on-premises, in the cloud, to an edge device, or in the browser.

This toolkit also comes with its own model zoo and utilities that are vital for deployment, such as quantization, a model optimizer, and a model server.

BigDL 2.0

For large-scale, distributed big data and AI applications, there is BigDL. As the name suggests, BigDL provides various deep learning features for handling large amounts of data. DLlib, the flagship BigDL application, allows you to use Apache Spark* for the pipeline to quickly build and deploy full end-to-end AI applications. Privacy-preserving machine learning (PPML) adds Intel® Software Guard Extensions protection to distributed big data and AI applications in private and public cloud environments. And BigDL includes libraries that make it easy to build time-series analysis and recommendation applications.

AI Reference Kits

Intel and Accenture* have developed AI reference kits to help jump-start projects in a given application and industry. These kits include an open source trained model for an application like natural language processing, computer vision, or predictive maintenance. And they come with industry-specific data sets, such as airline traveler text snippets, chest X-ray images, and utility pole attributes. They're a great starting point from which you can use the included oneAPI components and libraries to apply transfer learning with your own specific data.

Jump-Starting AI Development

To give you some ideas for using these tools, let's explore use cases where you can combine these tools into an AI application.

Document Routing with Intelligent Indexing

Large organizations like enterprises, governments, healthcare, and education providers may have to manually route millions of documents each year. An AI model that classifies documents based on terms in the text can make this much more efficient. You can train a support vector classification (SVC) model using Intel® Extension for Scikit-learn* with your own relevant dataset. Due to the size of the dataset, using Intel Distribution of Modin helps accelerate the preprocessing. Learn how to get started with a model pretrained on a generic dataset with the Document Routing with Intelligent Indexing reference kit.

Detect Objects in a Live Video Feed

If you have a video source, such as a live webcam, you can get started by downloading a pretrained TensorFlow object detection model. You are deploying this model to a browser to perform inferencing from the live video feed, so efficiency is important. So, when you import the trained TensorFlow model into the Intel Distribution of OpenVINO toolkit, convert it to the FP16 datatype. Combining shorter word-length data types with the oneAPI software optimizations in Intel Distribution of OpenVINO toolkit provide the performance you need to detect objects within the video in real time. You can try it yourself with this tutorial, or you can start with a pretrained PyTorch model, converting and optimizing it to suit your needs.

Intelligent Power Prediction

Power output from energy sources like wind power and photovoltaics depends on the weather, and the weather affects the demand for power. Therefore, optimizing power generation plans to efficiently meet demands requires complex prediction systems. This is a multimodel problem that requires weather forecasting and a model for how weather affects power generation and energy demands. This is a big data problem for which BigDL can provide a solution: Data from multiple sources must be collected, stored, governed, processed, and used across various deep learning and machine learning models. Learn how Goldwind* SE used BigDL to manage all this data and build prediction models to improve forecast accuracy from 59% to 79%.

Build Faster and Better

AI and data science is still a new field with growing interests and opportunities throughout the industry. Moreover, oneAPI enables you to write your code once, deploy in multiple architectures, and run at the best possible performance. Consequently, you can now build better AI solutions and innovate faster, ensuring enhanced support of different processor architectures with fewer problems.

With these tools and frameworks from Intel, you can build AI projects using standard and familiar methods and coding languages. You can also iterate and fully debug end-to-end in one place—without needing to use separate software for different parts of the workflow. This means a simplified, more streamlined development process.

AI Tools

Accelerate data science and AI pipelines—from preprocessing through machine learning—and provide interoperability for efficient model development.

Get It Now

Intel® Distribution of OpenVINO™ Toolkit

Optimize models trained using popular frameworks like TensorFlow*, PyTorch*, and Caffe*, and deploy across a mix of Intel® hardware and environments.

Get It Now

See All Tools