OneOligo: Use oneAPI to Accelerate DNA Data Storage

Speaker: Raja Appuswamy, EURECOM

In the European Commission-funded Future and Emerging Technologies initiative OligoArchive, we are working on transforming DNA–the biological building block of life–into a digital building block for long-term data archival. One of the key steps in retrieving digital data stored in DNA involves clustering billions of strings with respect to edit distance. The computationally intensive nature of edit distance computation has made this step a critical bottleneck in the DNA data retrieval pipeline. In this talk, we present project OneOligo—our scalable, hardware-accelerated solution for DNA read clustering. In doing so, we first provide an overview the DNA data storage pipeline. Then, we present OneJoin—a string-similarity join algorithm that synergistically combines algorithmic advances in low-distortion embedding with cross-architectural programming ability offered by DPC++, to scale-up clustering across CPUs and GPUs.

Additional Resources

Great Cross-Architecture Challenge—A Coding Challenge

Calling all C++, DPC++, and CUDA developers. We’re searching for the next oneAPI hero—someone who can write code that will run on the latest CPUs, GPUs, and FPGAs. Submit your best projects to win some amazing prizes.

Supercomputing 2020 (SC20) Recorded Sessions on oneAPI

Self-paced Trainings Using Jupyter* Notebooks

Sign Up for Intel® DevCloud for oneAPI

Join

Intel® DevMesh Community

Intel® ​Innovator Program​