Beijing Genomics Institute or BGI, the world’s largest genomics institute, has slashed the time to analyze batches of DNA sequencing data from nearly four days to just six hours using a NVIDIA® Tesla™ GPU-based server farm. The speed up is considered a critically important step in determining, in an affordable manner, the chemical building blocks that make up a DNA molecule. This is key for the genomics industry to achieve its target of the $1,000 genome – the point at which genomics can be used in clinical diagnostic tests as a practical component of patient care.
“We are drowning in the genome data that our high-throughput sequencing machines create every day,” said Dr. Bingqiang Wang, head of high performance computing from BGI. “GPU acceleration of our genome analysis applications enables our scientists to crunch through data and gain insights into bacteria, plants and humans faster than was ever possible. It offers the potential for researchers and healthcare professionals to identify highly effective and affordable individualized medicines and treatments.”
BGI researchers and collaborators have developed three genome data analysis applications that are accelerated by NVIDIA Tesla GPUs:
- SOAP3 aligner – Aligns short reads from the sequencing machine against existing reference genome sequences. Through GPU acceleration, the SOAP3 aligner can find all three-mismatch alignments in tens of seconds per one million reads, instead of tens of minutes without GPU acceleration. This means that sequencing and assembling of individual genomes for comparison to those previously sequenced and studied can be performed quickly to understand potential future disease states and treatments.
- GSNP (SNP detection) – A GPU-accelerated version of the widely used SOAPsnp software that detects variation of a single nucleotide polymorphism (SNP) in the DNA of a genome. These DNA SNP variations can be used to study how individuals develop diseases differently and respond to bacteria, viruses and medicines.
- GAMA (high resolution genotyping tool) – Finds the distribution of the occurrence or frequency of particular gene variants, such as eye color or the propensity for prostate cancer in a set of genes.
“The only way for science to reach the $1,000 genome milestone is through technologies that make analyzing DNA data easier, faster and more affordable,” said Sumit Gupta, manager of the Tesla business at NVIDIA. ”GPU computing enables researchers to achieve game-changing speedups in their scientific applications, which can help reduce the cost and complexity of all types of critical research.”
BGI does groundbreaking work in sequencing the genomes of a wide range of life forms, ranging from plants and E.coli to the giant panda, to develop better medicines, improve healthcare and develop genetically enhanced food. BGI’s sequencing output is expected to soon surpass the equivalent of more than 700,000 human genomes per year, a dramatic increase over initial efforts, which took 13 years to sequence a single genome.
Tesla GPUs are massively parallel accelerators based on the NVIDIA CUDA® parallel computing architecture. Application developers can accelerate their applications either by using CUDA C, CUDA C++, CUDA Fortran, or by using the simple, easy-to-use directive-based compilers.