In this paper, authors present a complete application-level study of using GPUs to accelerate data-intensive document clustering algorithms. They first propose a hardware-accelerated variant of the TF-IDF rank search algorithm exploiting GPU devices through NVIDIA’s CUDA. They then develop two highly parallelized methods to build hash tables, one with and one without support of atomic instructions. Even though floating-point calculations are not dominating this text mining domain and its text processing characteristics limit the effectiveness of GPUs due to non-synchronized branching and diverging, data-dependent loop bounds, authors achieve a significant speedup over the baseline algorithm on a general-purpose CPU. More specifically, they achieve up to a 30-fold speedup over CPU-based algorithms for selected phases of the problem solution on GPUs with overall wall-clock speedups ranging from six-fold to eight-fold depending on algorithmic parameters.
Document clustering is a central method to mine massive amounts of data. Due to the explosion of raw documents generated on the Internet and the necessity to analyze
them efficiently in various intelligent information systems, clustering techniques have reached their limitations on single processors. Instead of single processors, general purpose multi-core chips are increasingly deployed in response to diminishing returns in single processor speedup due to the frequency wall, but multi-core benefits only provide linear speedups while the number of documents in the Internet grows exponentially. Accelerating hardware devices represent a novel promise for improving the performance for data-intensive problems such as document clustering. They offer more radical designs with a higher level of parallelism but adaptation to novel programming environments.
In this paper, we assess the benefits of exploiting the computational power of Graphics Processing Units (GPUs) to study two fundamental problems in document mining, namely TF-IDF (Term Frequency-Inverse Document Frequency) and document clustering. We transform traditional algorithms into accelerated parallel counterparts that can be efficiently executed on many-core GPU architectures. We assess
our implementations on various platforms ranging from stand-alone GPU desktops to Beowulf-like clusters equipped with contemporaryGPU cards. We observe at least one order of magnitude speedups over CPU-only desktops and clusters. This demonstrates the potential of exploiting GPU clusters to efficiently solve massive document mining problems. Such speedups combinedwith the scalability potential and accelerator-based parallelization are unique in the domain of document-based data mining, to the best of our knowledge.
Current project further extend author’s work to a broader scope by implementing large-scale document clustering on GPU clusters. They clearly show that GPU clusters outperform CPU clusters by a factor of 30X to 50X, reducing the execution time of massive document clustering from half a day to around ten minutes. They also show that performance gains stem from three factors: (1) acceleration through GPU calculations, (2) parallelization over multiple nodes with GPUs in a cluster and (3) a well thought-out data-centric design that promotes data parallelism. Such speedups combined with the scalability potential and accelerator-based parallelization are unique in the domain of document-based data mining.
Zhang, Yongpeng ; Mueller, Frank ; Cui, Xiaohui ; Potok, Thomas Data-intensive document clustering on GPU clusters. Journal of Parallel and Distributed Computing, 2010, In Press. [DOI: 10.1016/j.jpdc.2010.08.002] [PDF]