GTC2013

Tag: algorithm

Enhancing data parallelism for ant colony optimisation on GPUs

Enhancing data parallelism for ant colony optimisation on GPUs

| 25 January, 2012 | 0 Comments

In this paper, we deal with a GPU implementation of Ant Colony Optimisation (ACO), a population-based optimisation method which comprises two major stages: Tour construction and Pheromone update.

Continue Reading

9th Workshop on Practical Aspects of High-Level Parallel Programming (PAPP 2012)

9th Workshop on Practical Aspects of High-Level Parallel Programming (PAPP 2012)

| 24 January, 2012 | 0 Comments

The PAPP workshop focuses on practical aspects of high-level parallel programming: design, implementation and optimisation of high-level programming languages, semantics of parallel languages, formal verification, design or certification of libraries, middle-wares and tools

Continue Reading

Efficiently Computing Tensor Eigenvalues on a GPU

Efficiently Computing Tensor Eigenvalues on a GPU

| 20 January, 2012 | 0 Comments

In this paper we present an implementation of SS-HOPM targeted for a GPU. We describe how to exploit symmetry to save both storage and computation in the two main computational kernels of the algorithm, and for the case of solving many small tensor eigenproblems we show how to map the computation onto a GPU.

Continue Reading

QCD simulations with staggered fermions on GPUs

QCD simulations with staggered fermions on GPUs

| 20 January, 2012 | 0 Comments

We report on our implementation of the RHMC algorithm for the simulation of lattice QCD with two staggered flavors on Graphics Processing Units, using the NVIDIA CUDA programming language. The main feature of our code is that the GPU is not used just as an accelerator, but instead the whole Molecular Dynamics trajectory is performed on it.

Continue Reading

The faster-than-fast Fourier transform

The faster-than-fast Fourier transform

| 19 January, 2012 | 0 Comments

For a large range of practically useful cases, MIT researchers find a way to increase the speed of one of the most important algorithms in the information sciences

Continue Reading

CAMPAIGN: Library of GPU-accelerated data clustering algorithms

CAMPAIGN: Library of GPU-accelerated data clustering algorithms

| 19 January, 2012 | 0 Comments

CAMPAIGN is a library of data clustering algorithms and tools, written in ‘C for CUDA’ for Nvidia GPUs. The library provides up to two orders of magnitude speed-up over respective CPU-based clustering algorithms and is intended as an open-source resource.

Continue Reading

Efficient two-level preconditionined conjugate gradient method on the GPU

Efficient two-level preconditionined conjugate gradient method on the GPU

| 11 December, 2011 | 0 Comments

We present an implementation of Two-Level Preconditioned Conjugate Gradient Method for the GPU. We investigate a Truncated Neumann Series based preconditioner in combination with deflation and compare it with Block Incomplete Cholesky schemes.

Continue Reading

Accelerating arithmetic coding on a graphic processing unit

Accelerating arithmetic coding on a graphic processing unit

| 10 December, 2011 | 0 Comments

We implement the block-parallel arithmetic encoder on GPUs using the NVIDIA GPU and the Computer Unified Device Architecture (CUDA) programming model. The source data sequence is divided into small blocks.

Continue Reading

FENZI: GPU-enabled Molecular Dynamics Simulations of Large Membrane Regions

FENZI: GPU-enabled Molecular Dynamics Simulations of Large Membrane Regions

| 8 December, 2011 | 0 Comments

This paper presents the design and implementation of an advanced GPU algorithm for Molecular Dynamics simulations of large membrane regions in the NVT, NVE, and NPT ensembles using explicit solvent and Particle Mesh Ewald (PME) method for treating the conditionally convergent electrostatic component of the classical force field.

Continue Reading

GPApriori: GPU-Accelerated Frequent Itemset Mining

GPApriori: GPU-Accelerated Frequent Itemset Mining

| 29 November, 2011 | 0 Comments

In this paper we describe GPA priori, a GPU-accelerated implementation of Frequent Item set Mining. We tested our implementation with an Nvidia Tesla graphic processor and demonstrate up to 100x speedup as compared with several state-of-the-art FIM algorithms on a CPU.

Continue Reading