Tag: parallelization
Using explicit platform descriptions to support programming of heterogeneous many-core systems
We show various usage scenarios of our PDL and demonstrate the effectiveness of our framework for a commonly used scientific kernel and a financial application on different configurations of a state-of-the-art CPU/GPU system.
GRace: A low-overhead mechanism for detecting data races in gpu programs
In this paper, we propose GRace, a new mechanism for detecting races in GPU programs that combines static analysis with a carefully designed dynamic checker for logging and analyzing information at runtime.
Call for papers-IWOMP 2011
The International Workshop on OpenMP 2011 is seeking submissions of unpublished technical papers detailing innovative, original research and development related to OpenMP. All topics related to OpenMP are of interest, including OpenMP applications in any domain
Parallel-vector algorithms for particle simulations on shared-memory multiprocessors
Two novel algorithms for shared-memory concurrent computation of particle simulations were developed, and their efficiency, scalability, and compatibility with various shared-memory architectures were measured. It was verified that the algorithms enhance the parallel efficiency on most architectures with scalar, vector, and multithreading processors. The performance on a vector processor is particularly excellent: the vector operating ratio reaches 99.8% and the vector length is almost 256—near the architectural limit.
CUDA accelerated MJPEG 2000 encoder
This paper presents a portable, fault-tolerant and a novel parallelized software implementation of Motion JPEG 2000 (MJPEG 2000) reference encoder using CUDA. Each major structural/ computational unit of JPEG 2000 is discussed in CUDA framework and the results are provided wherever required.





