Tag: CUDA
Advanced School of Parallel Computing: Video of talks is available
The Advanced School of Parallel Computing is an intense, 5 day, graduate level course in high performance computing including CUDA, OpenCL and GPGPU programming
Maximizing MultiGPU Machines
Multiple GPU and hybrid CPU+GPU performance is heavily dependent upon vendor implementation of the PCIe bus
Seminar at Iowa State: GPU-Accelerated Grid Search Technique for Aerospace
Guidance, Control, and Astrodynamics Seminar at Iowa State University. Thursday, November 10, 2011
The Portland Group Events at SC11
The Portland Group invites you to join us at Supercomputing 2011 in Seattle, Washington USA starting November 12. Following is a list of events and schedules.
Improved tracking technique for visual measurements of ionic polymer–metal composites (IPMC) actuators using CUDA
The implementation of a real-time measurement system based on visual measurements of displacement of an actuator–cantilever is presented in this paper.
Copperhead: compiling an embedded data parallel language
Copperhead is a functional data-parallel Python dialect, along with a runtime that currently supports CUDA-enabled GPUs. The Copperhead programmer describes parallel computations via composition of familiar data parallel primitives supporting both flat and nested data parallel computation on arrays of data.
Accelerating CUDA graph algorithms at maximum warp
Parallel execution of graph algorithms on GPUs suffered from workload imbalance for real-world graph instances. To address this issue, we proposed a novel virtual warp-centric programming method, a general strategy that prevents branch divergence and unnecessary scattering memory access.
Achieving a single compute device image in OpenCL for multiple GPU
In this paper, we propose an OpenCL framework that combines multiple GPUs and treats them as a single compute device. Providing a single virtual compute device image to the user makes an OpenCL application written for a single GPU portable to the platform that has multiple GPU devices.
Volume visualization: A technical overview with a focus on medical applications
In this paper, we review volumetric image visualization pipelines, algorithms, and medical applications. We also illustrate our algorithm implementation and evaluation results, and address the advantages and drawbacks of each algorithm in terms of image quality and efficiency.
GRace: A low-overhead mechanism for detecting data races in gpu programs
In this paper, we propose GRace, a new mechanism for detecting races in GPU programs that combines static analysis with a carefully designed dynamic checker for logging and analyzing information at runtime.






