Tag: nvidia
OpenACC Gains Momentum with Growing Developer Tool Support
The OpenACC standards group today announced growing support for OpenACC-supported development tools, and initial resultsfrom programmers who have been using the recently-released OpenACC compilers to accelerate research.
GPUs Transform Content Creation Workflow with Adobe Creative Suite 6
NVIDIA GPUs Enable Dramatic New GPU-accelerated Features for Adobe After Effects CS6, Adobe Premiere Pro CS6, Adobe SpeedGrade CS6 and Adobe Photoshop CS6
NVIDIA’s 2012 GPU Technology Conference (GTC) Opens for Registration
Four-Day Event Gathers Experts From Around the Globe; Showcases Breakthroughs Across Broad Range of Scientific, Visual, Technology Fields
Exploring the limits of GPGPU scheduling in control flow bound applications
This work tracks the root causes of execution inefficacies when running control flow intensive CUDA applications on NVIDIA GPGPU hardware
Accelerating knowledge-based energy evaluation in protein structure modeling with GPU
We present an efficient implementation of knowledge-based energy functions by taking advantage of the recent Graphics Processing Unit (GPU) architectures
Comparison of GPU architectures for asynchronous communication with finite-differencing applications
We report on some practical algorithmic and data layout approaches and on performance data on a range of GPUs with CUDA. We focus on the use of multiple GPU devices with a single CPU host and the asynchronous CPU/GPU communications issues involved.
A GPU application for high-order compact finite difference scheme
A high-order compact finite difference scheme for the solution of fluid flow problems is implemented to run on GPU using CUDA
Generating optimal CUDA sparse matrix-vector product implementations for evolving GPU hardware
This paper explores the effect of these different options on the performance of a routine that evaluates sparse matrix–vector products (SpMV) across three different generations of NVIDIA GPU hardware
High Precision Integer Multiplication with a GPU
We have improved our prior implementation of Strassen’s algorithm for high performance multiplication of very large integers on a general purpose graphics processor (GPU).
Optimizing the multipole-to-local operator in the fast multipole method for GPU
This paper presents a number of algorithms to run the fast multipole method (FMM) on NVIDIA CUDA-capable graphical processing units. The FMM is a class of methods to compute pairwise interactions between N particles for a given error tolerance and with computational cost.





