Tag: algorithm
Webinar: CUDA 5 Math Library Performance Overview
Jonathan Cohen and the NVIDIA CUDA Library Team present the latest benchmark results using the extensive numerical libraries included with CUDA 5. This webinar will cover all the data points and the significance of the new Math Library Performance Report
Pragmatic optimizations for better scientific utilization of large supercomputers
In this paper we describe our experiences in running simulations of the explosion phase of Type Ia supernovae on the largest available platforms. The simulations use FLASH, a modular, adaptive mesh, parallel simulation code with a wide user base. The simulations use multiple physics components: hydrodynamics, gravity, a sub-grid flame model, a three-stage burning model, and a degenerate equation of state.
SC12 Best Paper Award: Framework for Low-Communication 1D FFT
Intel’s Software & Intel Labs devised a new framework for distributed 1-D FFT problems which traditionally require three costly all-to-all inter-node data exchanges. The new approach delivers multiple 1D FFT algorithms requiring just a single all-to-all inter-node data exchange.
Fast box-counting algorithm on GPU
In this paper we present a fast parallel version of the box-counting algorithm, which has been coded in CUDA for execution on the Graphic Processing Unit (GPU)
Orthogonalization on a general purpose graphics processing unit
Using a massively parallel algorithm for the modified Gram-Schmidt orthogonalization on a NVIDIA Tesla C2050 Computing Processor we can compensate for the cost of one extra level of precision, even already for modest dimensions.
A multi-thread scheduling method for 3D CT image reconstruction using multi-GPU
In this method we use Multi-Threads to control GPUs and a separate thread to accomplish data storage, so that we make the calculation and data storage simultaneously
On the Future of High Performance Computing: How to Think for Peta and Exascale Computing
Jack Dongarra of Oak Ridge National Laboratory, speaking on 41th SPEEDUP Workshop on High-Performance Computing, ETH Zurich, Switzerland, September 7, 2012.
Numerical Study of Geometric Multigrid Methods on CPU-GPU Heterogeneous Computers
In this work, we studied the performance of GMG on CPU-GPU heterogeneous computers. Our numerical results suggest that in the best-case scenario the GPU version of GMG can achieve 18.5 times speed-up in 2D and 16.0 times speed-up in 3D compared with an efficient implementation of multigrid methods on CPUs.
Collision Detection Method for High Resolution Objects Using Tessellation Unit on GPU
After tessellation for collision detection on GPU, 200-million-face models cannot be computed in real-time. Our algorithm proposed an high-potential-collision area selection.
Notable SIGGRAPH 2012 technical papers focused on GPU computing
We collected notable GPU computing talks at upcoming SIGGRAPH 2012 meeting. All DOIs linked to the corresponding technical papers.






