Category: Physical Science
We here presented a highly optimized LBM based kernel for GPGPUs in CUDA and OpenCL. Since the LBM is mainly memory bandwidth bound and a single lattice update takes 304 Byte on the GPU and 456 Byte on the CPU in DP, we initially established an upper performance limit based on sustainablememory bandwidth
We consider the use of commodity graphics processing units (GPUs) for the common task of numerically integrating ordinary differential equations (ODEs), achieving speedups of up to 115-fold over comparable serial CPU implementations, and 15-fold over multithreaded CPU code with SIMD intrinsics.
In this paper, we have highlighted some of the benefits and limitations of early adoption of GPGPU for astronomy. While there are risks and significant effort may be required to prepare codes, in many cases the benefits will outweigh the limitations.
Table of Contents of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis Proceedings (SC11)
First-principles calculations of a silicon nanowire on the “K computer” Awarded by ACM Gordon Bell Prize
Research results obtained using the “K computer” were awarded the ACM Gordon Bell Prize, Peak-Performance at SC11, the International Conference for High Performance Computing
Application of Graphics Processing Units to the Study of Non-linear Dynamics of the Exciton Bose-Einstein Condensate
We have investigated the application of GPUs using NVIDIA’s CUDA programming environment to the numerical solution of the Gross-Pitaevskii equation, which describes the dynamics of the Bose-Einstein condensate of excitons in a semiconductor quantum well.
Implementation of GEOS-5 on GPUs provides a useful benchmark for the scalability of global atmospheric models on GPUs, and facilitates evaluation of future system architecture configurations.
In this paper, we have proposed a fast simulation method of star map for star sensor. This method selects stars in FOV more quickly, and has a high real-time simulation speed.
Porting Optimized GPU Kernels to a Multi-core CPU: Computational Quantum Chemistry Application Example
We investigate techniques for optimizing a multi-core CPU code back ported from a highly optimized GPU kernel. We show that common sub-expression elimination and loop unrolling optimization techniques improve code performance
Guidance, Control, and Astrodynamics Seminar at Iowa State University. Thursday, November 10, 2011