Tag: algorithm
Special Issue on “Aspects of Numerical Algorithms, Parallelization and Applications”
Aspects of Numerical Algorithms, Parallelization and Applications have been a major thrust of research and have application throughout computational science and engineering. Numerical algorithms are widely used by scientists engaged in various scientific areas.
Accelerating floating-point fitness functions in evolutionary algorithms: a FPGA-CPU-GPU performance comparison
The main objective of the work presented in this paper is to compare implementations on FPGAs and CPUs of different fitness functions in evolutionary algorithms in order to study the performance of the floating-point arithmetic in FPGAs and CPUs that is often present in the optimization problems tackled by these algorithms.
Efficient GPU-based time domain solver for the acoustic wave equation
An efficient algorithm for time-domain solution of the acoustic wave equation for the purpose of room acoustics is presented. It is based on adaptive rectangular decomposition of the scene and uses analytical solutions within the partitions that rely on spatially invariant speed of sound.
Performance Characterization and Optimization of Atomic Operations on AMD GPUs
In this paper, we first quantify the performance impact of atomic instructions to application kernels on AMD GPUs. We then propose a novel software-based implementation of atomic operations that can significantly improve the overall kernel performance.
SIGGRAPH Asia 2011: GPU-efficient recursive filtering and summed-area tables
This video present a new algorithmic framework for parallel evaluation. It partitions the image into 2D blocks, with a small band of additional data buffered along each block perimeter.
Gemma in April: A Matrix-like Parallel Programming Architecture on OpenCL
In this article, we propose a novel parallel computing architecture. The architecture includes Gemma, a general parallel programming model, and April, a programming framework based on Gemma and OpenCL. Gemma uses matrix operation, especially matrix multiplication, to describe general computing tasks.
Fast multipole method on GPU
We propose GPU-friendly data structures and SIMD parallel algorithm flows to facilitate the FMM-based 3-D capacitance extraction on GPU. Effective GPU performance modeling methods are also proposed to properly balance the workload of each critical kernel in our FMMGpu implementation
Unstructured grid applications on GPU
In this paper we analyze the algorithm for unstructured grid analysis on the basis of hardware occupancy and memory access efficiency. In general, the algorithm can be divided into three stages: cell-oriented analysis, edge-oriented analysis and information update, which present different memory access patterns.
26th IEEE International Parallel & Distributed Processing Symposium
PDPS is an international forum for engineers and scientists from around the world to present their latest research findings in all aspects of parallel computation. In addition to technical sessions of submitted paper presentations, the meeting offers workshops, tutorials, and commercial presentations & exhibits.






