Category: Code Examples
GPU programming with Python
Why use GPUs from Python? This workshop will provide a brief introduction to GPU programming with Python, including run-time code generation and use of high-level tools like PyCUDA and PyOpenCL, and Loo.py.
Introduction to pyOpenCL: python interface to the Open Computing Language
In this talk we will provide an introduction to pyOpenCL, python interface to the Open Computing Language. OpenCL is a framework to execute parallel programs across heterogeneous platforms consisting of of both CPUs and GPUs.
Portable LDPC Decoding on Multicores Using OpenCL
This article proposes to address, in a tutorial style, the benefits of using Open Computing Language (OpenCL) as a quick way to allow programmers to express and exploit parallelism in signal processing algorithms, such as those used in error-correcting code systems.
GPU Programming in MATLAB
Article demonstrates features in Parallel Computing Toolbox that enable you to run your MATLAB code on a GPU by making a few simple changes to your code
CUDAfy Me: Traveling Salesman problem with CUDA from C#
CUDAfy is a set of libraries and tools that permit general purpose programming of CUDA Graphics Processing Units (GPUs) from within the Microsoft .NET framework. John Michael Hauck wrote excellent article on how to transfer your CPU code to the GPU using Traveling Salesman problem as an example.
Tesla K20 GPU Quicksort with Dynamic Parallelism
In a recent blog post NVIDIA discussed the new Dynamic Parallelism feature of upcoming GPU Kepler K20 using Quicksort as an example. Dynamic Parallelism allows the GPU to operate more autonomously from the CPU by generating new work for itself at run-time, from inside a kernel.
VexCL: Vector expression template library for OpenCL
VexCL is vector expression template library for OpenCL. It has been created for ease of C++ based OpenCL development. Multi-device (and multi-platform) computations are supported.
Trip over threads to trap multicore bugs with Maze
What makes debugging of multiprocess and multithread applications so difficult? The first thing that comes to mind of every concurrent programmer is the lack of program execution reproducibility. The reason for such program behavior is the preemptive scheduling employed by real-time operating systems.
Whitepaper: The Xcelerit Software Development Kit
The paper presents the Xcelerit SDK, a parallel programming toolkit that leverages the dataflow programming model to efficiently use multi-core CPUs, graphics processors (GPUs), and combinations of these in a cluster (or grid) from a single high-level source code.
Hands-on tutorial: An introduction to OpenCL for HPC programmers
This is “programmer’s introduction” where we cover the ideas behind OpenCL but also show how these ideas are translated into source code. We will do this through a series of progressively more challenging examples






