GTC2013

Tag: performance

Webinar: CUDA 5 Math Library Performance Overview

Webinar: CUDA 5 Math Library Performance Overview

| 29 January, 2013 | 0 Comments

Jonathan Cohen and the NVIDIA CUDA Library Team present the latest benchmark results using the extensive numerical libraries included with CUDA 5. This webinar will cover all the data points and the significance of the new Math Library Performance Report

Continue Reading

Relativistic Hydrodynamics on Graphic Cards

Relativistic Hydrodynamics on Graphic Cards

| 31 July, 2012 | 0 Comments

We show how to accelerate relativistic hydrodynamics simulations using graphic cards (graphic processing units, GPUs).

Continue Reading

Evolutionary algorithms: a FPGA-CPU-GPU performance comparison

Evolutionary algorithms: a FPGA-CPU-GPU performance comparison

| 18 January, 2012 | 0 Comments

The main objective of the work presented in this paper is to compare implementations on FPGAs and CPUs of different fitness functions in evolutionary algorithms in order to study the performance of the floating-point arithmetic in FPGAs and CPUs that is often present in the optimization problems tackled by these algorithms.

Continue Reading

The challenges of writing portable, correct and high performance libraries for GPUs

The challenges of writing portable, correct and high performance libraries for GPUs

| 18 January, 2012 | 0 Comments

We aim to deliver working, efficient GPU code in a library that is downloaded and run by many different users. The issue is to deliver efficiency independent of the individual user parameters and without a priori knowledge of the hardware the user will employ.

Continue Reading

Intel OpenCL Whitepapers: Events and CPU Performance

Intel OpenCL Whitepapers: Events and CPU Performance

| 21 December, 2011 | 0 Comments

This whitepaper on OpenCLâ„¢ describing how to best utilize underlying Intel hardware architecture using OpenCL. This white paper will go over programming considerations for host-side device orchestration, as well as OpenCL kernels for CPU.

Continue Reading

A Comprehensive Performance Comparison of CUDA and OpenCL

A Comprehensive Performance Comparison of CUDA and OpenCL

| 14 December, 2011 | 3 Comments

This paper presents a comprehensive performance comparison between CUDA and OpenCL. We have selected 16 benchmarks ranging from synthetic applications to real-world ones. We make an extensive analysis of the performance gaps taking into account programming models, optimization strategies, architectural details, and underlying compilers.

Continue Reading

Efficient Synchronization Primitives for GPUs

Efficient Synchronization Primitives for GPUs

| 22 October, 2011 | 0 Comments

In this paper, we revisit the design of synchronization primitives-specifically barriers, mutexes, and semaphores-and how they apply to the GPU. Previous implementations are insufficient due to the discrepancies in hardware and programming model of the GPU and CPU.

Continue Reading

Performance and Power Analysis of AMD GPU cards

Performance and Power Analysis of AMD GPU cards

| 12 October, 2011 | 0 Comments

We present a comprehensive study on the performance and power consumption of a recent ATI GPU. By employing a rigorous statistical model to analyze execution behaviors of representative general-purpose GPU (GPGPU) applications, we conduct insightful investigations on the target GPU architecture.

Continue Reading

Performance impact on resource sharing among multiple CPU-and GPU-based applications

Performance impact on resource sharing among multiple CPU-and GPU-based applications

| 22 August, 2011 | 0 Comments

During the last decade, the performance and capabilities of graphics processing units (GPUs) have been drastically improved mostly due to the demands of the visualisation and the entertainment markets.

Continue Reading

Debunking the 100X GPU vs CPU myth

Debunking the 100X GPU vs CPU myth

| 16 July, 2010 | 0 Comments

Recent advances in computing have led to an explosion in the amount of data being generated. Processing the ever-growing data in a timely manner has made throughput computing an important aspect for emerging applications.

Continue Reading