Tag: parallel programming
Proven Algorithmic Techniques for Many-core Processors Workshop
By studying many current GPU computing applications, we have learned that the limits of an application’s scalability are often related to some combination of memory bandwidth saturation, memory contention, imbalanced data distribution, or data structure/algorithm interactions.
Upcoming Book: C++ AMP Accelerated Massive Parallelism with Microsoft Visual C++
With this practical book, experienced C++ developers will learn parallel programming fundamentals with C++ AMP through detailed examples, code snippets, and case studies.
GPU debugging in Visual Studio 2012 screencast
This screencast assumes knowledge of the C++ AMP API, e.g. that you totally understand the matrix multiplication implementation in C++ AMP. Watch this screencast on what features are available in Visual Studio 2012 for debugging C++ AMP code.
Inside VC++ 2012 Auto-Vectorization
The VC++ 2012 auto-vectorizer tries to make loops in your code run faster by automatically vectorizing your code using the SSE instructions available in all current mainline Intel and AMD chips.
ISC 12 Tutorial: Relative, Reverse & CUDA Debugging for Computationally Intensive Application Development
A significant challenge in developing, maintaining and porting numerical simulations is avoiding subtle errors that undermine the validity of the results without causing an obvious failure. This tutorial will share experiences, best practices and debugging techniques for identifying and resolving such defects in parallel applications.
Free Webinar: Getting Started with Intel SDK for OpenCL Applications
Developing parallel applications that take advantage of all the compute resources available on the underlying system is not a trivial task, and doing that across multiple devices in a standard manner is even more difficult.
On single-walk parallelization of the job shop problem solving algorithms
New parallel objective function determination methods for the job shop scheduling problem are proposed in this paper, considering makespan and the sum of jobs execution times criteria, however, the methods proposed can be applied also to another popular objective functions such as jobs tardiness or flow time
High-performance computing tools for the integrated assessment and modelling of social-ecological systems
Integrated spatio-temporal assessment and modelling of complex social–ecological systems is required to address global environmental challenges.
A Parallel Front Propagation Method: Simulating geological folds on parallel architectures
In this thesis, a novel three-dimensional anisotropic front propagation algorithm for simulation of geological folds on parallel architecture is presented. The algorithm’s abundant parallelism is demonstrated on multi-core CPUs and GPU architectures.






