Parallel and High Performance Computing offers techniques guaranteed
to boost your code's effectiveness.
Summary
Complex calculations, like training deep learning models or running
large-scale simulations, can take an extremely long time. Efficient
parallel programming can save hours--or even days--of computing time.
Parallel and High Performance Computing shows you how to deliver
faster run-times, greater scalability, and increased energy efficiency
to your programs by mastering parallel techniques for multicore
processor and GPU hardware.
About the technology
Write fast, powerful, energy efficient programs that scale to tackle
huge volumes of data. Using parallel programming, your code spreads data
processing tasks across multiple CPUs for radically better performance.
With a little help, you can create software that maximizes both speed
and efficiency.
About the book
Parallel and High Performance Computing offers techniques guaranteed
to boost your code's effectiveness. You'll learn to evaluate hardware
architectures and work with industry standard tools such as OpenMP and
MPI. You'll master the data structures and algorithms best suited for
high performance computing and learn techniques that save energy on
handheld devices. You'll even run a massive tsunami simulation across a
bank of GPUs.
What's inside
Planning a new parallel project
Understanding differences in CPU and GPU architecture
Addressing underperforming kernels and loops
Managing applications with batch scheduling
About the reader
For experienced programmers proficient with a high-performance computing
language like C, C++, or Fortran.
About the author
Robert Robey works at Los Alamos National Laboratory and has been
active in the field of parallel computing for over 30 years. Yuliana
Zamora is currently a PhD student and Siebel Scholar at the University
of Chicago, and has lectured on programming modern hardware at numerous
national conferences.
Table of Contents
PART 1 INTRODUCTION TO PARALLEL COMPUTING
1 Why parallel computing?
2 Planning for parallelization
3 Performance limits and profiling
4 Data design and performance models
5 Parallel algorithms and patterns
PART 2 CPU: THE PARALLEL WORKHORSE
6 Vectorization: FLOPs for free
7 OpenMP that performs
8 MPI: The parallel backbone
PART 3 GPUS: BUILT TO ACCELERATE
9 GPU architectures and concepts
10 GPU programming model
11 Directive-based GPU programming
12 GPU languages: Getting down to basics
13 GPU profiling and tools
PART 4 HIGH PERFORMANCE COMPUTING ECOSYSTEMS
14 Affinity: Truce with the kernel
15 Batch schedulers: Bringing order to chaos
16 File operations for a parallel world
17 Tools and resources for better code