Heterogeneous Computing with OpenCL 2.0,
Edition 1
By David R. Kaeli, Perhaad Mistry, Dana Schaa and Dong Ping Zhang

Publication Date: 18 May 2015
Description

Heterogeneous Computing with OpenCL 2.0 teaches OpenCL and parallel programming for complex systems that may include a variety of device architectures: multi-core CPUs, GPUs, and fully-integrated Accelerated Processing Units (APUs). This fully-revised edition includes the latest enhancements in OpenCL 2.0 including:

• Shared virtual memory to increase programming flexibility and reduce data transfers that consume resources
• Dynamic parallelism which reduces processor load and avoids bottlenecks
• Improved imaging support and integration with OpenGL 

Designed to work on multiple platforms, OpenCL will help you more effectively program for a heterogeneous future. Written by leaders in the parallel computing and OpenCL communities, this book explores memory spaces, optimization techniques, extensions, debugging and profiling. Multiple case studies and examples illustrate high-performance algorithms, distributing work across heterogeneous systems, embedded domain-specific languages, and will give you hands-on OpenCL experience to address a range of fundamental parallel algorithms.

Key Features

  • Updated content to cover the latest developments in OpenCL 2.0, including improvements in memory handling, parallelism, and imaging support
  • Explanations of principles and strategies to learn parallel programming with OpenCL, from understanding the abstraction models to thoroughly testing and debugging complete applications
  • Example code covering image analytics, web plugins, particle simulations, video editing, performance optimization, and more
About the author
By David R. Kaeli, Northeastern University, Boston, MA, USA; Perhaad Mistry, Northeastern University, Boston, MA, USA; Dana Schaa, Northeastern University, Boston, MA, USA and Dong Ping Zhang, AMD, Sunnyvale, California, USA
Table of Contents
  • List of Figures
  • List of Tables
  • Foreword
  • Acknowledgments
  • Chapter 1: Introduction
    • Abstract
    • 1.1 Introduction to Heterogeneous Computing
    • 1.2 The Goals of This Book
    • 1.3 Thinking Parallel
    • 1.4 Concurrency and Parallel Programming Models
    • 1.5 Threads and Shared Memory
    • 1.6 Message-Passing Communication
    • 1.7 Different Grains of Parallelism
    • 1.8 Heterogeneous Computing with OpenCL
    • 1.9 Book Structure
  • Chapter 2: Device architectures
    • Abstract
    • 2.1 Introduction
    • 2.2 Hardware Trade-offs
    • 2.3 The Architectural Design Space
    • 2.4 Summary
  • Chapter 3: Introduction to OpenCL
    • Abstract
    • 3.1 Introduction
    • 3.2 The OpenCL Platform Model
    • 3.3 The OpenCL Execution Model
    • 3.4 Kernels and the OpenCL Programming Model
    • 3.5 OpenCL Memory Model
    • 3.6 The OpenCL Runtime with an Example
    • 3.7 Vector Addition Using an OpenCL C++ Wrapper
    • 3.8 OpenCL for CUDA Programmers
    • 3.9 Summary
  • Chapter 4: Examples
    • Abstract
    • 4.1 OpenCL Examples
    • 4.2 Histogram
    • 4.3 Image Rotation
    • 4.4 Image Convolution
    • 4.5 Producer-Consumer
    • 4.6 Utility Functions
    • 4.7 Summary
  • Chapter 5: OpenCL runtime and concurrency model
    • Abstract
    • 5.1 Commands and the Queuing Model
    • 5.2 Multiple Command-Queues
    • 5.3 The Kernel Execution Domain: Work-Items, Work-Groups, and NDRanges
    • 5.4 Native and Built-In Kernels
    • 5.5 Device-Side Queuing
    • 5.6 Summary
  • Chapter 6: OpenCL host-side memory model
    • Abstract
    • 6.1 Memory Objects
    • 6.2 Memory Management
    • 6.3 Shared Virtual Memory
    • 6.4 Summary
  • Chapter 7: OpenCL device-side memory model
    • Abstract
    • 7.1 Synchronization and Communication
    • 7.2 Global Memory
    • 7.3 Constant Memory
    • 7.4 Local Memory
    • 7.5 Private Memory
    • 7.6 Generic Address Space
    • 7.7 Memory Ordering
    • 7.8 Summary
  • Chapter 8: Dissecting OpenCL on a heterogeneous system
    • Abstract
    • 8.1 OpenCL on an AMD FX-8350 CPU
    • 8.2 OpenCL on the AMD Radeon R9 290X GPU
    • 8.3 Memory Performance Considerations in OpenCL
    • 8.4 Summary
  • Chapter 9: Case study: Image clustering
    • Abstract
    • 9.1 Introduction
    • 9.2 The Feature Histogram on the CPU
    • 9.3 OpenCL Implementation
    • 9.4 Performance Analysis
    • 9.5 Conclusion
  • Chapter 10: OpenCL profiling and debugging
    • Abstract
    • 10.1 Introduction
    • 10.2 Profiling OpenCL Code Using Events
    • 10.3 AMD CodeXL
    • 10.4 Profiling Using CodeXL
    • 10.5 Analyzing Kernels Using CodeXL
    • 10.6 Debugging OpenCL Kernels Using CodeXL
    • 10.7 Debugging Using printf
    • 10.8 Summary
  • Chapter 11: Mapping high-level programming languages to OpenCL 2.0: A compiler writer’s perspective
    • Abstract
    • 11.1 Introduction
    • 11.2 A Brief Introduction to C++ AMP
    • 11.3 OpenCL 2.0 as a Compiler Target
    • 11.4 Mapping Key C++ AMP Constructs to OpenCL
    • 11.5 C++ AMP Compilation Flow
    • 11.6 Compiled C++ AMP Code
    • 11.7 How Shared Virtual Memory in OpenCL 2.0 Fits in
    • 11.8 Compiler Support for Tiling in C++AMP
    • 11.9 Address Space Deduction
    • 11.10 Data Movement Optimization
    • 11.11 Binomial Options: A Full Example
    • 11.12 Preliminary Results
    • 11.13 Conclusion
  • Chapter 12: WebCL: Enabling OpenCL acceleration of Web applications
    • Abstract
    • 12.1 Introduction
    • 12.2 Programming with WebCL
    • 12.3 Synchronization
    • 12.4 Interoperability with WebGL
    • 12.5 Example Application
    • 12.6 Security Enhancement
    • 12.7 WebCL on the Server
    • 12.8 Status and Future of WebCL
    • Works Cited
  • Chapter 13: Foreign lands: Plugging OpenCL in
    • Abstract
    • 13.1 Introduction
    • 13.2 Beyond C and C+ +
    • 13.3 Haskell OpenCL
    • 13.4 Summary
  • Index
Book details
ISBN: 9780128014141
Page Count: 336
Retail Price : £58.99
  • Gaster et al, Heterogeneous Computing with OpenCL, Morgan Kaufmann, 2011, Paperback, 282pp, 9780123877666, $69.95
  • Herlihy: The Art of Multiprocessor Programming Revised Ed, Morgan Kaufmann, 2012, Paperback, 514pp, 9780123973375, $74.95
Audience

Software engineers, programmers, hardware engineers, graduate students