Sunday, December 2, 2007

OpenMP

Intel C++ supports OpenMP, a standardized API for shared-memory multiprocessing in C++. By adding a simple #pragma loops can be split up to automatically run in parallel on multiple cores. I added a simple change to the per-screen-line rendering loop of my generic ray tracer:

int screenX;
#ifdef USE_OMP
#pragma omp parallel for firstprivate(portPoint)
#endif
for (screenX = 0;
screenX < SCREEN_WIDTH;
screenX++) {

With this change each pixel is rendered by a separate thread, with a maximum of 4 (the number of cores) running simultaneously. This use of threads is an alternative to the explicit pthread threading tested previously, where a separate thread was used per line of the screen. In this case OpenMP is simpler than explicit threading, though it yields slightly lower performance.

GCC 4.2 also supports OpenMP, but its performance is much worse than Intel.