OpenMP
This page contains information on GCC's implementation of the OpenMP standard and related functionality like the auto parallelizer (-ftree-parallelize-loops).
As of GCC 4.2, the compiler implements version 2.5 of the OpenMP standard and as of 4.4 it implements version 3.0 of the OpenMP standard. The OpenMP 3.1 is supported since GCC 4.7.
OpenMP Documentation
Documentation on libgomp (OpenMP runtime for GCC).
OpenMP v3.1 standard (July 2011) (OpenMP 3.1 API forum)
Automatic Parallelization
(-ftree-parallelize-loops)
- Streamization
Test Suites and Benchmarks
OpenMP Validation Suite by HLRS, Univ. Stuttgart and Univ. of Houston (2007 version at UH: README, Download)
OpenMP task test suite by BSC
Rodinia Benchmark suite (2): OpenMP, OpenCL, CUDA benchmark
TODO List
Feel free to add new items to this list as you run into issues or features that would be interesting to add. Send mail to the list and/or the GCC OpenMP maintainers if any item in this list sounds interesting but is hard to understand.
Fix PR 35423 (missing parts of OpenMP WORKSHARE).
Fine tune the auto scheduling feature for parallel loops.
Implement untied tasks (no compliance issue; needs to be well tuned to be actually faster; cf. page 53 of pdf) - see also next item
Tasks need some tuning in taskwait. Cf. GCC email, comparison. Algorithms: PDF 1, PDF 2, PDF 3
Note: There is a GSoC 2011 project regarding task and untied tasks A comparison for tasks between libgomp, another library (gcc+libKOMP) and Intel (cf. benchmark on p. 10); see also another IOMP 2012 comparison
taskyield is a stub and mergeable task clonning could be optimized
OpenMP 4.0 -- when available. Cf. slides at the IWOMP, the International Workshop for OpenMP (slides and tutorials) in June 2010 and the OWOMP 2010 proceedings. There is also a blog entry. See also IWOMP 2012's talks and the committee report. SC2011 (November 2011): OpenMP Lang Committee Report, CEO report