Differences between revisions 1 and 28 (spanning 27 versions)
Revision 1 as of 2008-03-12 18:48:17
Size: 1152
Editor: DiegoNovillo
Comment:
Revision 28 as of 2013-05-06 12:47:21
Size: 5651
Editor: 138
Comment:
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:
This page contains information on GCC's implementation of the [[http://www.openmp.org|OpenMP]] standard and related functionality like the auto parallelizer ({{{-ftree-parallelize-loops}}}). This page contains information on GCC's implementation of the [[http://openmp.org/wp/|OpenMP]] standard and related functionality like the auto parallelizer ({{{-ftree-parallelize-loops}}}).
Line 4: Line 4:
As of GCC 4.2, the compiler implements version 2.5 of the OpenMP standard.  Work is underway to implement the new version 3.0 of the standard (branch {{{gomp-3_0-branch}}}). As of GCC 4.2, the compiler implements version 2.5 of the OpenMP standard and as of 4.4 it implements version 3.0 of the OpenMP standard. The OpenMP 3.1 is supported since GCC 4.7.
Line 9: Line 9:
 * [[http://www.openmp.org/mp-documents/spec25.pdf|OpenMP v2.5 standard]]
 * [[http://www.openmp.org/mp-documents/spec30_draft.pdf|OpenMP v3.0 standard (draft)]]
 * [[http://www.openmp.org/mp-documents/spec25.pdf|OpenMP v2.5 specification]]
 * [[http://www.openmp.org/mp-documents/spec30.pdf|OpenMP v3.0 specification]]
 * [[http://www.openmp.org/mp-documents/omp3.1-2011.0203a.pdf|OpenMP v3.1 draft specification]] ([[http://openmp.org/forum/viewforum.php?f=9|forum for draft comments]])
 * [[http://www.openmp.org/mp-documents/OpenMP3.1.pdf|OpenMP v3.1 specification]] (July 2011) ([[http://openmp.org/forum/viewforum.php?f=10|OpenMP 3.1 API forum]])
 * [[http://www.openmp.org/mp-documents/OpenMP_4.0_RC2.pdf|OpenMP v4.0rc2 specification]] (March 2013) ([[http://openmp.org/forum/viewforum.php?f=12|OpenMP 4.0 API forum]], [[http://openmp.org/wp/2013/03/openmp-40-rc2/|rc2 changes]])
Line 12: Line 15:

= Automatic Parallelization =
({{{-ftree-parallelize-loops}}})

 * Streamization

= Test Suites and Benchmarks =

 * [[http://www.hlrs.de/organization/people/niethammer/projects/openmp-validation-suite/|OpenMP Validation Suite]] by HLRS, Univ. Stuttgart and Univ. of Houston (2007 version at UH: [[http://www2.cs.uh.edu/~openuh/OpenMPValidation_README|README]], [[http://www2.cs.uh.edu/~openuh/download/register.shtml|Download]])
 * [[https://pm.bsc.es/projects/bots|OpenMP task test suite]] by BSC
 * [[http://www.spec.org/omp/|SPEC OMP]]
 * [[http://www.epcc.ed.ac.uk/software-products/epcc-openmp-benchmarks|EPCC Microbenchmarks]]
 * [[http://www.nas.nasa.gov/Resources/Software/npb.html|NAS Benchmarks]] ([[http://www.hpcs.cs.tsukuba.ac.jp/omni-openmp/download/download-benchmarks.html|unofficial C version]])
 * [[https://computation.llnl.gov/casc/RTS_Report/openmp_perf.html|LLNL benchmark]]
 * [[https://www.cs.virginia.edu/~skadron/wiki/rodinia/index.php/Main_Page|Rodinia Benchmark suite]] [[http://lava.cs.virginia.edu/Rodinia/|(2)]]: OpenMP, OpenCL, CUDA benchmark
Line 17: Line 35:
 * Fix [[http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35243|PR 35423]].  * Fix [[http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35423|PR 35423]] (missing parts of OpenMP WORKSHARE).
Line 19: Line 37:
 * Help out with the OpenMP 3.0 implementation effort ({{{gomp-3_0-branch}}}).  * Implement `untied` tasks (no compliance issue; needs to be well tuned to be actually faster; cf. page 53 of [[https://iwomp.zih.tu-dresden.de/downloads/omp30-tasks.pdf|pdf]]) - see also next item
 * Tasks need some tuning in taskwait. Cf. [[http://gcc.gnu.org/ml/gcc/2011-04/msg00040.html|GCC email]], [[https://iwomp.zih.tu-dresden.de/downloads/runtime-olivier.pdf|comparison]]. Algorithms: [[http://www.sarc-ip.org/files/null/Workshop/1234128788173__TSchedStrat-iwomp08.pdf|PDF 1]], [[http://www.sarc-ip.org/files/xavier/Conference/1234121213301__cascon08-ibm-tasks.pdf|PDF 2]], [[http://capinfo.e.ac.upc.edu/PDFs/dir27/file003681.pdf|PDF 3]]
  A [[http://iwomp-2012.caspur.it/sites/iwomp-2012.caspur.it/files/Broquedis_libKOMP-iwomp2012.pdf|comparison for tasks between libgomp, another library (gcc+libKOMP) and Intel]] (cf. benchmark on p. 10); see also [[http://iwomp-2012.caspur.it/sites/iwomp-2012.caspur.it/files/Terboven-Assessing_OpenMP_Tasking_Implementations_on_NUMA_Architectures.pdf|another IOMP 2012 comparison]]
 * [[http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00080.html|taskyield is a stub and mergeable task clonning could be optimized]]
 * OpenMP 4.0 -- when available. Cf. slides at the [[http://www.ccs.tsukuba.ac.jp/workshop/IWOMP2010/|IWOMP, the International Workshop for OpenMP]] ([[http://www.ccs.tsukuba.ac.jp/workshop/IWOMP2010/program.html|slides]] and [[http://www.ccs.tsukuba.ac.jp/workshop/IWOMP2010/tutorial.html|tutorials]]) in June 2010 and the [[http://www.springerlink.com/content/978-3-642-13216-2|OWOMP 2010 proceedings]]. There is also a [[http://www-949.ibm.com/software/rational/cafe/blogs/ccpp-parallel-multicore/2010/06/21/the-view-from-iwomp-2010-trip-report|blog entry]]. See also IWOMP 2012's [[http://www.ncsa.illinois.edu/Conferences/IWOMP11/program/program.html|talks]] and the [[http://www.ncsa.illinois.edu/Conferences/IWOMP11/program/presentations/supinski.pdf|committee report]]. SC2011 (November 2011): [[http://openmp.org/wp/presos/SC11_OpenMP_BoF.pdf|OpenMP Lang Committee Report]], [[http://openmp.org/wp/presos/SCBOF11.pdf|CEO report]]
  . And the 2012 slides: [[http://iwomp-2012.caspur.it/program/workshop-program|IWOMP program]], [[http://iwomp-2012.caspur.it/sites/iwomp-2012.caspur.it/files/IWOMP12_State_of_LC.pdf|Language Committee report]]
  . And the OpenMP 4.0 release candidate documents: [[http://www.openmp.org/mp-documents/OpenMP_4.0_RC2.pdf|OpenMP v4.0rc2 specification]] (March 2013) ([[http://openmp.org/forum/viewforum.php?f=12|OpenMP 4.0 API forum]], [[http://openmp.org/wp/2013/03/openmp-40-rc2/|rc2 changes]])
  . And a comment prior to the final release: [[https://www.ibm.com/developerworks/community/blogs/5894415f-be62-4bc0-81c5-3956e82276f3/entry/openmp_4_0_about_to_be_released_and_iwomp_2013?lang=en|OpenMP 4.0 about to be released and IWOMP 2013, by Michael Wong]] (May 2013)

OpenMP

This page contains information on GCC's implementation of the OpenMP standard and related functionality like the auto parallelizer (-ftree-parallelize-loops).

As of GCC 4.2, the compiler implements version 2.5 of the OpenMP standard and as of 4.4 it implements version 3.0 of the OpenMP standard. The OpenMP 3.1 is supported since GCC 4.7.

OpenMP Documentation

Automatic Parallelization

(-ftree-parallelize-loops)

  • Streamization

Test Suites and Benchmarks

TODO List

Feel free to add new items to this list as you run into issues or features that would be interesting to add. Send mail to the list and/or the GCC OpenMP maintainers if any item in this list sounds interesting but is hard to understand.

None: openmp (last edited 2015-01-29 08:24:51 by tschwinge)