This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PATCH] updates for projects/ia64.html
- From: Janis Johnson <janis187 at us dot ibm dot com>
- To: gcc-patches at gcc dot gnu dot org
- Date: Tue, 30 Apr 2002 14:42:57 -0700
- Subject: [PATCH] updates for projects/ia64.html
This patch updates the "Projects to improve performance on IA-64"
section of the GCC projects list. It hadn't been updated for several
months and some of the projects that were mentioned have been completed.
There are a couple of changes that Richard Henderson requested long,
long ago that I had forgotten about.
I expect to see additional patches coming soon from IBM and HP. Anyone
working on something that is expected to help the performance GCC code
for IA-64 is encouraged to add to this list to let us know what you're
doing and if you'd like help. Some of us use this list as a source of
ideas for new projects.
OK to commit?
Janis
Index: projects/ia64.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/ia64.html,v
retrieving revision 1.6
diff -u -p -r1.6 ia64.html
--- projects/ia64.html 25 Jan 2002 09:57:30 -0000 1.6
+++ projects/ia64.html 30 Apr 2002 20:58:17 -0000
@@ -17,13 +17,16 @@
<!-- table of contents end -->
<p> This page lists projects that are expected to improve the performance
-of the code that GCC generates for IA-64. The lists originally came out
+of the code that GCC generates for IA-64, more properly known as IPF
+(Itanium Processor Family).
+The lists originally came out
of the GCC IA-64 Summit that was held June 6, 2001
(<a href="http://linuxia64.org/gcc_summit.2001.06.06.html">minutes
of the summit</a>), and many of the comments are from that summit.
-Later updates are from discussions among people working in this area.</p>
+Later updates are from discussions among people working in this area.
+Additions and corrections are always welcome.</p>
-<p>During that summit, developers of proprietary IA-64 compilers
+<p>During the June 2001 summit, developers of proprietary IA-64 compilers
stressed that interactions between optimizations for IA-64 can be very
significant, more so than with other architectures. People contributing
IA-64 improvements are highly encouraged to work closely with people
@@ -66,24 +69,19 @@ optimizations could use it?</em></p>
<p>At the GCC IA-64 Summit in June 2001, developers of other IA-64 compilers
said that optimizations involving compiler generated data prefetch are
-important for IA-64 performance. There are other GCC targets that would
-also benefit from this support, so a generalized framework for the support
-will encourage other GCC contributors to consider implementing these
-optimizations.</p>
-
-<p>Jan Hubicka of SuSE submitted a preliminary patch in May 2000 to prefetch
-arrays within loops, but his patch supported only x86 variants SSE and
-3DNow! and wasn't completed. Jan is interested in completing that patch
-to fit a general framework for data prefetch. He is also interested in
-implementing support for "greedy prefetch" of memory referenced by pointers.
-</p>
+important for IA-64 performance.</p>
-<p>Janis Johnson is defining a prefetch RTL pattern that will support data
-prefetch support on a variety of GCC targets and a
-<code>__builtin_prefetch</code> function. This and other work for data
-prefetch support are described in the
+<p>GCC 3.1 includes a prefetch RTL pattern that supports data prefetch on
+a variety of GCC targets, a <code>__builtin_prefetch</code> function, and
+the optimization <code>-fprefetch-loop-arrays</code>. General information
+about data prefetch and about data prefetch instructions supported by a
+variety of GCC targets are described in the
<a href="prefetch.html">Data Prefetch Support</a> section of the Projects
list.</p>
+
+<p>Janis Johnson is trying tweaks to the heuristics used for the
+<code>-fprefetch-loop-arrays</code> optimization to try to get better
+performance on IA-64.</p>
</li>
<li>Use existing dependence distance code
@@ -103,23 +101,26 @@ hooked up to the MEM info struct and use
<li>Make better use of dependence information in scheduling
</li>
-<li>Contribute to the DFA branch
-<p>This branch of the FSF CVS tree includes the Cygnus scheduler written by
-Vladimir Makarov. It uses a new pipeline description model and supports
-software pipelining. The Cygnus scheduler is a very large piece of work.
-Testing has shown only 1-2% performance improvement for Itanium.</p>
-
-<p>In theory, this scheduler should be a lot faster (for compile time) than
-the Haifa scheduler, but in practice it is not, perhaps because it uses
-a much larger model that takes longer to process.
-This scheduler might be the right way to go long-term, but it needs a lot
-of work first. As of November 2001 it's not a high priority for IA-64.</p>
+<li>Contribute to the DFA scheduler
+
+<p>As reported in (and cribbed from) the GCC news/announcements list,
+Vladimir Makarov has contributed a new scheme for describing processor
+pipelines (commonly referred to as the DFA scheduler). This new scheme
+can model certain pipeline architectures more effectively than the old
+scheme, which in turn can improve the code generated by the compiler.
+This work was merged to the GCC CVS mainline from the DFA branch
+in April 2002 and is expected to be in GCC 3.2.</p>
+
+<p>Vlad reported in
+<a href="http://gcc.gnu.org/ml/gcc/2002-04/msg00409.html">
+mail to the GCC list</a> in April 2002
+that he is working on a DFA-based scheduler for IA-64.</p>
</li>
<li>Contribute to the AST optimizer branch
<p>
-This has a larger promise of showing performance improvements for IA-64 in
-the short term, and might be in GCC 3.2. Work that would be particularly
+This has promise of showing performance improvements for IA-64 in
+the short term and might be in GCC 3.2. Work that would be particularly
useful in this area is to make it language-independent;
right now only C and C++ can provide input to it because they are the only
front ends that have tree representations for functions. Getting g95 to
@@ -142,8 +143,9 @@ but couldn't show that it made a differe
<li>Code locality: exploit existing profile-directed block ordering
<p>Jan Hubicka, together with Richard Henderson and Andreas Jaeger,
-recently made several changes to the profile-directed block ordering in GCC,
-which is available through <code>-fbranch-probabilities</code>
+made several changes to the profile-directed block ordering in GCC
+for GCC 3.1. This functionality
+is available through <code>-fbranch-probabilities</code>
using data generated by first compiling with <code>-fprofile-arcs</code>.
This is described in
<a href="http://gcc.gnu.org/news/profiledriven.html">Infrastructure for
@@ -154,9 +156,11 @@ investigate:</p>
<ul>
<li>GCC does not split a function into multiple regions, although that
has been mentioned as a possibility.</li>
-<li>Profile information could be used to improve linearization of the code,
-and for if-conversion to decide which side of the branch should be
-predicated. It could also be used for delay slots.</li>
+<li>Profile information could be used to improve linearization of the code;
+the CFG branch includes some support for trace scheduling.</li>
+<li>Profile information could be used for if-conversion to decide which side
+of the branch should be predicated.</li>
+<li>Profile information is used for predication and delay slots.</li>
</ul>
</li>
@@ -166,7 +170,7 @@ used with GCC.</p>
</li>
<li>Inlining: improve the heuristics used to guide inlining with -O3
-<p>Some of this was done in the summer of 2001 and will be in GCC 3.1.
+<p>Some of this was done in the summer of 2001 and is in GCC 3.1.
There might be more work that could be done here.</p>
</li>
@@ -238,8 +242,8 @@ transformation. Large functions are som
into regions for compilation, with the goal of reducing compile time."
</p>
-<p>Richard Henderson says we could rip out CFG detection, use regular data
-structures, and fix region detection.</p>
+<p>Richard Henderson says we could rip out the Haifa scheduler's CFG
+detection, use regular data structures, and fix region detection.</p>
</li>
<li>Language-independent tree optimizations
<p>Richard Henderson: Cool optimizations require more information than