This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] updates for projects/ia64.html


This patch updates the "Projects to improve performance on IA-64"
section of the GCC projects list.  It hadn't been updated for several
months and some of the projects that were mentioned have been completed.
There are a couple of changes that Richard Henderson requested long,
long ago that I had forgotten about.

I expect to see additional patches coming soon from IBM and HP.  Anyone
working on something that is expected to help the performance GCC code
for IA-64 is encouraged to add to this list to let us know what you're
doing and if you'd like help.  Some of us use this list as a source of
ideas for new projects.

OK to commit?

Janis

Index: projects/ia64.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/ia64.html,v
retrieving revision 1.6
diff -u -p -r1.6 ia64.html
--- projects/ia64.html	25 Jan 2002 09:57:30 -0000	1.6
+++ projects/ia64.html	30 Apr 2002 20:58:17 -0000
@@ -17,13 +17,16 @@
 <!-- table of contents end -->
 
 <p> This page lists projects that are expected to improve the performance
-of the code that GCC generates for IA-64.  The lists originally came out
+of the code that GCC generates for IA-64, more properly known as IPF
+(Itanium Processor Family).
+The lists originally came out
 of the GCC IA-64 Summit that was held June 6, 2001
 (<a href="http://linuxia64.org/gcc_summit.2001.06.06.html";>minutes
 of the summit</a>), and many of the comments are from that summit.
-Later updates are from discussions among people working in this area.</p>
+Later updates are from discussions among people working in this area.
+Additions and corrections are always welcome.</p>
 
-<p>During that summit, developers of proprietary IA-64 compilers
+<p>During the June 2001 summit, developers of proprietary IA-64 compilers
 stressed that interactions between optimizations for IA-64 can be very
 significant, more so than with other architectures.  People contributing
 IA-64 improvements are highly encouraged to work closely with people
@@ -66,24 +69,19 @@ optimizations could use it?</em></p>
 
 <p>At the GCC IA-64 Summit in June 2001, developers of other IA-64 compilers
 said that optimizations involving compiler generated data prefetch are
-important for IA-64 performance.  There are other GCC targets that would
-also benefit from this support, so a generalized framework for the support
-will encourage other GCC contributors to consider implementing these
-optimizations.</p>
-
-<p>Jan Hubicka of SuSE submitted a preliminary patch in May 2000 to prefetch
-arrays within loops, but his patch supported only x86 variants SSE and
-3DNow! and wasn't completed.  Jan is interested in completing that patch
-to fit a general framework for data prefetch.  He is also interested in
-implementing support for "greedy prefetch" of memory referenced by pointers.
-</p>
+important for IA-64 performance.</p>
 
-<p>Janis Johnson is defining a prefetch RTL pattern that will support data
-prefetch support on a variety of GCC targets and a
-<code>__builtin_prefetch</code> function.  This and other work for data
-prefetch support are described in the
+<p>GCC 3.1 includes a prefetch RTL pattern that supports data prefetch on
+a variety of GCC targets, a <code>__builtin_prefetch</code> function, and
+the optimization <code>-fprefetch-loop-arrays</code>.  General information
+about data prefetch and about data prefetch instructions supported by a
+variety of GCC targets are described in the
 <a href="prefetch.html">Data Prefetch Support</a> section of the Projects
 list.</p>
+
+<p>Janis Johnson is trying tweaks to the heuristics used for the
+<code>-fprefetch-loop-arrays</code> optimization to try to get better
+performance on IA-64.</p>
 </li>
 
 <li>Use existing dependence distance code
@@ -103,23 +101,26 @@ hooked up to the MEM info struct and use
 <li>Make better use of dependence information in scheduling
 </li>
 
-<li>Contribute to the DFA branch
-<p>This branch of the FSF CVS tree includes the Cygnus scheduler written by
-Vladimir Makarov.  It uses a new pipeline description model and supports
-software pipelining.  The Cygnus scheduler is a very large piece of work.
-Testing has shown only 1-2% performance improvement for Itanium.</p>
-
-<p>In theory, this scheduler should be a lot faster (for compile time) than
-the Haifa scheduler, but in practice it is not, perhaps because it uses
-a much larger model that takes longer to process.
-This scheduler might be the right way to go long-term, but it needs a lot
-of work first.  As of November 2001 it's not a high priority for IA-64.</p>
+<li>Contribute to the DFA scheduler
+
+<p>As reported in (and cribbed from) the GCC news/announcements list,
+Vladimir Makarov has contributed a new scheme for describing processor
+pipelines (commonly referred to as the DFA scheduler).  This new scheme
+can model certain pipeline architectures more effectively than the old
+scheme, which in turn can improve the code generated by the compiler.
+This work was merged to the GCC CVS mainline from the DFA branch
+in April 2002 and is expected to be in GCC 3.2.</p>
+
+<p>Vlad reported in
+<a href="http://gcc.gnu.org/ml/gcc/2002-04/msg00409.html";>
+mail to the GCC list</a> in April 2002
+that he is working on a DFA-based scheduler for IA-64.</p>
 </li>
 
 <li>Contribute to the AST optimizer branch
 <p>
-This has a larger promise of showing performance improvements for IA-64 in
-the short term, and might be in GCC 3.2.  Work that would be particularly
+This has promise of showing performance improvements for IA-64 in
+the short term and might be in GCC 3.2.  Work that would be particularly
 useful in this area is to make it language-independent;
 right now only C and C++ can provide input to it because they are the only
 front ends that have tree representations for functions.  Getting g95 to
@@ -142,8 +143,9 @@ but couldn't show that it made a differe
 
 <li>Code locality: exploit existing profile-directed block ordering
 <p>Jan Hubicka, together with Richard Henderson and Andreas Jaeger,
-recently made several changes to the profile-directed block ordering in GCC,
-which is available through <code>-fbranch-probabilities</code>
+made several changes to the profile-directed block ordering in GCC
+for GCC 3.1.  This functionality
+is available through <code>-fbranch-probabilities</code>
 using data generated by first compiling with <code>-fprofile-arcs</code>.
 This is described in
 <a href="http://gcc.gnu.org/news/profiledriven.html";>Infrastructure for
@@ -154,9 +156,11 @@ investigate:</p>
 <ul>
 <li>GCC does not split a function into multiple regions, although that
 has been mentioned as a possibility.</li>
-<li>Profile information could be used to improve linearization of the code,
-and for if-conversion to decide which side of the branch should be
-predicated.  It could also be used for delay slots.</li>
+<li>Profile information could be used to improve linearization of the code;
+the CFG branch includes some support for trace scheduling.</li>
+<li>Profile information could be used for if-conversion to decide which side
+of the branch should be predicated.</li>
+<li>Profile information is used for predication and delay slots.</li>
 </ul>
 </li>
 
@@ -166,7 +170,7 @@ used with GCC.</p>
 </li>
 
 <li>Inlining: improve the heuristics used to guide inlining with -O3
-<p>Some of this was done in the summer of 2001 and will be in GCC 3.1.
+<p>Some of this was done in the summer of 2001 and is in GCC 3.1.
 There might be more work that could be done here.</p>
 </li>
 
@@ -238,8 +242,8 @@ transformation.  Large functions are som
 into regions for compilation, with the goal of reducing compile time."
 </p>
 
-<p>Richard Henderson says we could rip out CFG detection, use regular data
-structures, and fix region detection.</p>
+<p>Richard Henderson says we could rip out the Haifa scheduler's CFG
+detection, use regular data structures, and fix region detection.</p>
 </li>
 <li>Language-independent tree optimizations
 <p>Richard Henderson:  Cool optimizations require more information than


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]