[wwwdocs] cfg branch homepage update

Mon Jan 28 08:36:00 GMT 2002

> Jan,
> 
> thanks for keeping this web-page up-to-date, this really is very useful.
> 
> However, may I suggest that in the future you use a spell-checker (I just
> verified, and Aspell for example really works fine for HTML documents)?
OK, I will try.
I do use simple trick of learing ispell all the HTML tags I use, but
I will take a look at Aspell as well.
> 
> Below there are a couple of comments, would you mind addressing these?

I've installed the attached patch that I hope fixes the problem.  I am
leaving for the vacation soon, so in case there are futher issues, just
fix the page yourself.

Thanks for the comments!

Honza

Index: cfg.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cfg.html,v
retrieving revision 1.4
diff -c -3 -p -r1.4 cfg.html
*** cfg.html	2002/01/24 21:07:28	1.4
--- cfg.html	2002/01/28 16:23:08
*************** optimization by other passes. It brings 
*** 69,75 ****
  on specint2000 with profile feedback.  Using static profile estimation
  it is pretty much hit or miss.</p>

! The code is on the branch, it needs to be tunned.

  <h3>Loop Peeling</h3>

--- 69,75 ----
  on specint2000 with profile feedback.  Using static profile estimation
  it is pretty much hit or miss.</p>

! The code is on the branch, it needs to be tuned.

  <h3>Loop Peeling</h3>

*************** the loop.
*** 88,94 ****
 The loop peeling optimization will be done as part of the
 superblock formation. The tracer should peel as many iterations as
 can be predicted. Zdenek has also implemented simple loop peeling
! as part of new loop optimizer.

  <h4>Status</h4>
--- 88,94 ----
  <p>The loop peeling optimization will be done as part of the
  superblock formation.  The tracer should peel as many iterations as
  can be predicted.  Zdenek has also implemented simple loop peeling
! as part of the new loop optimizer.
  </p>

  <h4>Status</h4>
*************** to peel more than one iteration.</p>
*** 105,112 ****

  <h4>Theory</h4>

! <p>Current loop optimizer do use information passed by frontend
! to discover loop construct to simplify flow analysis.
  It is dificult to keep the information up-to-date and nowday
  it is easy to implement the loop discovery code on CFG.
  </p>
--- 105,112 ----

  <h4>Theory</h4>

! The current loop optimizer uses information passed by frontend
! to discover loop constructs to simplify flow analysis.
 It is dificult to keep the information up-to-date and nowday
 it is easy to implement the loop discovery code on CFG.
 
*************** it is easy to implement the loop discove
*** 114,120 ****
 <h4>Implementation in GCC</h4>

  <p>
! We want to use the Michael Hayes loop discovery code and slowly
  replace existing features of loop optimizer by new one.
  So far Zdenek has modified the datastructure to allow easier
  updating and implemented loop unrolling, peeling and unswitching
--- 114,120 ----
  <h4>Implementation in GCC</h4>

! We want to use the Michael Hayes' loop discovery code and slowly
 replace existing features of loop optimizer by new one.
 So far Zdenek has modified the datastructure to allow easier
 updating and implemented loop unrolling, peeling and unswitching
*************** on the top of it.
*** 123,129 ****

  <h4>Status</h4>

! The main changed are on the branch, the optimizations itself
 are to come later.

  <p>The tracer peels loops with one predicted iteration. We should try
--- 123,129 ----

  <h4>Status</h4>

! The main changes are on the branch, the optimizations itself
 are to come later.

The tracer peels loops with one predicted iteration. We should try
*************** interprocedural optimizations are out of
*** 154,160 ****
 <h4>Status</h4>

The code is in the branch. The exact benefits needs to be measured
! but on non-PDO runs it brings 5% to eon benchamrk.

  <h3>Web Construction Pass</h3>

--- 154,160 ----
  <h4>Status</h4>

The code is in the branch. The exact benefits needs to be measured
! but on non-PDO runs it brings 5% to eon benchmark (from CPU2000).

  <h3>Web Construction Pass</h3>

*************** to work on updating debug information in
*** 183,199 ****

  <h3>Register coalescing Pass</h3>

! This pass coalesces multiple registers into single in order to
 avoid register to register copies that our register allocator is not
 able to deal with very well. It is a kind of temporary solution until
! the new register allocator is in place. The benefits after than are
 questionable, but still it is more effective (and probably less
 expensive) than the current copy propagation implementation.

  <h4>Implementation in GCC</h4>

It is designed as stand alone pass constructing conflict graph and
! coalescing register run after GCSE.

  <h4>Status</h4>

--- 183,199 ----

  <h3>Register coalescing Pass</h3>

! This pass coalesces multiple registers into single one in order to
 avoid register to register copies that our register allocator is not
 able to deal with very well. It is a kind of temporary solution until
! the new register allocator is in place. At that point, the benefits are
 questionable, but still it is more effective (and probably less
 expensive) than the current copy propagation implementation.

  <h4>Implementation in GCC</h4>

It is designed as stand alone pass constructing conflict graph and
! coalescing registers run after GCSE.

  <h4>Status</h4>

*************** Profile data is used to control the amou
*** 231,239 ****

  <h4>Status</h4>

! The code is on the branch. Benefits for Athlon CPU are not emasurable.
 This can be saved by adding more simplifications. Benefit for wider issue
! CPUs needs to be evaulated.

  <h3>High Level Branch Prediction</h3>

--- 231,239 ----

  <h4>Status</h4>

! The code is on the branch. Benefits for Athlon CPU are not measurable.
 This can be saved by adding more simplifications. Benefit for wider issue
! CPUs needs to be evaluated.

  <h3>High Level Branch Prediction</h3>

*************** href="mailto:bim@atrey.karlin.mff.cuni.c
*** 297,303 ****
 Currently the profile is fragile, since there is no verification
 that the compiled program matches the profiled data. Since the
 profiler eliminates the redundancy in data, the mismatch is often not
! discovered at all thereby making results to be completely
 nonsense.

  <h4>Implementation in GCC</h4>
--- 297,303 ----
  <p>Currently the profile is fragile, since there is no verification
  that the compiled program matches the profiled data.  Since the
  profiler eliminates the redundancy in data, the mismatch is often not
! discovered at all causing completely nonsensical results.
  nonsense.</p>

  <h4>Implementation in GCC</h4>
*************** in <a href="#3">[3]</a>.</p>
*** 316,322 ****

  <h4>Status</h4>

! <p>First version is included in the branch. Due to ability to compute
  overall summary of counters it allows detection of hot spots in program
  and improves to detect hot spots better reducing code size and improving
  performance occasionally.</p>
--- 316,322 ----

  <h4>Status</h4>

! First version is included in the branch. Due to the ability to compute
 overall summary of counters it allows detection of hot spots in program
 and improves to detect hot spots better reducing code size and improving
 performance occasionally.
*************** passes.
*** 385,391 ****

  <h4>Implementation in GCC</h4>

! <p>The following outline has been writen by Jeff Law:
  <PRE>
  Particularly for the task of building the machine description for the generic
  RTL and the translation from generic RTL into target specific RTL.  Those two
--- 385,391 ----

  <h4>Implementation in GCC</h4>

! The following outline has been written by Jeff Law:
 <PRE>
 Particularly for the task of building the machine description for the generic
 RTL and the translation from generic RTL into target specific RTL. Those two
*************** duplication.</li>
*** 470,477 ****
 <li>Avoid rebuilding the CFG for <code>toplev.c</code>.</li>
 </ul>

! <li>Opcode heuristics tweeks.</li>
! Avoid predicting equivality comparisons to 0 and floating point comparisons
 to increase hitrate of opcode heursitic.

  <h2>Branch Status</h2>
--- 470,477 ----
  <li>Avoid rebuilding the  CFG for <code>toplev.c</code>.</li>
  </ul>

! <li>Opcode heuristics tweak.</li>
! Avoid predicting comparisons for equality to 0 and floating point comparisons
 to increase hitrate of opcode heursitic.

  <h2>Branch Status</h2>