This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RFA: merging dfa-branch into the main trunk


  Hello, I'd like to get an approval for merging the dfa-branch into
the main trunk.  Currently the branch contains dfa descriptions for
ultrsparc and sh4.

David Edelsohn <dje@watson.ibm.com> wrote:

>         Does the new scheduler produce any significant performance
> improvement in the code GCC generates commensurate with its compile-time
> cost?
> 
>         Both my experience with the new DFA scheduler targeted at PowerPC
> delivered with GNUPro and the GCC for IA-64 Summit minutes report that the
> scheduler compile time was no faster than the Haifa Scheduler and the
> scheduler did it not generate faster code.  The software pipelining did
> not seem to be effective either, mainly because of dependency information
> infrastructure still lacking in GCC.
> 
>         In theory, the DFA scheduler and software pipelining should be
> much better.  Until the new work demonstrates an improvement and is shown
> to be robust, I think it should be on a branch, as GCC's development
> policy specifies.
> 
>         What evidence shows that the DFA scheduler and software pipeliner
> have evolved beyond a work in progress?

  David Miller wrote pipeline description for ultrasparc processor.
Dan Nicolaesku ran SPEC95 tests comparing gcc with traditional and
dfa-based scheduler

http://gcc.gnu.org/ml/gcc/2001-11/msg00736.html

  DFA-based scheduler generates always a better code for any SPEC95
tests.  One SPECfp95 test was speeded up to 11%. 

  Another important thing is that sparc.c was decreased on several
hundred lines which was mainly tuning the scheduler to ultrasparc
for better insn scheduling.

  Naveen Sharma wrote a sh4 dfa pipeline description and reported
12-13% on SLALOM benchmark.

http://gcc.gnu.org/ml/gcc-patches/2001-12/msg02157.html

  Mike Meissner reported 5-15 percent speedup by going to the DFA
scheduler for a particular MIPS target on EEBC.

http://gcc.gnu.org/ml/gcc/2001-09/msg00061.html

  As for ppc I've tried to improve the description several times for
ppc750/ppc7400 but have no visible improvement for real tests like
gcc.  Although I got some improvement on few small tests, e.g. hanoi
(about 3%).

/home/vmakarov/build/gcc-dfa-branch1/toymac/bin/gcc -O2 -mcpu=7400
/home/vmakarov/aburto/hanoi/hanoi.c -DUNIX -o f1 -lm -g -mdfa;./f1
/home/vmakarov/aburto/hanoi/hanoi.c: In function `main':
/home/vmakarov/aburto/hanoi/hanoi.c:64: warning: return type of `main'
is not `int'

Towers of Hanoi Puzzle Test Program (27 Oct 94)

Disks     Moves     Time(sec)   Moves/25usec
 16       65535       0.00000         inf
 17      131071       0.02000    163.8388
 18      262143       0.02000    327.6788
 19      524287       0.04000    327.6794
 20     1048575       0.08000    327.6797
 21     2097151       0.16000    327.6798
 22     4194303       0.31000    338.2502
 23     8388607       0.62000    338.2503
 24    16777215       1.24000    338.2503
 25    33554431       2.47000    339.6197
 26    67108863       4.94000    339.6198
 27   134217727       9.88000    339.6198
 28   268435455      19.77000    339.4480
 29   536870911      39.64000    338.5916

Average Moves Per 25 usec =   337.7033

/home/vmakarov/build/gcc-dfa-branch1/toymac/bin/gcc -O2 -mcpu=7400
/home/vmakarov/aburto/hanoi/hanoi.c -DUNIX -o f1 -lm -g;./f1

/home/vmakarov/aburto/hanoi/hanoi.c: In function `main':
/home/vmakarov/aburto/hanoi/hanoi.c:64: warning: return type of `main'
is not `int'

Towers of Hanoi Puzzle Test Program (27 Oct 94)

Disks     Moves     Time(sec)   Moves/25usec
 16       65535       0.00000         inf
 17      131071       0.02000    163.8388
 18      262143       0.02000    327.6788
 19      524287       0.04000    327.6794
 20     1048575       0.08000    327.6797
 21     2097151       0.16000    327.6798
 22     4194303       0.32000    327.6799
 23     8388607       0.64000    327.6800
 24    16777215       1.27000    330.2601
 25    33554431       2.55000    328.9650
 26    67108863       5.10000    328.9650
 27   134217727      10.19000    329.2878
 28   268435455      20.37000    329.4495
 29   536870911      40.74000    329.4495

Average Moves Per 25 usec =   328.8241

  Old description model does not permit to describe ppc750/ppc7400
reservations kind of 2-1-1 (where numbers are cycles taken for
execution of an insn in each floating point pipeline stages) or cr*
insns serialization.  Probably therefore hanoi test got the
improvement.  But in overall as I said there is no visible advantages
of usage of dfa-scheduler for the ppc processors.  I can explain it
only by that the ppc processors are out-of-order execution processors.

  The advantage of the dfa-based scheduler usage definitely exists for
classical (not out-of-order/speculative) RISC processors.  The more
processor has irregular pipelines (e.g. sh4), the bigger improvement
we have.

  Actually, I did not expect such big improvement for ultrasparc and
sh4.  As I already wrote I originally positioned the DFA based
pipeline hazard recognizer as more infrastructure with simpler
interface and more readable description, faster pipeline hazard
recognizer,
and finally for better future insn schedulers when you can try easily
more insn schedules for the same time to choose the best one.

  I think that the current dfa-based scheduler proved itself as useful
and robust and deserves moving it into the main trunk.

  After moving the dfa-branch code into the main trunk, I could commit
RCSP into the branch and work on its improvement on the branch.

  I would like to say thanks to Richard Henderson, Mike Meissner,
David Miller, Naveen Sharma, Bernd Schmidt, Jan Hubicka, Dan
Nicolaesku and all others (sorry if I missed you) for the
help and helpful comments.

Vladimir Makarov


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]