This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Failed attempt to improve FP register allocation on alpha


I was trying to improve the floating-point register allocation with
the 1/2/2000 mainline compiler with the options

gcc -mcpu=ev6 -fno-math-errno -mieee -fPIC -O2

on the alpha.  On ev6 with ieee code generation, the result register
cannot be one of the argument registers; this is noted with early
clobbers in alpha.md.  So rtl that basically looks like two-address code,
i.e., fadd r1, r2, r1, must be changed, eventually.

This isn't noticed as a problem until pass 13, global register allocation,
which I thought was rather late in the compilation process, and is fixed
by adding fmovs where necessary, which, by my estimate, leads to a 20-25%
slowdown in FP performance.  So I decided
to try to change how the rtl was generated so it could be dealt with
earlier in the compilation process by changing expand_binop and
expand_unop to be

===================================================================
RCS file: RCS/optabs.c,v
retrieving revision 1.1
diff -c -r1.1 optabs.c
*** optabs.c    2000/01/03 07:59:16     1.1
--- optabs.c    2000/01/03 08:16:11
***************
*** 608,613 ****
--- 608,616 ----
    rtx last;
  
    class = GET_MODE_CLASS (mode);
+   if (class == MODE_FLOAT && <compiling for alphaev6 with -mieee>
+       && (rtx_equal_p (target, op0) || rtx_equal_p (target, op1)))
+     target = 0;
  
    op0 = protect_from_queue (op0, 0);
    op1 = protect_from_queue (op1, 0);
***************
*** 1991,1996 ****
--- 1994,2002 ----
    rtx pat;
  
    class = GET_MODE_CLASS (mode);
+   if (class == MODE_FLOAT && <compiling for alphaev6 with -mieee>
+       && rtx_equal_p (target, op0))
+     target = 0;
  
    op0 = protect_from_queue (op0, 0);
  
This did not generate the offending instructions, and added the fmovs
to the original rtl.  I was hoping that later passes of the compiler
would be able to remove them (you know, expose the low-level problems
and instructions early, and let the general algorithms deal with them
and fix them, that sort of philosophy).

Well, this failed miserably, and resulted in exactly the same number
of fmovs, only in different places.  The problem seemed to be that the
original rtl generation insisted in putting each C-level variable in
the same register each time it was used, instead of choosing a different
symbolic- (pseudo-?)register each time the variable was reintroduced
after being killed.  So I see the problem now as a symptom of statement-
level rtl generation without any knowledge of when variables are killed;
perhaps it can be fixed if you move C to function-level rtl generation.

(I've also tried to change expand_store so it doesn't generate
the illegal fp instructions, but I got the same result.)

As for my particular problem, I've decided that the best way to fix
it is to change the C code generation (the C code is actually the
output of a Scheme->C compiler) to work around the problem in gcc,

Brad Lucier       lucier@math.purdue.edu

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]