This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Fix for i386 APPLU regression (Was Re: Fix to i386 binarry operation predicates)
> Jan Hubicka wrote:
>
> > I wrote:
>
> > > Can this be tightened up, e.g., by Jan's own suggestion:
> > >
> > > "Perhaps I can go with only single nonimmediate operand in the operand 1
> > > and rewrite predicates as "=f" "fm" "0", but I feel somewhat
> > > unconfortable about such reversed order isntructions."
>
> > I am just running benchmark comparison of this at my home machine. I will
> > know results afternoon.
>
> Thanks for looking into this.
Hi,
the applu regression shows very common scenario of posimizing i387 code that is
caused by crazy corelation between regmove, reload and reg-stack passes. In
particular the following pattern:
(define_insn "*fop_sf_1_nosse"
[(set (match_operand:SF 0 "register_operand" "=f,f")
(match_operator:SF 3 "binary_fp_operator"
[(match_operand:SF 1 "nonimmediate_operand" "0,fm")
(match_operand:SF 2 "nonimmediate_operand" "fm,0")]))]
Always get which_alternative == 0 for regmove and non-strict matching thus
regmove believes this is the typical 2 address operation. What it does
is not always fruitfull then. All specFP benchmarks are compiled into shorter
code with -fno-regmove, while all specInt benchmarks, except for eon are
longer.
One particular problem is that regmove decides to emit the reg-reg copy if it
fails to rename registers to avoid the reload. In case the operand 1 does not
die, the result is unnecesarily increased register pressure.
After some thinking, I see point for doing the copy in other cases, but not
for instruction like (a=b op c) where no operands die. Then the resulting
copy instruction is just extra instruction and does not make the conflict
graph or register allocations different. Reload would create it anyway
if really required (not in this case). Perhaps this code has been fruitfull
before reload used full liveness information.
Just disabling the first case (creating copy when instruction has no REG_DEAD
notes) helps to many specFP benchmarks, but not to the applu.
Second case, where it may make some sense is is situation where c dies and a
requires to be the same as b. Such instruction will always require one reload,
but register allocation may manage c and a to be in the hard register. For
i387 reload will then match other alternative and win, but for "standard"
binary operation one extra copy of c will be needed.
I've done testing on i386 integer code (by compiling cc1) and found
that this sitation happends much less in practice than cases where the
regmove action pesimizes code.
Disabling the copy saves 10% of instructions for applu and as cross-check about
1Kb from cc1 binarry and 0.5% from mesa, so I believe we should just disable
the code. I also believe i386 is the extreme case and it will result in less
effect to code for other ports.
The patch has been rendered into strange form, it really just moves the
conditional in front of others and adds comment.
Honza
Sat Feb 16 20:41:45 CET 2002 Jan Hubicka <jh@suse.cz>
* regmove.c (regmove_optimize):
*** regmove.c.old Sat Feb 16 20:32:44 2002
--- ../gcc/regmove.c Sat Feb 16 20:35:31 2002
*************** regmove_optimize (f, nregs, regmove_dump
*** 1328,1346 ****
}
src_class = reg_preferred_class (REGNO (src));
dst_class = reg_preferred_class (REGNO (dst));
! if (! regclass_compatible_p (src_class, dst_class))
{
! if (!copy_src)
! {
! copy_src = src;
! copy_dst = dst;
! }
continue;
}
! /* Can not modify an earlier insn to set dst if this insn
! uses an old value in the source. */
! if (reg_overlap_mentioned_p (dst, SET_SRC (set)))
{
if (!copy_src)
{
--- 1328,1349 ----
}
src_class = reg_preferred_class (REGNO (src));
dst_class = reg_preferred_class (REGNO (dst));
!
! if (! (src_note = find_reg_note (insn, REG_DEAD, src)))
{
! /* We used to force the copy here like in other cases, but
! it produces worse code, as it eliminates no copy
! instructions and the copy emitted will be produced by
! reload anyway. On patterns with multiple alternatives,
! there may be better sollution availble.
!
! In particular this change produced slower code for numeric
! i387 programs. */
!
continue;
}
! if (! regclass_compatible_p (src_class, dst_class))
{
if (!copy_src)
{
*************** regmove_optimize (f, nregs, regmove_dump
*** 1350,1356 ****
continue;
}
! if (! (src_note = find_reg_note (insn, REG_DEAD, src)))
{
if (!copy_src)
{
--- 1353,1361 ----
continue;
}
! /* Can not modify an earlier insn to set dst if this insn
! uses an old value in the source. */
! if (reg_overlap_mentioned_p (dst, SET_SRC (set)))
{
if (!copy_src)
{
*************** regmove_optimize (f, nregs, regmove_dump
*** 1359,1365 ****
}
continue;
}
-
/* If src is set once in a different basic block,
and is set equal to a constant, then do not use
--- 1364,1369 ----