This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][RFC] Fix PR71632, remove parts of TER


On Fri, 1 Jul 2016, Jakub Jelinek wrote:

> On Thu, Jun 30, 2016 at 03:51:20PM +0200, Richard Biener wrote:
> > The following patch fixes PR71632 by removing delayed expansion of
> > TERed defs.  Instead it adds code to apply the scheduling effect
> > to the GIMPLE IL (so you also get better interleaved GIMPLE stmt
> > / generated RTL dumps in .expand).
> 
> Does anything from TER survive after this patch?
> I thought the whole point was that the expansion can see through
> the SSA_NAMEs and optimize based on that, by not seeing through
> them it doesn't, or if it somewhere still uses get_gimple_for_ssa_name,
> if the definition will be already expanded, it might expand stuff multiple
> times.

Yes, get_gimple_for_ssa_name is what survives (also the scheduling
effect though that is applied on GIMPLE now).  And yes, I noted the
issue of multiple expansions with get_gimple_for_ssa_name in 2)
(and that this is probably not worse than multiple expansion through
lazy expansion that ultimatively fails).

And yes, it no longer sees through SSA names for the cases that
the lazy SSA name expansion returned sth !REG_P.

Given the testresults show some regressions plus patched cc1 is
0.2% larger the patch obviously isn't ready yet.

Still I believe it is ultimatively the way to go as in theory 
fwprop/combine should have everything at hand to recover the original 
expansion from the single-use reg defs.

gcc.target/i386/xorps-sse2.c for example shows a missing transform
on GIMPLE given the comment on the testcase
/* Test that we generate xorps when the result is used in FP math.  */
is not reflected by what we do at expansion which decides locally
for the vector int f_int ^ g operation to rewrite it as float
operation (and subregs the result into vector int).  So it would
do that even if the result is used in an integer operation:

vector int x(vector float f, vector int h)
{
  vector int g = { 0x80000000, 0, 0x80000000, 0 };
  vector int f_int = (vector int) f;
  return (f_int ^ g) + h;
}

turns into

x:
.LFB1:
        .cfi_startproc
        xorps   .LC0(%rip), %xmm0
        paddd   %xmm1, %xmm0
        ret

but with the testcases logic it should have better used pxor.

Richard.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]