This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][RFC] Fix PR71632, remove parts of TER
- From: Richard Biener <rguenther at suse dot de>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Jeff Law <law at redhat dot com>, Michael Matz <matz at suse dot de>, Andrew MacLeod <amacleod at redhat dot com>, gcc-patches at gcc dot gnu dot org
- Date: Fri, 1 Jul 2016 10:02:44 +0200 (CEST)
- Subject: Re: [PATCH][RFC] Fix PR71632, remove parts of TER
- Authentication-results: sourceware.org; auth=none
- References: <alpine dot LSU dot 2 dot 11 dot 1606301502410 dot 29772 at t29 dot fhfr dot qr> <20160701073923 dot GT7387 at tucnak dot redhat dot com>
On Fri, 1 Jul 2016, Jakub Jelinek wrote:
> On Thu, Jun 30, 2016 at 03:51:20PM +0200, Richard Biener wrote:
> > The following patch fixes PR71632 by removing delayed expansion of
> > TERed defs. Instead it adds code to apply the scheduling effect
> > to the GIMPLE IL (so you also get better interleaved GIMPLE stmt
> > / generated RTL dumps in .expand).
>
> Does anything from TER survive after this patch?
> I thought the whole point was that the expansion can see through
> the SSA_NAMEs and optimize based on that, by not seeing through
> them it doesn't, or if it somewhere still uses get_gimple_for_ssa_name,
> if the definition will be already expanded, it might expand stuff multiple
> times.
Yes, get_gimple_for_ssa_name is what survives (also the scheduling
effect though that is applied on GIMPLE now). And yes, I noted the
issue of multiple expansions with get_gimple_for_ssa_name in 2)
(and that this is probably not worse than multiple expansion through
lazy expansion that ultimatively fails).
And yes, it no longer sees through SSA names for the cases that
the lazy SSA name expansion returned sth !REG_P.
Given the testresults show some regressions plus patched cc1 is
0.2% larger the patch obviously isn't ready yet.
Still I believe it is ultimatively the way to go as in theory
fwprop/combine should have everything at hand to recover the original
expansion from the single-use reg defs.
gcc.target/i386/xorps-sse2.c for example shows a missing transform
on GIMPLE given the comment on the testcase
/* Test that we generate xorps when the result is used in FP math. */
is not reflected by what we do at expansion which decides locally
for the vector int f_int ^ g operation to rewrite it as float
operation (and subregs the result into vector int). So it would
do that even if the result is used in an integer operation:
vector int x(vector float f, vector int h)
{
vector int g = { 0x80000000, 0, 0x80000000, 0 };
vector int f_int = (vector int) f;
return (f_int ^ g) + h;
}
turns into
x:
.LFB1:
.cfi_startproc
xorps .LC0(%rip), %xmm0
paddd %xmm1, %xmm0
ret
but with the testcases logic it should have better used pxor.
Richard.