[PATCH] Fix PR79201 (half-way)
Richard Biener
rguenther@suse.de
Thu May 11 14:06:00 GMT 2017
On Thu, 11 May 2017, Uros Bizjak wrote:
> On Thu, May 11, 2017 at 2:48 PM, Richard Biener <rguenther@suse.de> wrote:
> > On Thu, 11 May 2017, Rainer Orth wrote:
> >
> >> Hi Richard,
> >>
> >> > On Mon, 24 Apr 2017, Richard Biener wrote:
> >> >>
> >> >> One issue in PR79201 is that we don't sink pure/const calls which is
> >> >> what the following simple patch fixes.
> >> >>
> >> >> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> >> >
> >> > Needed some gimple_assign_lhs -> gimple_get_lhs adjustments and
> >> > adjustment of gcc.target/i386/pr22152.c where we now sink the
> >> > assignment out of the pointless loop. Not sure what the original
> >> > bug was about (well, reg allocation) so I simply disabled sinking
> >> > for it.
> >> >
> >> > Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
> >> >
> >> > Richard.
> >> >
> >> > 2017-04-25 Richard Biener <rguenther@suse.de>
> >> >
> >> > PR tree-optimization/79201
> >> > * tree-ssa-sink.c (statement_sink_location): Handle calls.
> >> >
> >> > * gcc.dg/tree-ssa/ssa-sink-16.c: New testcase.
> >> > * gcc.target/i386/pr22152.c: Disable sinking.
> >>
> >> however, gcc.target/i386/pr22152.c FAILs now for 32-bit:
> >>
> >> FAIL: gcc.target/i386/pr22152.c scan-assembler-times movq[ \\\\t]+[^\\n]*%mm 1
> >
> > I remember seeing this and was not able to make sense of the testcase
> > which was added to fix some backend issue. Disabling sinking doesn't
> > work (IIRC) as it is required to generate the original code as well.
> >
> > Uros added the testcase in 2008 -- I think if we want to have a testcase
> > for the original issue we need a different one. Or simply remove
> > the testcase.
>
> No, there is something going on in the testcase:
>
> .L3:
> movq (%ecx,%eax,8), %mm1
> paddq (%ebx,%eax,8), %mm1
> addl $1, %eax
> movq %mm1, %mm0
> cmpl %eax, %edx
> jne .L3
>
>
> The compiler should allocate %mm0 to movq and paddq to avoid %mm1 ->
> %mm0 move. These are all movv1di patterns (they shouldn't interfere
> with movdi), and it is not clear to me why RA allocates %mm1 instead
> of %mm0.
In any case the testcase is no longer testing what it tested as the
input to RA is now different. The testcase doesn't make much sense:
__m64
unsigned_add3 (const __m64 * a, const __m64 * b, unsigned int count)
{
__m64 sum;
unsigned int i;
for (i = 1; i < count; i++)
sum = _mm_add_si64 (a[i], b[i]);
return sum;
}
that's equivalent to
__m64
unsigned_add3 (const __m64 * a, const __m64 * b, unsigned int count)
{
__m64 sum;
unsigned int i;
if (1 < count)
sum = _mm_add_si64 (a[count-1], b[count-1]);
return sum;
}
which means possibly using uninitialized sum plus a pointless loop.
Richard.
More information about the Gcc-patches
mailing list