This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/69710] performance issue with SP Linpack with Autovectorization
- From: "doug.gilmore at imgtec dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sat, 06 Feb 2016 23:24:29 +0000
- Subject: [Bug rtl-optimization/69710] performance issue with SP Linpack with Autovectorization
- Auto-submitted: auto-generated
- References: <bug-69710-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69710
--- Comment #5 from Doug Gilmore <doug.gilmore at imgtec dot com> ---
Thanks for checking on AArch64 Andrew.
BTW, I made my (incorrect) hunch by running a test on gcc113, where
the installed 4.8 compile showed problems for both DP and SP. (I
assumed that the problem was addressed on DP since we don't see it on
MIPS at DP ToT with the MSA patch applied.)
For Neon after ivopts I see:
<bb 14>:
# vectp_dy.20_96 = PHI <vectp_dy.21_94(13), vectp_dy.20_97(21)>
# ivtmp.22_78 = PHI <0(13), ivtmp.22_77(21)>
# ivtmp.26_112 = PHI <ivtmp.26_110(13), ivtmp.26_111(21)>
# ivtmp.31_153 = PHI <ivtmp.31_155(13), ivtmp.31_154(21)>
vectp_dx.15_88 = (vector(4) float *) ivtmp.26_112;
_156 = (void *) ivtmp.31_153;
vect__12.14_85 = MEM[base: _156, offset: 0B];
ivtmp.31_154 = ivtmp.31_153 + 16;
vect__15.17_90 = MEM[(float *)vectp_dx.15_88];
vect__16.18_92 = vect_cst__91 * vect__15.17_90;
vect__17.19_93 = vect__12.14_85 + vect__16.18_92;
MEM[base: vectp_dy.20_96, offset: 0B] = vect__17.19_93;
vectp_dy.20_97 = vectp_dy.20_96 + 16;
ivtmp.22_77 = ivtmp.22_78 + 1;
ivtmp.26_111 = ivtmp.26_112 + 16;
if (ivtmp.22_77 < bnd.9_53)
goto <bb 21>;
else
goto <bb 16>;
...
<bb 21>:
goto <bb 14>;
So the problem is indeed exposed on Neon.