This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/48329] Missed vectorization of reduction due to PRE
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Tue, 29 Mar 2011 10:32:08 +0000
- Subject: [Bug tree-optimization/48329] Missed vectorization of reduction due to PRE
- Auto-submitted: auto-generated
- References: <bug-48329-4@http.gcc.gnu.org/bugzilla/>
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48329
Richard Guenther <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Keywords|openmp |
Last reconfirmed| |2011.03.29 10:31:56
Component|middle-end |tree-optimization
CC| |rguenth at gcc dot gnu.org
Ever Confirmed|0 |1
Summary|Program takes twice as long |Missed vectorization of
|*without* -fopenmp than |reduction due to PRE
|with 1 OpenMP thread |
--- Comment #1 from Richard Guenther <rguenth at gcc dot gnu.org> 2011-03-29 10:31:56 UTC ---
We vectorize the reduction if the function is outlined. I suppose sth
confuses the vectorizer in the non-OMP path. Yep, it's PRE, so try
-fno-tree-pre:
<bb 3>:
# i_1 = PHI <1(2), i_22(4)>
# sum_2 = PHI <0.0(2), sum_20(4)>
# prephitmp.9_50 = PHI
<5.66893424036281234980410020432668056299176519904892395524e-20(2),
D.1586_48(4)>
# ivtmp.12_10 = PHI <2100000000(2), ivtmp.12_11(4)>
D.1574_17 = prephitmp.9_50 + 1.0e+0;
D.1575_18 = ((D.1574_17));
D.1576_19 = 4.0e+0 / D.1575_18;
sum_20 = D.1576_19 + sum_2;
ivtmp.12_11 = ivtmp.12_10 - 1;
if (ivtmp.12_11 == 0)
goto <bb 5>;
else
goto <bb 4>;
<bb 4>:
i_22 = i_1 + 1;
pretmp.8_44 = (real(kind=8)) i_22;
pretmp.8_45 = pretmp.8_44 - 5.0e-1;
pretmp.8_46 = ((pretmp.8_45));
pretmp.8_47 = pretmp.8_46 *
4.76190476190476200439314681013558416822206709184683859348e-10;
D.1586_48 = __builtin_pow (pretmp.8_47, 2.0e+0);
goto <bb 3>;
is not detected as reduction. Probably not only because, but at least
also because of the latch block not being empty.