[Bug tree-optimization/42108] [4.6/4.7/4.8 Regression] 50% performance regression

rguenth at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Wed Jan 16 12:37:00 GMT 2013


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42108

--- Comment #54 from Richard Biener <rguenth at gcc dot gnu.org> 2013-01-16 12:36:52 UTC ---
Re-confirmed on trunk.  The initial GFortran IL is still ... awkward.  Apart
from the issue of using a canonicalized IV at all we have

                            D.1910 = i;
                            D.1911 = *nnd;
                            D.1912 = *n;
                            k = D.1910;
                            if (D.1912 > 0)
                              {
                                if (D.1911 < D.1910) goto L.6;
                              }
                            else
                              {
                                if (D.1911 > D.1910) goto L.6;
                              }
                            countm1.6 = (unsigned int) ((D.1911 - D.1910) *
(D.1912 < 0 ? -1 : 1)) / (unsigned int) ((D.1912 < 0 ? -1 : 1) * D.1912);
                            while (1)
                              {
...
                                  do_exit.7 = countm1.6 == 0;
                                  countm1.6 = countm1.6 + 4294967295;
                                  if (do_exit.7) goto L.6;
                                }
                              }
                            L.6:;

in the computation of countm1.6 we have a redundant (D.1912 < 0 ? -1 : 1)
test which ends up complicating the CFG.  It's also redundant again
with a test that was done just above.  In fact it completely cancels out
in the countm1 compute.  (the exit test via a do_exit temporary is
because of a local change in my tree ... eh)

Also note that

      /* Calculate the loop count.  to-from can overflow, so
         we cast to unsigned.  */

but we do

  (unsigned)(to * step_sign - from * step_sign) / (unsigned) (step * step_sign)

that does not avoid overflow of step * step_sign (step == INT_MIN, step_sign ==
-1) nor overflow of to * step_sign - from * step_sign which we fold to
(to - from) * step_sign anyway (signed overflow is undefined, heh).
I believe we need to do

  ((unsigned)to - (unsigned)from) * (unsigned)step_sign / ((unsigned) step *
(unsigned) step_sign)

to avoid these issues.



More information about the Gcc-bugs mailing list