This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug middle-end/64099] [5 Regression] ~15% runtime increase for fatigue.f90.


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64099

--- Comment #2 from Dominique d'Humieres <dominiq at lps dot ens.fr> ---
> I don't see this on any of our testers.  What CPU do you have and what default
> -march gets used for you?  (thus please show -v output)

My CPU is a 2.8 GHz Intel Core i7. All the versions reported in comment 0 have
been configured with

../p_work/configure --prefix=/opt/gcc/gcc4.10p-#rev.p#patch
--enable-languages=c,c++,lto,fortran,ada,objc,obj-c++ --with-gmp=/opt/mp
--with-system-zlib --enable-checking=release --with-isl=/opt/mp --enable-lto
--enable-plugin --with-arch=core2 --with-cpu=core2

but r216631 for which --enable-checking=release has been omitted. #rev. is the
revision and #patch is the number of patches required to bootstrap.

> Btw, -flto should be redundant for a single-file benchmark - -fwhole-program
> is enough.  

I know, however I have seen in the past some regressions when -flto is added.
Since I can afford to double the compile time, I keep it in my reference
options.

> Does -ftree-loop-linear make a difference for you?

AFAICT it does not on fatigue.f90, but I see some (minor) improvements for
other tests in the suite.

> Our testers use -ffast-math -funroll-loops -O3.

Using '-O3 -ffast-math' instead of '-Ofast' almost double the runtime:

[Book15] lin/test% gfortran -O3 -ffast-math -fwhole-program fatigue.f90
[Book15] lin/test% time a.out > /dev/null
2.648u 0.002s 0:02.65 99.6%    0+0k 0+3io 38pf+0w
[Book15] lin/test% gfortran -Ofast -fwhole-program fatigue.f90
[Book15] lin/test% time a.out > /dev/null
1.385u 0.002s 0:01.38 100.0%    0+0k 0+1io 0pf+0w
[Book15] lin/test% gfc -O3 -ffast-math -fwhole-program fatigue.f90
[Book15] lin/test% time a.out > /dev/null
2.952u 0.002s 0:02.96 99.6%    0+0k 0+0io 40pf+0w
[Book15] lin/test% gfc -Ofast -fwhole-program fatigue.f90
[Book15] lin/test% time a.out > /dev/null
1.643u 0.001s 0:01.64 100.0%    0+0k 0+1io 0pf+0w

(gfortran is 4.9.2 and gfc is 5.0 r218134).

The runtime increase with '-O3 -ffast-math' is ~0.4s between r217816 and
r217833

[Book15] lin/test% /opt/gcc/gcc4.10p-217816p2/bin/gfortran -O3 -ffast-math
-fwhole-program fatigue.f90
[Book15] lin/test% time a.out > /dev/null
2.654u 0.002s 0:02.66 99.6%    0+0k 0+1io 41pf+0w
[Book15] lin/test% /opt/gcc/gcc4.10p-217833p1/bin/gfortran -O3 -ffast-math
-fwhole-program fatigue.f90
[Book15] lin/test% time a.out > /dev/null
2.962u 0.001s 0:02.97 99.6%    0+0k 0+1io 39pf+0w

> Can you bisect the regressions to a single commit?

I can do it for the range r217816-r217833 (the candidates are r217824 and
r217827, may be r217828 also). As indicated by the p? in my coding scheme, I
cannot bootstrap in the range r216631-r216747 without at least two patches, so
bisecting this range will take much longer.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]