This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug lto/51497] [4.7 Regression] The run time for the polyhedron test nf.f90 is ~10% slower with -flto after revision 182107

From: "dominiq at lps dot ens.fr" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Mon, 12 Dec 2011 13:09:49 +0000
Subject: [Bug lto/51497] [4.7 Regression] The run time for the polyhedron test nf.f90 is ~10% slower with -flto after revision 182107
Auto-submitted: auto-generated
References: <bug-51497-4@http.gcc.gnu.org/bugzilla/>

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51497

--- Comment #4 from Dominique d'Humieres <dominiq at lps dot ens.fr> 2011-12-12 13:09:49 UTC ---
> I can't see any vectorizer differences for the testcase in comment #2 and the
> patch you cite only (should) have debuginfo changes, no changes to the produced
> IL at statement level (eventually it has better type-based alias analysis).
>
> Not confirmed.

I have just done the following check:

(1) gfc -Ofast -funroll-loops nf.f90 -ftree-vectorizer-verbose=1 > & tmp1
(2) gfc -Ofast -funroll-loops nf.f90 -ftree-vectorizer-verbose=1 -flto > & tmp2
I noticed that the tmp2 file contains two sets of annotations, likely one for
the usual vectorization (up to line 334) and a second one for the lto stage.
(3) I have split the file tmp2 in a new tmp2 keeping only the first 334 lines
and a second one containing the second part.
(4) I have used diff to compare the files: tmp1 and the new tmp2 are identical,
while I see missing vectorizations in tmp3:

--- tmp1    2011-12-12 13:49:06.000000000 +0100
+++ tmp3    2011-12-12 13:54:12.000000000 +0100
...
-206: LOOP VECTORIZED.
-nf.f90:204: note: vectorized 7 loops in function.
...
-nf.f90:256: note: vectorized 3 loops in function.
+nf.f90:256: note: vectorized 2 loops in function.
...
-nf.f90:288: note: vectorized 3 loops in function.
+nf.f90:288: note: vectorized 2 loops in function.

This confirms what I have seen in the disassembled executable.

Questions:
(1) do you see the slowdown with -flto?
(2) can you reproduce the above?

> The two else if blocks are related, not independent, independently reverting
> them makes no sense.

I am not suggesting to remove one block. I was only interested in finding which
part of the patch caused/exposed the problem (which looks like yet another
instance of a bad choice of optimization for size: as pointed in 51499, the
vectorization generates two loops, one vectorized and one not, hence ~doubling
the code size).

References:
- [Bug lto/51497] New: [4.7 Regression] The run time for the polyhedron test nf.f90 is ~10% slower with -flto after revision 182107
  - From: dominiq at lps dot ens.fr

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]