This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug tree-optimization/25621] Missed optimisation



------- Comment #2 from jv244 at cam dot ac dot uk  2006-01-01 18:14 -------
(In reply to comment #1)
> What happens if you use -funroll-loops?  It should get about the same
> improvement.

I have the following timings (for N=1024, calling these subroutines a number of
times+some external initialisation)
-O2 -ffast-math -funroll-loops
S31                 S32
0.0229959786        0.0119980276
-O2 -ffast-math 
0.0229960084        0.0119979978

I think the issue is not pure unrolling but the fact that you have two
independent sums in the loop

In fact, I now find that
-O2 -ffast-math -funroll-loops -ftree-loop-ivcanon -fivopts
-fvariable-expansion-in-unroller
yields much improved code:
0.0119979978        0.0079990029
The last option indeed seems to do what I did by hand, still the routine S32
seems about 30% faster.

> Also your two loops not equal if N is old.
I've added at least the comment ;-)
! assume N is even


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=25621



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]