This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

[Bug tree-optimization/69710] performance issue with SP Linpack with Autovectorization

From: "doug.gilmore at imgtec dot com" <gcc-bugzilla at gcc dot gnu dot org>
To: gcc-bugs at gcc dot gnu dot org
Date: Sat, 06 Feb 2016 21:45:40 +0000
Subject: [Bug tree-optimization/69710] performance issue with SP Linpack with Autovectorization
Auto-submitted: auto-generated
References: <bug-69710-4 at http dot gcc dot gnu dot org/bugzilla/>

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69710

--- Comment #1 from Doug Gilmore <doug.gilmore at imgtec dot com> ---
Created attachment 37615
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37615&action=edit
daxpy for DP (previous was for SP)

Compilation example:

arm-linux-gnueabihf-gcc -O3 -save-temps daxpy.c saxpy.c -c -mfpu=neon  -c
-fdump-tree-{vect,ivopts}-{verbose,details} -fdump-tree-{slp1,optimized}
-fsched-verbose=9 \
-fdump-rtl-sched{1,2} -marm  -funsafe-math-optimizations -funroll-all-loops

Note that Neon does not support DP, thus daxpy.s won't contain
autovectorized code.

I haven't built a ToT compiler for aarch64-linux-gnu, but I suspect
that you will see autovectorized code in daxpy.s in which reasonable
schedules are being produced (loads are being moved above stores).

References:
- [Bug tree-optimization/69710] New: performance issue with SP Linpack with Autovectorization
  - From: doug.gilmore at imgtec dot com

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]