This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
- From: "rguenth at gcc dot gnu.org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 10 Apr 2015 09:10:52 +0000
- Subject: [Bug ipa/65701] r221530 makes 187.facerec drop with -Ofast -flto
- Auto-submitted: auto-generated
- References: <bug-65701-4 at http dot gcc dot gnu dot org/bugzilla/>
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65701
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |Ganesh.Gopalasubramanian@am
| |d.com
--- Comment #12 from Richard Biener <rguenth at gcc dot gnu.org> ---
I notice some (obvious) differences (just glancing at -fopt-info)
graphRoutines.f90:393
graphRoutines.f90:359
are not peeled for alignment when vectorized in the good case.
But it seems that's ok (well, we're peeling too much for alignment IMHO...).
In the fast variant we vectorize strided loads while in the slow variant
we can use vector loads for one of the loads (and we made sure to use
aligned loads by peeling).
1.11 ï3682: mov 0x60(%rsp),%rdx
9.32 ï3687:ïïïvmovss (%rax,%r12,2),%xmm5
1.44 ï ï vmovss (%rax),%xmm6
4.46 ï ï inc %rdi
0.01 ï ï add $0x10,%rcx
1.17 ï ï vinser $0x10,(%rax,%r13,1),%xmm5,%xmm0
1.92 ï ï vinser $0x10,(%rax,%r12,1),%xmm6,%xmm1
0.28 ï ï add %r14,%rax
0.07 ï ï vmovlh %xmm0,%xmm1,%xmm0
2.48 ï ï vfmadd %xmm3,-0x10(%rcx),%xmm0,%xmm3
5.15 ï ï cmp %rdi,%rdx
0.01 ï ïïïja 3687
so maybe the vfmadd with a memory operand is just bad for the pipeline
(I suspect bad for the decoder at least).
To me it really looks like trunk generates better code but we run into
a very odd bdver2 architectural issue (if the above loop is really the issue).
You could try disabling peeling for alignment with --param
vect-max-peeling-for-alignment=0 (so you get unaligned load and a vfmadd
without memory operand).
I don't think this is a RA issue.
Ganesh?