This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug c/53967] New: GCC produces slow code for convolution algorithm with -mfpmath=sse (the AMD_64 default)
- From: "bfriesen at simple dot dallas.tx.us" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Sat, 14 Jul 2012 20:52:22 +0000
- Subject: [Bug c/53967] New: GCC produces slow code for convolution algorithm with -mfpmath=sse (the AMD_64 default)
- Auto-submitted: auto-generated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53967
Bug #: 53967
Summary: GCC produces slow code for convolution algorithm with
-mfpmath=sse (the AMD_64 default)
Classification: Unclassified
Product: gcc
Version: 4.6.2
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
AssignedTo: unassigned@gcc.gnu.org
ReportedBy: bfriesen@simple.dallas.tx.us
Created attachment 27792
--> http://gcc.gnu.org/bugzilla/attachment.cgi?id=27792
Convolution example C file, pre-processed version, build log, assembler output
The classic convolution algorithm (as implemented in GraphicsMagick) is
observed to run 2X slower with -mfpmath=sse than with -mfpmath=387.
Unfortunately -mfpmath=sse is the default for -m64 builds on AMD_64 so this has
large impact for users.
Even with -mfpmath=387 other compilers (LLVM, Open64, and Oracle Studio)
produce faster code by default so some of these compilers are producing up to
3X better overall run-time performance and all of them are at least 2X faster
than the GCC default for x86-64.
This issue has been verified under Solaris 10, OpenIndiana, and Ubuntu Linux on
Opteron and several modern Xeon CPUs.
Please note that AMD Opteron 6200 family CPUs were not observed to suffer from
this issue.