This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow
- From: "sgunderson at bigfoot dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 30 Nov 2008 15:06:44 -0000
- Subject: [Bug tree-optimization/38328] Massive performance regression for jpeg_idct_islow
- References: <bug-38328-3483@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #2 from sgunderson at bigfoot dot com 2008-11-30 15:06 -------
OK, I looked at the source. The issue here seems to be that 4.4 likes to
compile this:
z3 = ((z3) * (- ((INT32) 16069)));
into this:
10 0.0403 : 805cc87: lea (%ecx,%ecx,4),%ebx
: 805cc8a: lea (%ebx,%ebx,4),%ebx
20 0.0805 : 805cc8d: lea (%ebx,%ebx,4),%ebx
7 0.0282 : 805cc90: lea (%ecx,%ebx,2),%ebx
3 0.0121 : 805cc93: shl $0x4,%ebx
38 0.1530 : 805cc96: add %ecx,%ebx
8 0.0322 : 805cc98: lea (%ecx,%ebx,4),%esi
4.3 uses imul here, which is a lot faster.
--
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=38328