Bug 8871 - [x86 MMX] Inefficient zero_extendsidi2 for MMX
Summary: [x86 MMX] Inefficient zero_extendsidi2 for MMX
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.1
: P3 normal
Target Milestone: 3.4.0
Assignee: Jan Hubicka
Keywords: missed-optimization
Depends on:
Reported: 2002-12-07 22:06 UTC by otaylor
Modified: 2003-08-23 23:14 UTC (History)
2 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2003-07-10 00:44:08

zero_extend.patch (732 bytes, text/plain)
2003-05-21 15:17 UTC, otaylor

Note You need to log in before you can comment on or make changes to this bug.
Description otaylor 2002-12-07 22:06:01 UTC
When moving a 32-bit quantity into an MMX register,
GCC first zero-extends it as if doing 64-bit arithmetic
emulation, then uses movq to move it into the register.
So, code like:

        xorl    %edx, %edx
        movl    %eax, -16(%ebp)
        movl    %edx, -12(%ebp)
        movq    -16(%ebp), %mm1

Instead of simply:

       movd    %eax, %mm1

This (and associated overhead) causes a pretty big
hit for the typical uses of MMX.... the attached
demonstration patch improved one alpha-compositing 
routine from 29 million pixels/sec to 51 million
pixels/sec. (With the patch, results for a range
of routines were comparable to hand-written assembly.)

The attached patch just replaces the existing 
patterns for zero_extendsidi2 with a pattern using
movd. This is clearly wrong, but my minimal GCC
hacking skills proved unequal to integrating it
in properly.

CVS Head, 7 December 2002


A simple example demonstrating the code generation

typedef int di __attribute__ ((mode(DI)));

di foo (unsigned int a, unsigned int b)
  return __builtin_ia32_por (a, b);
Comment 1 Wolfgang Bangerth 2002-12-20 21:04:29 UTC
Responsible-Changed-From-To: unassigned->hubicka
Responsible-Changed-Why: Jan, you are probably best acquainted with the MMX patterns
Comment 2 Dara Hazeghi 2003-07-10 00:44:08 UTC
Jan, have you been able to look at the patch for this PR that was included 
with the original report?
Comment 3 Andrew Pinski 2003-07-23 20:19:17 UTC
Might be related to bug 11628.
Comment 4 Andrew Pinski 2003-07-23 20:38:43 UTC
*** Bug 11628 has been marked as a duplicate of this bug. ***
Comment 5 CVS Commits 2003-08-23 21:19:05 UTC
Subject: Bug 8871

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	hubicka@gcc.gnu.org	2003-08-23 21:18:58

Modified files:
	gcc            : expr.c ChangeLog 
	gcc/config/i386: i386.c i386.h i386.md 

Log message:
	PR target/11369
	* i386.c (ix86_expand_carry_flag_compare): Validate operand.
	PR target/11031
	* i386.c (const_0_to_3_operand, const_0_to_7_operand,
	const_0_to_15_operand, const_0_to_255_operand): New predicates.
	* i386.h (PREDICATE_CODES): Add these.
	* i386.c (pinsrw and pextrw patterns): Use them.
	PR target/10984
	* i386.c (ix86_expand_binop_builtin): Behave sanely for VOIDmodes.
	PR target/8869
	* expr.c (convert_modes): Deal properly with integer to vector
	constant conversion.
	PR target/8871
	* i386.md (zero_extendsidi2*): Add MMX and SSE alternatives.


Comment 6 Andrew Pinski 2003-08-23 23:14:00 UTC
Fixed by the patch above.