Bug 8871 - [x86 MMX] Inefficient zero_extendsidi2 for MMX
Summary: [x86 MMX] Inefficient zero_extendsidi2 for MMX
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.1
: P3 normal
Target Milestone: 3.4.0
Assignee: Jan Hubicka
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2002-12-07 22:06 UTC by otaylor
Modified: 2003-08-23 23:14 UTC (History)
2 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2003-07-10 00:44:08


Attachments
zero_extend.patch (732 bytes, text/plain)
2003-05-21 15:17 UTC, otaylor
Details

Note You need to log in before you can comment on or make changes to this bug.
Description otaylor 2002-12-07 22:06:01 UTC
When moving a 32-bit quantity into an MMX register,
GCC first zero-extends it as if doing 64-bit arithmetic
emulation, then uses movq to move it into the register.
So, code like:

===
        xorl    %edx, %edx
        movl    %eax, -16(%ebp)
        movl    %edx, -12(%ebp)
        movq    -16(%ebp), %mm1
===

Instead of simply:

===
       movd    %eax, %mm1
===

This (and associated overhead) causes a pretty big
hit for the typical uses of MMX.... the attached
demonstration patch improved one alpha-compositing 
routine from 29 million pixels/sec to 51 million
pixels/sec. (With the patch, results for a range
of routines were comparable to hand-written assembly.)

The attached patch just replaces the existing 
patterns for zero_extendsidi2 with a pattern using
movd. This is clearly wrong, but my minimal GCC
hacking skills proved unequal to integrating it
in properly.

Release:
CVS Head, 7 December 2002

Environment:
Linux/ia32

How-To-Repeat:
A simple example demonstrating the code generation
is:

===
typedef int di __attribute__ ((mode(DI)));

di foo (unsigned int a, unsigned int b)
{
  return __builtin_ia32_por (a, b);
}
===
Comment 1 Wolfgang Bangerth 2002-12-20 21:04:29 UTC
Responsible-Changed-From-To: unassigned->hubicka
Responsible-Changed-Why: Jan, you are probably best acquainted with the MMX patterns
Comment 2 Dara Hazeghi 2003-07-10 00:44:08 UTC
Jan, have you been able to look at the patch for this PR that was included 
with the original report?
Comment 3 Andrew Pinski 2003-07-23 20:19:17 UTC
Might be related to bug 11628.
Comment 4 Andrew Pinski 2003-07-23 20:38:43 UTC
*** Bug 11628 has been marked as a duplicate of this bug. ***
Comment 5 CVS Commits 2003-08-23 21:19:05 UTC
Subject: Bug 8871

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	hubicka@gcc.gnu.org	2003-08-23 21:18:58

Modified files:
	gcc            : expr.c ChangeLog 
	gcc/config/i386: i386.c i386.h i386.md 

Log message:
	PR target/11369
	* i386.c (ix86_expand_carry_flag_compare): Validate operand.
	
	PR target/11031
	* i386.c (const_0_to_3_operand, const_0_to_7_operand,
	const_0_to_15_operand, const_0_to_255_operand): New predicates.
	* i386.h (PREDICATE_CODES): Add these.
	* i386.c (pinsrw and pextrw patterns): Use them.
	
	PR target/10984
	* i386.c (ix86_expand_binop_builtin): Behave sanely for VOIDmodes.
	
	PR target/8869
	* expr.c (convert_modes): Deal properly with integer to vector
	constant conversion.
	
	PR target/8871
	* i386.md (zero_extendsidi2*): Add MMX and SSE alternatives.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/expr.c.diff?cvsroot=gcc&r1=1.577&r2=1.578
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.864&r2=2.865
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.c.diff?cvsroot=gcc&r1=1.595&r2=1.596
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.h.diff?cvsroot=gcc&r1=1.350&r2=1.351
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/config/i386/i386.md.diff?cvsroot=gcc&r1=1.480&r2=1.481

Comment 6 Andrew Pinski 2003-08-23 23:14:00 UTC
Fixed by the patch above.