First Last Prev Next    No search results available      Search page      Enter new bug
Bug#: 13724
Product:  
Component:  
Status: RESOLVED
Resolution: FIXED
Assigned To: Not yet assigned to anyone <unassigned@gcc.gnu.org>
Host:
Reported against  
Priority:  
Severity:  
Target Milestone:  
 
 
Target:
Reporter: Paolo Bonzini <bonzini@gnu.org>
Add CC:
CC:
Remove selected CCs
Build:
URL:
Summary:
Keywords:
Known to work:
Known to fail:

Attachment Description Type Created Size Actions
cse-zero-extend-subreg.c One-line function exhibiting bad code generation text/plain 2004-01-17 23:47 71 bytes Edit
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 13724 depends on: Show dependency tree
Show dependency graph
Bug 13724 blocks: 19721 35281

Additional Comments:






View Bug Activity   |   Format For Printing   |   Clone This Bug


Description:   Last confirmed: 2004-01-18 17:16 Opened: 2004-01-17 23:45
The following function (which by the way is meant to compute the
unsigned remainder by 3), produces this really awful code on x86
(gcc 3.3.1) at any optimization level:

int f(unsigned int x)
{
  return ((x * 0x2AAAAAAABULL) >> 33) & 3;
}

	...
	movl	$0xAAAAAAAB, %eax
	movl	8(%ebp), %ebx
	mull	%ebx
	xorl	%esi, %esi			(1)
	leal	(%edx,%ebx,2), %ecx
	movl	%esi, %ebx
	imull	$0xAAAAAAAB, %ebx, %ebx		(2)
	leal	(%ebx,%ecx), %edx
	movl	%edx, %eax
	shrl	$1, %eax
	andl	$3, %eax
	...

The insns I marked with (1) and (2) are respectively, after RTL
generation,

(insn 11 10 12 (nil) (parallel [
            (set (reg:DI 61)
                (zero_extend:DI (reg/v:SI 59)))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil)
    (nil))

(insn 17 16 18 (nil) (parallel [
            (set (reg:SI 66)
                (mult:SI (subreg:SI (reg:DI 61) 4)
                    (const_int -1431655765 [0xaaaaaaab])))
            (clobber (reg:CC 17 flags))
        ]) -1 (nil)
    (nil))

CSE does not recognize that (subreg:SI (reg:DI 61) 4) must be
zero since it is created with a (zero_extend:DI (reg/v:SI 59)).
I've made a quick patch for 3.2.1:

2004-01-17  Paolo Bonzini  <bonzini@gnu.org>

	* gcc/cse.c (fold_rtx): Simplify a SUBREG to zero if it
	is the high part of the source, and the source is a REG
	equivalent to a ZERO_EXTEND.

The important part of the patch is:

,-----
+  if (GET_CODE (folded_arg0) == REG
+      && GET_MODE_SIZE (mode) < GET_MODE_SIZE (GET_MODE (folded_arg0))
+      && !subreg_lowpart_p (x))
+    {
+      struct table_elt *elt;
+
+      /* We can use HASH here since we know that canon_hash won't be
+         called.  */
+      elt = lookup (folded_arg0,
+		    HASH (folded_arg0, GET_MODE (folded_arg0)),
+		    GET_MODE (folded_arg0));
+
+      if (elt)
+        elt = elt->first_same_value;
+
+      /* If this is a SUBREG representing the high part of a REG, check if
+         the register is the result of a zero extension.  We can fold
+         the SUBREG to zero if the zero-extended expression's mode is
+         smaller or as big as the SUBREG's (for example, a subreg:HI of
+         a zero-extended reg:SI could be non-zero).  */
+      for (; elt; elt = elt->next_same_value)
+        {
+          if (GET_CODE (elt->exp) == ZERO_EXTEND
+    	      && GET_MODE_SIZE (GET_MODE (XEXP (elt->exp, 0)))
+    	         <= GET_MODE_SIZE (mode))
+    	    return const0_rtx;
+        }
+    }
`-----

It fixes this problem and bootstraps cleanly (C, C++, Java); please
tell me if it is the right way, so I can port to mainline and submit
properly.  Maybe it is better to put the transformation in simplify_rtx.c
instead?  It looks like a good deal of SUBREG manipulation done in cse.c
after looking up the operand in the available expression table is not
pass-dependent, but this is probably too much for me to do without any
kind of guidance (it is my first foray into patching gcc).

Paolo

------- Comment #1 From Paolo Bonzini 2004-01-17 23:47 -------
Created an attachment (id=5510) [edit]
One-line function exhibiting bad code generation

(This is the function included in the bugreport)

------- Comment #2 From Andrew Pinski 2004-01-18 17:16 -------
Confirmed.  I cannot comment on the patch at all though.  CSE is a mess and I
would not touch it 
with my life though.

------- Comment #3 From Paolo Bonzini 2004-01-22 17:59 -------
Approved patch pending

http://gcc.gnu.org/ml/gcc-patches/2004-01/msg02113.html

------- Comment #4 From CVS Commits 2004-01-23 02:03 -------
Subject: Bug 13724

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	sayle@gcc.gnu.org	2004-01-23 02:03:26

Modified files:
	gcc            : ChangeLog cse.c 

Log message:
	2004-01-22  Paolo Bonzini  <bonzini@gnu.org>
	
	PR optimization/13724
	* cse.c (fold_rtx) <SUBREG>:  Fold a SUBREG to zero if it
	represents the zero bits produced by a ZERO_EXTEND operation.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.2424&r2=2.2425
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/cse.c.diff?cvsroot=gcc&r1=1.279&r2=1.280


------- Comment #5 From Andrew Pinski 2004-01-23 02:10 -------
Fixed for 3.5.0.

------- Comment #6 From Steven Bosscher 2005-04-13 09:23 -------
Paolo Bozini mentioned this bug as an example of the 64bits arith 
on 32bits host "issue". 

First Last Prev Next    No search results available      Search page      Enter new bug