Bug 22072 - bizarre code for int*int/2 for -Os
Summary: bizarre code for int*int/2 for -Os
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 3.4.4
: P2 enhancement
Target Milestone: 4.4.3
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization, ra
: 27856 (view as bug list)
Depends on:
Blocks: 27856 42258
  Show dependency treegraph
 
Reported: 2005-06-15 06:04 UTC by felix-gcc
Modified: 2009-12-31 09:32 UTC (History)
4 users (show)

See Also:
Host:
Target: i686-pc-linux-gnu
Build:
Known to work:
Known to fail:
Last reconfirmed: 2006-09-18 01:24:44


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description felix-gcc 2005-06-15 06:04:39 UTC
int triangle(int a,int b) { int c; c=a*b/2; return c; }
emits this very bizarre code (at -O, -O2):
  mov a,%edx
  mov b,%eax
  imull %edx,%eax
  movl %eax,%edx
  shrl $31,%edx
  addl %edx,%eax
  sarl %eax
  ret
Why are the two instructions after the imull emitted?  Shouldn't this become
simply imull and sarl?  This code extracts the most significant bit of a*b and
adds it to a*b, then shifting the result right.  It almost looks as if trying to
round or something?  There is probably something obvious I'm overlooking here.
The analog code is generated for ppc, x86_64, sparc and mips.  Please explain.
I also tried with -Os, and the code becomes cltd (sign-extend 32 to 64 bits)
plus idivl with 2.  Could it be that the peephole optimizer converts the idivl
to a shift but forgets to remove the sign extend code?
Comment 1 felix-gcc 2005-06-15 06:12:37 UTC
by the way, -Os generates an unnecessary register move:

        pushl   %ebp
        movl    $2, %edx
        movl    %esp, %ebp
        movl    12(%ebp), %eax
        movl    %edx, %ecx
        imull   8(%ebp), %eax
        popl    %ebp
        cltd
        idivl   %ecx
        ret

gcc should put the $2 directly in %ecx here.  Or it should note that "mov
$2,%ecx; idiv %ecx" is 7 bytes, while "sar %eax" is only two bytes, and emit the
latter.
Comment 2 Falk Hueffner 2005-06-15 11:43:52 UTC
(In reply to comment #0)

> Why are the two instructions after the imull emitted?  Shouldn't this become
> simply imull and sarl?

No. A signed division by two is not a shift.

The complaint about $2 not propagating seems valid though. Maybe somebody
can try 4.x on it.
Comment 3 Andrew Pinski 2005-06-15 11:50:07 UTC
Yes /2 in signed does not equal >> 1 but "(+ signbit)>>1 ".
Comment 4 Pawel Sikora 2005-06-15 14:47:28 UTC
(In reply to comment #2)
> (In reply to comment #0)
> 
> > Why are the two instructions after the imull emitted?  Shouldn't this become
> > simply imull and sarl?
> 
> No. A signed division by two is not a shift.

what about *arithmetic shift* instruction (e.g. SAR on ix86) ?
Comment 5 Andrew Pinski 2005-06-15 14:56:26 UTC
(In reply to comment #4)
> what about *arithmetic shift* instruction (e.g. SAR on ix86) ?
Nope, try that with a negative number and you will notice that it will not work.
Comment 6 felix-gcc 2005-06-15 21:05:52 UTC
(In reply to comment #5)
> (In reply to comment #4)
> > what about *arithmetic shift* instruction (e.g. SAR on ix86) ?
> Nope, try that with a negative number and you will notice that it will not work.

Actually it does work.  5 SAR 1 == 2; -5 SAR 1 == -2.  That's exactly what SAR
is for, after all.  See the Intel manuals.  Or look here:

http://faydoc.tripod.com/cpu/sar.htm

 SAR r/m32, 1 	 Signed divide* r/m32 by 2, once
Comment 7 Andrew Pinski 2005-06-15 21:18:04 UTC
(In reply to comment #6)
>  SAR r/m32, 1    Signed divide* r/m32 by 2, once

Huh, I think that is wrong, witness:
#include <stdio.h>
int f(int a)
{
  return a >> 1;
}
int main(void)
{
  int g = f(-5);
  printf("%d\n", g);
}

prints -3.
Comment 8 Pawel Sikora 2005-06-17 00:53:14 UTC
(In reply to comment #7)  
> (In reply to comment #6)  
> >  SAR r/m32, 1    Signed divide* r/m32 by 2, once  
>   
> Huh, I think that is wrong,  
 
/* 
 * specifically, the division algorithm states that given 
 * two integers a and d, with d != 0, there exists unique 
 * integers q and r such that a = qd + r and 0 <= r < |d|, 
 * where |d| denotes the absolute value of d. 
 * the integer q is the quotient, r is the remainder, 
 * d is the divisor, and a is the dividend. 
 * 
 * examples: 
 * 
 * if a =  7 and d =  3, then q =  2 and r = 1. 
 * if a =  7 and d = -3, then q = -2 and r = 1. 
 * if a = -7 and d =  3, then q = -3 and r = 2. 
 * if a = -7 and d = -3, then q =  3 and r = 2. 
 * 
 */ 
 
#include <stdio.h> 
 
const int a[4] = { 5, -5 }; 
 
void div2_wrong(int* q, int*r, const int a) 
{ 
    *q = a / 2; 
    *r = a % 2; 
} 
 
void div2_correct(int* q, int*r, const int a) 
{ 
    *q = a >> 1; 
    *r = a - (*q << 1); 
} 
 
int main() 
{ 
    int i; 
    for (i = 0; i < 2; i++) 
    { 
        int q, r; 
        div2_wrong(&q, &r, a[i]); 
        printf("[w] a=%d, q=%d, r=%d\n", a[i], q, r); 
        div2_correct(&q, &r, a[i]); 
        printf("[c] a=%d, q=%d, r=%d\n", a[i], q, r); 
    } 
    return 0; 
} 
 
$ ./a.out (in this case d = 2) 
[w] a= 5, q= 2, r= 1 
[c] a= 5, q= 2, r= 1 
[w] a=-5, q=-2, r=-1   <=  wrong, vide 0 <= r < |d| 
[c] a=-5, q=-3, r= 1 
 
gcc div algorithm produces wrong result. 
 
> (...) witness:  
> #include <stdio.h>  
> int f(int a)  
> {  
>   return a >> 1;  
> }  
> int main(void)  
> {  
>   int g = f(-5);  
>   printf("%d\n", g);  
> }  
>   
> prints -3.  
  
-3 looks fine. 
 
a = -5, d = 2  ->  q = -3, r = 1  ->  qd+r = -3*2+1 = -5 = a. 
 
did i miss something in my math? 
Comment 9 Andrew Pinski 2005-06-17 01:08:13 UTC
(In reply to comment #8)
> did i miss something in my math? 

Yes in C, % (remainder and not mod) is negative iff one of the operands is negative.
so it is -2*2 + -1 = -5. , integer divide in C is truncated.
Comment 10 Pawel Sikora 2005-06-17 09:33:47 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > did i miss something in my math? 
> 
> Yes in C, % (remainder and not mod) is negative iff one of the operands is
negative.
> so it is -2*2 + -1 = -5. , integer divide in C is truncated.

[ cite ]

"The New ISO Standard for C (C9X)"

25).
The integer division and modulus operators are defined to perform
truncation *towards zero*. (In C89, it was implementation-defined whether
truncation was done towards zero or -infinity. This is (obviously)
important only if one or both operands are negative. Consider:

-22 / 7 = -3     	truncation towards zero
-22 % 7 = -1

-22 / 7 = -4    	truncation towards -infinity
-22 % 7 =  6

Both satisfy the required equation (a/b)*b + a%b == a.
The second has the advantage that the modulus is always positive
-- but they decided on the other (more Fortran-like, less Pascal-like) variant...)

[ /cite ]

so it's all clear now.
Comment 11 Andrew Pinski 2006-09-18 01:24:44 UTC
So the only bug here is that -Os produces an extra move.  That comes from the register allocator/reload:

Reloads for insn # 13
Reload 0: reload_in (SI) = (reg:SI 1 dx [65])
        GENERAL_REGS, RELOAD_FOR_INPUT (opnum = 3)
        reload_in_reg: (reg:SI 1 dx [65])
        reload_reg_rtx: (reg:SI 2 cx)
Comment 12 Andrew Pinski 2008-09-14 04:28:42 UTC
    a1 (r65,l0) best GENERAL_REGS, cover GENERAL_REGS

If IRA chose CREG instead of GENERAL_REGS, it would have been right.
Comment 13 Andrew Pinski 2008-09-14 04:30:46 UTC
*** Bug 27856 has been marked as a duplicate of this bug. ***
Comment 14 Vladimir Makarov 2009-10-06 21:59:21 UTC
IRA does not create a conflict for p66 and p67 (in function triangle).  One pseudo is earlyclobber.  They should have a conflict.  Therefore p67 gets wrong hard register and reload fixes this by generation of additional move.

Actually I found a reason for this.  It is a typo in ira-lives.c::check_and_make_def_conflict (continue stmts should be instead of returns in the loop).   I'll submit the patch for a review soon.
Comment 15 Vladimir Makarov 2009-10-07 17:18:55 UTC
Subject: Bug 22072

Author: vmakarov
Date: Wed Oct  7 17:18:38 2009
New Revision: 152533

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=152533
Log:
2009-10-07  Vladimir Makarov  <vmakarov@redhat.com>

	PR middle-end/22072
	* ira-lives.c (check_and_make_def_conflict): Process all operands.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ira-lives.c

Comment 16 Uroš Bizjak 2009-10-08 08:17:28 UTC
Vlad, is it OK if I backport this patch to 4.4? I have tested it on 4.4 without problems.
Comment 17 uros 2009-10-15 18:03:36 UTC
Subject: Bug 22072

Author: uros
Date: Thu Oct 15 18:03:20 2009
New Revision: 152856

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=152856
Log:
        Backport from mainline:
        2009-10-07  Vladimir Makarov  <vmakarov@redhat.com>

        PR middle-end/22072
        * ira-lives.c (check_and_make_def_conflict): Process all operands.


Modified:
    branches/gcc-4_4-branch/gcc/ChangeLog
    branches/gcc-4_4-branch/gcc/ira-lives.c

Comment 18 Uroš Bizjak 2009-10-15 18:04:50 UTC
Backport approved offline by Vlad. Fixed.