Bug 3995 - i386 optimisation: joining tests
Summary: i386 optimisation: joining tests
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: rtl-optimization (show other bugs)
Version: 3.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2001-08-11 00:46 UTC by 105309-quiet
Modified: 2003-08-10 15:36 UTC (History)
4 users (show)

See Also:
Host: i386-linux
Target: i386-linux
Build: i386-linux
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description 105309-quiet 2001-08-11 00:46:00 UTC
[ Reported to the Debian BTS as report #105309.
  Please CC 105309-quiet@bugs.debian.org on replies.
  Log of report can be found at http://bugs.debian.org/105309 ]
 	
Here's another segment that needs to have an assembler optimiser run over it:

int foo(char c) {
	if (c && !(c & 0x80)) {
		a();
	} else {
		b();
	}
}

produces with -O2:

   0:	55                   	push   %ebp
   1:	89 e5                	mov    %esp,%ebp
   3:	83 ec 08             	sub    $0x8,%esp
   6:	8a 45 08             	mov    0x8(%ebp),%al
   9:	84 c0                	test   %al,%al
   b:	74 04                	je     11 <foo+0x11>
   d:	84 c0                	test   %al,%al
   f:	79 07                	jns    18 <foo+0x18>

9-f can be rewritten as:

test	%al, %al
jg	18

  11:	e8 fc ff ff ff       	call   12 <foo+0x12>
			12: R_386_PC32	b
  16:	c9                   	leave  
  17:	c3                   	ret    
  18:	e8 fc ff ff ff       	call   19 <foo+0x19>
			19: R_386_PC32	a
  1d:	eb f7                	jmp    16 <foo+0x16>

And what purpose does this jmp serve? Surely it can be replaced with

leave
ret

  1f:	90                   	nop

Release:
3.0 (Debian GNU/Linux)

Environment:
System: Debian GNU/Linux (testing/unstable)
Architecture: i386
	
host: i386-linux
build: i386-linux
target: i386-linux
configured with: ../src/configure -v --enable-languages=c,c++,java,f77,proto,objc --prefix=/usr --infodir=/share/info --mandir=/share/man --enable-shared --with-gnu-as --with-gnu-ld --with-system-zlib --enable-long-long --enable-nls --without-included-gettext --disable-checking --enable-threads=posix --enable-java-gc=boehm --with-cpp-install-dir=bin --enable-objc-gc i386-linux
Comment 1 Richard Henderson 2002-04-02 16:26:51 UTC
State-Changed-From-To: open->analyzed
State-Changed-Why: Fold needs to recognize !(x & sign) as x >= 0.
    At which point the regular simplifications take care of it.
    
    As for leave+ret, may be handled by Jan's block duplication
    code on cfg-branch.  At present we only issue multiple returns
    without a stack frame.
Comment 2 Roger Sayle 2002-05-06 20:40:16 UTC
State-Changed-From-To: analyzed->closed
State-Changed-Why: This problem has been fixed on mainline by the recent patch:
    
    2002-05-06  Roger Sayle  <roger@eyesopen.com>
    
            * fold-const.c (sign_bit_p): New function.
            (fold) [EQ_EXPR]: Use this to convert (A & C) == 0 into A >= 0 and
            (A & C) != 0 into A < 0, when constant C is the sign bit of A's type.
            Reapply fold when converting (A & C) == C into (A & C) != 0.
    
    We now join the two tests together and generate better code.
Comment 3 Debian GCC Maintainers 2003-08-10 15:32:48 UTC
Herbert Xu writes:

This bug is mostly fixed, but one small detail remains.  The final 
jump back to the leave+ret sequence is still there. 
 
Since leave+ret is a two-byte sequence, it makes sense to replace the 
jump with a leave+ret. 
 
Here is the -S -O2 output with gcc 3.3: 

 
foo: 
        pushl   %ebp 
        movl    %esp, %ebp 
        subl    $8, %esp 
        cmpb    $0, 8(%ebp) 
        jle     .L2 
        call    a 
.L3: 
        leave 
        ret 
        .p2align 2,,3 
.L2: 
        call    b 
        jmp     .L3 
 
The last jmp should become 
 
        leave 
        ret 
--  
Comment 4 Andrew Pinski 2003-08-10 15:36:32 UTC
This is already fixed on the mainline GCC produce this:
foo:
        pushl   %ebp
        movl    %esp, %ebp
        subl    $8, %esp
        cmpb    $0, 8(%ebp)
        jle     .L2
        call    a
        leave
        ret
        .p2align 2,,3
.L2:
        call    b
        leave
        ret