Bug 9814 - gcc fails to optimise if (l&2) l|=2 away
Summary: gcc fails to optimise if (l&2) l|=2 away
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 3.2.2
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization, TREE
Depends on: 25290 29797
Blocks:
  Show dependency treegraph
 
Reported: 2003-02-23 07:46 UTC by 181096
Modified: 2011-05-22 14:54 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2006-11-18 01:34:36


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description 181096 2003-02-23 07:46:00 UTC
[ Reported to the Debian BTS as report #181096.
  Please CC 181096@bugs.debian.org on replies.
  Log of report can be found at http://bugs.debian.org/181096 ]
	

Checked with current 3.2 and 3.3 branches (20030221)

The following function doesn't get optimised away as a noop:

int k(int l)
{
	if (l & 2)
		l |= 2;
	return l;
}

$ gcc-3.2 -O2 -S b.c
$ cat b.s
	.file	"b.c"
	.text
	.p2align 2,,3
.globl k
	.type	k,@function
k:
	pushl	%ebp
	movl	%esp, %ebp
	movl	8(%ebp), %eax
	testl	$2, %eax
	je	.L2
	orl	$2, %eax
.L2:
	leave
	ret
.Lfe1:
	.size	k,.Lfe1-k
	.ident	"GCC: (GNU) 3.2.3 20030210 (Debian prerelease)"

Release:
3.2.2 (Debian) (Debian unstable)

Environment:
System: Debian GNU/Linux (unstable)
Architecture: i686

	
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Installed/Config-files/Unpacked/Failed-config/Half-installed
|/ Err?=(none)/Hold/Reinst-required/X=both-problems (Status,Err: uppercase=bad)
||/ Name           Version        Description
+++-==============-==============-============================================
ii  gcc-3.2        3.2.3-0pre1    The GNU C compiler
ii  g++-3.2        3.2.3-0pre1    The GNU C++ compiler
ii  libstdc++5     3.2.3-0pre1    The GNU Standard C++ Library v3
ii  libstdc++5-dev 3.2.3-0pre1    The GNU Standard C++ Library v3 (development
ii  binutils       2.13.90.0.18-1 The GNU assembler, linker and binary utiliti
ii  libc6          2.3.1-13       GNU C Library: Shared libraries and Timezone
host: i386-linux
Comment 1 Wolfgang Bangerth 2003-03-22 18:45:49 UTC
State-Changed-From-To: open->analyzed
State-Changed-Why: Confirmed.
Comment 2 Andrew Pinski 2004-06-04 05:41:25 UTC
I think this can be done on the tree using <http://gcc.gnu.org/ml/gcc-patches/2004-06/
msg00153.html> and not changing (a&2) == 0 into (a>>1) & 1 until late.
Comment 3 Steven Bosscher 2005-01-23 15:25:05 UTC
This is a NOP for me on AMD64 but not on i686. 
 
Comment 4 roger 2005-05-22 19:25:23 UTC
I posted a patch here: http://gcc.gnu.org/ml/gcc-patches/2005-03/msg01956.html
to implement this in the RTL optimizers.  Better to get it linked to the PR,
than slip through the cracks.  The proposed change to noce_emit_move_insn is
also related to another missed optimization PR, whose number I can no longer
remember.  Something to do with synthesized insn not getting recognized without
the needed clobbers.

I agree with Kazu that these transformations should also be implemented at the
tree-ssa level. I think once I commit the RTL solution to mainline CVS (so these
optimizations are performed somewhere), I'll unassign the PR (from myself), and
leave the tree-optimization PR open as an enhancement request.
Comment 5 GCC Commits 2005-05-27 02:46:08 UTC
Subject: Bug 9814

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	sayle@gcc.gnu.org	2005-05-27 02:46:01

Modified files:
	gcc            : ChangeLog ifcvt.c 
	gcc/testsuite  : ChangeLog 
Added files:
	gcc/testsuite/gcc.dg: pr9814-1.c 

Log message:
	PR tree-optimization/9814
	* ifcvt.c (noce_emit_move_insn): If we fail to recognize the move
	instruction, add the necessary clobbers by re-expanding the RTL
	for arithmetic operations via optab.c's expand_unop/expand_binop.
	(noce_try_bitop): New function to optimize bit manipulation idioms
	of the form "if (x & C) x = x op C" and "if (!(x & C) x = x op C".
	(noce_process_if_block): Call noce_try_bitop.
	
	* gcc.dg/pr9814-1.c: New test case.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.8916&r2=2.8917
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ifcvt.c.diff?cvsroot=gcc&r1=1.187&r2=1.188
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/ChangeLog.diff?cvsroot=gcc&r1=1.5540&r2=1.5541
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/testsuite/gcc.dg/pr9814-1.c.diff?cvsroot=gcc&r1=NONE&r2=1.1

Comment 6 roger 2005-05-27 02:55:01 UTC
This optimization is now performed at the RTL-level, but it would be nice if
this (and several other of ifcvt.c's noce_try_foo optimizations) could be
caught earlier during tree-ssa.
Comment 7 Andrew Pinski 2005-12-07 03:12:13 UTC
Once fold does (a&b)!=0?a|b:a to a and PR 25290 is fixed then this will be caught at the tree level. There are most likely others like this too.
Comment 8 Steven Bosscher 2006-11-18 01:27:39 UTC
Shouldn't this be fixed by Roger Sayle's recent fold-const.c patch?
Comment 9 Andrew Pinski 2006-11-18 01:34:36 UTC
(In reply to comment #8)
> Shouldn't this be fixed by Roger Sayle's recent fold-const.c patch?

No, in fact the generic (the one where 2 is turned into a variable) is not optimized either.
Comment 10 Steven Bosscher 2011-05-22 14:54:53 UTC
Works overall, fails at GIMPLE level:

$ ./cc1 -quiet -m32 -fomit-frame-pointer -O2 t.c -fdump-tree-optimized
$ cat t.s
	.file	"t.c"
	.text
	.p2align 4,,15
	.globl	k
	.type	k, @function
k:
.LFB0:
	.cfi_startproc
	movl	4(%esp), %eax
	ret
	.cfi_endproc
.LFE0:
	.size	k, .-k
	.ident	"GCC: (GNU) 4.6.0 20110312 (experimental) [trunk revision 170907]"
	.section	.note.GNU-stack,"",@progbits
$ cat t.c.143t.optimized 

;; Function k (k)

k (int l)
{
  int D.1979;

<bb 2>:
  D.1979_3 = l_2(D) & 2;
  if (D.1979_3 != 0)
    goto <bb 3>;
  else
    goto <bb 4>;

<bb 3>:
  l_4 = l_2(D) | 2;

<bb 4>:
  # l_1 = PHI <l_2(D)(2), l_4(3)>
  return l_1;

}

But there is already a (series of) bug report(s) for missed bit folding optimizations on GIMPLE.