Bug 105835 - [13 Regression] Dead Code Elimination Regression at -O1 (trunk vs. 12.1.0)
Summary: [13 Regression] Dead Code Elimination Regression at -O1 (trunk vs. 12.1.0)
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 13.0
: P3 normal
Target Milestone: 13.0
Assignee: Roger Sayle
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-06-03 13:31 UTC by Theodoros Theodoridis
Modified: 2022-06-20 17:10 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2022-06-03 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Theodoros Theodoridis 2022-06-03 13:31:36 UTC
cat case.c
void foo();

static int b;

static short a(short c, unsigned short d) { return c - d; }

int main() {
    int e = -(0 < b);
    if (a(1, e))
        b = 0;
    else
        foo();
}

`gcc-1982fe2692b6c3b7f969ffc4edac59f9d4359e91 (trunk) -O1` can not eliminate `foo` but `gcc-releases/gcc-12.1.0 -O1` can.

`gcc-1982fe2692b6c3b7f969ffc4edac59f9d4359e91 (trunk) -O1 -S -o /dev/stdout case.c`
--------- OUTPUT ---------
main:
.LFB1:
	.cfi_startproc
	cmpl	$0, b(%rip)
	movl	$65535, %eax
	movl	$0, %edx
	cmovle	%edx, %eax
	cmpw	$1, %ax
	je	.L2
	movl	$0, b(%rip)
	movl	$0, %eax
	ret
.L2:
	subq	$8, %rsp
	.cfi_def_cfa_offset 16
	movl	$0, %eax
	call	foo
	movl	$0, %eax
	addq	$8, %rsp
	.cfi_def_cfa_offset 8
	ret
---------- END OUTPUT ---------


`gcc-releases/gcc-12.1.0 -O1 -S -o /dev/stdout case.c`
--------- OUTPUT ---------
main:
.LFB1:
	.cfi_startproc
	movl	$0, b(%rip)
	movl	$0, %eax
	ret
---------- END OUTPUT ---------


Bisects to: https://gcc.gnu.org/git/?p=gcc.git;a=commit;h=8fb94fc6097c0a934aac0d89c9c5e2038da67655
Comment 1 Roger Sayle 2022-06-03 15:27:48 UTC
Hmm.  There might be something missing in CCP (with -O1)...

  # RANGE [0, 1] NONZERO 1
  _3 = (intD.6) _2;
  # RANGE [0, 65535] NONZERO 65535
  _10 = _3 * 65535;
  d_11 = (short unsigned intD.18) _10;
  if (d_11 != 1)
    goto <bb 3>; [67.00%]
  else
    goto <bb 4>; [33.00%]

So it knows _10 is either 0 or 65535, so if it knew d_11 had the
range {0 or -1}, i.e. [-1,0] it would know d_11 != 1, and hence
goto <bb 3>.
Comment 2 Roger Sayle 2022-06-05 19:05:12 UTC
Patch proposed
https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596200.html
Comment 3 GCC Commits 2022-06-18 08:10:26 UTC
The master branch has been updated by Roger Sayle <sayle@gcc.gnu.org>:

https://gcc.gnu.org/g:9991d84d2a84355fd3fc9afc89a963f45991bfa9

commit r13-1162-g9991d84d2a84355fd3fc9afc89a963f45991bfa9
Author: Roger Sayle <roger@nextmovesoftware.com>
Date:   Sat Jun 18 09:06:20 2022 +0100

    PR tree-optimization/105835: Two narrowing patterns for match.pd.
    
    This patch resolves PR tree-optimization/105835, which is a code quality
    (dead code elimination) regression at -O1 triggered/exposed by a recent
    change to canonicalize X&-Y as X*Y.  The new (shorter) form exposes some
    missed optimization opportunities that can be handled by adding some
    extra simplifications to match.pd.
    
    One transformation is to simplify "(short)(x ? 65535 : 0)" into the
    equivalent "x ? -1 : 0", or more accurately x ? (short)-1 : (short)0",
    as INTEGER_CSTs record their type, and integer conversions can be
    pushed inside COND_EXPRs reducing the number of gimple statements.
    
    The other transformation is that (short)(X * 65535), where X is [0,1],
    into the equivalent (short)X * -1, (or again (short)-1 where tree's
    INTEGER_CSTs encode their type).  This is valid because multiplications
    where one operand is [0,1] are guaranteed not to overflow, and hence
    integer conversions can also be pushed inside these multiplications.
    
    These narrowing conversion optimizations can be identified by range
    analyses, such as EVRP, but these are only performed at -O2 and above,
    which is why this regression is only visible with -O1.
    
    2022-06-18  Roger Sayle  <roger@nextmovesoftware.com>
    
    gcc/ChangeLog
            PR tree-optimization/105835
            * match.pd (convert (mult zero_one_valued_p@1 INTEGER_CST@2)):
            Narrow integer multiplication by a zero_one_valued_p operand.
            (convert (cond @1 INTEGER_CST@2 INTEGER_CST@3)): Push integer
            conversions inside COND_EXPR where both data operands are
            integer constants.
    
    gcc/testsuite/ChangeLog
            PR tree-optimization/105835
            * gcc.dg/pr105835.c: New test case.
Comment 4 Roger Sayle 2022-06-20 17:10:45 UTC
This should now be fixed on mainline.