Bug 42893 - [4.3/4.4/4.5/4.6 Regression] Missed conditionally dead store elimination
Summary: [4.3/4.4/4.5/4.6 Regression] Missed conditionally dead store elimination
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.5.0
: P2 normal
Target Milestone: 4.6.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
: 42899 (view as bug list)
Depends on:
Blocks: 16996
  Show dependency treegraph
 
Reported: 2010-01-28 07:17 UTC by Steven Bosscher
Modified: 2011-02-08 18:55 UTC (History)
5 users (show)

See Also:
Host:
Target: i?86-*-*
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-01-28 11:00:23


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Steven Bosscher 2010-01-28 07:17:28 UTC
Taken from http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/8A/8AB0B238.shtml:

struct frame_info;
void tui_registers_changed_hook (void);
extern struct frame_info *deprecated_selected_frame;
int tui_refreshing_registers = 0;
void
tui_registers_changed_hook (void)
{
  struct frame_info *fi;

  fi = deprecated_selected_frame;
  if (fi)
    {
      if (tui_refreshing_registers == 0)
	{
	  tui_refreshing_registers = 1;
	  tui_refreshing_registers = 0;
	}
    }
  return;
}



Assembler output from GCC 3.4.6 at -O3 on x86_64:

.globl tui_registers_changed_hook
	.type	tui_registers_changed_hook, @function
tui_registers_changed_hook:
.LFB2:
	rep ; ret
.LFE2:
	.size	tui_registers_changed_hook, .-tui_registers_changed_hook



Assembler output from GCC 4.5.0 (r155776) at -O3 on x86_64:

.globl tui_registers_changed_hook
	.type	tui_registers_changed_hook, @function
tui_registers_changed_hook:
.LFB0:
	.cfi_startproc
	cmpq	$0, deprecated_selected_frame(%rip)
	je	.L1
	movl	tui_refreshing_registers(%rip), %eax
	testl	%eax, %eax
	jne	.L1
	movl	$0, tui_refreshing_registers(%rip)
.L1:
	rep
	ret
	.cfi_endproc
.LFE0:
	.size	tui_registers_changed_hook, .-tui_registers_changed_hook


This also confuses the tree optimizers (e.g. GCC now fails to note that the function is pure+const). The .optimized dump:

;; Function tui_registers_changed_hook (tui_registers_changed_hook)

tui_registers_changed_hook ()
{
  struct frame_info * fi;
  int tui_refreshing_registers.0;

<bb 2>:
  fi_1 = deprecated_selected_frame;
  if (fi_1 != 0B)
    goto <bb 3>;
  else
    goto <bb 5>;

<bb 3>:
  tui_refreshing_registers.0_2 = tui_refreshing_registers;
  if (tui_refreshing_registers.0_2 == 0)
    goto <bb 4>;
  else
    goto <bb 5>;

<bb 4>:
  tui_refreshing_registers = 0;

<bb 5>:
  return;

}
Comment 1 rguenther@suse.de 2010-01-28 09:46:37 UTC
Subject: Re:   New: [4.3/4.4/4.5 Regression]
 Missed conditionally dead store elimination

On Thu, 28 Jan 2010, steven at gcc dot gnu dot org wrote:

> Taken from
> http://embed.cs.utah.edu/embarrassing/jan_10/harvest/source/8A/8AB0B238.shtml:
> 
> struct frame_info;
> void tui_registers_changed_hook (void);
> extern struct frame_info *deprecated_selected_frame;
> int tui_refreshing_registers = 0;
> void
> tui_registers_changed_hook (void)
> {
>   struct frame_info *fi;
> 
>   fi = deprecated_selected_frame;
>   if (fi)
>     {
>       if (tui_refreshing_registers == 0)
>         {
>           tui_refreshing_registers = 1;
>           tui_refreshing_registers = 0;
>         }
>     }
>   return;
> }

We can only see that tui_refreshing_registers = 1 is dead in DSE.
DOM in theory could see the equivalency but I think it does not
track loads/stores properly enough.  This needs predicated
value-numbering I think.

No idea why we caught this on RTL in 3.x but not now - that's for
you to answer ;)  I suppose RTL jump threading might have caught it?
Comment 2 Steven Bosscher 2010-01-28 11:00:22 UTC
The RTL .ce1 pass catches it in GCC 3.4.6. The codes goes from this:

;; Start of basic block 1, registers live: (nil)
(note 36 13 17 1 [bb 1] NOTE_INSN_BASIC_BLOCK)

(insn 17 36 18 1 (set (reg:SI 60 [ tui_refreshing_registers ])
        (mem/f:SI (symbol_ref:DI ("tui_refreshing_registers") [flags 0x2] <var_decl 0x63f012aee680 tui_refreshing_registers>) [2 tui_refreshing_registers+0 S4 A32])) 43 {*movsi_
    (nil))

(insn 18 17 19 1 (set (reg:CCZ 17 flags)
        (compare:CCZ (reg:SI 60 [ tui_refreshing_registers ])
            (const_int 0 [0x0]))) 3 {*cmpsi_ccno_1} (nil)
    (nil))

(jump_insn 19 18 37 1 (set (pc)
        (if_then_else (ne (reg:CCZ 17 flags)
                (const_int 0 [0x0]))
            (label_ref 34)
            (pc))) 494 {*jcc_1} (nil)
    (expr_list:REG_BR_PROB (const_int 5000 [0x1388])
        (nil)))
;; End of basic block 1, registers live:
 (nil)

;; Start of basic block 2, registers live: (nil)
(note 37 19 25 2 [bb 2] NOTE_INSN_BASIC_BLOCK)

(insn 25 37 32 2 (set (mem/f:SI (symbol_ref:DI ("tui_refreshing_registers") [flags 0x2] <var_decl 0x63f012aee680 tui_refreshing_registers>) [2 tui_refreshing_registers+0 S4 A32]
        (const_int 0 [0x0])) 43 {*movsi_1_nointernunit} (nil)
    (nil))
;; End of basic block 2, registers live:
 (nil)



...to this...:

note 36 13 17 1 [bb 1] NOTE_INSN_BASIC_BLOCK)

(insn 17 36 18 1 (set (reg:SI 60 [ tui_refreshing_registers ])
        (mem/f:SI (symbol_ref:DI ("tui_refreshing_registers") [flags 0x2] <var_decl 0x63f012aee680 tui_refreshing_registers>) [2 tui_refreshing_registers+0 S4 A32])) 43 {*movsi_
    (nil))

(insn 18 17 42 1 (set (reg:CCZ 17 flags)
        (compare:CCZ (reg:SI 60 [ tui_refreshing_registers ])
            (const_int 0 [0x0]))) 3 {*cmpsi_ccno_1} (nil)
    (nil))

(insn 42 18 44 1 (set (reg:SI 63 [ tui_refreshing_registers ])
        (mem/f:SI (symbol_ref:DI ("tui_refreshing_registers") [flags 0x2] <var_decl 0x63f012aee680 tui_refreshing_registers>) [2 tui_refreshing_registers+0 S4 A32])) -1 (nil)
    (nil))

(insn 44 42 43 1 (set (reg:SI 64)
        (const_int 0 [0x0])) -1 (nil)
    (nil))

(insn 43 44 45 1 (set (reg:CCZ 17 flags)
        (compare:CCZ (reg:SI 60 [ tui_refreshing_registers ])
            (const_int 0 [0x0]))) -1 (nil)
    (nil))

(insn 45 43 46 1 (set (reg:SI 62)
        (if_then_else:SI (ne (reg:CCZ 17 flags)
                (const_int 0 [0x0]))
            (reg:SI 63 [ tui_refreshing_registers ])
            (reg:SI 64))) -1 (nil)
    (nil))

(insn 46 45 32 1 (set (mem/f:SI (symbol_ref:DI ("tui_refreshing_registers") [flags 0x2] <var_decl 0x63f012aee680 tui_refreshing_registers>) [2 tui_refreshing_registers+0 S4 A32]
        (reg:SI 62)) -1 (nil)
    (nil))
;; End of basic block 1, registers live:
 (nil)



This does not happen in GCC 4.5.0 (r156286) anymore because the constant 0 is stored directly to a MEM, instead of through a register:

(note 7 6 8 3 [bb 3] NOTE_INSN_BASIC_BLOCK)

(insn 8 7 9 3 t.c:13 (set (reg:CCZ 17 flags)
        (compare:CCZ (mem/c/i:SI (symbol_ref:DI ("tui_refreshing_registers") [flags 0x2]  <var_decl 0x20000000006080a0 tui_refreshing_registers>) [2 tui_refreshing_registers+0 S4 A32])
            (const_int 0 [0x0]))) 2 {*cmpsi_ccno_1} (nil))

(jump_insn 9 8 10 3 t.c:13 (set (pc)
        (if_then_else (ne (reg:CCZ 17 flags)
                (const_int 0 [0x0]))
            (label_ref:DI 14)
            (pc))) 615 {*jcc_1} (expr_list:REG_DEAD (reg:CCZ 17 flags)
        (expr_list:REG_BR_PROB (const_int 3900 [0xf3c])
            (nil)))
 -> 14)

(note 10 9 11 4 [bb 4] NOTE_INSN_BASIC_BLOCK)

(insn 11 10 14 4 t.c:16 (set (mem/c/i:SI (symbol_ref:DI ("tui_refreshing_registers") [flags 0x2]  <var_decl 0x20000000006080a0 tui_refreshing_registers>) [2 tui_refreshing_registers+0 S4 A32])
        (const_int 0 [0x0])) 47 {*movsi_1} (nil))

(code_label 14 11 15 5 1 "" [2 uses])

Comment 3 Steven Bosscher 2010-01-29 19:58:58 UTC
*** Bug 42899 has been marked as a duplicate of this bug. ***
Comment 4 Andrew Pinski 2010-01-29 20:03:02 UTC
Note, I think this works on targets where movs cannot have a mem with a constant like most RISC targets.
Comment 5 Andrew Pinski 2010-01-29 20:07:47 UTC
Yep:
tui_registers_changed_hook:
        blr

Which means this is the same as PR 23488 which was caused by:
2005-07-30  Jan Hubicka  <jh@suse.cz>

        * expr.c (expand_expr_real_1): Do not load mem targets into register.
        * i386.c (ix86_fixup_binary_operands): Likewise.
        (ix86_expand_unary_operator): Likewise.
        (ix86_expand_fp_absneg_operator): Likewise.
        * optabs.c (expand_vec_cond_expr): Validate dest.
Comment 6 Steven Bosscher 2010-01-29 20:17:27 UTC
I think the issue here is more that we should look for a way to optimize this early on. I'm guessing it's one of the ce[123] passes that cleans this up for you on your RISCy machine? IMHO it would be better even in your case to nuke dumb code like this in GIMPLE (enabling other optimizations, etc.).
Comment 7 Andrew Pinski 2010-01-29 20:19:14 UTC
(In reply to comment #6)
> I think the issue here is more that we should look for a way to optimize this
> early on. I'm guessing it's one of the ce[123] passes that cleans this up for
> you on your RISCy machine? IMHO it would be better even in your case to nuke
> dumb code like this in GIMPLE (enabling other optimizations, etc.).

Note I tested on powerpc-linux-gnu but it should work on most other targets; just x86 is the crazy popular target which supports constants in memory stores :).
Comment 8 Richard Biener 2010-01-29 22:37:12 UTC
(In reply to comment #6)
> I think the issue here is more that we should look for a way to optimize this
> early on. I'm guessing it's one of the ce[123] passes that cleans this up for
> you on your RISCy machine? IMHO it would be better even in your case to nuke
> dumb code like this in GIMPLE (enabling other optimizations, etc.).

Predicated value-numbering with redundant store elimination should fix this.
Matz is working on that for 4.6.
Comment 9 Richard Biener 2010-05-22 18:13:50 UTC
GCC 4.3.5 is being released, adjusting target milestone.
Comment 10 Jeffrey A. Law 2011-01-27 02:27:36 UTC
DSE1 removes tui_refreshing_registers = 1 as it's a dead store.

DOM2 then removes tui_refreshing_registers = 0 because it's able to prove the memory location already holds the value 0.

That ultimately leaves an empty function as we leave the tree-ssa optimizers, which is precisely what we want.
Comment 11 Steven Bosscher 2011-02-06 12:24:10 UTC
The test case should be added to the test suite.
Comment 12 Jeffrey A. Law 2011-02-08 18:54:21 UTC
Author: law
Date: Tue Feb  8 18:54:12 2011
New Revision: 169933

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=169933
Log:

	PR tree-optimization/42893
	* gcc.tree-ssa/pr42893.c: New test.


Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/pr42893.c
Modified:
    trunk/gcc/testsuite/ChangeLog
Comment 13 Jeffrey A. Law 2011-02-08 18:55:44 UTC
Testcase added.