Filed under unknown since there is no version tag for the tree-cleanup-branch. The problem probably also exists on mainline, but I have not tried it there. Consider this code, ----------------------------------------------------- extern void foo (void); extern int i; void bar (void) { switch (i) { case 0: foo (); break; case 1: break; } switch (i) { case 0: foo (); break; case 1: break; } } ----------------------------------------------------- This gives the following .dse2 dump: ;; Function bar (bar) bar () { int i.0; <bb 0>: # VUSE <i_2>; i.0_3 = i; switch (i.0_3) { case 0: goto <L0>; default : goto <L1>; } <L0>:; # i_6 = V_MAY_DEF <i_2>; foo (); # i_1 = PHI <i_2(0), i_6(1)>; <L1>:; # VUSE <i_1>; i.0_4 = i; switch (i.0_4) { case 0: goto <L4>; default : goto <L5>; } <L4>:; # i_5 = V_MAY_DEF <i_1>; foo (); <L5>:; return; } If the first switch goes through the default case, the switch condition "i" is not call clobbered by the call to "foo ()" so in the second switch "i" is unchanged and we can thread the jump. We don't catch this jump threading opportunity on trees, but we *do* catch it on RTL. This is *the* major source of missed jump threads on the tree-cleanup-branch (we catch rougly a thousand of these on the cc1-i file set). If we teach the tree optimizers to catch this case, we can almost certainly kill the RTL thread_jump junk on the tree-cleanup-branch.
Confirmed. To summarize what the code should look like: extern void foo (void); extern int i; void bar (void) { switch (i) { case 0: foo (); break; default: goto other_block; } switch (i) { case 0: foo (); break; default: other_block: break; } }
Might I propose we don't deal with this as an enhancement request but as a "normal" bug? Killing the jump threader in cfgcleanup.c would be a mighty feat, it's one of the slowest parts of the cfgcleanup on RTL (and IIRC it's one of the quadratic bottleneckt). Otherwise, perhaps I should (have) add(ed) the compile time hog keyword...
Diego told me to bug Law. Obedient as I always am, I hereby do so :-) Jeff, this is a missed jump threading opportunity, the default case can be threaded here. Any ideas how to fix this?
Subject: Re: Missed jump threading optimization On Mon, 2004-10-18 at 11:30, steven at gcc dot gnu dot org wrote: > ------- Additional Comments From steven at gcc dot gnu dot org 2004-10-18 17:30 ------- > Diego told me to bug Law. Obedient as I always am, I hereby > do so :-) > > Jeff, this is a missed jump threading opportunity, the default > case can be threaded here. Any ideas how to fix this? I don't see a good way to fix this. There's lots of interconnected issues that would need to be addressed. Clearly we need better range information so that we can determine that i != 0 on the default case leaving the first switch statement (or we would need to avoid collapsing empty cases to the default label until expansion). Second, we need to rewrite the jump threading selection code; that's on the queue. Specifically we need to drop the requirement that the statements at the start of the intermediate block are nops. The SSA update code is already prepared to handle this change, so it really ought to be isolated in the jump threading selection code. Given those two improvements we'd have a chance of threading the default case out of the first switch statement. jeff
Subject: Re: Missed jump threading optimization Hmm, threading the default case sounds interesting, but the real reason why the RTL threader catches this and the tree threader does not is because on RTL the test case basically looks like this: extern void foo (void); extern int i; void bar (void) { if (i == 0) foo (); if (i == 0) foo (); } Hey, I can thread that! :-) So perhaps we should consider lowering SWITCH_EXPRs with only two targets to COND_EXPRs after all...? That would be quite easy to do.
Subject: Re: Missed jump threading optimization On Mon, 2004-10-18 at 16:50, stevenb at suse dot de wrote: > ------- Additional Comments From stevenb at suse dot de 2004-10-18 22:50 ------- > Subject: Re: Missed jump threading optimization > > Hmm, threading the default case sounds interesting, but the real > reason why the RTL threader catches this and the tree threader does > not is because on RTL the test case basically looks like this: > > extern void foo (void); > extern int i; > void > bar (void) > { > if (i == 0) > foo (); > > if (i == 0) > foo (); > } > > Hey, I can thread that! :-) > > So perhaps we should consider lowering SWITCH_EXPRs with only two > targets to COND_EXPRs after all...? That would be quite easy to > do. Jan and maybe others have talked about lowering SWITCH_EXPRs earlier. I don't recall if it ever got implemented. jeff
I'm going to implement lowering of some SWITCH_EXPRs at the tree level. At least the ones that we do not produce a decision tree for now in stmt.c...
I don't have time to work on these (new job), so unassigning.
We've got zero chance of threading the jump in this case until the partially redundant load from "i" is removed. Daniel -- there's a pretty obvious redundant load from the global variable "i" in this testcase. I haven't investigated why PRE is missing this obvious redundancy. Jeff
(In reply to comment #9) > Daniel -- there's a pretty obvious redundant load from the global > variable "i" in this testcase. I haven't investigated why PRE > is missing this obvious redundancy. Because tree level load PRE does not handle global variables, there is another bug about this, PR 23455.
Subject: Re: Missed jump threading optimization On Tue, 2006-03-21 at 15:57 +0000, law at redhat dot com wrote: > > ------- Comment #9 from law at redhat dot com 2006-03-21 15:57 ------- > We've got zero chance of threading the jump in this case until the > partially redundant load from "i" is removed. > > Daniel -- there's a pretty obvious redundant load from the global > variable "i" in this testcase. I haven't investigated why PRE > is missing this obvious redundancy. It doesn't deal with loads from global variables because we need to place a value number on each "instance" that occurs in the program, but can't easily because they are all shared. I will get to it eventually. > > Jeff > >
tree PRE now *does* handle the partially redundant global variable load. This is the .final_cleanup dump: ;; Function bar (bar) bar () { int prephitmp.13; <bb 2>: prephitmp.13 = i; switch (prephitmp.13) <default: <L1>, case 0: <L0>> <L0>: foo (); prephitmp.13 = i; <L1>: switch (prephitmp.13) <default: <L5>, case 0: <L4>> <L4>: foo (); [tail call] <L5>: return; } But we still miss the jump threading opportunity.
Subject: Re: Missed jump threading optimization steven at gcc dot gnu dot org wrote: > ------- Comment #12 from steven at gcc dot gnu dot org 2008-09-21 13:58 ------- > tree PRE now *does* handle the partially redundant global variable load. This > is the .final_cleanup dump: > > > ;; Function bar (bar) > > bar () > { > int prephitmp.13; > > <bb 2>: > prephitmp.13 = i; > switch (prephitmp.13) <default: <L1>, case 0: <L0>> > > <L0>: > foo (); > prephitmp.13 = i; > > <L1>: > switch (prephitmp.13) <default: <L5>, case 0: <L4>> > > <L4>: > foo (); [tail call] > > <L5>: > return; > > } > > > But we still miss the jump threading opportunity. > Thanks for the update. One blocking issue out of the way.... Things have changed a lot since that original bug report. I believe the best solution for this particular case is to lower the switch statements early enough to expose the conditionals to DOM & VRP. The 2nd best approach would be to extend VRP to create a range for the default case of a SWITCH and extend the jump threading code in tree-vrp.c to handle switch statements (they're currently ignored). The biggest difficulty here would be to avoid dropping to varying too quickly. I think you'd want to sort the cases, then build up a range containing all the cases. If you get a gap in the range, you drop to varying. If after extracting all the ranges you haven't dropped to varying, then you invert the range and create the appropriate ASSERT_EXPRs. I'm not currently working on either solution. jeff
Still not fixed. Still the major source of RTL jump threads.
Still not fixed with r162134: ;; Function bar (bar) bar () { int prephitmp.4; <bb 2>: prephitmp.4_1 = i; switch (prephitmp.4_1) <default: <L2>, case 0: <L0>> <L0>: foo (); prephitmp.4_7 = i; # prephitmp.4_8 = PHI <prephitmp.4_1(2), prephitmp.4_7(3)> <L2>: switch (prephitmp.4_8) <default: <L6>, case 0: <L4>> <L4>: foo (); [tail call] <L6>: return; }
If VRP inserts into the default case the assert that i.1 cannot be what is other cases are, then I think jump threading could handle this.
Will be addressed for GCC 4.9 by moving switch lowering to GIMPLE.
(In reply to Andrew Pinski from comment #16) > If VRP inserts into the default case the assert that i.1 cannot be what is > other cases are, then I think jump threading could handle this. I'm working on a patch that does this.
Author: ppalka Date: Tue Jul 26 15:19:58 2016 New Revision: 238761 URL: https://gcc.gnu.org/viewcvs?rev=238761&root=gcc&view=rev Log: Teach VRP to register assertions along default switch labels (PR18046) gcc/ChangeLog: PR tree-optimization/18046 * genmodes.c (emit_mode_size_inline): Emit an assert that verifies that mode is a valid array index. (emit_mode_nuinits_inline): Likewise. (emit_mode_inner_inline): Likewise. (emit_mode_unit_size_inline): Likewise. (emit_mode_unit_precision_inline): Likewise. * tree-vrp.c: Include params.h. (find_switch_asserts): Register edge assertions for the default label which correspond to the anti-ranges of each case label. * params.def (PARAM_MAX_VRP_SWITCH_ASSERTIONS): New. * doc/invoke.texi: Document it. gcc/testsuite/ChangeLog: PR tree-optimization/18046 * gcc.dg/tree-ssa/ssa-dom-thread-6.c: Bump FSM count to 5. * gcc.dg/tree-ssa/vrp103.c: New test. * gcc.dg/tree-ssa/vrp104.c: New test. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp103.c trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp104.c Modified: trunk/gcc/ChangeLog trunk/gcc/doc/invoke.texi trunk/gcc/genmodes.c trunk/gcc/params.def trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-dom-thread-6.c trunk/gcc/tree-vrp.c
Author: ppalka Date: Fri Aug 5 23:29:53 2016 New Revision: 239181 URL: https://gcc.gnu.org/viewcvs?rev=239181&root=gcc&view=rev Log: Improve forward jump threading of switch statements (PR18046) gcc/ChangeLog: PR tree-optimization/18046 * tree-ssa-threadedge.c: Include cfganal.h. (simplify_control_statement_condition): If simplifying a GIMPLE_SWITCH, replace the index operand of the GIMPLE_SWITCH with the dominating ASSERT_EXPR before handing it off to VRP. Mention that a CASE_LABEL_EXPR may be returned. (thread_around_empty_blocks): Adjust to handle simplify_control_statement_condition() returning a CASE_LABEL_EXPR. (thread_through_normal_block): Likewise. * tree-vrp.c (simplify_stmt_for_jump_threading): Simplify a switch statement by trying to determine which case label will be taken. gcc/testsuite/ChangeLog: PR tree-optimization/18046 * gcc.dg/tree-ssa/vrp105.c: New test. * gcc.dg/tree-ssa/vrp106.c: New test. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp105.c trunk/gcc/testsuite/gcc.dg/tree-ssa/vrp106.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-threadedge.c trunk/gcc/tree-vrp.c
I guess this can be considered fixed now.