cc1plus -v GNU C++ version 3.5-tree-ssa 20040123 (merged 20040102) (i686-pc-linux-gnu) compiled by GNU C version 3.5-tree-ssa 20040123 (merged 20040102). generate-3.4.ii is the famous code from PR8361 cc1plus -quiet -fpreprocessed -O2 generate-3.4.ii -fdump-tree-optimized Looking at the function HeuristicCounters::~HeuristicCounters() in generate-3.4.ii.t35.optimized, it seems that this function has not been optimized, there are obvious jump threading opportunities. It is strange that if -fno-inline is added to the command line, then HeuristicCounters::~HeuristicCounters() is optimized!
Confirmed. I have seen some weird things but this takes the cake.
Here is the simpler testcase, self-contained example: struct COST { unsigned *cost; ~COST(); }; COST::~COST() { if (cost) delete [] cost; } In fact it has nothing to do with inlining at all.
Note it is almost impossiable to make a C testcase.
The fundamental problem is the two conditionals have different types for their arguments. If I twiddle the C++ front-end to generate code with the proper types, then we properly eliminate the second conditional (at least in the small simple testcase provided). I haven't looked, but I strongly suspect the fact that the C++ FE mucking up the types is likely the cause of of missing many optimization opportunities.
I've checked in a patch to force the C++ front-end to use proper types which allows us to optimize the simple test. I would appreciate it if y'all could look at generate-3.4.ii in more detail and see if there are other opportunities we're missing. I'm going to leave this bug open until we're sure the optimizer is doing a better job at the larger testcase.
The patch is here: http://gcc.gnu.org/ml/gcc-patches/2004-03/msg00054.html (looks like Jeff forgot to mention the PR number in the ChangeLog)
As per Jeff's request, a brief look at generate-3.4.ii.t54.vars function: void MODEL_GENERATOR::assumePTLiteral(bool&, bool, const GLITERAL&, GINTERPRET&) [snip] T.8294 = (bool)(int)ptl->neg; if (T.8294 == 0) goto <L18>; else goto <L17>; <L17>:; if (reallyPT) goto <L21>; else goto <L18>; <L18>:; T.8297 = !T.8294; if (T.8297 == 0) goto <L33>; else goto <L20>; <L20>:; if (reallyPT == 0) goto <L21>; else goto <L33>; <L21>:; if (TraceLevel > 1) goto <L22>; else goto <L31>; see some jump threading opportunities missed, the T.8297 variable could be eliminated. Other observations: -- casts like "if ((bool)(int)(bool) ....)" -- lots of code will be eliminated after PR13954 and PR13761 are fixed
Subject: Re: [tree-ssa] missed jump thread optimization on the tree-level In message <20040303181405.32100.qmail@sources.redhat.com>, "dann at godzilla d ot ics dot uci dot edu" writes: > >------- Additional Comments From dann at godzilla dot ics dot uci dot edu 20 >04-03-03 18:14 ------- >As per Jeff's request, a brief look at generate-3.4.ii.t54.vars function: > >void MODEL_GENERATOR::assumePTLiteral(bool&, bool, const GLITERAL&, GINTERPRE >T&) >[snip] > > T.8294 = (bool)(int)ptl->neg; > if (T.8294 == 0) goto <L18>; else goto <L17>; > ><L17>:; > if (reallyPT) goto <L21>; else goto <L18>; > ><L18>:; > T.8297 = !T.8294; > if (T.8297 == 0) goto <L33>; else goto <L20>; > ><L20>:; > if (reallyPT == 0) goto <L21>; else goto <L33>; > ><L21>:; > if (TraceLevel > 1) goto <L22>; else goto <L31>; > >see some jump threading opportunities missed, the T.8297 variable >could be eliminated. Can you send me the .cddce dump. If T.8297_XXXX is used more than once, then that's going to block most/all jump threading opportunities in this code fragment. > >Other observations: > -- casts like "if ((bool)(int)(bool) ....)" From a .vars dump? I'm not terribly surprised. Not particularly important. It's actually easier to look at the cddce dumps since you've still got SSA version information which makes it much clearer where/how values are being used. jeff
Subject: Re: [tree-ssa] missed jump thread optimization on the tree-level In message <20040303181405.32100.qmail@sources.redhat.com>, "dann at godzilla d ot ics dot uci dot edu" writes: > >------- Additional Comments From dann at godzilla dot ics dot uci dot edu 20 >04-03-03 18:14 ------- >As per Jeff's request, a brief look at generate-3.4.ii.t54.vars function: > >void MODEL_GENERATOR::assumePTLiteral(bool&, bool, const GLITERAL&, GINTERPRE >T&) >[snip] > > T.8294 = (bool)(int)ptl->neg; > if (T.8294 == 0) goto <L18>; else goto <L17>; > ><L17>:; > if (reallyPT) goto <L21>; else goto <L18>; > ><L18>:; > T.8297 = !T.8294; > if (T.8297 == 0) goto <L33>; else goto <L20>; > ><L20>:; > if (reallyPT == 0) goto <L21>; else goto <L33>; > ><L21>:; > if (TraceLevel > 1) goto <L22>; else goto <L31>; > >see some jump threading opportunities missed, the T.8297 variable >could be eliminated. > >Other observations: > -- casts like "if ((bool)(int)(bool) ....)" > -- lots of code will be eliminated after PR13954 and PR13761 are fixed OK. It turned out to be easier to go ahead and propagate booleans more aggressively. Given: x = !y; if (x == 0) ... if (x != 0) ... if (x == 0) Even though x is used more than, it is advantageous to go ahead and forward propagate Y into the conditionals since we won't introduce any new expression evaluations. ie the above turns into if (y != 0) ... if (y == 0) ... if (y != 0) Additionally code which was supposed to record an SSA_NAME = <const> equivalence due to following a COND_EXPR was instead recording a less precise equivalence for the expression in the COND_EXPR due to a minor buglet. This is preventing some constant propagations from occurring which are critical to fully optimizing the code in question. The tree-ssa-forwprop.c looks larger than it really is... There's some reformatting going on as a hunk of code was moved outside a loop. Bootstrapped and regression tested i686-pc-linux-gnu. * tree-ssa-dom.c: (get_eq_expr_value): Fix typo when comparing a boolean against a constant. * tree-ssa-forwprop.c (record_single_argument_cond_exprs): Do not record the same SSA_NAME more than once. Only record the SSA_NAME tested, not the COND_EXPR. (substitute_single_use_vars): Substitute booleans which are set from a TRUTH_NOT_EXPR even if they have more than one use site. Index: tree-ssa-dom.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/Attic/tree-ssa-dom.c,v retrieving revision 1.1.2.145 diff -c -p -r1.1.2.145 tree-ssa-dom.c *** tree-ssa-dom.c 2 Mar 2004 18:41:49 -0000 1.1.2.145 --- tree-ssa-dom.c 8 Mar 2004 01:55:40 -0000 *************** get_eq_expr_value (tree if_stmt, *** 2889,2895 **** tree op1 = TREE_OPERAND (cond, 1); /* Special case comparing booleans against a constant as we know ! the value of op0 on both arms of the branch. */ if ((TREE_CODE (cond) == EQ_EXPR || TREE_CODE (cond) == NE_EXPR) && TREE_CODE (op0) == SSA_NAME && TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE --- 2889,2896 ---- tree op1 = TREE_OPERAND (cond, 1); /* Special case comparing booleans against a constant as we know ! the value of OP0 on both arms of the branch. ie, we can record ! an equivalence for OP0 rather than COND. */ if ((TREE_CODE (cond) == EQ_EXPR || TREE_CODE (cond) == NE_EXPR) && TREE_CODE (op0) == SSA_NAME && TREE_CODE (TREE_TYPE (op0)) == BOOLEAN_TYPE *************** get_eq_expr_value (tree if_stmt, *** 2907,2913 **** else retval.src = boolean_false_node; } ! retval.dst = cond; return retval; } --- 2908,2914 ---- else retval.src = boolean_false_node; } ! retval.dst = op0; return retval; } Index: tree-ssa-forwprop.c =================================================================== RCS file: /cvs/gcc/gcc/gcc/Attic/tree-ssa-forwprop.c,v retrieving revision 1.1.2.4 diff -c -p -r1.1.2.4 tree-ssa-forwprop.c *** tree-ssa-forwprop.c 5 Mar 2004 20:54:47 -0000 1.1.2.4 --- tree-ssa-forwprop.c 8 Mar 2004 02:05:38 -0000 *************** record_single_argument_cond_exprs (void) *** 108,114 **** vars = BITMAP_XMALLOC (); ! VARRAY_TREE_INIT (retval, 10, "forward propagation objects"); /* The first pass over the blocks gathers the set of variables we need immediate uses for as well as the set of interesting COND_EXPRs. --- 108,114 ---- vars = BITMAP_XMALLOC (); ! VARRAY_TREE_INIT (retval, 10, "forward propagation vars"); /* The first pass over the blocks gathers the set of variables we need immediate uses for as well as the set of interesting COND_EXPRs. *************** record_single_argument_cond_exprs (void) *** 146,151 **** --- 146,156 ---- else test_var = TREE_OPERAND (cond, 0); + /* If we have already recorded this SSA_NAME as interesting, + do not do so again. */ + if (bitmap_bit_p (vars, SSA_NAME_VERSION (test_var))) + continue; + /* Now get the defining statement for TEST_VAR and see if it something we are interested in. */ def = SSA_NAME_DEF_STMT (test_var); *************** record_single_argument_cond_exprs (void) *** 211,219 **** else continue; ! /* All the tests passed, record COND_EXPR and TEST_VAR ! as interesting. */ ! VARRAY_PUSH_TREE (retval, last); VARRAY_PUSH_TREE (retval, test_var); bitmap_set_bit (vars, SSA_NAME_VERSION (test_var)); } --- 216,222 ---- else continue; ! /* All the tests passed, record TEST_VAR as interesting. */ VARRAY_PUSH_TREE (retval, test_var); bitmap_set_bit (vars, SSA_NAME_VERSION (test_var)); } *************** record_single_argument_cond_exprs (void) *** 223,230 **** return retval; } ! /* Given FORWPROP_DATA containing pairs of potentially optimizable COND_EXPRs ! and the SSA_NAME used in the COND_EXPR, attempt to rewrite the condition in each COND_EXPR to use the RHS of the statement which defines the SSA_NAME used in the COND_EXPR. */ --- 226,233 ---- return retval; } ! /* Given FORWPROP_DATA containing SSA_NAMEs which are used in COND_EXPRs ! that we may be able to optimize, attempt to rewrite the condition in each COND_EXPR to use the RHS of the statement which defines the SSA_NAME used in the COND_EXPR. */ *************** substitute_single_use_vars (varray_type *** 233,261 **** { unsigned int i; ! for (i = 0; i < VARRAY_ACTIVE_SIZE (forwprop_data); i += 2) { ! tree test_var = VARRAY_TREE (forwprop_data, i + 1); tree def = SSA_NAME_DEF_STMT (test_var); dataflow_t df; ! int num_uses; /* Now compute the immediate uses of TEST_VAR. */ df = get_immediate_uses (def); num_uses = num_immediate_uses (df); ! /* If TEST_VAR is used precisely one time, then we may have ! an optimizable case. */ ! if (num_uses == 1) { ! tree cond_stmt = VARRAY_TREE (forwprop_data, i); ! tree cond = COND_EXPR_COND (cond_stmt); ! enum tree_code cond_code = TREE_CODE (cond); ! tree def_rhs = TREE_OPERAND (def, 1); ! enum tree_code def_rhs_code = TREE_CODE (def_rhs); ! block_stmt_iterator bsi; tree new_cond; /* If the definition of the single use variable was from an arithmetic operation, then we just need to adjust the constant in the COND_EXPR_COND and update the variable tested. */ --- 236,288 ---- { unsigned int i; ! for (i = 0; i < VARRAY_ACTIVE_SIZE (forwprop_data); i++) { ! tree test_var = VARRAY_TREE (forwprop_data, i); tree def = SSA_NAME_DEF_STMT (test_var); dataflow_t df; ! int j, num_uses, propagated_uses; ! block_stmt_iterator bsi; /* Now compute the immediate uses of TEST_VAR. */ df = get_immediate_uses (def); num_uses = num_immediate_uses (df); + propagated_uses = 0; ! /* If TEST_VAR is used more than once and is not a boolean set ! via TRUTH_NOT_EXPR with another SSA_NAME as its argument, then ! we can not optimize. */ ! if (num_uses == 1 ! || (TREE_CODE (TREE_TYPE (test_var)) == BOOLEAN_TYPE ! && TREE_CODE (TREE_OPERAND (def, 1)) == TRUTH_NOT_EXPR ! && (TREE_CODE (TREE_OPERAND (TREE_OPERAND (def, 1), 0)) ! == SSA_NAME))) ! ; ! else ! continue; ! ! /* Walk over each use and try to forward propagate the RHS of ! DEF into the use. */ ! for (j = 0; j < num_uses; j++) { ! tree cond_stmt; ! tree cond; ! enum tree_code cond_code; ! tree def_rhs; ! enum tree_code def_rhs_code; tree new_cond; + cond_stmt = immediate_use (df, j); + + /* For now we can only propagate into COND_EXPRs. */ + if (TREE_CODE (cond_stmt) != COND_EXPR) + continue; + + cond = COND_EXPR_COND (cond_stmt); + cond_code = TREE_CODE (cond); + def_rhs = TREE_OPERAND (def, 1); + def_rhs_code = TREE_CODE (def_rhs); + /* If the definition of the single use variable was from an arithmetic operation, then we just need to adjust the constant in the COND_EXPR_COND and update the variable tested. */ *************** substitute_single_use_vars (varray_type *** 333,353 **** /* Replace the condition. */ COND_EXPR_COND (cond_stmt) = new_cond; modify_stmt (cond_stmt); ! ! /* Now delete the defining statement, unfortunately ! we have to find the defining statement in whatever ! block it might be in. */ ! for (bsi = bsi_start (bb_for_stmt (def)); ! !bsi_end_p (bsi); ! bsi_next (&bsi)) ! { ! if (def == bsi_stmt (bsi)) ! { ! bsi_remove (&bsi); ! break; ! } ! } } } } --- 360,382 ---- /* Replace the condition. */ COND_EXPR_COND (cond_stmt) = new_cond; modify_stmt (cond_stmt); ! propagated_uses++; } + + /* If we propagated into all the uses, then we can delete DEF. + Unfortunately, we have to find the defining statement in + whatever block it might be in. */ + if (num_uses && num_uses == propagated_uses) + for (bsi = bsi_start (bb_for_stmt (def)); + !bsi_end_p (bsi); + bsi_next (&bsi)) + { + if (def == bsi_stmt (bsi)) + { + bsi_remove (&bsi); + break; + } + } } }
I just checked in another batch of changes which ought to improve this code. If you could take another looksie at the dumps and see if we're missing anything obvious, it would be appreciated.
Here's a simplified test derived from generate-3.4.ii that shows one more optimization opportunity: extern void foo1 (void); extern void foo2 (void); extern void foo3 (void); extern void foo4 (void); extern void foo5 (void); extern void __assert (const char *, int, const char *); void test (bool a, bool b, int c) { if ((a && b) || (!a && !b )) { if (c >= 2) { foo1 (); if (a && b) foo4 (); else foo5 (); } } else { if (!a && b) { foo2 (); } else { ((a && !b) ? (void)0 : __assert ("generate.C", 6154, "a && !b")); foo3 (); } } } "if (a && b) foo4 (); else foo5 ();" can be simplified to "if (a) foo4 (); else foo5 ();"
The problem here (I think) is that DOM cannot read inside a && yet, there is already a testcase on the branch which is XFAIL'ed for the problem.
Subject: Re: [tree-ssa] missed jump thread optimization on the tree-level In message <20040308235126.24783.qmail@sources.redhat.com>, "pinskia at gcc dot gnu dot org" writes: > >------- Additional Comments From pinskia at gcc dot gnu dot org 2004-03-08 2 >3:51 ------- >The problem here (I think) is that DOM cannot read inside a && yet, there is >already a testcase on the branch which is XFAIL'ed for the problem. I'm not sure what you mean here, and optimizing this test really isn't a job for the dominator optimizer. The way I see to optimize this particular case would require us to determine that there is an equivalence between a and b at the head of block #3 by knowing that a & b were equivalent on each incoming edge to block #3. We would then need to use that equivalence to thread the edge from 4->5 to instead reach 6. Note that block #3 is a merge point which really means this isn't a job for the dominator optimizer. A smart GVN pass _might_ be able to discover the equivalence. Jeff
I am not sure that this belongs here, or if I should open another PR. Also from generate-3.4.ii from PR8361 the .vars dump for function bool PT<GPROGRAM>::isReallyPossible() const looks like: int T.3219; int iftmp.14163; <bb 0>: if (this->allowsPositive) goto <L0>; else goto <L11>; <L0>:; T.3219 = (int)*(((struct GINTERPRET *)this->I)->array + (TruthValue *)(((struct GATOM *)(struct GATOM &)((struct __normal_iterator<constGATOM*,std::vector<GATOM, std::allocator<GATOM> > > *)this + 12B)->_M_current)->index * 4)); if ((bool)(T.3219 == 0) != 0) goto <L17>; else goto <L4>; <L17>:; iftmp.14163 = 1; goto <bb 4> (<L10>); <L4>:; if ((bool)(T.3219 == 2) != 0) goto <L8>; else goto <L18>; <L18>:; iftmp.14163 = 0; goto <bb 4> (<L10>); <L8>:; iftmp.14163 = 1; <L10>:; return iftmp.14163; <L11>:; return (int)(bool)((int)*(((struct GINTERPRET *)this->I)->array + (TruthValue *)(((struct GATOM *)(struct GATOM &)(struct GLITERAL *)(struct GLITERAL &)((struct __normal_iterator<constTLITERAL<GATOM>*,std::vector<TLITERAL<GATOM>, std::allocator<TLITERAL<GATOM> > > > *)this + 16B)->_M_current)->index * 4)) == 0); The 2 ifs that test T.3219 look strange.
You do not have to file a new bug, I have a fix already, the issue has to do with casts: bool PT<GPROGRAM>::isReallyPossible() const [with GPROGRAM = GRULES] (this) { TruthValue T.3218; int iftmp.14093; <bb 0>: if (this->allowsPositive) goto <L0>; else goto <L11>; <L0>:; T.3218 = *(((struct GINTERPRET *)this->I)->array + (TruthValue *)(((struct __normal_iterator<constGATOM*,std::vector<GATOM, std::allocator<GATOM> > > *)this + 12B)- >_M_current->index * 4)); if (T.3218 == 0) goto <L17>; else goto <L4>; <L17>:; iftmp.14093 = 1; goto <bb 4> (<L10>); <L4>:; if (T.3218 == 2) goto <L8>; else goto <L18>; <L18>:; iftmp.14093 = 0; goto <bb 4> (<L10>); <L8>:; iftmp.14093 = 1; <L10>:; return iftmp.14093; <L11>:; return (int)*(((struct GINTERPRET *)this->I)->array + (TruthValue *)(((struct GATOM *)(struct GATOM &)((struct __normal_iterator<constTLITERAL<GATOM>*,std::vector<TLITERAL<GATOM>, std:: allocator<TLITERAL<GATOM> > > > *)this + 16B)->_M_current)->index * 4)) == 0; }
Subject: Re: [tree-ssa] missed jump thread optimization on the tree-level In message <20040406012711.14808.qmail@sources.redhat.com>, "dann at godzilla d ot ics dot uci dot edu" writes: > >------- Additional Comments From dann at godzilla dot ics dot uci dot edu 20 >04-04-06 01:27 ------- >I am not sure that this belongs here, or if I should open another PR. > >Also from generate-3.4.ii from PR8361 >the .vars dump for function >bool PT<GPROGRAM>::isReallyPossible() const >looks like: > > int T.3219; > int iftmp.14163; > ><bb 0>: > if (this->allowsPositive) goto <L0>; else goto <L11>; > ><L0>:; > T.3219 = (int)*(((struct GINTERPRET *)this->I)->array + (TruthValue >*)(((struct GATOM *)(struct GATOM &)((struct >__normal_iterator<constGATOM*,std::vector<GATOM, std::allocator<GATOM> > > >*)this + 12B)->_M_current)->index * 4)); > if ((bool)(T.3219 == 0) != 0) goto <L17>; else goto <L4>; > ><L17>:; > iftmp.14163 = 1; > goto <bb 4> (<L10>); > ><L4>:; > if ((bool)(T.3219 == 2) != 0) goto <L8>; else goto <L18>; > ><L18>:; > iftmp.14163 = 0; > goto <bb 4> (<L10>); > ><L8>:; > iftmp.14163 = 1; > ><L10>:; > return iftmp.14163; > ><L11>:; > return (int)(bool)((int)*(((struct GINTERPRET *)this->I)->array + (TruthVal >ue >*)(((struct GATOM *)(struct GATOM &)(struct GLITERAL *)(struct GLITERAL >&)((struct __normal_iterator<constTLITERAL<GATOM>*,std::vector<TLITERAL<GATOM >>, >std::allocator<TLITERAL<GATOM> > > > *)this + 16B)->_M_current)->index * 4)) >== 0); > >The 2 ifs that test T.3219 look strange. What's strange about them? The first is testing if T.3219 != 0 and the second is testing of T.3219 != 2. They're rendered kind-of funny in the pretty printer, but they appear to be correct to me. Maybe if you provided more details about what you mean by "look strange", we could look further into the potential problem. jeff
> What's strange about them? The first is testing if T.3219 != 0 and the second > is testing of T.3219 != 2. They're rendered kind-of funny in the pretty > printer, but they appear to be correct to me. I was not sure if those tests are just rendered funny by the pretty printer or if it's something that implies that 2 comparisons are going to be performed for each "if". Other similar tests are not rendered that way, so maybe that is what was confusing for me. > Maybe if you provided more details about what you mean by "look strange", > we could look further into the potential problem. Sorry for being so terse.
Subject: Re: [tree-ssa] missed jump thread optimization on the tree-level In message <20040408233646.30099.qmail@sources.redhat.com>, "dann at godzilla d ot ics dot uci dot edu" writes: > >------- Additional Comments From dann at godzilla dot ics dot uci dot edu 20 >04-04-08 23:36 ------- >> What's strange about them? The first is testing if T.3219 != 0 and the sec >ond >> is testing of T.3219 != 2. They're rendered kind-of funny in the pretty >> printer, but they appear to be correct to me. > >I was not sure if those tests are just rendered funny by the pretty printer o >r >if it's something that implies that 2 comparisons are going to be performed >for each "if". Other similar tests are not rendered that way, so maybe that >is what was confusing for me. We should get one test for each. Though that's probably something that ought to be verified, probably by looking at the RTL dump. If you send me the .rtl dump I'd be more than happy to take a looksie and verify that the RTL expands DTRT. [ If the RTL expanders don't DTRT, it ought to be trivial to have a pass which cleans up these kinds of comparisons after TER is complete. ] jeff
The non-optimal code at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13875#c14 seems to be caused by the way the fron-end handles bools, specifically all bools are cast to int before being used in an "if" and the optimizers are not able to remove these casts. (see PR15017, comment #5)
(In reply to comment #19) > The non-optimal code at > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=13875#c14 > seems to be caused by the way the fron-end handles bools, specifically > all bools are cast to int before being used in an "if" and the > optimizers are not able to remove these casts. > (see PR15017, comment #5) And this problem should be fixed now with the improvement to forward propagation.
Dann, is there anything else you can see here?
There are some missed PHI-OPT but that is already fixed on the tcb branch. there are some missed forward propagation opertunies though (I don't know if they are fixed on the tcb or not).
I don't see any more obvios ones, can you? Though I did see a missed optimization (see PR 19431).
Hmm, I have not seen any more, could you double check this?
Subject: Re: [tree-ssa] missed jump thread optimization on the tree-level On Sun, 2005-06-12 at 20:41 +0000, pinskia at gcc dot gnu dot org wrote: > ------- Additional Comments From pinskia at gcc dot gnu dot org 2005-06-12 20:41 ------- > Hmm, I have not seen any more, could you double check this? FWIW, it is still possible to miss jump threading opportunities, it's just a lot harder now :-) The most common ways we miss threading opportunities are: 1. Cascading jump threading opportunities. We're more conservative than we need to be with the iteration step. This can cause us to miss jump threading opportunities because we're not iterating. A great example of this would be a loop where the backedge is always threadable. There's an example of this in one of the testcases in the various open PRs. I'm working on resolving some of the issues necessary to address this limitation right now. 2. We're not recording all the expressions we encounter when attempting to thread through a basic block. ie, if we encounter x = a + b and "a + b" does not simplify, then we do not record anything in our hash tables for "x". It is possible that recording the equivalence between x and a + b can result in more threading opportunities. I believe there is also a testcase which exposes this limitation in one of the still open PRs. jeff
I have looked at the various test cases in this PR, and all of them work for me.