This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/21829] [4.1/4.2 Regression] missed jump threading after unroller
- From: "law at redhat dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 9 Feb 2006 03:19:59 -0000
- Subject: [Bug tree-optimization/21829] [4.1/4.2 Regression] missed jump threading after unroller
- References: <bug-21829-6528@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #4 from law at redhat dot com 2006-02-09 03:19 -------
I'll note this really isn't a jump threading issue. This is a fundamental
weakness in a dominator based optimizer vs a truly global optimizer.
What we've got is a block which looks something like this:
# u_18 = PHI <u_13(4), u_26(3)>;
<L1>:;
D.1528_10 = u_18 % 2;
if (D.1528_10 == 1) goto <L2>; else goto <L3>;
Note carefully that this block is at a dominance frontier (thus the PHI node).
At a dominance frontier, a dominator walk does _not_ guarantee that the CFG
parents are visited before the node itself. ie, we may visit this block before
its CFG parents. That is in fact precisely what happens. Thus we do not
discover that u_13 and u_26 both have the value 1 until after we have processed
this block.
Possible solutions:
1. Iterating DOM. I don't want to do that.
2. Run CCP before DOM. It's something I've pondered, mostly because I want
to remove constant propagation from DOM. Compile-time issues worry me.
3. Simple constant/copy propagation in the renaming phase. I've always
been a fan of this to avoid clutter in the IL, but I've been overruled
on this as a design decision.
4. Somehow arrange to sort the et-tree dominance information so that we
visit immediate descendants of a CFG node which are dominated by the
node before visiting more distant CFG descendants. This could be
horribly ugly.
5. Somehow get really really smart in the block duplication code and arrange
for it to propagate constants. I've pondered this, but I don't think
it's really that feasible both from a design and implementation
standpoint.
6. Post-dom cleanup phase using worklists seeded by degenerate PHI nodes to
identify statements that can be simplified as a result of discovery of
the degenerate PHI. [Would replace the mini-copyprop pass we do after
DOM. ]
I suspect #2 or #6 are the most likely long term solutions.
jeff
--
law at redhat dot com changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21829