This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH] COND_EXPRs in GIMPLE code and vectorizer
- From: Roberto COSTA <roberto dot costa at st dot com>
- To: Roger Sayle <roger at eyesopen dot com>
- Cc: Paolo Bonzini <paolo dot bonzini at lu dot unisi dot ch>, Daniel Berlin <dberlin at dberlin dot org>, gcc-patches at gcc dot gnu dot org
- Date: Wed, 18 Oct 2006 12:48:49 +0200
- Subject: Re: [PATCH] COND_EXPRs in GIMPLE code and vectorizer
- References: <Pine.LNX.email@example.com>
Roger Sayle wrote:
On Tue, 17 Oct 2006, Roberto COSTA wrote:
In the meantime, there are a few changes I made to some of the
middle-end passes that I think are useful even when the gimplifier
expands COND_EXPR expressions (the current behavior).
* fold-const.c (fold_cond_expr_with_comparison): Bug fix, avoid
infinite recursion when folding COND_EXPRs.
Whilst this hunk is OK, it's really just papering over a deeper problem
with tree-ssa. Indeed, an issue whose fix may be more beneficial than
adding support for gimple COND_EXPRs.
The red-flag is that this infinite recursion can only possibly occur for
trees of the form "(c1 == c2) ? c2 : c3" where all of c1, c2 and c3 are
integer constants. The problem is that the sub-expression "c1 == c2" is
not folded, and if I had my way I'd like to claim its a invalid tree
(and any tree containing an invalid subtree is itself invalid).
fold and friends don't recurse on trees, instead assuming that all of a
tree's subtrees/operands have previously been folded. Indeed, much of
the pattern matching in fold-const relies on the fact that sub-expressions
have been folded to a canonical form. For example, in this particular
transformation, the code assumes the integer constant in an equality
operation appears as the second argument, i.e. it only checks
for "TREE_CODE (arg01) == INTEGER_CST" (and not arg00).
Ok, I see the point.
I've checked how that unfolded comparison is generated and it looks to
me that this is a bug already in the compiler, which my patch about
COND_EXPRs just make more likely to emerge.
What's happening is that store_ccp pass is propagating constant values;
in our example in particular, it is propagating a constant into c1.
Then, since there is at least one use that is replaced in the statement
including that COND_EXPR, substitute_and_fold (...) in
tree-ssa-propagate.c calls fold_stmt (...) on that statement.
fold_ternary (...) from fold-const.c assumes that all operands of a
COND_EXPR have been previously folded. This is consistent with what
you're saying ("fold and friends don't recurse on trees...").
In our example, this assumption doesn't hold, since we have a comparison
(c1 == c2) as operand 0 of a COND_EXPR.
I see two possible ways to fix this:
a) if anyone propagates a value anywhere, she should check whether the
propagated value is part of a comparison in a COND_EXPR (and explicitly
fold the comparison, if so).
b) in case of a COND_EXPR, fold_ternary (...) in fold-const.c folds the
comparison before doing anything else (currently, it doesn't do it at all).
I'd go for the second alternative.
It needs an amendment in your statement "fold and friends don't recurse
on trees..." for the special case of the comparison of a COND_EXPR, but
it avoids explicit checks and folding every time a pass propagates
anything into a COND_EXPR.
What do you think?
Given that all of the fold_buildN interfaces only return "valid" trees,
the only way we can have a "c1 == c2" in the middle-end is if we
incorrectly used the low-level "build" functions, or have destructively
modified a tree-node. This destructive modification of tree nodes is
one of the reasons we (i) have to unshare trees to keep our memory
usage high, (ii) occasionally need to inefficiently call "fold"
(during expand and elsewhere) on trees that should already be folded,
just so we can keep our compile-times high, (iii) makes it impossible
to set break points in fold_build* or build*_stat to identify how/when
such invalid trees get constructed, so the compiler is less debugable,
(iv) makes it impossible/harder to have immutable static/read-only trees,
such as for builtins, that would reduce start-up time/overhead.
In an ideal world, we'd use a function such as "fold_replace_tree",
that would take an expression and constructively substitute all
occurences of the subexpression "old_expr" with "new_expr", calling
fold if the tree changes. This follows the idiom/pattern already
used in simplify-rtx.c:simplify_replace_rtx.
If you could find where the "C1 == C2" is coming from, that'd be a
much better fix. Indeed, if "C1 == C2" was correctly constructed,
we'd never allocate the COND_EXPR! [if it's correctly constructed].
Well, see above.