This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH][RFA][P1 PR tree-optimization/83298] Avoid over-optimistic result range for COND_EXPR


On 12/08/2017 04:17 AM, Richard Biener wrote:
> On Fri, Dec 8, 2017 at 1:18 AM, Jeff Law <law@redhat.com> wrote:
>>
>> So the underlying issue here is quite simple.  Given something like
>>
>> x = (cond) ? res1 : res2;
>>
>> EVRP analysis will compute the resultant range using vrp_meet of the
>> ranges for res1 and res2.  Seems pretty natural.
>>
>> vrp_meet makes optimistic assumptions if either range is VR_UNDEFINED
>> and will set the resultant range to the range of the other operand.
>>
>> Some callers explicitly mention this is the desired behavior (PHI
>> processing).  Other callers avoid calling vrp_meet when one of the
>> ranges is VR_UNDEFINED and do something sensible
>> (extract_range_from_unary_expr, extract_range_from_binary_expr_1).
>>
>> extract_range_from_cond_expr neither mentions that it wants the
>> optimistic behavior nor does it avoid calling vrp_meet with a
>> VR_UNDEFINED range.  It naturally seems to fit in better with the other
>> extract_range_from_* routines.
>>
>> I'm not at all familiar with the ipa-cp bits, but from a quick look they
>> also seems to fall into the extract_* camp.
>>
>>
>> Anyway, normally in a domwalk the only place where we're going to see
>> VR_UNDEFINED would be in the PHI nodes.  It's one of the nice properties
>> of a domwalk :-)
>>
>> However, for jump threading we look past the dominance frontier;
>> furthermore, we do not currently record ranges for statements we process
>> as part of the jump threading.  But we do try to extract the range each
>> statement generates -- we're primarily looking for cases where the
>> statement generates a singleton range.
>>
>> While the plan does include recording ranges as we look past the
>> dominance frontier, I strongly believe some serious code cleanup in DOM
>> and jump threading needs to happen first.  So I don't want to go down
>> that path for gcc-8.
>>
>> So we're kind-of stuck with the fact that we might query for a resultant
>> range when one or more input operands may not have recorded range
>> information.  Thankfully that's easily resolved by making
>> extract_range_from_cond_expr work like the other range extraction
>> routines and avoid calling vrp_meet when one or more operands is
>> VR_UNDEFINED.
>>
>> Bootstrapped and regression tested on x86_64.  OK for the trunk?
> 
> But that does regress the case of either arm being an uninitialized
> variable.
> 
> I'm not convinced that when you look forward past the dominance frontier
> and do VRP analysis on stmts without analyzing all intermediate
> stmts on the path (or at least push all defs on that path temporarily
> to VR_VARYING) is fixed by this patch.  It merely looks like a wrong
> workaround for a fundamental issue in how DOM now uses the
> interface?I'm going back and pulling together the bits to fix this in a more
consistent way.  Specifically, recording ranges as we process statements
outside the dominance frontier so we don't ever see a VR_UNDEFINED range.

It's not bad as long as I resist the urge to pull in cleanups along the
way :-)

The only thing really painful is that normally when we generate a range
for a statement's output, we can just update the VR and reflect the
updated range into the global table -- that range is globally
applicable.    There's no need to push stuff onto the unwind stack or
the like.  The update_value_range API is sufficient here as NEW_VR is
just a temporary -- we copy the relevant bits from NEW_VR into the
actual tables.


When threading we must add entries to the unwinding stack.  Worse yet,
we can't use the existing VR because it's a stack local and obviously
does not persist.  We have to allocate a new object and copy from the
stack temporary into the new object.  Ugh.

We don't see any of this complexity with the other tables (like
const_and_copies) because we always update the global state and always
unwind it.  The VRP bits are a bit more efficient because they don't
bother with unwinding entries in cases where the result is globally
applicable.

Jeff
jeff


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]