[PATCH 3/4 v3] ivopts: Consider cost_step on different forms during unrolling

Wed Sep 2 03:50:19 GMT 2020

Hi Bin,

>> 2) This case makes me think we should exclude ainc candidates in function
>> mark_reg_offset_candidates.  The justification is that: ainc candidate
>> handles step update itself and when we calculate the cost for it against
>> its ainc_use, the cost_step has been reduced. When unrolling happens,
>> the ainc computations are replicated and it doesn't save step updates
>> like normal reg_offset_p candidates.
> Though auto-inc candidate embeds stepping operation into memory
> access, we might want to avoid it in case of unroll if there are many
> sequences of memory accesses, and if the unroll factor is big.  The
> rationale is embedded stepping is a u-arch operation and does have its
> cost.
> 

Thanks for the comments!  Agree!  Excluding them from reg_offset_p
candidates here is consistent with this expectation, it makes us
consider the unroll factor effect when checking the corresponding
step cost and the embedded stepping cost (in group/candidate cost,
minus step cost and use the cost from the address_cost hook).

>>
>> I've updated the patch to punt ainc_use candidates as below:
>>
>>> +         /* Skip AINC candidate since it contains address update itself,
>>> +            the replicated AINC computations when unrolling still have
>>> +            updates, unlike reg_offset_p candidates can save step updates
>>> +            when unrolling.  */
>>> +         if (cand->ainc_use)
>>> +           continue;
>>> +
>>
>> For this new attached patch, it's bootstrapped and regress-tested without
>> explicit unrolling, while the only one failure has been identified as
>> rs6000 specific address_cost issue in bootstrapping and regression testing
>> with explicit unrolling.
>>
>> By the way, with above simple hack of address_cost, I also did one
>> bootstrapping and regression testing with explicit unrolling, the above
>> sms-4.c did pass as I expected but had two failures instead:
>>
>>   PASS->FAIL: gcc.dg/sms-compare-debug-1.c (test for excess errors)
>>   PASS->FAIL: gcc.dg/tree-ssa/ivopts-lt.c scan-tree-dump-times ivopts "PHI" 2
>>
>> By further investigation, the 2nd one is expected due to the adddress_cost hook
>> hack, while the 1st one one exposed -fcompare-debug issue in sms.  The RTL
>> sequence starts to off from sms, just some NOTE_INSN_DELETED positions change.
>> I believe it's just exposed by this combination unluckily/luckily ;-) I will
>> send a patch separately for it once I got time to look into it, but it should
>> be unrelated to this patch series for sure.
> This is the kind of situation I intended to avoid before.  IMHO, this
> isn't a neat change (it can't be given we are predicting the future
> transformation in compilation pipeline), accumulation of such changes
> could make IVOPTs break in one way or another.  So as long as you make
> sure it doesn't have functional impact in case of no-rtl_unroll, I am
> fine.

Yeah, I admit it's not neat, but the proposals in the previous discussions
without predicting unroll factor can not work well for all scenarios with
different unroll factors, they could over-blame some kind of candidates.
For the case of no-rtl_unroll, unroll factor estimation should set
loop->estimated_unroll to zero, all these changes won't take effect. The
estimation function follows the same logics as that of RTL unroll factor
calculation, I did test with explicit unrolling disablement before, it
worked expectedly.

BR,
Kewen