[PATCH] Loop unswitching: support gswitch statements.

Martin Liška mliska@suse.cz
Wed Nov 10 13:29:31 GMT 2021


On 11/10/21 09:59, Richard Biener wrote:
> On Tue, Nov 9, 2021 at 5:44 PM Martin Liška <mliska@suse.cz> wrote:
>>
>> On 11/9/21 14:37, Richard Biener wrote:
>>> On Mon, Nov 8, 2021 at 8:45 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>>>>
>>>> On 11/8/21 10:05 AM, Martin Liška wrote:
>>>>> On 9/28/21 22:39, Andrew MacLeod wrote:
>>>>>> In Theory, modifying the IL should be fine, it happens already in
>>>>>> places, but its not extensively tested under those conditions yet.
>>>>>
>>>>> Hello Andrew.
>>>>>
>>>>> I've just tried using a global gimple_ranger and it crashes when loop
>>>>> unswitching duplicates
>>>>> some BBs.
>>>>>
>>>>> Please try the attached patch for:
>>>>
>>>> hey Martin,
>>>>
>>>> try using this in your tree.  Since nothing else is using a growing BB
>>>> right now, I'll let you work with it and see if everything works as
>>>> expected before checking it in, just in case we need more tweaking.
>>>> With this,
>>>>
>>>> make RUNTESTFLAGS=dg.exp=loop-unswitch*.c check-gcc
>>>>
>>>> runs clean.
>>>>
>>>>
>>>> basically, I tried to grow it by either a factor of 10% for the current
>>>> BB size when the grow is requested, or some double the needed extra
>>>> size, or 128... whichever value is "maximum"    That means it shoudnt be
>>>> asking for tooo much each time, but also not a minimum amount.
>>>>
>>>> Im certainly open to suggestion on how much to grow it each time.
>>>> Note the vector being grown is ONLY fo the SSA_NAme being asked for.. so
>>>> it really an on-demand thing just for specific names, in your case,
>>>> mostly just the switch index.
>>>>
>>>> Let me know how this works for you, and if you have any other issues.
>>>
>>> So I think in the end we shouldn't need the growing.  Ideally we'd do all
>>> the analysis before the first transform, but for that we'd need ranger to
>>> be able to "simplify" conditions based on a known true/false predicate
>>> that's not yet in the IL.  Consider
>>>
>>>    for (;;)
>>>      {
>>>           if (invariant < 3) // A
>>>             {
>>> ...
>>>             }
>>>           if (invariant < 5) // B
>>>             {
>>> ...
>>>             }
>>>      }
>>>
>>> unswitch analysis will run into the condition 'A' and determine the loop
>>> can be unswitched with the condition 'invariant < 3'.  To be able to
>>> perform cost assessment and to avoid redundant unswitching we
>>> want to determine that if we unswitch with 'invariant < 3' being
>>> true then the condition at 'B' is true as well before actually inserting
>>> the if (invariant < 3) outside of the loop.
>>>
>>> So I'm thinking of assigning a gimple_uid to each condition we want to
>>> unswitch on and have an array indexed by the uid with meta-data on
>>> the unswitch opportunity, the "related" conditions could be marked with
>>> the same uid (or some other), and the folding result recorded so that
>>> at transform time we can just do the appropriate replacement without
>>> invoking ranger again.
>>
>> Calculating all this before transformation is quite ambitious based on the code
>> we have now.
>>
>> Note one can have in a loop:
>>
>> if (a > 100)
>>      ...
>>
>> switch (a)
>>      case 1000:
>>        ...
>>      case 20:
>>        ...
>>      case 200:
>>        ...
>>
>> which means the first predicate effectively makes some cases unreachable. Moreover
>> one can have
>>
>> if (a > 100 && b < 300)
>>      ...
>>
>> and more complex conditions.
> 
> True - I guess we should do two things.

All right.

> 
>   1) keep simplify_using_entry_checks like code for symbolic conditions
>   2) add integer ranges for unswitch conditions producing them, that
>       includes all unswitching of switch stmts - we might be able to use
>       the ranger queries (with global ranges) to simplify stmts with the
>       known ranges as noted by Andrew
> 
> I do think that pre-computing the simplifications is what we should do
> to be able to make the cost modeling sane.  What we can avoid
> trying is evaluating multiple unswitch possibilities to pick the "best".

So the first step would be taking all unswitching candidates (gconds basically)
and grouping them (all items in a group would fold to true edge in the unswitched loop).
Is it something we want to do combining simplify_using_entry_checks and fold_range ranger
capability?

> 
> I think changing the code do to the analysis first should be done
> before wiring in gcond support, even adding the additional 'range'

s/gcond/switch, right?

> capability will be useful without that since the current code
> wont figure out a > 5 is true when we unswitch on a > 3.

Agree that gswitch support should be added later.

Martin

> 
>>>
>>> Now, but how do we arrange for the ranger analysis here?
>>
>> That's likely something we need support from ranger, yes.
>>
>>>
>>> We might also somehow want to remember that on the
>>> 'invariant < 3' == false copy of the loop there's still the
>>> unswitching opportunity on 'invariant < 5', but not on the
>>> 'invariant < 5' == true copy.
>>>
>>> Currently unswitching uses a custom simplify_using_entry_checks
>>> which tries to do simplification only after the fact (and so costing
>>> also is far from costing the true cost and ordering of the opportunities
>>> to do the best first is not implemented either).
>>
>> I'm sending updated version of the patch where I changed:
>> - simplify_using_entry_checks is put back for the floating point expressions
>> - all scans utilize scan-tree-dump-times
>> - some new tests were added
>> - global ranger is used (I rely on the growing patch provided by Andrew)
>> - better ranger API is used for gcond expression: ranger.range_of_stmt (r, stmt) && r.singleton_p (&result))
>> - auto_edge_flag is used now
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Thoughts?
>> Martin
>>
>>>
>>> Richard.
>>>
>>>> Andrew
>>>>



More information about the Gcc-patches mailing list