Fix PR85726 (div-div suboptimization) and a rant on match.pd :s-flag

Thu May 10 09:28:00 GMT 2018

(not a review)

On Thu, 10 May 2018, Hans-Peter Nilsson wrote:

> Replacing a division feeding a division helps only when the
> second division is the only user, and "fusing" the divisions is

Well, that's not quite true.
int x, y;
void f(int n){
   int c = 3 << 20;
   x = n / c;
   y = n / c;
}

Here we can optimize the last division to y = 0. After your patch, we 
likely need VRP to do that simplification. There are probably more 
complicated transformations this disables.

> downright bad if another user of the result of first division is
> a modulus of the same value as the second division, forming a
> divmod pair.  See the test-case, where for the tested
> architectures (which all fail the test-case before the patch)
> the div and mod are implemented using the high-part of a widened
> multiplication and shift, emitted separately but combined as
> late as in rtl, with the multiplicaton and shift re-used.  That
> of course does not happen if later passes see (y / 48; y % 3).
> While in match.pd, I noticed the corresponding mul-mul match,
> which I believe should be guarded the same way.

Did you notice bad codegen because of the multiplication? You are only 
adding a test for divisions. I am asking because I know such a change will 
help some cases and hurt others...

> Now a rant on the match.pd ":s" flag, which reasonable people
> may reasonably suggest I should have used in the patch instead
> of the (if (single_use ...)).
>
> Initially, I got the match.pd-language (which deserves a proper
> name) all wrong.  Then I read the documentation and still got it
> wrong.  I "misunderstood" that the ":s" on an operand "O" was
> supposed to have the effect of conditionalize the replacement
> "R" by wrapping it in "(if (single_use (O)) R)" as in the
> suggested patch (above).  To wit, this does not work; it will
> *not* stop the replacement as seen in the test-case (THIS IS NOT
> A SUGGESTED PATCH):
>
> (for div (trunc_div exact_div)
>  (simplify
> -  (div (div @0 INTEGER_CST@1) INTEGER_CST@2)
> +  (div (div:s @0 INTEGER_CST@1) INTEGER_CST@2)
>   (with {
>     bool overflow_p;
>     wide_int mul = wi::mul (wi::to_wide (@1), wi::to_wide (@2),
>
> In PR69556, it seems other people seem to have read the
> documentation of ":s" the same way, but are corrected by other
> comments there, so I guess it's not my reading that's flawed.
>
> I suggest preferably (1) correcting the semantics of ":s" to do
> as the documentation says because I don't understand the
> explanation in PR69556 comment #4 that the replacement "is still
> allowed if it is a single operation as that replaces at least
> one other (the one we are simplifying)"; I see that as a
> complete nullification of the :s flag, making it a nop.
> Alternatively (2), if the :s is *not* a nop in some case add an
> example of that case and sufficient explanation to
> match-and-simplify.texi *and* the suggested ":S" flag
> (i.e. *really* conditionalize the replacement by a single-use of
> that operand).

To simplify, the goal of :s is to avoid increasing the number of 
instructions. Normally, the transformation output is smaller (or the same 
size but cheaper, simpler, more canonical) than the input. But if we can't 
get rid of some of the input intermediate results, the size may still 
increase. :s does test single_use, but it has a special case. If the 
output (possibly after several rounds of resimplifications) is at most one 
instruction, then it cannot be larger than the input (we are at least 
getting rid of the instruction we called the simplification on), so the 
transformation does happen even if !single_use. Originally this was only 
done when the simplification led to an SSA_NAME or a constant, IIRC.

Then people start wanting single_use restrictions to reduce register 
pressure, reduce the size / number of constants, etc. And not all of those 
want exactly the same conditions.

It is useful for high-level transformations to push the canonicalization 
as far as possible, to notice equivalent quantities or constant bounds in 
particular. So on a case by case basis, we use :s or single_use or 
whatever...

If we use both y=x/3 and z=x/15 in the same function, should we make an 
effort to detect it and rewrite to z=y/5?

-- 
Marc Glisse