This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Fix PR85726 (div-div suboptimization) and a rant on match.pd :s-flag


On Thu, 10 May 2018, Hans-Peter Nilsson wrote:

Date: Thu, 10 May 2018 10:33:39 +0200 (CEST)
From: Marc Glisse <marc.glisse@inria.fr>

On Thu, 10 May 2018, Hans-Peter Nilsson wrote:

Replacing a division feeding a division helps only when the
second division is the only user, and "fusing" the divisions is

Well, that's not quite true.
int x, y;
void f(int n){
   int c = 3 << 20;
   x = n / c;
   y = n / c;

Ah, this was supposed to be

  y = x / c;

sorry.

There are probably more
complicated transformations this disables.

I'm providing an example from *real* code where the
transformation is bad (admittedly just for the div-div case).
Please provide the same in your counterargument.

I don't have this kind of example at hand. Easiest would be to try some large benchmark (SPEC?) with and without, but that's a pain... I agree that the whole question is what helps most in practice, and I don't have the answer, I can just provide some not-so-realistic examples to show that the answer is not obvious in the other direction either (I am not trying to reject your patch).

int x,y,z;
void f(int n){
  x = n / 3;
  y = x / 5;
  z = n / 15;
}

we would like to notice that y and z are equivalent to skip one division.

int x,y,z;
void f(int n){
  x = n / 4;
  y = x / 4;
  z = y * 16 | 15;
}

we can replace the multiplication with a bit_and, which then simplifies with the bit_ior (surprisingly only in RTL, we may be missing something in gimple).

downright bad if another user of the result of first division is
a modulus of the same value as the second division, forming a
divmod pair.  See the test-case, where for the tested
architectures (which all fail the test-case before the patch)
the div and mod are implemented using the high-part of a widened
multiplication and shift, emitted separately but combined as
late as in rtl, with the multiplicaton and shift re-used.  That
of course does not happen if later passes see (y / 48; y % 3).
While in match.pd, I noticed the corresponding mul-mul match,
which I believe should be guarded the same way.

Did you notice bad codegen because of the multiplication? You are only
adding a test for divisions. I am asking because I know such a change will
help some cases and hurt others...

I can take that part out for lack of evidence, but first I'll
argue that e.g. a multiplication by 5 won't be helped to be
transformed into a multiplication by 15; multiplying or dividing
by a larger constant is not by itself a simplification or
canonicalization.

It is a canonicalization, because it makes the result depend on a more primitive variable. If you make several computations based on an int n, if you can express many variables in terms of n, you are more likely to notice duplicates or simplifications, or back-propagate ranges.

To simplify, the goal of :s is to avoid increasing the number of
instructions. Normally, the transformation output is smaller (or the same
size but cheaper, simpler, more canonical) than the input.

That may be the intent, but the second sentence is not generally
true.  Any such transformation must be carefully inspected to
have that property *on their own*; it's not true for the div-div
and mul-mul case for example.  After a quick look at uses of :s
in match.pd I'd say that's actually rare and for most codes not
true, carefully remembering we're talking about the context
where the intermediate still lives after the transformation.

Note that I was talking about the case where the intermediate results can be eliminated, and moved on to when this isn't the case in the next sentence below.

But if we can't
get rid of some of the input intermediate results, the size may still
increase. :s does test single_use, but it has a special case. If the
output (possibly after several rounds of resimplifications) is at most one
instruction, then it cannot be larger than the input (we are at least
getting rid of the instruction we called the simplification
on),

(not true in the div-div or mul-mul case)

???

We start from
y = x / 5;
z = y / 3; // calling simplification on this insn

and replace it with
y = x / 5;
z = x / 15;
(we get rid of the division by 3)

the output is not larger than the input.


so the
transformation does happen even if !single_use. Originally this was only
done when the simplification led to an SSA_NAME or a constant, IIRC.

Can you please provide an example?  I had a look at a couple of
the :s uses in match.pd imagining intermediate use of the :s-ed
operand and it didn't seem that they'd help where :s doesn't
mean single_use.

The first :s in match.pd is

 (simplify
  (rdiv (rdiv:s @0 @1) @2)
  (rdiv @0 (mult @1 @2)))

normally, with:
y=a/b; // y has multiple uses
z=y/c;

producing
y=a/b;
t=b*c;
z=a/t;

is bad because it has one extra instruction. But if for instance b and c are constants, then that's not the case anymore and the transformation does not increase the number of instructions. You are probably going to argue that the transformation is useless anyway... The following might be more convincing:

 (simplify
  (rdiv @0 (rdiv:s @1 @2))
  (mult (rdiv @0 @1) @2)))

4. / (2. / X) is better as 2. * X even if 2. / X has multiple uses (replaced a division with a multiplication). But that's only the case because the output simplifies enough that the (new) division is not emitted.

Then people start wanting single_use restrictions to reduce register
pressure, reduce the size / number of constants, etc. And not all of those
want exactly the same conditions.

It is useful for high-level transformations to push the canonicalization
as far as possible, to notice equivalent quantities or constant bounds in
particular. So on a case by case basis, we use :s or single_use or
whatever...

Again, you're speaking as if match.pd naturally contains just
canonicalizations, but that's not true.

I didn't say that, but canonicalizations are often more controversial because their benefit is less obvious (replacing one multiplication by another, what's the point?).

--
Marc Glisse


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]