This is the mail archive of the
mailing list for the GCC project.
Re: Expansion of narrowing math built-ins into power instructions
On Tue, Aug 20, 2019 at 08:41:29AM +0100, Richard Sandiford wrote:
> Tejas: given the controversy, I agree unspecs sound like a good approach
> for now. We can always go back and add the rtx codes later once there's
> agreement on what they should look like.
> Segher Boessenkool <email@example.com> writes:
> > On Sat, Aug 17, 2019 at 09:21:00AM +0100, Richard Sandiford wrote:
> >> In lisp terms we're saying that the operand to the float_narrow is
> >> implicitly quoted:
> >> (float_narrow:m '(plus:n a b))
> >> so that when float_narrow is evaluated, the argument is the unevaluated
> >> rtl expression "(plus a b)" rather than the evaluated result a + b.
> >> float_narrow then does its own evaluation of a and b and performs a
> >> fused addition and narrowing on the result.
> > RTL isn't Lisp.
> Right. But it's heavily influenced by lisp, so I was using quoting to
> explain why I don't think the code is a good fit.
> > RTL doesn't have quotations.
> I'd like to keep it that way for rvalues :-)
> > RTL doesn't have *evaluation*.
> But we can (and do) evaluate some rtxes without target help.
We do? Other than constant folding; there is nothing to evaluate in
constants anyway. Or do you mean simplification?
There are rules what kind of transformations are allowed. Many unwritten
of course :-/
> > RTL is just a data structure that describes your program instructions.
> > A large part of what means what is system-specific. Rounding of floating
> > point is not defined, for example.
> Some of the semantics are target-specific, sure, with some of the details
> controlled by hooks/macros and some left undefined. But that's true to a
> lesser extent of gimple too.
Yes, gimple and RTL are not very different, at the core of things.
> > And yes, various parts of GCC can manipulate RTL, doing substitution and
> > algebraic simplication and whatnot. All within the rules of RTL. And
> > that means nothing ever can "pass" a float_narrow, because there are no
> > rules that allow it to.
> You mean create a new float_narrow out of thin air, with no justification?
> Sure, but I don't think that was ever the issue.
No. I mean that if you have
... (float_narrow:M (x:N))
it will always stay in that form, with just x changed. Nothing can
change the float_narrow.
> Or do you mean that target-independent code couldn't just use GET_RTX_FORMAT
> to recurse on a float_narrow without first noting that it's a float_narrow
> (and thus special)? If so, then yeah, I agree that they wouldn't be
> allowed to do that, which is essentially why I think it's a bad idea.
No, they can do that just fine.
> >> No other rtx rvalue works like this.
> > A lot of unspecs are used like this, for example.
> Unspecs don't have a quoting effect though. I agree it's common to match
> things like:
> (unspec:m [(plus:m ...)] UNSPEC_FOO)
> But that doesn't have any quoting effect on the plus. If the optimisers see:
> (unspec:m [(plus:m x y)] UNSPEC_FOO)
> and know what x and y are, they can certainly fold this to:
> (unspec:m [(const_int N)] UNSPEC_FOO)
An the exact same is true for the proposed float_narrow!
The compiler should not do this if FP_CONTRACT is off, which it has to
be for fadd etc. too make sense at all, to not be optimised to a plain
> This is similar to things like (from mips.md):
> (define_insn_and_split "<su>mulsi3_highpart_internal"
Yeah, I did that for rs6000. Lots and lots and lots of special cases :-P
(RTL represents things differently for BE and LE, and there are the various
sizes of operation, both with and without 64-bit insns).
> [(set (match_operand:SI 0 "register_operand" "=d")
(this is optimised to a subreg, in many cases, for example).
> Going back to the unspec example: if at some point we added a target
> hook for evaluating unspecs in the same way that we evaluate basic
> arithmetic (might be useful!), the handling of UNSPEC_FOO wouldn't be
> able to assert that the plus or whatever is there. At best it could
> punt evaluation when the plus isn't there, at the cost of losing
> potentially useful optimisation.
Yes. And as far as I can see float_narrow will still work.
> float_narrow is different in that the plus (or whatever operation
> it's quoting) has to be kept in-place rather than folded away,
> otherwise the rtx itself is malformed and could trigger an ICE,
> just like the zero_extend of a const_int that I mentioned.
Yes, it will not pass recog. Structurally it is just hunky-dory though.
> > And you need many many more RTX codes, which you will not handle in
> > almost all places, because there are too many.
> > I agree this construct is not as nice as could be hoped for. I don't
> > agree that 60 new RTX codes is an acceptable solution (or that that will
> > ever really work out, even).
> 60 sounds a high number. :-) Do we really have that many rtx codes with
> a floating-point rounding effect?
It was meant to sound high, heh. If things need a variant A, and also a
variant B, then before you know it there is a variant A+B as well, and
you have unbridled growth.
plus minus neg mult div mod smin smax abs sqrt fma I think? And let's
hope we never ever have to do saturating versions of FP :-)
> Whatever the number is, we'll still be listing them individually for
> built-in enumerations, internal_fn, and (I assume) optabs. But maybe
> after a certain point it does become too unwieldly for rtx codes.
> We have to keep it within 16 bits at least...
My main concern is all the (simplification) code that parses RTL. All of
that will have to handle all variant versions as well.
> > It would be nice if somehow we could make a variant of RTL codes, so that
> > we could have nice and simple code that applies to all variants of some
> > code. Not sure how that would work out. Maybe we don't have to do this
> > very generically, how often will we need this anyway?
> > I have three examples so far:
> > 1) Saturating arithmetic;
> > 2) This float_narrow thing;
> > 3) Ordered compares, that is, fp compares that set an exception on NaNs.
> > Something that works for all three would be nice!
> Yeah, agree that sounds good. Maybe we could bundle the code with some
> flags. Storage-wise, there should be room for that in the u2 field.
> But there might still be cases in which it's useful to view the code+flags
> as a combined supercode, e.g. for switch statements.
Yeah... Whether to make "code" or "code+flags" the more usual version is
the biggest design question then. Oh, and what the rest of the interface
to this looks like ;-)