This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
RE: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
- From: Richard Biener <rguenther at suse dot de>
- To: Tamar Christina <Tamar dot Christina at arm dot com>
- Cc: Joseph Myers <joseph at codesourcery dot com>, Christophe Lyon <christophe dot lyon at linaro dot org>, Markus Trippelsdorf <markus at trippelsdorf dot de>, Jeff Law <law at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, Michael Meissner <meissner at linux dot vnet dot ibm dot com>, nd <nd at arm dot com>
- Date: Wed, 30 Aug 2017 15:26:17 +0200 (CEST)
- Subject: RE: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like numbers in GIMPLE.
- Authentication-results: sourceware.org; auth=none
- References: <VI1PR0801MB20313CD2538258D3AAA60F5DFF910@VI1PR0801MB2031.eurprd08.prod.outlook.com> <20170608103058.GA285@x4> <CAKdteOZ6T8OB-R7HHkUuWgoVrxA-D6rPnH1U3n=9VAU5h4W0eg@mail.gmail.com>,<VI1PR0801MB2031575376F996A705CE73C4FFC90@VI1PR0801MB2031.eurprd08.prod.outlook.com> <VI1PR0801MB2031411786EE5517959DA13DFFC90@VI1PR0801MB2031.eurprd08.prod.outlook.com> <E2E86FA4-2E97-454B-895D-BF405DB01DBC@suse.de> <alpine.DEB.2.20.1706081639520.15290@digraph.polyomino.org.uk> <0E0A7BBD-39C1-4749-9FB4-8E319BAD888A@suse.de>,<DB6PR0802MB2309A3486072D28BA4FA1898FF830@DB6PR0802MB2309.eurprd08.prod.outlook.com> <DB6PR0802MB23094D855B09143E48F096EAFF850@DB6PR0802MB2309.eurprd08.prod.outlook.com> <alpine.LSU.2.20.1708231420490.14191@zhemvz.fhfr.qr> <DB6PR0802MB23090AD8F3CA1C07C4976CFFFF850@DB6PR0802MB2309.eurprd08.prod.outlook.com> <alpine.LSU.2.20.1708241052460.14191@zhemvz.fhfr.qr> <DB6PR0802MB23095448CE3AF8A3AE9FA517FF9F0@DB6PR0802MB2309.eurprd08.prod.outlook.com>
On Tue, 29 Aug 2017, Tamar Christina wrote:
>
>
> > -----Original Message-----
> > From: Richard Biener [mailto:rguenther@suse.de]
> > Sent: 24 August 2017 10:16
> > To: Tamar Christina
> > Cc: Joseph Myers; Christophe Lyon; Markus Trippelsdorf; Jeff Law; GCC
> > Patches; Wilco Dijkstra; Michael Meissner; nd
> > Subject: RE: [PATCH][GCC][PATCHv3] Improve fpclassify w.r.t IEEE like
> > numbers in GIMPLE.
> >
> > On Wed, 23 Aug 2017, Tamar Christina wrote:
> >
> > > Hi Richard,
> > >
> > > Thanks for the feedback,
> > >
> > > >
> > > > I think you should split up your work a bit. The pieces from
> > > > fold_builtin_* you remove and move to lowering that require no
> > > > control flow like __builtin_isnan (x) -> x UNORD x should move to
> > > > match.pd patterns (aka foldings that will then be applied both on
> > GENERIC and GIMPLE).
> > > >
> > >
> > > But emitting them as an UNORD wouldn't solve the performance or
> > > correctness issues The patch is trying to address. For correctness,
> > > for instance an UNORD would still signal When -fsignaling-nans is used
> > (PR/66462).
> >
> > Well, then address the correctness issues separately. I think lumping
> > everything into a single patch just makes your live harder ;)
>
> But the patch was made to fix the correctness issues. Which unless I'm
> mistaken can only be solved using integer operations. To give a short
> history on the patch:
>
> My original patch had this in builtins.c where the current code is and
> had none of this problem. The intention of this patch was just address
> the correctness issues with isnan etc.
>
> Since we had to do this using integer operations we figured we'd also
> replace the operations with faster ones so the code was tweaked in order
> to e.g. make the common paths of fpclassify exit earlier. (I had
> provided benchmark results to show that the integer operations are
> faster than the floating point ones, also on x86).
>
> It was then suggested by upstream review that I do this as a gimple lowering
> phase. I pushed back against this but ultimately lost and submitted a new version
> that moved from builtins.c to gimple-low.c, which was why late expansion for C++
> broke, because the expansion is now done very early.
>
> The rational for this move was that it would give the rest of the pipeline a chance
> to optimize the code further.
>
> After a few revisions this was ultimately accepted (the v3 version), but was reverted
> because it broke IBM's long double format due to the special case code for it violating SSA.
>
> That was trivially fixed (once I finally had access to a PowerPC hardware months later) and I changed the
> patch to leave the builtin as a function call if at lowering time it wasn't known to be a builtin.
> This then "fixes" the C++ testcase, in that the test doesn't fail anymore, but it doesn't generate
> the same code as before.
> Though the test is quite contrived and I don't know if actual user code often has this.
>
> While checking to see if this behavior is OK, I was suggested to do it
> as a gimple-fold instead. I did, but then turns out it won't work for
> non-constants as I can't generate new control flow from a fold (which
> makes sense in retrospect but I didn't know this before).
>
> This is why the patch is more complex than what it was before. The
> original version only emitted simple tree.
The correctness issue is only about -fsignaling-nans, correct? So a
simple fix is to guard the existing builtins.c code with
!flag_singalling_nans (and emit a library call for -fsignalling-nans).
> >
> > I was saying that if I write isunordered (x, y) and use -fno-signaling-nans (the
> > default) then I would expect the same code to be generated as if I say isnan
> > (x) || isnan (y). And I'm not sure if isnan (x) || isnan (y) is more efficient if
> > both are implemented with integer ops compared to using an unordered FP
> > compare.
> >
>
> But isnan (x) || isnan (y) can be rewritten using a match.pd rule to isunordered (x, y)
> If that should really generate the same code. So I don’t think it's an issue for this patch.
If isnan is lowered to integer ops than we'll have a hard time recognizing
it.
> > A canonical high-level representation on GIMPLE is equally important as
> > maybe getting some optimization on a lowered "optimized" form.
> >
> > And a good canonical representation is short (x UNORD x or x UNORD y
> > counts as short here).
> >
> > > > As fallback the expanders should simply emit a call (as done for
> > > > isinf when not folded for example).
> > >
> > > Yes the patch currently already does this.
> >
> > Ah, I probably looked at an old version then.
> >
> > > My question had more to do if
> > > the late expansion when you alias one of these functions in C++ was
> > > really an issue, since It used to (sometimes, only when you got the
> > > type of the argument correct) expand before whereas now it'll always
> > > just emit a function call in this edge case.
> >
> > The answer here is really that we should perform lowering at a point where
> > we do not omit possible optimizations. For the particular case this would
> > mean lowering after IPA optimizations (also think of accelerator offloading
> > which uses the early optimization phase of the host). This also allows early
> > optimization to more accurately estimate code size (when optimizing for
> > size). And I still think whether to expose fp classification as FP or integer ops
> > should be a target decision and done late.
> >
> > This still means canonicalizations like isnan -> UNORD should happen early
> > and during folding.
> >
>
> This feels to me like it's a different patch, whatever transformations you do
> to get isnan or any of the other builtins don't seem relevant. Only that when you
> do generate isnan, you get better and more correct code than before.
>
> If the integer operations are of a concern though, I could add a target hook
> to make this opt-in. The backup code is still there as it gets used when the floating point
> type is not IEEE like.
I am concerned about using integer ops without having a way for the
targets to intervene. This choice would be naturally made at
RTL expansion time (like in your original patch I guess). There
are currently no optabs for isnan() and friends given we do lower
some of them early (to FP ops, and with bogus sNaN behavior).
> > >
> > > >
> > > > That said, split out the uncontroversical part of moving existing
> > > > foldings from builtins.c to match.pd where they generate no control-
> > flow.
> > > >
>
> If that's the direction I have take I will, I am just not convinced the patch will be
> simpler nor smaller...
It makes the intent more visible, a wrong-code fix (adding flag_* checks)
compared to a added optimization. Adding the flags checks is sth
we'd want to backport for example.
Richard.
> Thanks,
> Tamar
>
> > > > Thanks,
> > > > Richard.
> > > >
> > > > > Originally the patch was in expand, which would have covered the
> > > > > late expansion case similar to what it's doing now in trunk. I was
> > > > > however asked to move this to a GIMPLE lowering pass as to give
> > > > > other
> > > > optimizations a chance to do further optimizations on the generated
> > code.
> > > > >
> > > > > This of course works fine for C since these math functions are a
> > > > > Macro in C
> > > > but are functions in C++.
> > > > >
> > > > > C++ would then allow you to do stuff like take the address of the
> > > > > C++ function so
> > > > >
> > > > > void foo(float a) {
> > > > > int (*xx)(...);
> > > > > xx = isnan;
> > > > > if (xx(a))
> > > > > g++;
> > > > > }
> > > > >
> > > > > is perfectly legal. My current patch leaves "isnan" in as a call
> > > > > as by the time we're doing GIMPLE lowering the alias has not been
> > > > > resolved yet, whereas the version currently committed is able to
> > > > > expand it as it's late
> > > > in expand.
> > > > >
> > > > > Curiously the behaviour is somewhat confusing.
> > > > > if xx is called with a non-constant it is expanded as you would
> > > > > expect
> > > > >
> > > > > .LFB0:
> > > > > .cfi_startproc
> > > > > fcvt d0, s0
> > > > > fcmp d0, d0
> > > > > bvs .L4
> > > > >
> > > > > but when xx is called with a constant, say 0 it's not
> > > > >
> > > > > .LFB0:
> > > > > .cfi_startproc
> > > > > mov w0, 0
> > > > > bl isnan
> > > > > cbz w0, .L1
> > > > >
> > > > > because it infers the type of the 0 to be an integer and then
> > > > > doesn't
> > > > recognize the call.
> > > > > using 0.0 works, but the behaviour is really counter intuitive.
> > > > >
> > > > > The question is what should I do now, clearly it would be useful
> > > > > to handle the late expansion as well, however I don't know what
> > > > > the best
> > > > approach would be.
> > > > >
> > > > > I can either:
> > > > >
> > > > > 1) Add two new implementations, one for constant folding and one
> > > > > for
> > > > expansion, but then one in expand would
> > > > > be quite similar to the one in the early lowering phase. The
> > > > > constant
> > > > folding one could be very simple since
> > > > > it's a constant I can just call the buildins and evaluate the
> > > > > value
> > > > completely.
> > > > >
> > > > > 2) Use the patch as is but make another one to allow the renaming
> > > > > to be applied quite early on. e.g while still in Tree or GIMPLE
> > > > > resolve
> > > > >
> > > > > int (*xx)(...);
> > > > > xx = isnan;
> > > > > if (xx(a))
> > > > >
> > > > > to
> > > > >
> > > > > int (*xx)(...);
> > > > > xx = isnan;
> > > > > if (isnan(a))
> > > > >
> > > > > This seems like it would be the best approach and the more
> > > > > useful one in
> > > > general.
> > > > >
> > > > > Any thoughts?
> > > > >
> > > > > Thanks,
> > > > > Tamar
> > > > >
> > > > > >
> > > > > > Richard.
> > > > >
> > > > >
> > > >
> > > > --
> > > > Richard Biener <rguenther@suse.de>
> > > > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham
> > > > Norton, HRB 21284 (AG Nuernberg)
> > >
> > >
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton,
> > HRB 21284 (AG Nuernberg)
>
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)