This is the mail archive of the
mailing list for the GCC project.
Re: RFC: Add of type-demotion pass
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Jeff Law <law at redhat dot com>
- Cc: Kai Tietz <ktietz at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Jakub Jelinek <jakub at redhat dot com>
- Date: Fri, 18 Oct 2013 12:06:35 +0200
- Subject: Re: RFC: Add of type-demotion pass
- Authentication-results: sourceware.org; auth=none
- References: <155227895 dot 847667 dot 1373305519786 dot JavaMail dot root at redhat dot com> <525E2B2E dot 6010505 at redhat dot com> <CAFiYyc2czZAkSnuU6KU-z+bUbZXOQKg4BnBtc4d27OPzvmvk_A at mail dot gmail dot com> <525EBFD9 dot 4000509 at redhat dot com> <CAFiYyc1mrkLbOzu0t-bT_XwQ4y1R8XBHWJNkj3wkewwwOWBXHA at mail dot gmail dot com> <52603B61 dot 7020101 at redhat dot com>
On Thu, Oct 17, 2013 at 9:32 PM, Jeff Law <firstname.lastname@example.org> wrote:
> On 10/17/13 04:41, Richard Biener wrote:
>>> I don't see this as the major benefit of type demotion. Yes, there is
>>> value in shrinking constants and the like, but in my experience the
>>> are relatively small and often get lost in things like partial register
>>> stalls on x86, the PA and probably others (yes, the PA has partial
>>> stalls, it's just that nobody used that term).
>>> What I really want to get at here is avoiding having a large number of
>>> optimizers looking back through the use-def chains and attempting to
>>> typecasts in the middle of a chain of statements of interest.
>> Hmm, off the top of my head only forwprop and VRP look back through
>> use-def chains to elide typecasts. And they do that to optimize those
>> casts, thus it is their job ...? Other cases are around, but those
>> are of the sorts of "is op1 available in type X and/or can I safely cast
>> it to type X?" that code isn't going to be simplified by generic
>> promotion / demotion because that code isn't going to know what
>> type pass Y in the end wants.
> I strongly suspect if we were to look hard at why various optimizations
> weren't being applied in cases where intuitively we think they should, we'd
> find that type conversions are often the culprit.
> And so we'd go off fixing the vectorizer, DOM, and god knows what else to
> start looking through the type conversions. I want to stop this before it
> I'm *certain* that to do this well, we're going to need a mess of additional
> cases in tree-ssa-forwprop.c based on my prior investigations. A large part
> of the reason I stopped with that work was I could already see the code was
> ultimately going to be an utter mess.
>> Abstracting functions that can answer those questions instead of
>> repeating N variants of it would of course be nice.
> Or we can move the type conversions out of the way so they don't impact our
You can't move type conversion "out of the way" in most cases as
GIMPLE is stronly typed
and data sources and sinks can obviously not be "promoted" (nor can
So you'll very likely not be able to remove the code from the
optimizers, it will only maybe
trigger less often.
>> Likewise reducing the number of places we perform promotion / demotion
>> (remove it from frontend code and fold, add it in the GIMPLE combiner).
>> Also making the GIMPLE combiner available as an utility to apply
>> to a single statement (see my very original GIMPLE-fold proposal)
>> would be very useful.
> I strongly believe the gimple combiner is not the place to handle
> promotion/demotion based on already working through some of these issues
> privately. It was that investigative work which led me to look more closely
> at what Kai was doing with the promotion/demotion work.
>> As for promotion / demotion (if you are not talking about applying
>> PROMOTE_MODE which rather forces promotion of variables and
>> requires inserting compensation code), you want to optimize
>> op1 = (T) op1';
>> op2 = (T) op2';
>> x = op1 OP op2; (*)
>> y = (T2) x;
>> to either carry out OP in type T2 or in a type derived from the types
>> of op1' and op2'.
> That's part of the benefit, but you also want to be able to look at where
> op1' and op2' came from and possibly do something even more significant than
> just changing the type of OP. Getting the casts out of the way makes that a
> lot easier. And that's one of the reasons why you want both promotion and
> demotion, both expose those kind of opportunities.
>> For the simple case combine-like pattern matching is ok. It gets
>> more complicated if there are a series of statements here (*), but
>> even that case is handled by iteratively applying the combiner
>> patterns (which forwprop does).
> Right, but you're still missing the point that every time a type conversion
> apepars in a stream of interesting statements that you have to special case
> the optimization to deal with the type conversions.
> With Kai's work that special casing goes away and thus our existing
> reassociation & forwprop passes do a better job without needing a ton of
> special cases.
See above - you can't remove the special casing.
>> If you split out promotion / demotion into a separate pass then
>> you introduce pass ordering issues as combining may introduce
>> promotion / demotion opportunities and the other way around.
> Right, which is why you promote, optimize, demote, optimize. Both promotion
> and demotion have the potential to expose optimizable sequences.
> It's not perfect, but it's a hell of a lot better than what we do now.
I'm not sure ;) Keep an eye on compile-time.
>> If we remove the ad-hoc frontend code and strip down fold then an
>> early combine phase (before CSE wrecks single-use cases) will
>> more reliably handle what frontends and fold do. Conveniently the first
>> forwprop is already placed very early.
> But again, you're burdening every transformation in forwprop with being
> aware that there may be type conversions mid-stream and having to deal with
> them. So consider a slightly different approach where we promote, run
> forwprop, demote, run forwprop, all before PRE/DOM, etc wreck the single use
Fact is that conversions mid-stream cannot simply be ignored. If we can remove
them then a combiner pattern can possibly remove them which will make the
transform that only works without them trigger subsequently.
The proposed patch doesn't add a single testcase nor does it remove any
special code from other optimizations so it is hard to see what it
tries to enable
that doesn't already work.
>>> As far as dealing with the target dependencies, there's no clear "this is
>>> best". I vaguely recall discussions with Kai where we decided that
>>> PROMOTE_MODE was relatively easy from a coding standpoint -- it's more a
>>> matter of where does that fit into the entire optimization pipeline. I
>>> could make arguments either way.
>> One thing is honoring PROMOTE_MODE for deciding what types
>> to promote/demote to, another thing is applying PROMOTE_MODE
>> somewhen during GIMPLE optimizations with the goal to remove
>> its handling from RTL expansion (I'd really like to move most of
>> RTL expansions side-effects such as PROMOTE_MODE or
>> strict-align bitfield memory stuff to GIMPLE).
> Can we please deal with PROMOTE_MODE independently from Kai's initial work.
> Kai's work may make it easier to implement what you want, but Kai's work has
> significant value independently of using it to reimplement PROMOTE_MODE in a
> better place in the pipeline.
I think it is related in a way because PROMOTE_MODE has the issue that it
introduces tons of unnecessary casts if done naiively. So the pass, if it works
properly, has to show that if we apply PROMOTE_MODE as "cost model" it
will remove most of the unnecessary sign-/zero-extensions (and you'll quickly
find out that with strongly typed GIMPLE this gets interesting).