This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: RFC: Add of type-demotion pass

From: Richard Biener <richard dot guenther at gmail dot com>
To: Jeff Law <law at redhat dot com>
Cc: Kai Tietz <ktietz at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Jakub Jelinek <jakub at redhat dot com>
Date: Thu, 17 Oct 2013 12:41:55 +0200
Subject: Re: RFC: Add of type-demotion pass
Authentication-results: sourceware.org; auth=none
References: <155227895 dot 847667 dot 1373305519786 dot JavaMail dot root at redhat dot com> <525E2B2E dot 6010505 at redhat dot com> <CAFiYyc2czZAkSnuU6KU-z+bUbZXOQKg4BnBtc4d27OPzvmvk_A at mail dot gmail dot com> <525EBFD9 dot 4000509 at redhat dot com>

On Wed, Oct 16, 2013 at 6:33 PM, Jeff Law <law@redhat.com> wrote:
> On 10/16/13 03:31, Richard Biener wrote:
>>>
>>> I see two primary effects of type sinking.
>>
>>
>> Note it was called type demotion ;)
>
> ;)  It's a mental block of mine; it's been called type hoisting/sinking in
> various contexts and I see parallels between the code motion algorithms and
> how the type promotion/demotion exposes unnecessary type conversions.  So I
> keep calling them hoisting/sinking.  I'll try to use promotion/demotion.
>
>
>
>
>>
>>>   First and probably the most
>>> important in my mind is by sinking a cast through its uses the various
>>> transformations we already perform are more likely to apply *without*
>>> needing to handle optimizing through typecasts explicitly.
>>
>>
>> I would say it is desirable to express arithmetic in the smallest possible
>> types (see what premature optimization the C family frontends do
>> to narrow operations again after C integer promotion applied).
>
> I don't see this as the major benefit of type demotion.  Yes, there is some
> value in shrinking constants and the like, but in my experience the benefits
> are relatively small and often get lost in things like partial register
> stalls on x86, the PA and probably others (yes, the PA has partial register
> stalls, it's just that nobody used that term).
>
> What I really want to get at here is avoiding having a large number of
> optimizers looking back through the use-def chains and attempting to elide
> typecasts in the middle of a chain of statements of interest.

Hmm, off the top of my head only forwprop and VRP look back through
use-def chains to elide typecasts.  And they do that to optimize those
casts, thus it is their job ...?  Other cases are around, but those
are of the sorts of "is op1 available in type X and/or can I safely cast
it to type X?" that code isn't going to be simplified by generic
promotion / demotion because that code isn't going to know what
type pass Y in the end wants.

Abstracting functions that can answer those questions instead of
repeating N variants of it would of course be nice.

Likewise reducing the number of places we perform promotion / demotion
(remove it from frontend code and fold, add it in the GIMPLE combiner).

Also making the GIMPLE combiner available as an utility to apply
to a single statement (see my very original GIMPLE-fold proposal)
would be very useful.

As for promotion / demotion (if you are not talking about applying
PROMOTE_MODE which rather forces promotion of variables and
requires inserting compensation code), you want to optimize

 op1 = (T) op1';
 op2 = (T) op2';
 x = op1 OP op2; (*)
 y = (T2) x;

to either carry out OP in type T2 or in a type derived from the types
of op1' and op2'.

For the simple case combine-like pattern matching is ok.  It gets
more complicated if there are a series of statements here (*), but
even that case is handled by iteratively applying the combiner
patterns (which forwprop does).

If you split out promotion / demotion into a separate pass then
you introduce pass ordering issues as combining may introduce
promotion / demotion opportunities and the other way around.

That wouldn't apply to a pass lowering GIMPLE to fully honor
PROMOTE_MODE.

>> You need some kind of range information to do this, thus either integrate
>> it into VRP (there is already code that does this there) or use range
>> information from VRP which we now preserve.
>
> If the primary goal is to shrink types, then yes, you want to use whatever
> information you can, including VRP.  But that's not the primary goal in my
> mind, at least not at this stage.
>
> There's no reason why this pass couldn't utilize VRP information to provide
> more opportunities to demote types and achieve the goal you want.  But I'd
> consider that a follow-on opportunity.
>
>
>
>
>
>>
>>> The second primary effect is, given two casts where the first indirectly
>>> feeds the second (ie, the first feeds some statement, which then feeds
>>> the
>>> second cast), if we're able to sink the first cast, we end up with the
>>> first
>>> cast directly feeding the second cast.  When this occurs one of the two
>>> casts can often be eliminated.   Sadly, I didn't keep any of those test
>>> files, but I regularly saw them in GCC bootstraps.
>>
>>
>> This transformation is applied both by fold-const.c and by SSA forwprop
>> (our GIMPLE combiner).  Doing it in yet another pass looks wrong
>> (and it isn't type demotion but also can be promotion).
>
> Yes, I know.  And we need to get this back down to a single implementation.
> I don't much care which of the 3 implementations we keep, but it really
> should just be one and it needs to be reusable.
>
> I probably should have stated this differently -- the second primary effect
> is to expose more cases where type conversions can be eliminated via type
> promotion/demotion.  I don't much care which of the 3 blobs of code to
> eliminate the conversions we use -- I do care that we've got a consistent
> way to promote/demote conversions to expose the unnecessary type
> conversions.

Sure.

>> In contrast to the desire of expressing operations in the smallest
>> required
>> type there is the desire of exposing the effect of PROMOTE_MODE on
>> GIMPLE instead of only during RTL expansion.  This is because the
>> truncations (sext and zext) PROMOTE_MODE introduced are
>> easier to optimize away when range information is available (see the
>> attempts to address this at RTL expansion time from Kugan from Linaro).
>
> Right.  I'm aware of this work and the problem he's trying to solve and have
> been loosely watching it -- primarily for the persistent VRP information.
>
>
>
>
>>> Similarly, I know there's a type hoisting patch that's also queued up. I
>>> think it should be handled separately as well.
>>
>>
>> I think we need to paint a picture of the final result - what is the
>> main objective of the various(?!) passes in question?  Where do
>> we do the same kind of transformation already?
>
> I thought we'd done this at a high level already.  At the heart of this work
> is to:
>
>   1. Isolate, to the fullest extent possible, code which promotes and
>      demotes types.  We have this stuff all over the place right now
>      and it's very ad-hoc.
>
>   2. Promote/demote types to allow our optimizers to not concern
>      themselves with walking back through type conversions when applying
>      optimizations.
>
>   3. Promote/demote types to expose unnecessary type conversions.
>
>
> If we look at #2 and #3 we can expect that we'd want a structure which
> allows for a simplification/optimization step to occur after types are
> promoted or demoted.  ie, a pipeline that looks like:
>
> promote types -> optimize1 -> demote types -> optimize2
>
> Now where that little mini pipeline lands is still a big question to me.
> optimize1 may be a fairly significant hunk of our pipeline.  optimize2
> probably isn't (may just be a final tree-ssa-forwprop pass).
>
>
>>
>> We have no pass that tries to promote or demote the types of
>> variables with using a data-flow approach (VRP comes closest,
>> but the transform is again pattern-matching, thus combine-like).
>> I do not object to adding this kind of pass, but I suggest to
>> look at the targets desires when implementing it - which eventually
>> means to honor PROMOTE_MODE (be careful about pass
>> placement here - you want this after loop optimizations like
>> vectorization but possibly before induction variable optimization).
>
> Placement is one of the biggest questions in my mind.  If I think about
> something like the old SGI compiler, they did a very early promotion, then
> lowered/demoted and got reasonable results with it.

If we remove the ad-hoc frontend code and strip down fold then an
early combine phase (before CSE wrecks single-use cases) will
more reliably handle what frontends and fold do.  Conveniently the first
forwprop is already placed very early.

> As far as dealing with the target dependencies, there's no clear "this is
> best".  I vaguely recall discussions with Kai where we decided that handling
> PROMOTE_MODE was relatively easy from a coding standpoint -- it's more a
> matter of where does that fit into the entire optimization pipeline.  I
> could make arguments either way.

One thing is honoring PROMOTE_MODE for deciding what types
to promote/demote to, another thing is applying PROMOTE_MODE
somewhen during GIMPLE optimizations with the goal to remove
its handling from RTL expansion (I'd really like to move most of
RTL expansions side-effects such as PROMOTE_MODE or
strict-align bitfield memory stuff to GIMPLE).

Richard.

> Jeff
>

Follow-Ups:
- Re: RFC: Add of type-demotion pass
  - From: Jeff Law

References:
- Re: RFC: Add of type-demotion pass
  - From: Jeff Law
- Re: RFC: Add of type-demotion pass
  - From: Richard Biener
- Re: RFC: Add of type-demotion pass
  - From: Jeff Law

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]