This is the mail archive of the
mailing list for the GCC project.
Re: RFC: Add of type-demotion pass
- From: Jeff Law <law at redhat dot com>
- To: Richard Biener <richard dot guenther at gmail dot com>
- Cc: Kai Tietz <ktietz at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Jakub Jelinek <jakub at redhat dot com>
- Date: Wed, 16 Oct 2013 10:33:29 -0600
- Subject: Re: RFC: Add of type-demotion pass
- Authentication-results: sourceware.org; auth=none
- References: <155227895 dot 847667 dot 1373305519786 dot JavaMail dot root at redhat dot com> <525E2B2E dot 6010505 at redhat dot com> <CAFiYyc2czZAkSnuU6KU-z+bUbZXOQKg4BnBtc4d27OPzvmvk_A at mail dot gmail dot com>
On 10/16/13 03:31, Richard Biener wrote:
;) It's a mental block of mine; it's been called type hoisting/sinking
in various contexts and I see parallels between the code motion
algorithms and how the type promotion/demotion exposes unnecessary type
conversions. So I keep calling them hoisting/sinking. I'll try to use
I see two primary effects of type sinking.
Note it was called type demotion ;)
I don't see this as the major benefit of type demotion. Yes, there is
some value in shrinking constants and the like, but in my experience the
benefits are relatively small and often get lost in things like partial
register stalls on x86, the PA and probably others (yes, the PA has
partial register stalls, it's just that nobody used that term).
First and probably the most
important in my mind is by sinking a cast through its uses the various
transformations we already perform are more likely to apply *without*
needing to handle optimizing through typecasts explicitly.
I would say it is desirable to express arithmetic in the smallest possible
types (see what premature optimization the C family frontends do
to narrow operations again after C integer promotion applied).
What I really want to get at here is avoiding having a large number of
optimizers looking back through the use-def chains and attempting to
elide typecasts in the middle of a chain of statements of interest.
If the primary goal is to shrink types, then yes, you want to use
whatever information you can, including VRP. But that's not the primary
goal in my mind, at least not at this stage.
You need some kind of range information to do this, thus either integrate
it into VRP (there is already code that does this there) or use range
information from VRP which we now preserve.
There's no reason why this pass couldn't utilize VRP information to
provide more opportunities to demote types and achieve the goal you
want. But I'd consider that a follow-on opportunity.
Yes, I know. And we need to get this back down to a single
implementation. I don't much care which of the 3 implementations we
keep, but it really should just be one and it needs to be reusable.
The second primary effect is, given two casts where the first indirectly
feeds the second (ie, the first feeds some statement, which then feeds the
second cast), if we're able to sink the first cast, we end up with the first
cast directly feeding the second cast. When this occurs one of the two
casts can often be eliminated. Sadly, I didn't keep any of those test
files, but I regularly saw them in GCC bootstraps.
This transformation is applied both by fold-const.c and by SSA forwprop
(our GIMPLE combiner). Doing it in yet another pass looks wrong
(and it isn't type demotion but also can be promotion).
I probably should have stated this differently -- the second primary
effect is to expose more cases where type conversions can be eliminated
via type promotion/demotion. I don't much care which of the 3 blobs of
code to eliminate the conversions we use -- I do care that we've got a
consistent way to promote/demote conversions to expose the unnecessary
Right. I'm aware of this work and the problem he's trying to solve and
have been loosely watching it -- primarily for the persistent VRP
In contrast to the desire of expressing operations in the smallest required
type there is the desire of exposing the effect of PROMOTE_MODE on
GIMPLE instead of only during RTL expansion. This is because the
truncations (sext and zext) PROMOTE_MODE introduced are
easier to optimize away when range information is available (see the
attempts to address this at RTL expansion time from Kugan from Linaro).
I thought we'd done this at a high level already. At the heart of this
work is to:
Similarly, I know there's a type hoisting patch that's also queued up. I
think it should be handled separately as well.
I think we need to paint a picture of the final result - what is the
main objective of the various(?!) passes in question? Where do
we do the same kind of transformation already?
1. Isolate, to the fullest extent possible, code which promotes and
demotes types. We have this stuff all over the place right now
and it's very ad-hoc.
2. Promote/demote types to allow our optimizers to not concern
themselves with walking back through type conversions when applying
3. Promote/demote types to expose unnecessary type conversions.
If we look at #2 and #3 we can expect that we'd want a structure which
allows for a simplification/optimization step to occur after types are
promoted or demoted. ie, a pipeline that looks like:
promote types -> optimize1 -> demote types -> optimize2
Now where that little mini pipeline lands is still a big question to me.
optimize1 may be a fairly significant hunk of our pipeline. optimize2
probably isn't (may just be a final tree-ssa-forwprop pass).
Placement is one of the biggest questions in my mind. If I think about
something like the old SGI compiler, they did a very early promotion,
then lowered/demoted and got reasonable results with it.
We have no pass that tries to promote or demote the types of
variables with using a data-flow approach (VRP comes closest,
but the transform is again pattern-matching, thus combine-like).
I do not object to adding this kind of pass, but I suggest to
look at the targets desires when implementing it - which eventually
means to honor PROMOTE_MODE (be careful about pass
placement here - you want this after loop optimizations like
vectorization but possibly before induction variable optimization).
As far as dealing with the target dependencies, there's no clear "this
is best". I vaguely recall discussions with Kai where we decided that
handling PROMOTE_MODE was relatively easy from a coding standpoint --
it's more a matter of where does that fit into the entire optimization
pipeline. I could make arguments either way.