This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: RFC: Add of type-demotion pass

On Tue, Oct 22, 2013 at 4:27 PM, Jakub Jelinek <> wrote:
> On Fri, Oct 18, 2013 at 12:06:35PM +0200, Richard Biener wrote:
>> You can't move type conversion "out of the way" in most cases as
>> GIMPLE is stronly typed
>> and data sources and sinks can obviously not be "promoted" (nor can
>> function arguments).
>> So you'll very likely not be able to remove the code from the
>> optimizers, it will only maybe
>> trigger less often.
> My take on the type demotion and promotion is that we badly need it and the
> question is just in which pass to do it.
> The benefit of type demotion is code canonicalization and removing
> unnecessary computation that e.g. only affects the upper bits that are going
> to be thrown away anyway, the disadvantage of type demotion of signed
> operations is that we need to perform them in unsigned type instead and thus
> we can't perform some loop optimizations based on undefined behavior etc.
> See e.g.
> for some testcases where type demotion can improve generated code.
> If types are demoted, upper bits of constants go away, SCCVN can find
> equivalences between SSA_NAMEs that wouldn't be considered before, etc.

Indeed demotion for this reason is good and important (we may also be able
to remove the code in the frontends and in fold that "shorten" operations).

> But given the issue with signed operation type demotion, I think before loop
> optimizations we should only be doing type demotions that don't result
> in defining previously undefined behavior operations.

But the demotion pass could fill in range information which maybe allows
to recover parts of the undefinedness.

>  I guess passes like
> forwprop, gimple-fold etc. could easily handle the easy cases, where there
> is a tree of has_single_use SSA_NAMEs that can be demoted, but handling
> a more complicated web would be harder.  Say in:
> unsigned int a, b, c, d, e, f; unsigned char h, i, j;
> void
> foo (void)
> {
>   unsigned int k = a * 2 + b + 0x12340000;
>   unsigned int l = c * 4 + d + 0x23456700;
>   unsigned int m = e * 5 + f, n = k + l - m, o = k - l + m, p = -k + 1;
>   h = n; i = o; j = p;
> }
> k, l, m all have multiple imm uses, but still pretty much everything in this
> function could be demoted to unsigned char, the two large constants could go
> away as additions of zero, etc.  Perhaps that can be seen as little benefit,
> but what if the above is all
> s/unsigned int/unsigned long long/;s/unsigned char/unsigned int/ on 32-bit
> target?  RTL subreg pass might help a little bit, but that is too late.

Yeah, I'd like to see testcases like this with the expected outcome.

> For the demotion which changes undefined overflow operations to defined
> ones, I wonder when is the last pass that usefully makes use of that
> information, if e.g. we could do the full type demotion already before
> vectorization somewhere in the loop optimization queue, or if that is still
> too early.

The most important user is number of iterations analysis.

> Where type demotion and promotion is very important is IMHO vectorization,
> the code we generate for mixed types vectorization is just huge and
> terrible.  If we can help it by not computing useless upper bits, or on the
> other side sometimes not doing parts of computations in smaller types, which
> lead to all the other computations on wider types to be done with bigger
> vectorization factor, we could improve generated code quality.
> I wonder if for vectorizations we couldn't use the same thing I wrote
> recently for if-conversion, for bbs potentially suitable for vectorization
> (with the right loop form etc.), that is, if we don't do full type demotion
> before vectorization, check if we'd demote anything and if so, work only on
> the vectorization only loop copy (or create it), and then try to do some
> type promotion to minimize number of type sizes in the loop,
> see the (admittedly
> artificial) testcase for what I mean.  After demotion, we could replace the
> cast of short to char and back just with and (for zero extension) or signed
> shift right + shift left (for sign extension), etc.
> And, finally, the question is if we generate good code if we just expand RTL
> from the demoted types (we'd better be, because user could have written his
> code in the narrower types from the beginning (well, C implicit promotions
> make that harder, but fold-const already demotes some computations that
> appear in a single statement), or if there are advantages of promoting some
> types, what algorithm to use for that, what cost model, what target hooks
> etc.

I guess that experiments will show that not doing the promotion again
will regress things.  Ideally we'd promote in a target specific way, just
like what expand would do (and then adjust targets that don't do
aggressive promotion for the now aggressive demotion done earlier).


>         Jakub

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]