This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 2/3][AArch64] Improve zero extend

Richard Earnshaw wrote:
> I'm not sure about this, while rtx_cost is called recursively as it
> walks the RTL, I'd normally expect the outer levels of the recursion to
> catch the cases where zero-extend is folded into a more complex
> operation.  Hitting a case like this suggests that something isn't doing
> that correctly.

As mentioned, the query is about an non-existent instruction, so the existing
rtx_cost code won't handle it. In fact there is no other check for "outer" anywhere
in aarch64_rtx_cost. We either assume outer == SET or know that if it isn't, the
expression will be split.

> So what was the top-level RTX passed into rtx_cost?  I'd like to get a
> better understanding about the use case before acking this patch.

An example would be:

long f(unsigned x) { return (long)x * 20; }

Combine tries to merge the constant into the multiply, so we get this cost query:

(mult:DI (zero_extend:DI (reg/v:SI 74 [ x ]))
    (const_int 20 [0x14]))

Given this is not a legal multiply, rtx_mult_cost recurses, assuming both the
zero_extend and the immediate are going to be split off. But then the zero_extend
is a SET, ie. a zero-cost operation. So not checking outer is correct.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]