This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] Tighten memory type assumption in RTL combiner pass.
- From: Venkataramanan Kumar <venkataramanan dot kumar at linaro dot org>
- To: Jeff Law <law at redhat dot com>, rth at redhat dot com
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Maxim Kuvyrkov <maxim dot kuvyrkov at linaro dot org>, segher at kernel dot crashing dot org, Richard Earnshaw <rearnsha at arm dot com>, bernds at codesourcery dot com
- Date: Fri, 16 Jan 2015 12:43:38 +0530
- Subject: Re: [RFC] Tighten memory type assumption in RTL combiner pass.
- Authentication-results: sourceware.org; auth=none
- References: <CAJK_mQ3R90TsoKhoA5WMkPjn31UO8bkqc5Bwhaj9t+SyQ982Cg at mail dot gmail dot com> <54B6E243 dot 7040404 at redhat dot com>
Hi jeff and Richard
On 15 January 2015 at 03:10, Jeff Law <law@redhat.com> wrote:
> On 01/14/15 04:27, Venkataramanan Kumar wrote:
>>
>> Hi all,
>>
>> When trying to debug GCC combiner pass with the test case in PR63949
>> ref https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63949 I came across this
>> code.
>>
>> This code in "make_compound_operation" assumes that all PLUS and MINUS
>> RTX are "MEM" type for scalar int modes and tries to optimize based on
>> that assumption.
>>
>> /* Select the code to be used in recursive calls. Once we are inside an
>> address, we stay there. If we have a comparison, set to COMPARE,
>> but once inside, go back to our default of SET. */
>>
>> next_code = (code == MEM ? MEM
>> : ((code == PLUS || code == MINUS)
>> && SCALAR_INT_MODE_P (mode)) ? MEM
>> : ((code == COMPARE || COMPARISON_P (x))
>> && XEXP (x, 1) == const0_rtx) ? COMPARE
>> : in_code == COMPARE ? SET : in_code);
>>
>> next_code is passed as in_code via recursive calls to
>> "make_compound_operation". Based on that we are converting shift
>> pattern to MULT pattern.
>>
>> case ASHIFT:
>> /* Convert shifts by constants into multiplications if inside
>> an address. */
>> if (in_code == MEM && CONST_INT_P (XEXP (x, 1))
>> && INTVAL (XEXP (x, 1)) < HOST_BITS_PER_WIDE_INT
>> && INTVAL (XEXP (x, 1)) >= 0
>> && SCALAR_INT_MODE_P (mode))
>> {
>>
>> Now I tried to tighten it further by adding check to see in_code is
>> also MEM type.
>> Not sure if this right thing to do. But this assumption about MEM
>> seems to be very relaxed before.
>>
>> diff --git a/gcc/combine.c b/gcc/combine.c
>> index 101cf35..1353f54 100644
>> --- a/gcc/combine.c
>> +++ b/gcc/combine.c
>> @@ -7696,7 +7696,8 @@ make_compound_operation (rtx x, enum rtx_code
>> in_code)
>>
>> next_code = (code == MEM ? MEM
>> : ((code == PLUS || code == MINUS)
>> - && SCALAR_INT_MODE_P (mode)) ? MEM
>> + && SCALAR_INT_MODE_P (mode)
>> + && (in_code == MEM)) ? MEM
>> : ((code == COMPARE || COMPARISON_P (x))
>> && XEXP (x, 1) == const0_rtx) ? COMPARE
>> : in_code == COMPARE ? SET : in_code);
>>
>>
>> This passed bootstrap on x86_64 and GCC tests are not regressing.
>> On Aarch64 passed bootstrap tests, test case in PR passed, but few
>> tests failed (failed to generate adds and subs), because there are
>> patterns (extended adds and subs) based on multiplication only in
>> Aarch64 backend.
>>
>> if this change is correct then I may need to add patterns in Aarch64
>> based on shifts. Not sure about targets also.
>>
>> Requesting further comments/help about this.
>>
>> I am looking to get it fixed in stage 1.
>
> So the first question I would ask here is what precisely are you trying to
> accomplish? Is there some code where making this change is important or is
> it strictly a theoretical problem? If the latter, can we make it concrete
> with a well crafted testcase?
>
> jeff
The below test case which I am working on is from the PR63949
int subsi_sxth (int a, short i)
{
/* { dg-final { scan-assembler "sub\tw\[0-9\]+,.*sxth #?1" } } */
return a - ((int)i << 1);
}
The expression "a - ((int)i << 1)" is not a memory expression.
But combiner assumes that MINUS RTX as memory, and process all RTX
sub expressions with the memory assumption.
While processing ((int)i << 1) in the combiner, it first hits this code.
/* See if we have operations between an ASHIFTRT and an ASHIFT.
If so, try to merge the shifts into a SIGN_EXTEND. We could
also do this for some cases of SIGN_EXTRACT, but it doesn't
seem worth the effort; the case checked for occurs on Alpha. */
if (!OBJECT_P (lhs)
&& ! (GET_CODE (lhs) == SUBREG
&& (OBJECT_P (SUBREG_REG (lhs))))
&& CONST_INT_P (rhs)
&& INTVAL (rhs) < HOST_BITS_PER_WIDE_INT
&& INTVAL (rhs) < mode_width
&& (new_rtx = extract_left_shift (lhs, INTVAL (rhs))) != 0)
new_rtx = make_extraction
(mode,make_compound_operation(
new_rtx,next_code),
0, NULL_RTX, mode_width - INTVAL (rhs),
code == LSHIFTRT, 0, in_code == COMPARE);
The input RTX is
(ashiftrt:SI (ashift:SI (reg:SI 1 x1 [ i ])
(const_int 16 [0x10]))
(const_int 15 [0xf]))
And the function extract_left_shift generates
ashift:SI (reg:SI 1 x1 [ i ])
(const_int 16 [0x10])
Before we call make_extraction, there is another call to
"make_compound_operation" again which converts ASHIFT to mult RTX.
In "make_compound_operation"
switch (code)
{
case ASHIFT:
/* Convert shifts by constants into multiplications if inside
an address. */
if (in_code == MEM && CONST_INT_P (XEXP (x, 1))
&& INTVAL (XEXP (x, 1)) < HOST_BITS_PER_WIDE_INT
&& INTVAL (XEXP (x, 1)) >= 0
&& SCALAR_INT_MODE_P (mode))
{
After extraction what we get a sign_extract and an outer SUBREG expression.
Combiner Says Failed to mismatch this
set (reg/i:SI 0 x0)
(minus:SI (reg:SI 0 x0 [ a ])
(subreg:SI (sign_extract:DI (mult:DI (reg:DI 1 x1 [ i ])
(const_int 2 [0x2]))
(const_int 17 [0x11])
(const_int 0 [0])) 0)))
But when we don't convert ASHIFT to MULT then we have a code in
make_extraction that works on ASHIFT RTX
else if (GET_CODE (inner) == ASHIFT
&& CONST_INT_P (XEXP (inner, 1))
&& pos_rtx == 0 && pos == 0
&& len > UINTVAL (XEXP (inner, 1)))
{
/* We're extracting the least significant bits of an rtx
(ashift X (const_int C)), where LEN > C. Extract the
least significant (LEN - C) bits of X, giving an rtx
whose mode is MODE, then shift it left C times. */
new_rtx = make_extraction (mode, XEXP (inner, 0),
0, 0, len - INTVAL (XEXP (inner, 1)),
unsignedp, in_dest, in_compare);
Now combiner Successfully matched this instruction
(set (reg/i:SI 0 x0)
(minus:SI (reg:SI 0 x0 [ a ])
(ashift:SI (sign_extend:SI (reg:HI 1 x1 [ i ]))
(const_int 1 [0x1]))))
So MEM assumptions on MINUS and PLUX RTX seems to be wrong for me.
Please let me know your suggestions on this.
regards,
Venkat.