This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch/combine] PR64304 wrong bitmask passed to force_to_mode in combine_simplify_rtx
- From: Jiong Wang <wong dot kwongyuan dot tools at gmail dot com>
- To: Andrew Pinski <pinskia at gmail dot com>
- Cc: Jeff Law <law at redhat dot com>, Jiong Wang <jiong dot wang at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 9 Jan 2015 21:59:12 +0000
- Subject: Re: [Patch/combine] PR64304 wrong bitmask passed to force_to_mode in combine_simplify_rtx
- Authentication-results: sourceware.org; auth=none
- References: <54AFDA2D dot 4060303 at arm dot com> <54B03CCC dot 8090604 at redhat dot com> <CA+=Sn1mfW=0bp44KeN=+uh5v+UD5cff7N277nHE1+W9W4iV3fA at mail dot gmail dot com>
2015-01-09 21:29 GMT+00:00 Andrew Pinski <pinskia@gmail.com>:
> On Fri, Jan 9, 2015 at 12:40 PM, Jeff Law <law@redhat.com> wrote:
>> On 01/09/15 06:39, Jiong Wang wrote:
>>>
>>> as reported at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64304
>>>
>>> given the following test:
>>>
>>> unsigned char byte = 0;
>>>
>>> void set_bit(unsigned int bit, unsigned char value) {
>>> unsigned char mask = (unsigned char)(1 << (bit & 7));
>>> if (!value) {
>>> byte &= (unsigned char)~mask;
>>> } else {
>>> byte |= mask;
>>> }
>>> }
>>>
>>> we should generate something like:
>>>
>>> set_bit:
>>> and w0, w0, 7
>>> mov w2, 1
>>> lsl w2, w2, w0
>>>
>>> while we are generating
>>> mov w2, 1
>>> lsl w2, w2, w0
>>>
>>>
>>> the necessary "and w0, w0, 7" deleted wrongly.
>>>
>>> that because
>>>
>>> (insn 2 5 3 2 (set (reg/v:SI 82 [ bit ])
>>> (reg:SI 0 x0 [ bit ])) bug.c:3 38 {*movsi_aarch64}
>>> (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
>>> (nil)))
>>> (insn 7 4 8 2 (set (reg:SI 84 [ D.1482 ])
>>> (and:SI (reg/v:SI 82 [ bit ])
>>> (const_int 7 [0x7]))) bug.c:4 399 {andsi3}
>>> (expr_list:REG_DEAD (reg/v:SI 82 [ bit ])
>>> (nil)))
>>> (insn 9 8 10 2 (set (reg:SI 85 [ D.1482 ])
>>> (ashift:SI (reg:SI 86)
>>> (subreg:QI (reg:SI 84 [ D.1482 ]) 0))) bug.c:4 539
>>> {*aarch64_ashl_sisd_or_int_si3}
>>> (expr_list:REG_DEAD (reg:SI 86)
>>> (expr_list:REG_DEAD (reg:SI 84 [ D.1482 ])
>>> (expr_list:REG_EQUAL (ashift:SI (const_int 1 [0x1])
>>> (subreg:QI (reg:SI 84 [ D.1482 ]) 0))
>>> (nil)))))
>>>
>>> are wrongly combined into
>>>
>>> (insn 9 8 10 2 (set (reg:QI 85 [ D.1482 ])
>>> (ashift:QI (subreg:QI (reg:SI 86) 0)
>>> (reg:QI 0 x0 [ bit ]))) bug.c:4 556 {*ashlqi3_insn}
>>> (expr_list:REG_DEAD (reg:SI 0 x0 [ bit ])
>>> (expr_list:REG_DEAD (reg:SI 86)
>>> (nil))))
>>>
>>> thus, the generated assembly is lack of the necessary "and w0, x0, 7".
>>>
>>> the root cause is at one place in combine pass, we are passing wrong
>>> bitmask to force_to_mode.
>>>
>>> in this particular case, for QI mode, we should pass (1 << 8 - 1), while
>>> we are passing (1 << 3 - 1),
>>> thus the combiner think we only need the lower 3 bits, that X & 7 is
>>> unnecessary. While for QI mode, we
>>> want the lower 8 bits. we should remove the exp operator.
>>>
>>> this should be a historical bug in combine pass?? while it's only
>>> triggered for target
>>> where SHIFT_COUNT_TRUNCATED be true. it's long time hiding mostly
>>> because x86/arm will
>>> not trigger this part of code.
>>>
>>> bootstrap on x86 and gcc check OK.
>>> bootstrap on aarch64 and bare-metal regression OK.
>>> ok for trunk?
>>>
>>> gcc/
>>> PR64303
>>> * combine.c (combine_simplify_rtx): Correct the bitmask passed to
>>> force_to_mode.
>>> gcc/testsuite/
>>> PR64303
>>> * gcc.target/aarch64/pr64304.c: New testcase.
>>
>> I don't think this is correct.
>>
>> When I put a breakpoint on the code in question I see the following RTL
>> prior to the call to DO_SUBST:
>>
>> (ashift:QI (const_int 1 [0x1])
>> (subreg:QI (and:SI (reg:SI 0 x0 [ bit ])
>> (const_int 7 [0x7])) 0))
>>
>>
>> Note carefully the QImode for the ASHIFT. That clearly indicates that just
>> the low 8 bits are meaningful and on a SHIFT_COUNT_TRUNCATED target the
>> masking of the count with 0x7 is redundant as the only valid shift counts
>> are 0-7 (again because of the QImode for the ASHIFT).
Thanks for the explain, I have misunderstood some key points here.
> Jeff is correct here. SHIFT_COUNT_TRUNCATED cannot be true if the
> aarch64 back-end has shifts patterns for smaller than 32/64bit but the
> aarch64 target only has shifts for 32 and 64bit.
Pinski, thanks for pointing this out.
Agree. AArch64 define shift pattern for SHORT type should be the problem.
Regards,
Jiong
> The middle-end is doing the correct thing, with SHIFT_COUNT_TRUNCATED
> true and a pattern for a QIshift, means the shifter does not to be
> truncated before use.
>
> Thanks,
> Andrew Pinski
>
>>
>> Jeff
>>
>>