This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][RFC][match.pd] optimize (X & C) == N when C is power of 2
- From: Ramana Radhakrishnan <ramana dot gcc at googlemail dot com>
- To: Kyrill Tkachov <kyrylo dot tkachov at arm dot com>
- Cc: Richard Biener <rguenther at suse dot de>, Jakub Jelinek <jakub at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Fri, 24 Jul 2015 10:11:13 +0100
- Subject: Re: [PATCH][RFC][match.pd] optimize (X & C) == N when C is power of 2
- Authentication-results: sourceware.org; auth=none
- References: <55B1F2C3 dot 2000903 at arm dot com> <20150724082319 dot GI1780 at tucnak dot redhat dot com> <55B1FBDE dot 9040501 at arm dot com> <alpine dot LSU dot 2 dot 11 dot 1507241057400 dot 19642 at zhemvz dot fhfr dot qr> <55B1FFAA dot 7060004 at arm dot com>
- Reply-to: ramrad01 at arm dot com
On Fri, Jul 24, 2015 at 10:04 AM, Kyrill Tkachov <kyrylo.tkachov@arm.com> wrote:
>> It arrives as SSA_NAME == N and you can use get_gimple_for_ssa_name
>> or get_def_for_expr to get at the defining stmt if that is possible
>> (it's still unexpanded and thus TERed) and expand a different
>> expression.
>
>
> Thanks, so it's where we expand compares... (what's TERed?)
Temporary Expression Replacement - IIRC something done when you just
come out of ssa . tree-ssa-ter.[hc].
Ramana
>
>>
>> But why can't simplify-rtx via combine handle this - it should have
>> access to target costs.
>
>
> That would require for the target to expand to an SMOD rtx
> which, if the target has no direct instruction for would be somewhat
> awkward.
>
> Thanks,
> Kyrill
>
>
>>
>>>> Because, ((1 << (size - 1)) | (C - 1))) constant might be very
>>>> expensive,
>>>> while C cheap, and % might not be that expensive compared to & to offset
>>>> that.
>>>>
>>>> E.g. on x86_64, for 32-bit and smaller X the constant is cheap as any
>>>> other
>>>> (well, if we don't take instruction size into account), but 64-bit
>>>> constant
>>>> is at least 3 times more expensive (movabsq is needed with its latency).
>>>> In the x86_64 case supposedly the divmod is still more expensive, but
>>>> there
>>>> are many other targets. On sparc64 for 64-bit constants, you might need
>>>> many instructions to create the constants, etc.
>>>
>>> Ok, I am not familiar with sparc64. The constant is just a 1
>>> in the sign bit orred with a continuous string of ones.
>>> That's usually cheap on aarch64 but may not be so on other targets.
>>
>> On GIMPLE we might still want to canonicalize to one form. I'd
>> canonicalize to the form with "smaller" constants if the number
>> of operations is the same.
>>
>> Richard.
>>
>>> Thanks,
>>> Kyrill
>>>
>>>> Jakub
>>>>
>>>
>