This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Using particular register class (like floating point registers) as spill register class
- From: Andrew Haley <aph at redhat dot com>
- To: Kugan <kugan dot vivekanandarajah at linaro dot org>, pinskia at gmail dot com
- Cc: "gcc at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>, Vladimir Makarov <vmakarov at redhat dot com>
- Date: Fri, 16 May 2014 13:57:18 +0100
- Subject: Re: Using particular register class (like floating point registers) as spill register class
- Authentication-results: sourceware.org; auth=none
- References: <5375E730 dot 20309 at linaro dot org> <CE1A054F-0FBD-4368-82E3-9BCE35509078 at gmail dot com> <5375F0E8 dot 70109 at linaro dot org>
On 05/16/2014 12:05 PM, Kugan wrote:
>
>
> On 16/05/14 20:40, pinskia@gmail.com wrote:
>>
>>
>>> On May 16, 2014, at 3:23 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>>>
>>> I would like to know if there is anyway we can use registers from
>>> particular register class just as spill registers (in places where
>>> register allocator would normally spill to stack and nothing more), when
>>> it can be useful.
>>>
>>> In AArch64, in some cases, compiling with -mgeneral-regs-only produces
>>> better performance compared not using it. The difference here is that
>>> when -mgeneral-regs-only is not used, floating point register are also
>>> used in register allocation. Then IRA/LRA has to move them to core
>>> registers before performing operations as shown below.
>>
>> Can you show the code with fp register disabled? Does it use the stack to spill? Normally this is due to register to register class costs compared to register to memory move cost. Also I think it depends on the processor rather the target. For thunder, using the fp registers might actually be better than using the stack depending if the stack was in L1.
> Not all the LDR/STR combination match to fmov. In the testcase I have,
>
> aarch64-none-linux-gnu-gcc sha_dgst.c -O2 -S -mgeneral-regs-only
> grep -c "ldr" sha_dgst.s
> 50
> grep -c "str" sha_dgst.s
> 42
> grep -c "fmov" sha_dgst.s
> 0
>
> aarch64-none-linux-gnu-gcc sha_dgst.c -O2 -S
> grep -c "ldr" sha_dgst.s
> 42
> grep -c "str" sha_dgst.s
> 31
> grep -c "fmov" sha_dgst.s
> 105
>
> I am not saying that we shouldn’t use floating point register here. But
> from the above, it seems like register allocator is using it as more
> like core register (even though the cost mode has higher cost) and then
> moving the values to core registers before operations. if that is the
> case, my question is, how do we just make this as spill register class
> so that we will replace ldr/str with equal number of fmov when it is
> possible.
I'm also seeing stuff like this:
=> 0x7fb72a0928 <ClassFileParser::parse_constant_pool_entries(int, Thread*)+2500>:
add x21, x4, x21, lsl #3
=> 0x7fb72a092c <ClassFileParser::parse_constant_pool_entries(int, Thread*)+2504>:
fmov w2, s8
=> 0x7fb72a0930 <ClassFileParser::parse_constant_pool_entries(int, Thread*)+2508>:
str w2, [x21,#88]
I guess GCC doesn't know how to store an SImode value in an FP register into
memory? This is 4.8.1.
Andrew.