This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
RE: Using particular register class (like floating point registers) as spill register class
- From: "Ian Bolton" <ian dot bolton at arm dot com>
- To: "'Andrew Haley'" <aph at redhat dot com>, "Kugan" <kugan dot vivekanandarajah at linaro dot org>, <pinskia at gmail dot com>
- Cc: <gcc at gcc dot gnu dot org>, "Vladimir Makarov" <vmakarov at redhat dot com>
- Date: Fri, 16 May 2014 17:20:21 +0100
- Subject: RE: Using particular register class (like floating point registers) as spill register class
- Authentication-results: sourceware.org; auth=none
- References: <5375E730 dot 20309 at linaro dot org> <CE1A054F-0FBD-4368-82E3-9BCE35509078 at gmail dot com> <5375F0E8 dot 70109 at linaro dot org> <53760B2E dot 1000802 at redhat dot com>
> On 05/16/2014 12:05 PM, Kugan wrote:
> >
> >
> > On 16/05/14 20:40, pinskia@gmail.com wrote:
> >>
> >>
> >>> On May 16, 2014, at 3:23 AM, Kugan
> <kugan.vivekanandarajah@linaro.org> wrote:
> >>>
> >>> I would like to know if there is anyway we can use registers from
> >>> particular register class just as spill registers (in places where
> >>> register allocator would normally spill to stack and nothing more),
> when
> >>> it can be useful.
> >>>
> >>> In AArch64, in some cases, compiling with -mgeneral-regs-only
> produces
> >>> better performance compared not using it. The difference here is
> that
> >>> when -mgeneral-regs-only is not used, floating point register are
> also
> >>> used in register allocation. Then IRA/LRA has to move them to core
> >>> registers before performing operations as shown below.
> >>
> >> Can you show the code with fp register disabled? Does it use the
> stack to spill? Normally this is due to register to register class
> costs compared to register to memory move cost. Also I think it
> depends on the processor rather the target. For thunder, using the fp
> registers might actually be better than using the stack depending if
> the stack was in L1.
> > Not all the LDR/STR combination match to fmov. In the testcase I
> have,
> >
> > aarch64-none-linux-gnu-gcc sha_dgst.c -O2 -S -mgeneral-regs-only
> > grep -c "ldr" sha_dgst.s
> > 50
> > grep -c "str" sha_dgst.s
> > 42
> > grep -c "fmov" sha_dgst.s
> > 0
> >
> > aarch64-none-linux-gnu-gcc sha_dgst.c -O2 -S
> > grep -c "ldr" sha_dgst.s
> > 42
> > grep -c "str" sha_dgst.s
> > 31
> > grep -c "fmov" sha_dgst.s
> > 105
> >
> > I am not saying that we shouldn't use floating point register here.
> But
> > from the above, it seems like register allocator is using it as more
> > like core register (even though the cost mode has higher cost) and
> then
> > moving the values to core registers before operations. if that is the
> > case, my question is, how do we just make this as spill register
> class
> > so that we will replace ldr/str with equal number of fmov when it is
> > possible.
>
> I'm also seeing stuff like this:
>
> => 0x7fb72a0928 <ClassFileParser::parse_constant_pool_entries(int,
> Thread*)+2500>:
> add x21, x4, x21, lsl #3
> => 0x7fb72a092c <ClassFileParser::parse_constant_pool_entries(int,
> Thread*)+2504>:
> fmov w2, s8
> => 0x7fb72a0930 <ClassFileParser::parse_constant_pool_entries(int,
> Thread*)+2508>:
> str w2, [x21,#88]
>
> I guess GCC doesn't know how to store an SImode value in an FP register
> into
> memory? This is 4.8.1.
>
Please can you try that on trunk and report back.
Thanks,
Ian