Improve VFP code generation
Paul Brook
paul@codesourcery.com
Wed Mar 29 19:15:00 GMT 2006
Under some circumstances, particularly when accessing arrays, gcc generates
poor code for VFP memory accesses. It will typically generate:
ldrd r6, [r3, r5]
fmdrr d7, r6, r7
when it would be much better to use
add r6, r3, r5
fldd d7, [r6]
This uses fewer registers, and transfers less data over the coprocessor
interface.
The solution is to disparage core registers in the floating point move
patterns. This is sufficient to make the register allocator pick the second
code sequence above.
The definition of REGISTER_MOVE_COST already says the latter sequence is
cheaper. I'm guessing once we allocate the registers it's too late for this
to make any difference.
Tested with cross to arm-none-eabi.
Applied to mainline and branches/csl/arm-4_1.
Paul
Mainline:
2006-03-29 Paul Brook <paul@codesourcery.com>
* config/arm/vfp.md (movsf_vfp): Disparage w<->r alternatives.
(movdf_vfp): Ditto.
csl-arm:
2006-03-29 Paul Brook <paul@codesourcery.com>
* gcc/config/arm/vfp.md (movsf_vfp): Disparage w<->r alternatives.
(thumb2_movsf_vfp, movdf_vfp, thumb2_movdf_vfp): Ditto.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.vfp_load_core
Type: text/x-diff
Size: 2421 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20060329/3ae271ec/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.vfp_load_core_head
Type: text/x-diff
Size: 1415 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20060329/3ae271ec/attachment-0001.bin>
More information about the Gcc-patches
mailing list