Improve VFP code generation

Wed Mar 29 19:15:00 GMT 2006

Under some circumstances, particularly when accessing arrays, gcc generates 
poor code for VFP memory accesses. It will typically generate:

        ldrd    r6, [r3, r5]
        fmdrr   d7, r6, r7

when it would be much better to use

        add     r6, r3, r5
        fldd    d7, [r6]

This uses fewer registers, and transfers less data over the coprocessor 
interface.

The solution is to disparage core registers in the floating point move 
patterns. This is sufficient to make the register allocator pick the second 
code sequence above.

The definition of REGISTER_MOVE_COST already says the latter sequence is 
cheaper.  I'm guessing once we allocate the registers it's too late for this 
to make any difference.

Tested with cross to arm-none-eabi.
Applied to mainline and branches/csl/arm-4_1.

Paul

Mainline:
2006-03-29  Paul Brook  <paul@codesourcery.com>

	* config/arm/vfp.md (movsf_vfp): Disparage w<->r alternatives.
	(movdf_vfp): Ditto.

csl-arm:
2006-03-29  Paul Brook  <paul@codesourcery.com>

	* gcc/config/arm/vfp.md (movsf_vfp): Disparage w<->r alternatives.
	(thumb2_movsf_vfp, movdf_vfp, thumb2_movdf_vfp): Ditto.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.vfp_load_core
Type: text/x-diff
Size: 2421 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20060329/3ae271ec/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch.vfp_load_core_head
Type: text/x-diff
Size: 1415 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20060329/3ae271ec/attachment-0001.bin>