[PATCH, SPU] generated better code for loads and stores
Trevor_Smigiel@playstation.sony.com
Trevor_Smigiel@playstation.sony.com
Fri Aug 29 00:22:00 GMT 2008
This patch generates better code for loads and stores on SPU.
The SPU can only do 16-byte, aligned loads and stores. To load something
smaller with a smaller alignment requires a load and a rotate. To store
something smaller requires a load, insert, and store.
Currently, there are two obvious ways to generate rtl for loads and
stores. Generate the multiple instructions at expand time, or split
them at some later phase. When expanded early we lose alias
information (because that 16-byte load could contain anything), and in
general do worse optimization on memory. When we split late, the
compiler has no opportunity to combine loads/stores of the same 16
bytes.
This patch introduces an additional split pass, split0, right before the
CSE2 pass. Before this pass, loads and stores are modeled as a single
rtl instruction, and can be optimized well. This pass splits them into
multiple instructions, allowing CSE2 and combine to optimize the 16 byte
loads and stores. The pass is only enabled when a target defines
SPLIT_BEFORE_CSE2.
The test case is an example which is improved by the earlier split pass.
This patch also makes other small improvements to the code generated for
loads and stores on SPU.
Ok for mainline? In particular, the new split pass.
Trevor
2008-08-27 Trevor Smigiel <Trevor_Smigiel@playstation.sony.com>
Improve code generated for loads and stores on SPU.
* doc/tm.texi (SPLIT_BEFORE_CSE2) : Document.
* tree-pass.h (pass_split_before_cse2) : Declare.
* final.c (rest_of_clean_state) : Initialize split0_completed.
* recog.c (split0_completed) : Define.
(gate_handle_split_before_cse2, rest_of_handle_split_before_cse2) :
New functions.
(pass_split_before_cse2) : New pass.
* rtl.h (split0_completed) : Declare.
* passes.c (init_optimization_passes) : Add pass_split_before_cse2
before pass_cse2 .
* config/spu/spu-protos.h (spu_legitimate_address) : Add
for_split argument.
(aligned_mem_p, spu_valid_move) : Remove prototypes.
(spu_split_load, spu_split_store) : Change return type to int.
* config/spu/predicates.md (spu_mem_operand) : Remove.
(spu_dest_operand) : Add.
* config/spu/spu-builtins.md (spu_lqd, spu_lqx, spu_lqa,
spu_lqr, spu_stqd, spu_stqx, spu_stqa, spu_stqr) : Remove AND
operation.
* config/spu/spu.c (regno_aligned_for_load) : Remove.
(reg_aligned_for_addr, address_needs_split) : New functions.
(spu_legitimate_address, spu_expand_mov, spu_split_load,
spu_split_store) : Update.
(spu_init_expanders) : Pregenerate a couple of pseudo-registers.
* config/spu/spu.h (REG_ALIGN, SPLIT_BEFORE_CSE2) : Define.
(GO_IF_LEGITIMATE_ADDRESS) : Update for spu_legitimate_address.
* config/spu/spu.md ("_mov<mode>", "_movdi", "_movti") : Update
predicates.
("load", "store") : Change to define_split.
testsuite/
* testsuite/gcc.target/spu/split0-1.c : Add test.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: split0.patch
Type: text/x-diff
Size: 46770 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20080829/20fe618f/attachment.bin>
More information about the Gcc-patches
mailing list