This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
Hi Christophe, On 15/11/17 15:31, Christophe Lyon wrote:
Hi Kyrill, On 8 November 2017 at 19:34, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:On 06/06/17 14:17, James Greenhalgh wrote:On Tue, Jun 06, 2017 at 09:40:44AM +0100, Kyrill Tkachov wrote:Hi all, On top of the previous vec_merge simplifications [1] we can add this pattern to perform a store of a vec_concat of two 64-bit values in distinct registers as an STP. This avoids constructing such a vector explicitly in a register and storing it as a Q register. This way for the code in the testcase we can generate: construct_lane_1: ldp d1, d0, [x0] fmov d3, 1.0e+0 fmov d2, 2.0e+0 fadd d4, d1, d3 fadd d5, d0, d2 stp d4, d5, [x1, 32] ret construct_lane_2: ldp x2, x0, [x0] add x3, x2, 1 add x4, x0, 2 stp x3, x4, [x1, 32] ret instead of the current: construct_lane_1: ldp d0, d1, [x0] fmov d3, 1.0e+0 fmov d2, 2.0e+0 fadd d0, d0, d3 fadd d1, d1, d2 dup v0.2d, v0.d[0] ins v0.d[1], v1.d[0] str q0, [x1, 32] ret construct_lane_2: ldp x2, x3, [x0] add x0, x2, 1 add x2, x3, 2 dup v0.2d, x0 ins v0.d[1], x2 str q0, [x1, 32] ret Bootstrapped and tested on aarch64-none-linux-gnu. Ok for GCC 8?OK.Thanks, I've committed this and the other patches in this series after rebasing and rebootstrapping and testing on aarch64-none-linux-gnu. The only conflict from updating the patch was that I had to use the store_16 attribute rather than the old store2 for the new define_insn. This is what I've committed with r254551. Sorry for the delay in committing.I've noticed that the new tests fail when testing with -mabi=ilp32: FAIL: gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-not ins\t FAIL: gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times stp\td[0-9]+, d[0-9]+ 1 (found 0 times) FAIL: gcc.target/aarch64/store_v2vec_lanes.c scan-assembler-times stp\tx[0-9]+, x[0-9]+ 1 (found 0 times) Sorry if this has been reported before.
Thank you for reporting this, I was not aware of it.My patch does indeed fail to generate the optimised sequence for -mabi=ilp32.
During combine it fails to match: Failed to match this instruction: (set (mem:V2DF (plus:DI (reg/v/f:DI 79 [ z ])(const_int 32 [0x20])) [1 MEM[(v2df *)z_8(D) + 32B]+0 S16 A128])
(vec_concat:V2DF (reg:DF 81 [ y0 ]) (reg:DF 84 [ y1 ]))) but without the -mabi=ilp32 it does successfully match the equivalent (set (mem:V2DF (plus:DI (reg:DI 1 x1 [ z ])(const_int 32 [0x20])) [1 MEM[(v2df *)z_8(D) + 32B]+0 S16 A128])
(vec_concat:V2DF (reg:DF 81 [ y0 ]) (reg:DF 84 [ y1 ]))) The only difference is the index register being the hard reg x1.There's probably some subtlety in aarch64_classify_address that I'll need to dig into.
In any case, can you please open a bug report for this so we can track it?To be clear, the failure is just suboptimal codegen for the -mabi=ilp32 case, not a wrong-code or ICE
(though it should still be fixed, of course). Thanks again, Kyrill
ChristopheKyrillThanks, James2017-06-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * config/aarch64/aarch64-simd.md (store_pair_lanes<mode>): New pattern. * config/aarch64/constraints.md (Uml): New constraint. * config/aarch64/predicates.md (aarch64_mem_pair_lanes_operand): New predicate. 2017-06-06 Kyrylo Tkachov <kyrylo.tkachov@arm.com> * gcc.target/aarch64/store_v2vec_lanes.c: New test.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |