This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][ARM] Fix PR 55426
- From: Kyrill Tkachov <kyrylo dot tkachov at arm dot com>
- To: Kyrill Tkachov <kyrylo dot tkachov at arm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Richard Earnshaw <Richard dot Earnshaw at arm dot com>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>
- Date: Tue, 11 Feb 2014 15:03:23 +0000
- Subject: Re: [PATCH][ARM] Fix PR 55426
- Authentication-results: sourceware.org; auth=none
- References: <52F9EF80 dot 706 at arm dot com>
On 11/02/14 09:38, Kyrill Tkachov wrote:
Hi all,
In this PR the 128-bit load-duplicate intrinsics in neon.exp ICE on big-endian
with an unrecognisable insn error:
neon-vld1_dupQ.c:24:1: error: unrecognizable insn:
(insn 94 93 31 (set (subreg:DI (reg:V2DI 95 d16 [orig:137 D.14400 ] [137]) 0)
(subreg:DI (reg:V2DI 95 d16 [orig:137 D.14400 ] [137]) 8))
The problem seems to be that the neon_vld1_dupv2di splitter generates subregs
after reload with gen_lowpart and gen_highpart. Since that splitter always
matches after reload, we already know the hard register numbers, so we can just
manipulate those directly to extract the two doubleword parts of a quadword reg.
While we're at it, we might as well use a more general move instruction when the
alignment is natural to potentially take advantage of more complex addressing
modes. We're allowed to do that because the vld1Q_dup*64 intrinsics describe a
behaviour and do not guarantee that a particular instruction will be used.
Therefore the vld1Q_dup*64 tests are updated to be run-time tests instead to
test the functionality. New *_misaligned tests are added, however, to make sure
that we still generate vld1.64 when the address is explicitly unaligned, since
vld1.64 is the only instruction that can handle that.
Did an armeb-none-linux-gnueabihf build.
The vld1Q_dup*64* tests now pass on big and little endian.
arm-none-linux-gnueabihf bootstrap on Chromebook successful.
This is a regression since 4.7. I've tested this on trunk. Will test this on the
4.8 and 4.7 branches.
My apologies, I misread the bug report as if this appears on 4.7. I couldn't
reproduce it on 4.7 since the offending pattern didn't exist then.
I'm testing a 4.8 variant of this patch.
Kyrill
Ok for those branches if no regressions?
Thanks,
Kyrill
2014-02-11 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
PR target/55426
* config/arm/neon.md (neon_vld1_dupv2di): Do not generate
low and high part subregs, use hard reg numbers.
* config/arm/arm.c (arm_mem_aligned_p): New function.
(arm_init_neon_builtins): Allow for memory operands
in load operations.
* config/arm/arm-protos.h (arm_mem_aligned_p): Declare
extern.
* config/arm/constraints.md (Uo): New constraint.
2014-02-11 Kyrylo Tkachov <kyrylo.tkachov@arm.com>
PR target/55426
* gcc.target/arm/neon/vld1Q_dupp64.c: Change to run-time test.
* gcc.target/arm/neon/vld1Q_dups64.c: Likewise.
* gcc.target/arm/neon/vld1Q_dupu64.c: Likewise.
* gcc.target/arm/neon/vld1Q_dupp64_misaligned.c: New test.
* gcc.target/arm/neon/vld1Q_dups64_misaligned.c: Likewise.
* gcc.target/arm/neon/vld1Q_dupu64_misaligned.c: Likewise.