This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 1/3] [ARM] PR63870 Add qualifiers for NEON builtins
- From: Alan Lawrence <alan dot lawrence at arm dot com>
- To: Charles Baylis <charles dot baylis at linaro dot org>
- Cc: Kyrylo Tkachov <Kyrylo dot Tkachov at arm dot com>, Ramana Radhakrishnan <Ramana dot Radhakrishnan at arm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 19 Oct 2015 17:56:11 +0100
- Subject: Re: [PATCH 1/3] [ARM] PR63870 Add qualifiers for NEON builtins
- Authentication-results: sourceware.org; auth=none
- References: <1444175989-24944-1-git-send-email-charles dot baylis at linaro dot org> <1444175989-24944-2-git-send-email-charles dot baylis at linaro dot org> <561B9242 dot 1060506 at arm dot com> <CADnVucCrVZDr3xrag-GYdURLHLFrYnz_PKnP8QFZP7=-a-wYLg at mail dot gmail dot com>
On 14/10/15 23:02, Charles Baylis wrote:
On 12 October 2015 at 11:58, Alan Lawrence <alan.lawrence@arm.com> wrote:
>
Given we are making changes here to how this all works on bigendian, have
you tested armeb at all?
I tested on big endian, and it passes, except....
Well, I asked because it seemed good to make sure that the changes/improvements
to how lane-swapping was done, wasn't breaking anything on armeb by the back
door, and so thank you, I'm happy with that as far as your patch is concerned ;).
for a testsuite issue
with the *_f16 tests, which fail because they are built without the
fp16 options on big endian. This is because
check_effective_target_arm_neon_fp16_ok_nocache gets an ICE when it
attempts to compile the test program. I think those fp16 intrinsics
are in your area, do you want to take a look? :)
Heh, yes, I see ;). So I've dug into this a bit, and the problem seems to be
that we don't define a movv4hf pattern, and hence, we fall back to
emit_multi_word_move. This uses subregs, and in simplify_subreg_regno,
REG_CANNOT_CHANGE_MODE_P is true on bigendian (but false on little-endian).
That is, I *think* the right thing to do is just to add a movv4hf (and v8hf)
pattern, I'm testing this now....
Cheers, Alan