This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [Patch ARM 1/3] Neon intrinsics TLC : Replace intrinsics with GNU C implementations where possible.
- From: Ramana Radhakrishnan <ramrad01 at arm dot com>
- To: Julian Brown <julian at codesourcery dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Christophe Lyon <christophe dot lyon at linaro dot org>
- Date: Mon, 28 Apr 2014 14:01:36 +0100
- Subject: Re: [Patch ARM 1/3] Neon intrinsics TLC : Replace intrinsics with GNU C implementations where possible.
- Authentication-results: sourceware.org; auth=none
- References: <535E304D dot 7000800 at arm dot com> <535E30F1 dot 4020902 at arm dot com> <20140428124410 dot 15aa5246 at octopus>
On Mon, Apr 28, 2014 at 12:44 PM, Julian Brown <julian@codesourcery.com>
wrote:
> On Mon, 28 Apr 2014 11:44:01 +0100
> Ramana Radhakrishnan <ramrad01@arm.com> wrote:
>
>> I've special cased the ffast-math case for the _f32 intrinsics to
>> prevent the auto-vectorizer from coming along and vectorizing addv2sf
>> and addv4sf type operations which we don't want to happen by default.
>> Patch 1/3 causes apparent "regressions" in the rather ineffective
>> neon intrinsics tests that we currently carry soon hopefully to be
>> replaced by Christophe Lyon's rewrite that is being reviewed. On the
>> whole I deem this patch stack to be safe to go in if necessary. These
>> "regressions" are for -O0 with the vbic and vorn intrinsics which
>> don't now get combined and well, so be it.
>
> I think reimplementing these intrinsics in C is a mistake if we ever
> hope to make big-endian mode work properly, and "fixing" the generated
> header file by bypassing the generator makes it harder to accurately
> perform the sweeping changes that will probably be necessary to do that.#
> Recall e.g. the discussion around:
>
> http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00161.html
Well, it would help if the generator were written in a better language
than ML :) . While I don't mind the different language in the backend
once in a while the problem is that everytime anyone needs to make a
change to this file, we spend far more time relearning ML than actually
doing the change :(.
>
> Generally (though in this case it's merely an implementation detail)
> the NEON intrinsics and GCC's generic vector support cannot be expected
> to interwork properly (because of incompatible lane ordering). Of
> course we get away with it in little-endian mode though, and I guess
> the bridge has already been crossed by earlier patches.
Please note that I have been very careful about doing only those
operations that will not be afflicted by big endian. I am not touching
any of the lane-wise intrinsics or intrinsics that touch lane numbers.
It is the intrinsics that have explicit lane numbering that have the
issue and not the intrinsics I have touched. What's being done here is
similar to how these particular intrinsics have been dealt with in the
AArch64 backend and we don't see any issues with these intrinsics in the
big endian mode and I will not expect these intrinsics to be more broken
in big-endian than they are currently with this patch or these set of
patches.
What specifically are you worried about with Patch 1/3 with respect to
big endian in this case ? I agree that there may be issues with the
specific "lane" extraction and vector lane numbering extensions that GCC
has in big-endian mode vs Neon intrinsics but otherwise this change
should *not* cause any issues in that space.
What specifically are you worried about with this patch other than
losing the ability to auto-generate these intrinsics - the patch as is
doesn't do anything but touch all those that operate on the entire
vector and have no dependence at all on lane numbering ?
regards
Ramana