This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch ARM 1/3] Neon intrinsics TLC : Replace intrinsics with GNU C implementations where possible.



On Mon, Apr 28, 2014 at 12:44 PM, Julian Brown <julian@codesourcery.com> wrote:
> On Mon, 28 Apr 2014 11:44:01 +0100
> Ramana Radhakrishnan <ramrad01@arm.com> wrote:
>
>> I've special cased the ffast-math case for the _f32 intrinsics to
>> prevent the auto-vectorizer from coming along and vectorizing addv2sf
>> and addv4sf type operations which we don't want to happen by default.
>> Patch 1/3 causes apparent "regressions" in the rather ineffective
>> neon intrinsics tests that we currently carry soon hopefully to be
>> replaced by Christophe Lyon's rewrite that is being reviewed. On the
>> whole I deem this patch stack to be safe to go in if necessary. These
>> "regressions" are for -O0 with the vbic and vorn intrinsics which
>> don't now get combined and well, so be it.
>
> I think reimplementing these intrinsics in C is a mistake if we ever
> hope to make big-endian mode work properly, and "fixing" the generated
> header file by bypassing the generator makes it harder to accurately
> perform the sweeping changes that will probably be necessary to do that.#


> Recall e.g. the discussion around:

>
> http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00161.html

Well, it would help if the generator were written in a better language than ML :) . While I don't mind the different language in the backend once in a while the problem is that everytime anyone needs to make a change to this file, we spend far more time relearning ML than actually doing the change :(.

>
> Generally (though in this case it's merely an implementation detail)
> the NEON intrinsics and GCC's generic vector support cannot be expected
> to interwork properly (because of incompatible lane ordering). Of
> course we get away with it in little-endian mode though, and I guess
> the bridge has already been crossed by earlier patches.

Please note that I have been very careful about doing only those operations that will not be afflicted by big endian. I am not touching any of the lane-wise intrinsics or intrinsics that touch lane numbers. It is the intrinsics that have explicit lane numbering that have the issue and not the intrinsics I have touched. What's being done here is similar to how these particular intrinsics have been dealt with in the AArch64 backend and we don't see any issues with these intrinsics in the big endian mode and I will not expect these intrinsics to be more broken in big-endian than they are currently with this patch or these set of patches.

What specifically are you worried about with Patch 1/3 with respect to big endian in this case ? I agree that there may be issues with the specific "lane" extraction and vector lane numbering extensions that GCC has in big-endian mode vs Neon intrinsics but otherwise this change should *not* cause any issues in that space.

What specifically are you worried about with this patch other than losing the ability to auto-generate these intrinsics - the patch as is doesn't do anything but touch all those that operate on the entire vector and have no dependence at all on lane numbering ?

regards
Ramana


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]