This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch][ARM,AArch64] more poly64 intrinsics and tests


Hi Christophe,

On 06/02/17 08:31, Christophe Lyon wrote:
On 4 February 2017 at 22:54, James Greenhalgh <james.greenhalgh@arm.com> wrote:
On Wed, Jan 11, 2017 at 11:13:07AM +0100, Christophe Lyon wrote:
Ping?

James, I'm not sure whether your comment was a request for a new
version of my patch or just FYI?
Sorry that this was unclear. I was looking for a new version of the patch
covering this comment. Otherwise we just have debt to go fix it in future.

With the suggested change, the AArch64 parts of this patch are OK - adding
missing intrinsics is very safe (even in Stage 4).

Please post an updated patch, and give Richard and Marcus a reasonable
amount of tiume to object to taking the patch this late. (and you need an
AArch32 OK too).

Thanks,
James

Hi James,

Thanks for the clarification, here is an updated patch.

I had to make a few changes after rebasing, and I also took the opportunity to
rewrite the existing vtst_p8, vtst_p16, vtstq_p8 and vtstq_p16 without an
asm() statement.

As before, the aarch64 and aarch32 updates to arm_neon.h are independent,
but I found it simpler to group them, as they imply updates to the same test.

Tested as usual, cross-testing on several arm* and aarch64* configurations,
no regression.

OK?

Ok for arm.

Thanks,
Kyrill

Thanks,

Christophe


On 3 January 2017 at 16:47, Christophe Lyon <christophe.lyon@linaro.org> wrote:
Ping?


On 14 December 2016 at 23:09, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
On 14 December 2016 at 17:55, James Greenhalgh <james.greenhalgh@arm.com> wrote:
On Mon, Dec 12, 2016 at 05:03:31PM +0100, Christophe Lyon wrote:
Hi,

After the recent update from Tamar, I noticed a few discrepancies
between ARM and AArch64 regarding a few poly64 intrinsics.

This patch:
- adds vtst_p64 and vtstq_p64 to AArch64's arm_neon.h
- adds vgetq_lane_p64, vset_lane_p64 and vsetq_lane_p64 to ARM's arm_neon.h
( vget_lane_p64 was already there)
- adds the corresponding tests, and moves the vget_lane_p64 ones out
of the #ifdef __aarch64__ zone.

Cross-tested on arm* and aarch64* targets.

OK?
The AArch64 parts of this look fine to me, but I do have one question on
your inline assembly implementation for vtstq_p64:

+__extension__ extern __inline uint64x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vtstq_p64 (poly64x2_t a, poly64x2_t b)
+{
+  uint64x2_t result;
+  __asm__ ("cmtst %0.2d, %1.2d, %2.2d"
+           : "=w"(result)
+           : "w"(a), "w"(b)
+           : /* No clobbers */);
+  return result;
+}
+
Why can this not be written as many of the other vtstq intrinsics are; e.g.:

    __extension__ extern __inline uint64x2_t
   __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
   vtstq_p64 (poly64x2_t __a, poly64x2_t __b)
   {
     return (uint64x2_t) ((((uint64x2_t) __a) & ((uint64x2_t) __b))
                           != __AARCH64_INT64_C (0));
   }

I don't know, I just followed the pattern used for vtstq_p8 and vtstq_p16
just above...


Thanks,
James

gcc/ChangeLog:

2016-12-12  Christophe Lyon  <christophe.lyon@linaro.org>

       * config/aarch64/arm_neon.h (vtst_p64): New.
       (vtstq_p64): New.
       * config/arm/arm_neon.h (vgetq_lane_p64): New.
       (vset_lane_p64): New.
       (vsetq_lane_p64): New.

gcc/testsuite/ChangeLog:

2016-12-12  Christophe Lyon  <christophe.lyon@linaro.org>

       * gcc.target/aarch64/advsimd-intrinsics/p64_p128.c
       (vget_lane_expected, vset_lane_expected, vtst_expected_poly64x1):
       New.
       (vmov_n_expected0, vmov_n_expected1, vmov_n_expected2)
       (expected_vld_st2_0, expected_vld_st2_1, expected_vld_st3_0)
       (expected_vld_st3_1, expected_vld_st3_2, expected_vld_st4_0)
       (expected_vld_st4_1, expected_vld_st4_2, expected_vld_st4_3)
       (vtst_expected_poly64x2): Move to aarch64-only section.
       (vget_lane_p64, vgetq_lane_p64, vset_lane_p64, vsetq_lane_p64)
       (vtst_p64, vtstq_p64): New tests.




Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]