[AARCH64][PATCH 2/3] Implementing vmulx_lane NEON intrinsic variants

Sun Nov 22 15:24:00 GMT 2015

On Mon, Nov 09, 2015 at 11:03:28AM +0000, Bilyan Borisov wrote:
> 
> 
> On 03/11/15 11:16, James Greenhalgh wrote:
> >On Fri, Oct 30, 2015 at 09:31:08AM +0000, Bilyan Borisov wrote:
> >>In this patch from the series, all vmulx_lane variants have been implemented as
> >>a vdup followed by a vmulx. Existing implementations of intrinsics were
> >>refactored to use this new approach.
> >>
> >>Several new nameless md patterns are added that will enable the combine pass to
> >>pick up the dup/fmulx combination and replace it with a proper fmulx[lane]
> >>instruction.
> >>
> >>In addition, test cases for all new intrinsics were added. Tested on targets
> >>aarch64-none-elf and aarch64_be-none-elf.
> >Hi,
> >
> >I have a small style comment below.
> >
> >>gcc/
> >>
> >>2015-XX-XX  Bilyan Borisov  <bilyan.borisov@arm.com>
> >>
> >>	* config/aarch64/arm_neon.h (vmulx_lane_f32): New.
> >>	(vmulx_lane_f64): New.
> >>	(vmulxq_lane_f32): Refactored & moved.
> >>	(vmulxq_lane_f64): Refactored & moved.
> >>	(vmulx_laneq_f32): New.
> >>	(vmulx_laneq_f64): New.
> >>	(vmulxq_laneq_f32): New.
> >>	(vmulxq_laneq_f64): New.
> >>	(vmulxs_lane_f32): New.
> >>	(vmulxs_laneq_f32): New.
> >>	(vmulxd_lane_f64): New.
> >>	(vmulxd_laneq_f64): New.
> >>	* config/aarch64/aarch64-simd.md (*aarch64_combine_dupfmulx1<mode>,
> >>	VDQSF): New pattern.
> >>	(*aarch64_combine_dupfmulx2<mode>, VDQF): New pattern.
> >>	(*aarch64_combine_dupfmulx3): New pattern.
> >>	(*aarch64_combine_vgetfmulx1<mode>, VDQF_DF): New pattern.
> >I'm not sure I like the use of 1,2,3 for this naming scheme. Elsewhere in
> >the file, this convention points to the number of operands a pattern
> >requires (for example add<mode>3).
> >
> >I think elsewhere in the file we use:
> >
> >
> >   "*aarch64_mul3_elt<mode>"
> >   "*aarch64_mul3_elt_<vswap_width_name><mode>"
> >   "*aarch64_mul3_elt_to_128df"
> >   "*aarch64_mul3_elt_to_64v2df"
> >
> >Is there a reason not to follow that pattern?
> >
> >Thanks,
> >James
> >
> Hi,
> 
> I've made the changes you've requested - the pattern names have been
> changed to follow better the naming conventions elsewhere in the
> file.

This is OK with a reformatting of some comments.

> +;; fmulxs_lane_f32, fmulxs_laneq_f32, fmulxd_lane_f64 ==  fmulx_lane_f64,
> +;; fmulxd_laneq_f64 == fmulx_laneq_f64

I'd rewrite this as so:

  ;; fmulxs_lane_f32, fmulxs_laneq_f32
  ;; fmulxd_lane_f64 ==  fmulx_lane_f64
  ;; fmulxd_laneq_f64 == fmulx_laneq_f64

The way you have it I was parsing it as all of {fmulxs_lane_f32,
fmulxs_laneq_f32, fmulxd_lane_f64} are the same as fmulx_lane_f64 - which
is not accurate.

Additionally, with all these comments I'd use the intrinsic name
(vmulx_lane_f32 rather than fmulx_lane_f32).

Sorry for the long wait for review.

I've committed it on your behalf as revision r230720.

Thanks,
James