This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [AARCH64][PATCH 2/3] Implementing vmulx_lane NEON intrinsic variants
- From: James Greenhalgh <james dot greenhalgh at arm dot com>
- To: Bilyan Borisov <bilyan dot borisov at foss dot arm dot com>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Sun, 22 Nov 2015 15:17:40 +0000
- Subject: Re: [AARCH64][PATCH 2/3] Implementing vmulx_lane NEON intrinsic variants
- Authentication-results: sourceware.org; auth=none
- References: <563338DC dot 6060206 at foss dot arm dot com> <20151103111636 dot GB18428 at arm dot com> <56407D80 dot 10002 at foss dot arm dot com>
On Mon, Nov 09, 2015 at 11:03:28AM +0000, Bilyan Borisov wrote:
>
>
> On 03/11/15 11:16, James Greenhalgh wrote:
> >On Fri, Oct 30, 2015 at 09:31:08AM +0000, Bilyan Borisov wrote:
> >>In this patch from the series, all vmulx_lane variants have been implemented as
> >>a vdup followed by a vmulx. Existing implementations of intrinsics were
> >>refactored to use this new approach.
> >>
> >>Several new nameless md patterns are added that will enable the combine pass to
> >>pick up the dup/fmulx combination and replace it with a proper fmulx[lane]
> >>instruction.
> >>
> >>In addition, test cases for all new intrinsics were added. Tested on targets
> >>aarch64-none-elf and aarch64_be-none-elf.
> >Hi,
> >
> >I have a small style comment below.
> >
> >>gcc/
> >>
> >>2015-XX-XX Bilyan Borisov <bilyan.borisov@arm.com>
> >>
> >> * config/aarch64/arm_neon.h (vmulx_lane_f32): New.
> >> (vmulx_lane_f64): New.
> >> (vmulxq_lane_f32): Refactored & moved.
> >> (vmulxq_lane_f64): Refactored & moved.
> >> (vmulx_laneq_f32): New.
> >> (vmulx_laneq_f64): New.
> >> (vmulxq_laneq_f32): New.
> >> (vmulxq_laneq_f64): New.
> >> (vmulxs_lane_f32): New.
> >> (vmulxs_laneq_f32): New.
> >> (vmulxd_lane_f64): New.
> >> (vmulxd_laneq_f64): New.
> >> * config/aarch64/aarch64-simd.md (*aarch64_combine_dupfmulx1<mode>,
> >> VDQSF): New pattern.
> >> (*aarch64_combine_dupfmulx2<mode>, VDQF): New pattern.
> >> (*aarch64_combine_dupfmulx3): New pattern.
> >> (*aarch64_combine_vgetfmulx1<mode>, VDQF_DF): New pattern.
> >I'm not sure I like the use of 1,2,3 for this naming scheme. Elsewhere in
> >the file, this convention points to the number of operands a pattern
> >requires (for example add<mode>3).
> >
> >I think elsewhere in the file we use:
> >
> >
> > "*aarch64_mul3_elt<mode>"
> > "*aarch64_mul3_elt_<vswap_width_name><mode>"
> > "*aarch64_mul3_elt_to_128df"
> > "*aarch64_mul3_elt_to_64v2df"
> >
> >Is there a reason not to follow that pattern?
> >
> >Thanks,
> >James
> >
> Hi,
>
> I've made the changes you've requested - the pattern names have been
> changed to follow better the naming conventions elsewhere in the
> file.
This is OK with a reformatting of some comments.
> +;; fmulxs_lane_f32, fmulxs_laneq_f32, fmulxd_lane_f64 == fmulx_lane_f64,
> +;; fmulxd_laneq_f64 == fmulx_laneq_f64
I'd rewrite this as so:
;; fmulxs_lane_f32, fmulxs_laneq_f32
;; fmulxd_lane_f64 == fmulx_lane_f64
;; fmulxd_laneq_f64 == fmulx_laneq_f64
The way you have it I was parsing it as all of {fmulxs_lane_f32,
fmulxs_laneq_f32, fmulxd_lane_f64} are the same as fmulx_lane_f64 - which
is not accurate.
Additionally, with all these comments I'd use the intrinsic name
(vmulx_lane_f32 rather than fmulx_lane_f32).
Sorry for the long wait for review.
I've committed it on your behalf as revision r230720.
Thanks,
James