[PATCH 1/7 V4] ifn/optabs: Support vector load/store with length

Richard Biener rguenther@suse.de
Tue Jun 23 06:47:08 GMT 2020


On Mon, 22 Jun 2020, Richard Sandiford wrote:

> "Kewen.Lin" <linkw@linux.ibm.com> writes:
> > @@ -5167,6 +5167,24 @@ mode @var{n}.
> >  
> >  This pattern is not allowed to @code{FAIL}.
> >  
> > +@cindex @code{lenload@var{m}} instruction pattern
> > +@item @samp{lenload@var{m}}
> > +Perform a vector load with length from memory operand 1 of mode @var{m}
> > +into register operand 0.  Length is provided in register operand 2 with
> > +appropriate mode which should afford the maximal required precision of
> > +any available lengths.
> 
> I think we need to say in more detail what “load with length” actually
> means.  How about:
> 
>   Load the number of bytes specified by operand 2 from memory operand 1
>   into register operand 0, setting the other bytes of operand 0 to
>   undefined values.  Operands 0 and 1 have mode @var{m}.  Operand 2 has
>   whichever integer mode the target prefers.

IMHO it should also say whether length may be >= the mode byte size
(should we say "units" instead of bytes everywhere?), and if so what
the behavior is.  That is, whether explicit masking is required
or if there's implicit masking or whether values >= the mode byte size
will load all bytes.

Does the number of bytes have to be a multiple of the vector component
size?  That is, is there any difference between lenloadv4si and
lenloadv8hi?

> > +@cindex @code{lenstore@var{m}} instruction pattern
> > +@item @samp{lenstore@var{m}}
> > +Perform a vector store with length from register operand 1 of mode @var{m}
> > +into memory operand 0.  Length is provided in register operand 2 with
> > +appropriate mode which should afford the maximal required precision of
> > +any available lengths.
> 
> Similarly here:
> 
>   Store the number of bytes specified by operand 2 from nonmemory operand 1
>   into memory operand 0, leaving the other bytes of operand 0 unchanged.
>   Operands 0 and 1 have mode @var{m}.  Operand 2 has whichever integer
>   mode the target prefers.
> 
> > @@ -2478,7 +2480,7 @@ expand_call_mem_ref (tree type, gcall *stmt, int index)
> >    return fold_build2 (MEM_REF, type, addr, build_int_cst (alias_ptr_type, 0));
> >  }
> >  
> > -/* Expand MASK_LOAD{,_LANES} call STMT using optab OPTAB.  */
> > +/* Expand MASK_LOAD{,_LANES} and LEN_LOAD call STMT using optab OPTAB.  */
> 
> s/and/or/.
> 
> >  
> >  static void
> >  expand_mask_load_optab_fn (internal_fn, gcall *stmt, convert_optab optab)
> 
> Think it would be worth generalising the name, e.g. to
> expand_partial_load_optab_fn, and adding a #define for
> expand_mask_load_optab_fn before the other two #defines.
> 
> Same comments for stores.
> 
> > diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
> > index 1d190d492ff..ed6561f296a 100644
> > --- a/gcc/internal-fn.def
> > +++ b/gcc/internal-fn.def
> > @@ -49,11 +49,13 @@ along with GCC; see the file COPYING3.  If not see
> >     - load_lanes: currently just vec_load_lanes
> >     - mask_load_lanes: currently just vec_mask_load_lanes
> >     - gather_load: used for {mask_,}gather_load
> > +   - len_load: currently just lenload
> >  
> >     - mask_store: currently just maskstore
> >     - store_lanes: currently just vec_store_lanes
> >     - mask_store_lanes: currently just vec_mask_store_lanes
> >     - scatter_store: used for {mask_,}scatter_store
> > +   - len_store: currently just lenstore
> >  
> >     - unary: a normal unary optab, such as vec_reverse_<mode>
> >     - binary: a normal binary optab, such as vec_interleave_lo_<mode>
> > @@ -127,6 +129,8 @@ DEF_INTERNAL_OPTAB_FN (GATHER_LOAD, ECF_PURE, gather_load, gather_load)
> >  DEF_INTERNAL_OPTAB_FN (MASK_GATHER_LOAD, ECF_PURE,
> >  		       mask_gather_load, gather_load)
> >  
> > +DEF_INTERNAL_OPTAB_FN (LEN_LOAD, ECF_PURE, lenload, len_load)
> > +
> >  DEF_INTERNAL_OPTAB_FN (SCATTER_STORE, 0, scatter_store, scatter_store)
> >  DEF_INTERNAL_OPTAB_FN (MASK_SCATTER_STORE, 0,
> >  		       mask_scatter_store, scatter_store)
> > @@ -136,6 +140,8 @@ DEF_INTERNAL_OPTAB_FN (STORE_LANES, ECF_CONST, vec_store_lanes, store_lanes)
> >  DEF_INTERNAL_OPTAB_FN (MASK_STORE_LANES, 0,
> >  		       vec_mask_store_lanes, mask_store_lanes)
> >  
> > +DEF_INTERNAL_OPTAB_FN (LEN_STORE, 0, lenstore, len_store)
> > +
> >  DEF_INTERNAL_OPTAB_FN (WHILE_ULT, ECF_CONST | ECF_NOTHROW, while_ult, while)
> >  DEF_INTERNAL_OPTAB_FN (CHECK_RAW_PTRS, ECF_CONST | ECF_NOTHROW,
> >  		       check_raw_ptrs, check_ptrs)
> > diff --git a/gcc/optabs.def b/gcc/optabs.def
> > index 0c64eb52a8d..9fe4ac1840d 100644
> > --- a/gcc/optabs.def
> > +++ b/gcc/optabs.def
> > @@ -435,3 +435,5 @@ OPTAB_D (check_war_ptrs_optab, "check_war_ptrs$a")
> >  OPTAB_DC (vec_duplicate_optab, "vec_duplicate$a", VEC_DUPLICATE)
> >  OPTAB_DC (vec_series_optab, "vec_series$a", VEC_SERIES)
> >  OPTAB_D (vec_shl_insert_optab, "vec_shl_insert_$a")
> > +OPTAB_D (lenload_optab, "lenload$a")
> > +OPTAB_D (lenstore_optab, "lenstore$a")
> 
> Sorry, I should have picked up on this last time, but I think we should
> be consistent about whether there's an underscore after “len” or not.
> I realise this is just replicating what happens for IFN_MASK_LOAD/
> “maskload” and IFN_MASK_STORE/“maskstore”, but it's something I kept
> tripping over when implementing those for SVE.
> 
> Personally I think it is easier to read with the underscore, so this
> would be “len_load_optab” and “len_load$a” (or “len_load_$a”,
> there's no real consistency on that).  Same for stores.
> 
> Thanks,
> Richard
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)


More information about the Gcc-patches mailing list