[Patch, ARM] Implement widening vector moves and mults.

Richard Earnshaw rearnsha@arm.com
Tue Aug 24 09:49:00 GMT 2010


On Thu, 2010-08-19 at 15:14 +0100, Tejas Belagod wrote:
> Hi,
> 
> Take 2 with the patch!
> 
> This patch implements support for vector widening signed and unsigned
> moves and multiplications viz. VMOVL.<sign><size> and VMULL.<sign><size>
> NEON instructions. This helps vectorize loops whose bodies have widening
> moves or multiplications when compiled for NEON. This patch is
> implemented to have support for vectorizing with and without
> -mvectorize-with-neon-quad. 
> 
> Regression tested on trunk. OK for trunk?
> 
> --
> Tejas Belagod
> ARM.
> 
> gcc/testsuite
> 
> 2010-08-19 Tejas Belagod <tejas.belagod@arm.com>
> 
> 	* lib/target-supports.exp (check_effective_target_vect_unpack):
> 	Set vect_unpack supported flag to true for neon.
> 
> gcc/
> 
> 2010-08-19 Tejas Belagod <tejas.belagod@arm.com>
> 
> 	* config/arm/iterators.md (VU, SE, V_widen_l): New. 
> 	(V_unpack, US): New.
> 	* config/arm/neon.md (vec_unpack<US>_hi_<mode>): Expansion for
> 	vmovl.
> 	(vec_unpack<US>_lo_<mode>): Likewise.
> 	(neon_vec_unpack<US>_hi_<mode>): Instruction pattern for vmovl.
> 	(neon_vec_unpack<US>_lo_<mode>): Likewise.
> 	(vec_widen_<US>mult_lo_<mode>): Expansion for vmull.
> 	(vec_widen_<US>mult_hi_<mode>): Likewise.
> 	(neon_vec_<US>mult_lo_<mode>"): Instruction pattern for vmull.
> 	(neon_vec_<US>mult_hi_<mode>"): Likewise.
> 	(neon_unpack<US>_<mode>): Widening move intermediate step for
> 	vectorizing without -mvectorize-with-neon-quad.
> 	(neon_vec_<US>mult_<mode>): Widening multiply intermediate step
> 	for vectorizing without -mvectorize-with-neon-quad.
> 	* config/arm/predicates.md (vect_par_constant_high): Check for
> 	high-half lanes of a vector.
> 	(vect_par_constant_low): Check for low-half lanes of a vector.

+;; Assembler mnemonics for signedness of widening operations
Full stop at end of sentence.


+;; Predicates for parallel expanders based on mode.
+(define_special_predicate "vect_par_constant_high" 
+  (match_code "parallel")
+{
[...]
+
+  for (i = 0; i < count; i++)
+   {
+     rtx elt = XVECEXP (op, 0, i);
+     int val = INTVAL (elt);

It's unlikely that this will ever fault, as the uses of this predicate
are fairly limited, but good coding practice says that you shouldn't
assume that.  So you need to confirm that elt is a const_int before
extracting its value (and if it's not the predicate fails to match).
Similarly for vect_par_constant_low.

Otherwise, this is OK.

R.



More information about the Gcc-patches mailing list