[PATCH 2/3] Refactor widen_plus as internal_fn

Andre Vieira (lists) andre.simoesdiasvieira@arm.com
Mon May 15 11:53:52 GMT 2023



On 15/05/2023 12:01, Richard Biener wrote:
> On Mon, 15 May 2023, Richard Sandiford wrote:
> 
>> Richard Biener <rguenther@suse.de> writes:
>>> On Fri, 12 May 2023, Richard Sandiford wrote:
>>>
>>>> Richard Biener <rguenther@suse.de> writes:
>>>>> On Fri, 12 May 2023, Andre Vieira (lists) wrote:
>>>>>
>>>>>> I have dealt with, I think..., most of your comments. There's quite a few
>>>>>> changes, I think it's all a bit simpler now. I made some other changes to the
>>>>>> costing in tree-inline.cc and gimple-range-op.cc in which I try to preserve
>>>>>> the same behaviour as we had with the tree codes before. Also added some extra
>>>>>> checks to tree-cfg.cc that made sense to me.
>>>>>>
>>>>>> I am still regression testing the gimple-range-op change, as that was a last
>>>>>> minute change, but the rest survived a bootstrap and regression test on
>>>>>> aarch64-unknown-linux-gnu.
>>>>>>
>>>>>> cover letter:
>>>>>>
>>>>>> This patch replaces the existing tree_code widen_plus and widen_minus
>>>>>> patterns with internal_fn versions.
>>>>>>
>>>>>> DEF_INTERNAL_OPTAB_WIDENING_HILO_FN and DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
>>>>>> are like DEF_INTERNAL_SIGNED_OPTAB_FN and DEF_INTERNAL_OPTAB_FN respectively
>>>>>> except they provide convenience wrappers for defining conversions that require
>>>>>> a hi/lo split.  Each definition for <NAME> will require optabs for _hi and _lo
>>>>>> and each of those will also require a signed and unsigned version in the case
>>>>>> of widening. The hi/lo pair is necessary because the widening and narrowing
>>>>>> operations take n narrow elements as inputs and return n/2 wide elements as
>>>>>> outputs. The 'lo' operation operates on the first n/2 elements of input. The
>>>>>> 'hi' operation operates on the second n/2 elements of input. Defining an
>>>>>> internal_fn along with hi/lo variations allows a single internal function to
>>>>>> be returned from a vect_recog function that will later be expanded to hi/lo.
>>>>>>
>>>>>>
>>>>>>   For example:
>>>>>>   IFN_VEC_WIDEN_PLUS -> IFN_VEC_WIDEN_PLUS_HI, IFN_VEC_WIDEN_PLUS_LO
>>>>>> for aarch64: IFN_VEC_WIDEN_PLUS_HI   -> vec_widen_<su>add_hi_<mode> ->
>>>>>> (u/s)addl2
>>>>>>                         IFN_VEC_WIDEN_PLUS_LO  -> vec_widen_<su>add_lo_<mode>
>>>>>> -> (u/s)addl
>>>>>>
>>>>>> This gives the same functionality as the previous WIDEN_PLUS/WIDEN_MINUS tree
>>>>>> codes which are expanded into VEC_WIDEN_PLUS_LO, VEC_WIDEN_PLUS_HI.
>>>>>
>>>>> What I still don't understand is how we are so narrowly focused on
>>>>> HI/LO?  We need a combined scalar IFN for pattern selection (not
>>>>> sure why that's now called _HILO, I expected no suffix).  Then there's
>>>>> three possibilities the target can implement this:
>>>>>
>>>>>   1) with a widen_[su]add<mode> instruction - I _think_ that's what
>>>>>      RISCV is going to offer since it is a target where vector modes
>>>>>      have "padding" (aka you cannot subreg a V2SI to get V4HI).  Instead
>>>>>      RVV can do a V4HI to V4SI widening and widening add/subtract
>>>>>      using vwadd[u] and vwsub[u] (the HI->SI widening is actually
>>>>>      done with a widening add of zero - eh).
>>>>>      IIRC GCN is the same here.
>>>>
>>>> SVE currently does this too, but the addition and widening are
>>>> separate operations.  E.g. in principle there's no reason why
>>>> you can't sign-extend one operand, zero-extend the other, and
>>>> then add the result together.  Or you could extend them from
>>>> different sizes (QI and HI).  All of those are supported
>>>> (if the costing allows them).
>>>
>>> I see.  So why does the target the expose widen_[su]add<mode> at all?
>>
>> It shouldn't (need to) do that.  I don't think we should have an optab
>> for the unsplit operation.
>>
>> At least on SVE, we really want the extensions to be fused with loads
>> (where possible) rather than with arithmetic.
>>
>> We can still do the widening arithmetic in one go.  It's just that
>> fusing with the loads works for the mixed-sign and mixed-size cases,
>> and can handle more than just doubling the element size.
>>
>>>> If the target has operations to do combined extending and adding (or
>>>> whatever), then at the moment we rely on combine to generate them.
>>>>
>>>> So I think this case is separate from Andre's work.  The addition
>>>> itself is just an ordinary addition, and any widening happens by
>>>> vectorising a CONVERT/NOP_EXPR.
>>>>
>>>>>   2) with a widen_[su]add{_lo,_hi}<mode> combo - that's what the tree
>>>>>      codes currently support (exclusively)
>>>>>   3) similar, but widen_[su]add{_even,_odd}<mode>
>>>>>
>>>>> that said, things like decomposes_to_hilo_fn_p look to paint us into
>>>>> a 2) corner without good reason.
>>>>
>>>> I suppose one question is: how much of the patch is really specific
>>>> to HI/LO, and how much is just grouping two halves together?
>>>
>>> Yep, that I don't know for sure.
>>>
>>>>   The nice
>>>> thing about the internal-fn grouping macros is that, if (3) is
>>>> implemented in future, the structure will strongly encourage even/odd
>>>> pairs to be supported for all operations that support hi/lo.  That is,
>>>> I would expect the grouping macros to be extended to define even/odd
>>>> ifns alongside hi/lo ones, rather than adding separate definitions
>>>> for even/odd functions.
>>>>
>>>> If so, at least from the internal-fn.* side of things, I think the question
>>>> is whether it's OK to stick with hilo names for now, or whether we should
>>>> use more forward-looking names.
>>>
>>> I think for parts that are independent we could use a more
>>> forward-looking name.  Maybe _halves?
>>
>> Using _halves for the ifn macros sounds good to me FWIW.
>>
>>> But I'm also not sure
>>> how much of that is really needed (it seems to be tied around
>>> optimizing optabs space?)
>>
>> Not sure what you mean by "this".  Optabs space shouldn't be a problem
>> though.  The optab encoding gives us a full int to play with, and it
>> could easily go up to 64 bits if necessary/convenient.
>>
>> At least on the internal-fn.* side, the aim is really just to establish
>> a regular structure, so that we don't have arbitrary differences between
>> different widening operations, or too much cut-&-paste.
> 
> Hmm, I'm looking at the need for the std::map and
> internal_fn_hilo_keys_array and internal_fn_hilo_values_array.
> The vectorizer pieces contain
> 
> +  if (code.is_fn_code ())
> +     {
> +      internal_fn ifn = as_internal_fn ((combined_fn) code);
> +      gcc_assert (decomposes_to_hilo_fn_p (ifn));
> +
> +      internal_fn lo, hi;
> +      lookup_hilo_internal_fn (ifn, &lo, &hi);
> +      *code1 = as_combined_fn (lo);
> +      *code2 = as_combined_fn (hi);
> +      optab1 = lookup_hilo_ifn_optab (lo, !TYPE_UNSIGNED (vectype));
> +      optab2 = lookup_hilo_ifn_optab (hi, !TYPE_UNSIGNED (vectype));
> 
> so that tries to automatically associate the scalar widening IFN
> with the set(s) of IFN pairs we can split to.  But then this
> list should be static and there's no need to create a std::map?
> Maybe gencfn-macros.cc can be enhanced to output these static
> cases?  Or the vectorizer could (as it did previously) simply
> open-code the handled cases (I guess since we deal with two
> cases only now I'd prefer that).
> 
> Thanks,
> Richard.
> 
> 
>> Thanks,
>> Richard
>>
> 
The patch I uploaded last no longer has std::map nor 
internal_fn_hilo_keys_array and internal_fn_hilo_values_array. (I've 
attached it again)

I'm not sure I understand the _halves, do you mean that for the case 
where I had _hilo or _HILO before we rename that to _halves/_HALVES such 
that it later represents both _hi/_lo separation and _even/_odd?

And am I correct to assume we are just giving up on having a 
INTERNAL_OPTAB_FN idea for 1)?

Kind regards,
Andre
-------------- next part --------------
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index bfc98a8d943467b33390defab9682f44efab5907..ffbbecb9409e1c2835d658c2a8855cd0e955c0f2 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -4626,7 +4626,7 @@
   [(set_attr "type" "neon_<ADDSUB:optab>_long")]
 )
 
-(define_expand "vec_widen_<su>addl_lo_<mode>"
+(define_expand "vec_widen_<su>add_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4638,7 +4638,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>addl_hi_<mode>"
+(define_expand "vec_widen_<su>add_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4650,7 +4650,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_lo_<mode>"
+(define_expand "vec_widen_<su>sub_lo_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
@@ -4662,7 +4662,7 @@
   DONE;
 })
 
-(define_expand "vec_widen_<su>subl_hi_<mode>"
+(define_expand "vec_widen_<su>sub_hi_<mode>"
   [(match_operand:<VWIDE> 0 "register_operand")
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 1 "register_operand"))
    (ANY_EXTEND:<VWIDE> (match_operand:VQW 2 "register_operand"))]
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi
index 8b2882da4fe7da07d22b4e5384d049ba7d3907bf..0fd7e6cce8bbd4ecb8027b702722adcf6c32eb55 100644
--- a/gcc/doc/generic.texi
+++ b/gcc/doc/generic.texi
@@ -1811,6 +1811,10 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}.
 @tindex VEC_RSHIFT_EXPR
 @tindex VEC_WIDEN_MULT_HI_EXPR
 @tindex VEC_WIDEN_MULT_LO_EXPR
+@tindex IFN_VEC_WIDEN_PLUS_HI
+@tindex IFN_VEC_WIDEN_PLUS_LO
+@tindex IFN_VEC_WIDEN_MINUS_HI
+@tindex IFN_VEC_WIDEN_MINUS_LO
 @tindex VEC_WIDEN_PLUS_HI_EXPR
 @tindex VEC_WIDEN_PLUS_LO_EXPR
 @tindex VEC_WIDEN_MINUS_HI_EXPR
@@ -1861,6 +1865,33 @@ vector of @code{N/2} products. In the case of @code{VEC_WIDEN_MULT_LO_EXPR} the
 low @code{N/2} elements of the two vector are multiplied to produce the
 vector of @code{N/2} products.
 
+@item IFN_VEC_WIDEN_PLUS_HI
+@itemx IFN_VEC_WIDEN_PLUS_LO
+These internal functions represent widening vector addition of the high and low
+parts of the two input vectors, respectively.  Their operands are vectors that
+contain the same number of elements (@code{N}) of the same integral type. The
+result is a vector that contains half as many elements, of an integral type
+whose size is twice as wide.  In the case of @code{IFN_VEC_WIDEN_PLUS_HI} the
+high @code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} products.  In the case of @code{IFN_VEC_WIDEN_PLUS_LO} the low
+@code{N/2} elements of the two vectors are added to produce the vector of
+@code{N/2} products.
+
+@item IFN_VEC_WIDEN_MINUS_HI
+@itemx IFN_VEC_WIDEN_MINUS_LO
+These internal functions represent widening vector subtraction of the high and
+low parts of the two input vectors, respectively.  Their operands are vectors
+that contain the same number of elements (@code{N}) of the same integral type.
+The high/low elements of the second vector are subtracted from the high/low
+elements of the first. The result is a vector that contains half as many
+elements, of an integral type whose size is twice as wide.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_HI} the high @code{N/2} elements of the second
+vector are subtracted from the high @code{N/2} of the first to produce the
+vector of @code{N/2} products.  In the case of
+@code{IFN_VEC_WIDEN_MINUS_LO} the low @code{N/2} elements of the second
+vector are subtracted from the low @code{N/2} of the first to produce the
+vector of @code{N/2} products.
+
 @item VEC_WIDEN_PLUS_HI_EXPR
 @itemx VEC_WIDEN_PLUS_LO_EXPR
 These nodes represent widening vector addition of the high and low parts of
diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index 594bd3043f0e944299ddfff219f757ef15a3dd61..66636d82df27626e7911efd0cb8526921b39633f 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -1187,6 +1187,7 @@ gimple_range_op_handler::maybe_non_standard ()
 {
   range_operator *signed_op = ptr_op_widen_mult_signed;
   range_operator *unsigned_op = ptr_op_widen_mult_unsigned;
+  bool signed1, signed2, signed_ret;
   if (gimple_code (m_stmt) == GIMPLE_ASSIGN)
     switch (gimple_assign_rhs_code (m_stmt))
       {
@@ -1202,32 +1203,55 @@ gimple_range_op_handler::maybe_non_standard ()
 	  m_op1 = gimple_assign_rhs1 (m_stmt);
 	  m_op2 = gimple_assign_rhs2 (m_stmt);
 	  tree ret = gimple_assign_lhs (m_stmt);
-	  bool signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
-	  bool signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
-	  bool signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
-
-	  /* Normally these operands should all have the same sign, but
-	     some passes and violate this by taking mismatched sign args.  At
-	     the moment the only one that's possible is mismatch inputs and
-	     unsigned output.  Once ranger supports signs for the operands we
-	     can properly fix it,  for now only accept the case we can do
-	     correctly.  */
-	  if ((signed1 ^ signed2) && signed_ret)
-	    return;
-
-	  m_valid = true;
-	  if (signed2 && !signed1)
-	    std::swap (m_op1, m_op2);
-
-	  if (signed1 || signed2)
-	    m_int = signed_op;
-	  else
-	    m_int = unsigned_op;
+	  signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+	  signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+	  signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
 	  break;
 	}
 	default:
-	  break;
+	  return;
       }
+  else if (gimple_code (m_stmt) == GIMPLE_CALL
+      && gimple_call_internal_p (m_stmt)
+      && gimple_get_lhs (m_stmt) != NULL_TREE)
+    switch (gimple_call_internal_fn (m_stmt))
+      {
+      case IFN_VEC_WIDEN_PLUS_LO:
+      case IFN_VEC_WIDEN_PLUS_HI:
+	  {
+	    signed_op = ptr_op_widen_plus_signed;
+	    unsigned_op = ptr_op_widen_plus_unsigned;
+	    m_valid = false;
+	    m_op1 = gimple_call_arg (m_stmt, 0);
+	    m_op2 = gimple_call_arg (m_stmt, 1);
+	    tree ret = gimple_get_lhs (m_stmt);
+	    signed1 = TYPE_SIGN (TREE_TYPE (m_op1)) == SIGNED;
+	    signed2 = TYPE_SIGN (TREE_TYPE (m_op2)) == SIGNED;
+	    signed_ret = TYPE_SIGN (TREE_TYPE (ret)) == SIGNED;
+	    break;
+	  }
+      default:
+	return;
+      }
+  else
+    return;
+
+    /* Normally these operands should all have the same sign, but some passes
+       and violate this by taking mismatched sign args.  At the moment the only
+       one that's possible is mismatch inputs and unsigned output.  Once ranger
+       supports signs for the operands we can properly fix it,  for now only
+       accept the case we can do correctly.  */
+    if ((signed1 ^ signed2) && signed_ret)
+      return;
+
+    m_valid = true;
+    if (signed2 && !signed1)
+      std::swap (m_op1, m_op2);
+
+    if (signed1 || signed2)
+      m_int = signed_op;
+    else
+      m_int = unsigned_op;
 }
 
 // Set up a gimple_range_op_handler for any built in function which can be
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 5c9da73ea11f8060b18dcf513599c9694fa4f2ad..1acea5ae33046b70de247b1688aea874d9956abc 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -90,6 +90,19 @@ lookup_internal_fn (const char *name)
   return entry ? *entry : IFN_LAST;
 }
 
+/*  Given an internal_fn IFN that is a HILO function, return its corresponding
+    LO and HI internal_fns.  */
+
+extern void
+lookup_hilo_internal_fn (internal_fn ifn, internal_fn *lo, internal_fn *hi)
+{
+  gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+  *lo = internal_fn (ifn + 1);
+  *hi = internal_fn (ifn + 2);
+}
+
+
 /* Fnspec of each internal function, indexed by function number.  */
 const_tree internal_fn_fnspec_array[IFN_LAST + 1];
 
@@ -137,7 +150,16 @@ const direct_internal_fn_info direct_internal_fn_array[IFN_LAST + 1] = {
 #define DEF_INTERNAL_OPTAB_FN(CODE, FLAGS, OPTAB, TYPE) TYPE##_direct,
 #define DEF_INTERNAL_SIGNED_OPTAB_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \
 				     UNSIGNED_OPTAB, TYPE) TYPE##_direct,
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR, SIGNED_OPTAB, \
+					    UNSIGNED_OPTAB, TYPE)		  \
+TYPE##_direct, TYPE##_direct, TYPE##_direct,
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE)	\
+TYPE##_direct, TYPE##_direct, TYPE##_direct,
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
   not_direct
 };
 
@@ -3852,7 +3874,7 @@ multi_vector_optab_supported_p (convert_optab optab, tree_pair types,
 
 /* Return the optab used by internal function FN.  */
 
-static optab
+optab
 direct_internal_fn_optab (internal_fn fn, tree_pair types)
 {
   switch (fn)
@@ -3971,6 +3993,9 @@ commutative_binary_fn_p (internal_fn fn)
     case IFN_UBSAN_CHECK_MUL:
     case IFN_ADD_OVERFLOW:
     case IFN_MUL_OVERFLOW:
+    case IFN_VEC_WIDEN_PLUS_HILO:
+    case IFN_VEC_WIDEN_PLUS_LO:
+    case IFN_VEC_WIDEN_PLUS_HI:
       return true;
 
     default:
@@ -4044,6 +4069,88 @@ first_commutative_argument (internal_fn fn)
     }
 }
 
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as wide as the element size of the input vectors.  */
+
+bool
+widening_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \
+    case IFN_##NAME##_HILO:\
+    case IFN_##NAME##_HI: \
+    case IFN_##NAME##_LO: \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if this CODE describes an internal_fn that returns a vector with
+   elements twice as narrow as the element size of the input vectors.  */
+
+bool
+narrowing_fn_p (code_helper code)
+{
+  if (!code.is_fn_code ())
+    return false;
+
+  if (!internal_fn_p ((combined_fn) code))
+    return false;
+
+  internal_fn fn = as_internal_fn ((combined_fn) code);
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \
+    case IFN_##NAME##_HILO:\
+    case IFN_##NAME##_HI: \
+    case IFN_##NAME##_LO: \
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
+/* Return true if FN decomposes to _hi and _lo IFN.  */
+
+bool
+decomposes_to_hilo_fn_p (internal_fn fn)
+{
+  switch (fn)
+    {
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, F, S, SO, UO, T) \
+    case IFN_##NAME##_HILO:\
+      return true;
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+    #define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, F, O, T) \
+    case IFN_##NAME##_HILO:\
+      return true;
+    #include "internal-fn.def"
+    #undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+    #undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+
+    default:
+      return false;
+    }
+}
+
 /* Return true if IFN_SET_EDOM is supported.  */
 
 bool
@@ -4071,7 +4178,33 @@ set_edom_supported_p (void)
     optab which_optab = direct_internal_fn_optab (fn, types);		\
     expand_##TYPE##_optab_fn (fn, stmt, which_optab);			\
   }
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(CODE, FLAGS, SELECTOR,	    \
+					    SIGNED_OPTAB, UNSIGNED_OPTAB,   \
+					    TYPE)			    \
+  static void								    \
+  expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED,		    \
+			gcall *stmt ATTRIBUTE_UNUSED)			    \
+  {									    \
+    gcc_unreachable ();							    \
+  }									    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_HI, FLAGS, SELECTOR, SIGNED_OPTAB,    \
+			       UNSIGNED_OPTAB, TYPE)			    \
+  DEF_INTERNAL_SIGNED_OPTAB_FN(CODE##_LO, FLAGS, SELECTOR, SIGNED_OPTAB,    \
+			       UNSIGNED_OPTAB, TYPE)
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(CODE, FLAGS, OPTAB, TYPE)	\
+  static void								\
+  expand_##CODE##_HILO (internal_fn fn ATTRIBUTE_UNUSED,		\
+			gcall *stmt ATTRIBUTE_UNUSED)			\
+  {									\
+    gcc_unreachable ();							\
+  }									\
+  DEF_INTERNAL_OPTAB_FN(CODE##_LO, FLAGS, OPTAB, TYPE)			\
+  DEF_INTERNAL_OPTAB_FN(CODE##_HI, FLAGS, OPTAB, TYPE)
 #include "internal-fn.def"
+#undef DEF_INTERNAL_OPTAB_FN
+#undef DEF_INTERNAL_SIGNED_OPTAB_FN
+#undef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#undef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
 
 /* Routines to expand each internal function, indexed by function number.
    Each routine has the prototype:
@@ -4080,6 +4213,7 @@ set_edom_supported_p (void)
 
    where STMT is the statement that performs the call. */
 static void (*const internal_fn_expanders[]) (internal_fn, gcall *) = {
+
 #define DEF_INTERNAL_FN(CODE, FLAGS, FNSPEC) expand_##CODE,
 #include "internal-fn.def"
   0
diff --git a/gcc/internal-fn.def b/gcc/internal-fn.def
index 7fe742c2ae713e7152ab05cfdfba86e4e0aa3456..012dd323b86dd7cfcc5c13d3a2bb2a453937155d 100644
--- a/gcc/internal-fn.def
+++ b/gcc/internal-fn.def
@@ -85,6 +85,13 @@ along with GCC; see the file COPYING3.  If not see
    says that the function extends the C-level BUILT_IN_<NAME>{,L,LL,IMAX}
    group of functions to any integral mode (including vector modes).
 
+   DEF_INTERNAL_SIGNED_OPTAB_HILO_FN is like DEF_INTERNAL_OPTAB_FN except it
+   provides convenience wrappers for defining conversions that require a
+   hi/lo split, like widening and narrowing operations.  Each definition
+   for <NAME> will require an optab named <OPTAB> and two other optabs that
+   you specify for signed and unsigned.
+
+
    Each entry must have a corresponding expander of the form:
 
      void expand_NAME (gimple_call stmt)
@@ -123,6 +130,20 @@ along with GCC; see the file COPYING3.  If not see
   DEF_INTERNAL_OPTAB_FN (NAME, FLAGS, OPTAB, TYPE)
 #endif
 
+#ifndef DEF_INTERNAL_OPTAB_WIDENING_HILO_FN
+#define DEF_INTERNAL_OPTAB_WIDENING_HILO_FN(NAME, FLAGS, SELECTOR, SOPTAB, UOPTAB, TYPE) \
+  DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _LO, FLAGS, SELECTOR, SOPTAB##_lo, UOPTAB##_lo, TYPE) \
+  DEF_INTERNAL_SIGNED_OPTAB_FN (NAME ## _HI, FLAGS, SELECTOR, SOPTAB##_hi, UOPTAB##_hi, TYPE)
+#endif
+
+#ifndef DEF_INTERNAL_OPTAB_NARROWING_HILO_FN
+#define DEF_INTERNAL_OPTAB_NARROWING_HILO_FN(NAME, FLAGS, OPTAB, TYPE) \
+  DEF_INTERNAL_FN (NAME##_HILO, FLAGS | ECF_LEAF, NULL) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _LO, FLAGS, OPTAB##_lo, TYPE) \
+  DEF_INTERNAL_OPTAB_FN (NAME ## _HI, FLAGS, OPTAB##_hi, TYPE)
+#endif
+
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD, ECF_PURE, maskload, mask_load)
 DEF_INTERNAL_OPTAB_FN (LOAD_LANES, ECF_CONST, vec_load_lanes, load_lanes)
 DEF_INTERNAL_OPTAB_FN (MASK_LOAD_LANES, ECF_PURE,
@@ -315,6 +336,16 @@ DEF_INTERNAL_OPTAB_FN (COMPLEX_ADD_ROT270, ECF_CONST, cadd270, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL, ECF_CONST, cmul, binary)
 DEF_INTERNAL_OPTAB_FN (COMPLEX_MUL_CONJ, ECF_CONST, cmul_conj, binary)
 DEF_INTERNAL_OPTAB_FN (VEC_ADDSUB, ECF_CONST, vec_addsub, binary)
+DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_PLUS,
+				     ECF_CONST | ECF_NOTHROW,
+				     first,
+				     vec_widen_sadd, vec_widen_uadd,
+				     binary)
+DEF_INTERNAL_OPTAB_WIDENING_HILO_FN (VEC_WIDEN_MINUS,
+				     ECF_CONST | ECF_NOTHROW,
+				     first,
+				     vec_widen_ssub, vec_widen_usub,
+				     binary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMADDSUB, ECF_CONST, vec_fmaddsub, ternary)
 DEF_INTERNAL_OPTAB_FN (VEC_FMSUBADD, ECF_CONST, vec_fmsubadd, ternary)
 
diff --git a/gcc/internal-fn.h b/gcc/internal-fn.h
index 08922ed4254898f5fffca3f33973e96ed9ce772f..8ba07d6d1338e75bc5a451d9e403112a608f3ea2 100644
--- a/gcc/internal-fn.h
+++ b/gcc/internal-fn.h
@@ -20,6 +20,10 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_INTERNAL_FN_H
 #define GCC_INTERNAL_FN_H
 
+#include "insn-codes.h"
+#include "insn-opinit.h"
+
+
 /* INTEGER_CST values for IFN_UNIQUE function arg-0.
 
    UNSPEC: Undifferentiated UNIQUE.
@@ -112,6 +116,8 @@ internal_fn_name (enum internal_fn fn)
 }
 
 extern internal_fn lookup_internal_fn (const char *);
+extern void lookup_hilo_internal_fn (internal_fn, internal_fn *, internal_fn *);
+extern optab direct_internal_fn_optab (internal_fn, tree_pair);
 
 /* Return the ECF_* flags for function FN.  */
 
@@ -210,6 +216,9 @@ extern bool commutative_binary_fn_p (internal_fn);
 extern bool commutative_ternary_fn_p (internal_fn);
 extern int first_commutative_argument (internal_fn);
 extern bool associative_binary_fn_p (internal_fn);
+extern bool widening_fn_p (code_helper);
+extern bool narrowing_fn_p (code_helper);
+extern bool decomposes_to_hilo_fn_p (internal_fn);
 
 extern bool set_edom_supported_p (void);
 
diff --git a/gcc/optabs.cc b/gcc/optabs.cc
index c8e39c82d57a7d726e7da33d247b80f32ec9236c..5a08d91e550b2d92e9572211f811fdba99a33a38 100644
--- a/gcc/optabs.cc
+++ b/gcc/optabs.cc
@@ -1314,7 +1314,15 @@ commutative_optab_p (optab binoptab)
 	  || binoptab == smul_widen_optab
 	  || binoptab == umul_widen_optab
 	  || binoptab == smul_highpart_optab
-	  || binoptab == umul_highpart_optab);
+	  || binoptab == umul_highpart_optab
+	  || binoptab == vec_widen_saddl_hi_optab
+	  || binoptab == vec_widen_saddl_lo_optab
+	  || binoptab == vec_widen_uaddl_hi_optab
+	  || binoptab == vec_widen_uaddl_lo_optab
+	  || binoptab == vec_widen_sadd_hi_optab
+	  || binoptab == vec_widen_sadd_lo_optab
+	  || binoptab == vec_widen_uadd_hi_optab
+	  || binoptab == vec_widen_uadd_lo_optab);
 }
 
 /* X is to be used in mode MODE as operand OPN to BINOPTAB.  If we're
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 695f5911b300c9ca5737de9be809fa01aabe5e01..16d121722c8c5723d9b164f5a2c616dc7ec143de 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -410,6 +410,10 @@ OPTAB_D (vec_widen_ssubl_hi_optab, "vec_widen_ssubl_hi_$a")
 OPTAB_D (vec_widen_ssubl_lo_optab, "vec_widen_ssubl_lo_$a")
 OPTAB_D (vec_widen_saddl_hi_optab, "vec_widen_saddl_hi_$a")
 OPTAB_D (vec_widen_saddl_lo_optab, "vec_widen_saddl_lo_$a")
+OPTAB_D (vec_widen_ssub_hi_optab, "vec_widen_ssub_hi_$a")
+OPTAB_D (vec_widen_ssub_lo_optab, "vec_widen_ssub_lo_$a")
+OPTAB_D (vec_widen_sadd_hi_optab, "vec_widen_sadd_hi_$a")
+OPTAB_D (vec_widen_sadd_lo_optab, "vec_widen_sadd_lo_$a")
 OPTAB_D (vec_widen_sshiftl_hi_optab, "vec_widen_sshiftl_hi_$a")
 OPTAB_D (vec_widen_sshiftl_lo_optab, "vec_widen_sshiftl_lo_$a")
 OPTAB_D (vec_widen_umult_even_optab, "vec_widen_umult_even_$a")
@@ -422,6 +426,10 @@ OPTAB_D (vec_widen_usubl_hi_optab, "vec_widen_usubl_hi_$a")
 OPTAB_D (vec_widen_usubl_lo_optab, "vec_widen_usubl_lo_$a")
 OPTAB_D (vec_widen_uaddl_hi_optab, "vec_widen_uaddl_hi_$a")
 OPTAB_D (vec_widen_uaddl_lo_optab, "vec_widen_uaddl_lo_$a")
+OPTAB_D (vec_widen_usub_hi_optab, "vec_widen_usub_hi_$a")
+OPTAB_D (vec_widen_usub_lo_optab, "vec_widen_usub_lo_$a")
+OPTAB_D (vec_widen_uadd_hi_optab, "vec_widen_uadd_hi_$a")
+OPTAB_D (vec_widen_uadd_lo_optab, "vec_widen_uadd_lo_$a")
 OPTAB_D (vec_addsub_optab, "vec_addsub$a3")
 OPTAB_D (vec_fmaddsub_optab, "vec_fmaddsub$a4")
 OPTAB_D (vec_fmsubadd_optab, "vec_fmsubadd$a4")
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
index 220bd9352a4c7acd2e3713e441d74898d3e92b30..7037673d32bd780e1c9b58a51e58e2bac3b30b7e 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-add.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_PLUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tuaddl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tuaddl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tsaddl\t} 1} } */
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
index a2bed63affbd091977df95a126da1f5b8c1d41d2..83bc1edb6105f47114b665e24a13e6194b2179a2 100644
--- a/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
+++ b/gcc/testsuite/gcc.target/aarch64/vect-widen-sub.c
@@ -1,5 +1,5 @@
 /* { dg-do run } */
-/* { dg-options "-O3 -save-temps" } */
+/* { dg-options "-O3 -save-temps -fdump-tree-vect-all" } */
 #include <stdint.h>
 #include <string.h>
 
@@ -86,6 +86,8 @@ main()
     return 0;
 }
 
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_LO" "vect"   } } */
+/* { dg-final { scan-tree-dump "add new stmt.*VEC_WIDEN_MINUS_HI" "vect"   } } */
 /* { dg-final { scan-assembler-times {\tusubl\t} 1} } */
 /* { dg-final { scan-assembler-times {\tusubl2\t} 1} } */
 /* { dg-final { scan-assembler-times {\tssubl\t} 1} } */
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 0aeebb67fac864db284985f4a6f0653af281d62b..28464ad9e3a7ea25557ffebcdbdbc1340f9e0d8b 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -65,6 +65,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "asan.h"
 #include "profile.h"
 #include "sreal.h"
+#include "internal-fn.h"
 
 /* This file contains functions for building the Control Flow Graph (CFG)
    for a function tree.  */
@@ -3411,6 +3412,52 @@ verify_gimple_call (gcall *stmt)
 	  debug_generic_stmt (fn);
 	  return true;
 	}
+      internal_fn ifn = gimple_call_internal_fn (stmt);
+      if (ifn == IFN_LAST)
+	{
+	  error ("gimple call has an invalid IFN");
+	  debug_generic_stmt (fn);
+	  return true;
+	}
+      else if (decomposes_to_hilo_fn_p (ifn))
+	{
+	  /* Non decomposed HILO stmts should not appear in IL, these are
+	     merely used as an internal representation to the auto-vectorizer
+	     pass and should have been expanded to their _LO _HI variants.  */
+	  error ("gimple call has an non decomposed HILO IFN");
+	  debug_generic_stmt (fn);
+	  return true;
+	}
+      else if (ifn == IFN_VEC_WIDEN_PLUS_LO
+	       || ifn == IFN_VEC_WIDEN_PLUS_HI
+	       || ifn == IFN_VEC_WIDEN_MINUS_LO
+	       || ifn == IFN_VEC_WIDEN_MINUS_HI)
+	{
+	  tree rhs1_type = TREE_TYPE (gimple_call_arg (stmt, 0));
+	  tree rhs2_type = TREE_TYPE (gimple_call_arg (stmt, 1));
+	  tree lhs_type = TREE_TYPE (gimple_get_lhs (stmt));
+	  if (TREE_CODE (lhs_type) == VECTOR_TYPE)
+	    {
+	      if (TREE_CODE (rhs1_type) != VECTOR_TYPE
+		  || TREE_CODE (rhs2_type) != VECTOR_TYPE)
+		{
+		  error ("invalid non-vector operands in vector IFN call");
+		  debug_generic_stmt (fn);
+		  return true;
+		}
+	      lhs_type = TREE_TYPE (lhs_type);
+	      rhs1_type = TREE_TYPE (rhs1_type);
+	      rhs2_type = TREE_TYPE (rhs2_type);
+	    }
+	  if (POINTER_TYPE_P (lhs_type)
+	      || POINTER_TYPE_P (rhs1_type)
+	      || POINTER_TYPE_P (rhs2_type))
+	    {
+	      error ("invalid (pointer) operands in vector IFN call");
+	      debug_generic_stmt (fn);
+	      return true;
+	    }
+	}
     }
   else
     {
diff --git a/gcc/tree-inline.cc b/gcc/tree-inline.cc
index 63a19f8d1d89c6bd5d8e55a299cbffaa324b4b84..d74d8db2173b1ab117250fea89de5212d5e354ec 100644
--- a/gcc/tree-inline.cc
+++ b/gcc/tree-inline.cc
@@ -4433,7 +4433,20 @@ estimate_num_insns (gimple *stmt, eni_weights *weights)
 	tree decl;
 
 	if (gimple_call_internal_p (stmt))
-	  return 0;
+	  {
+	    internal_fn fn = gimple_call_internal_fn (stmt);
+	    switch (fn)
+	      {
+	      case IFN_VEC_WIDEN_PLUS_HI:
+	      case IFN_VEC_WIDEN_PLUS_LO:
+	      case IFN_VEC_WIDEN_MINUS_HI:
+	      case IFN_VEC_WIDEN_MINUS_LO:
+		return 1;
+
+	      default:
+		return 0;
+	      }
+	  }
 	else if ((decl = gimple_call_fndecl (stmt))
 		 && fndecl_built_in_p (decl))
 	  {
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 1778af0242898e3dc73d94d22a5b8505628a53b5..93cebc72beb4f65249a69b2665dfeb8a0991c1d1 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -562,21 +562,30 @@ vect_joust_widened_type (tree type, tree new_type, tree *common_type)
 
 static unsigned int
 vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
-		      tree_code widened_code, bool shift_p,
+		      code_helper widened_code, bool shift_p,
 		      unsigned int max_nops,
 		      vect_unpromoted_value *unprom, tree *common_type,
 		      enum optab_subtype *subtype = NULL)
 {
   /* Check for an integer operation with the right code.  */
-  gassign *assign = dyn_cast <gassign *> (stmt_info->stmt);
-  if (!assign)
+  gimple* stmt = stmt_info->stmt;
+  if (!(is_gimple_assign (stmt) || is_gimple_call (stmt)))
+    return 0;
+
+  code_helper rhs_code;
+  if (is_gimple_assign (stmt))
+    rhs_code = gimple_assign_rhs_code (stmt);
+  else if (is_gimple_call (stmt))
+    rhs_code = gimple_call_combined_fn (stmt);
+  else
     return 0;
 
-  tree_code rhs_code = gimple_assign_rhs_code (assign);
-  if (rhs_code != code && rhs_code != widened_code)
+  if (rhs_code != code
+      && rhs_code != widened_code)
     return 0;
 
-  tree type = TREE_TYPE (gimple_assign_lhs (assign));
+  tree lhs = gimple_get_lhs (stmt);
+  tree type = TREE_TYPE (lhs);
   if (!INTEGRAL_TYPE_P (type))
     return 0;
 
@@ -589,7 +598,7 @@ vect_widened_op_tree (vec_info *vinfo, stmt_vec_info stmt_info, tree_code code,
     {
       vect_unpromoted_value *this_unprom = &unprom[next_op];
       unsigned int nops = 1;
-      tree op = gimple_op (assign, i + 1);
+      tree op = gimple_arg (stmt, i);
       if (i == 1 && TREE_CODE (op) == INTEGER_CST)
 	{
 	  /* We already have a common type from earlier operands.
@@ -1343,7 +1352,8 @@ vect_recog_sad_pattern (vec_info *vinfo,
   /* FORNOW.  Can continue analyzing the def-use chain when this stmt in a phi
      inside the loop (in case we are analyzing an outer-loop).  */
   vect_unpromoted_value unprom[2];
-  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR, WIDEN_MINUS_EXPR,
+  if (!vect_widened_op_tree (vinfo, diff_stmt_vinfo, MINUS_EXPR,
+			     IFN_VEC_WIDEN_MINUS_HILO,
 			     false, 2, unprom, &half_type))
     return NULL;
 
@@ -1395,14 +1405,16 @@ static gimple *
 vect_recog_widen_op_pattern (vec_info *vinfo,
 			     stmt_vec_info last_stmt_info, tree *type_out,
 			     tree_code orig_code, code_helper wide_code,
-			     bool shift_p, const char *name)
+			     bool shift_p, const char *name,
+			     optab_subtype *subtype = NULL)
 {
   gimple *last_stmt = last_stmt_info->stmt;
 
   vect_unpromoted_value unprom[2];
   tree half_type;
   if (!vect_widened_op_tree (vinfo, last_stmt_info, orig_code, orig_code,
-			     shift_p, 2, unprom, &half_type))
+			     shift_p, 2, unprom, &half_type, subtype))
+
     return NULL;
 
   /* Pattern detected.  */
@@ -1468,6 +1480,20 @@ vect_recog_widen_op_pattern (vec_info *vinfo,
 			      type, pattern_stmt, vecctype);
 }
 
+static gimple *
+vect_recog_widen_op_pattern (vec_info *vinfo,
+			     stmt_vec_info last_stmt_info, tree *type_out,
+			     tree_code orig_code, internal_fn wide_ifn,
+			     bool shift_p, const char *name,
+			     optab_subtype *subtype = NULL)
+{
+  combined_fn ifn = as_combined_fn (wide_ifn);
+  return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
+				      orig_code, ifn, shift_p, name,
+				      subtype);
+}
+
+
 /* Try to detect multiplication on widened inputs, converting MULT_EXPR
    to WIDEN_MULT_EXPR.  See vect_recog_widen_op_pattern for details.  */
 
@@ -1481,26 +1507,30 @@ vect_recog_widen_mult_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 }
 
 /* Try to detect addition on widened inputs, converting PLUS_EXPR
-   to WIDEN_PLUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_PLUS_HILO.  See vect_recog_widen_op_pattern for details.  */
 
 static gimple *
 vect_recog_widen_plus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      PLUS_EXPR, WIDEN_PLUS_EXPR, false,
-				      "vect_recog_widen_plus_pattern");
+				      PLUS_EXPR, IFN_VEC_WIDEN_PLUS_HILO,
+				      false, "vect_recog_widen_plus_pattern",
+				      &subtype);
 }
 
 /* Try to detect subtraction on widened inputs, converting MINUS_EXPR
-   to WIDEN_MINUS_EXPR.  See vect_recog_widen_op_pattern for details.  */
+   to IFN_VEC_WIDEN_MINUS_HILO.  See vect_recog_widen_op_pattern for details.  */
 static gimple *
 vect_recog_widen_minus_pattern (vec_info *vinfo, stmt_vec_info last_stmt_info,
 			       tree *type_out)
 {
+  optab_subtype subtype;
   return vect_recog_widen_op_pattern (vinfo, last_stmt_info, type_out,
-				      MINUS_EXPR, WIDEN_MINUS_EXPR, false,
-				      "vect_recog_widen_minus_pattern");
+				      MINUS_EXPR, IFN_VEC_WIDEN_MINUS_HILO,
+				      false, "vect_recog_widen_minus_pattern",
+				      &subtype);
 }
 
 /* Function vect_recog_ctz_ffs_pattern
@@ -3078,7 +3108,7 @@ vect_recog_average_pattern (vec_info *vinfo,
   vect_unpromoted_value unprom[3];
   tree new_type;
   unsigned int nops = vect_widened_op_tree (vinfo, plus_stmt_info, PLUS_EXPR,
-					    WIDEN_PLUS_EXPR, false, 3,
+					    IFN_VEC_WIDEN_PLUS_HILO, false, 3,
 					    unprom, &new_type);
   if (nops == 0)
     return NULL;
@@ -6469,6 +6499,7 @@ static vect_recog_func vect_vect_recog_func_ptrs[] = {
   { vect_recog_mask_conversion_pattern, "mask_conversion" },
   { vect_recog_widen_plus_pattern, "widen_plus" },
   { vect_recog_widen_minus_pattern, "widen_minus" },
+  /* These must come after the double widening ones.  */
 };
 
 const unsigned int NUM_PATTERNS = ARRAY_SIZE (vect_vect_recog_func_ptrs);
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index d152ae9ab10b361b88c0f839d6951c43b954750a..24c811ebe01fb8b003100dea494cf64fea72a975 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -5038,7 +5038,9 @@ vectorizable_conversion (vec_info *vinfo,
   bool widen_arith = (code == WIDEN_PLUS_EXPR
 		 || code == WIDEN_MINUS_EXPR
 		 || code == WIDEN_MULT_EXPR
-		 || code == WIDEN_LSHIFT_EXPR);
+		 || code == WIDEN_LSHIFT_EXPR
+		 || code == IFN_VEC_WIDEN_PLUS_HILO
+		 || code == IFN_VEC_WIDEN_MINUS_HILO);
 
   if (!widen_arith
       && !CONVERT_EXPR_CODE_P (code)
@@ -5088,7 +5090,9 @@ vectorizable_conversion (vec_info *vinfo,
       gcc_assert (code == WIDEN_MULT_EXPR
 		  || code == WIDEN_LSHIFT_EXPR
 		  || code == WIDEN_PLUS_EXPR
-		  || code == WIDEN_MINUS_EXPR);
+		  || code == WIDEN_MINUS_EXPR
+		  || code == IFN_VEC_WIDEN_PLUS_HILO
+		  || code == IFN_VEC_WIDEN_MINUS_HILO);
 
 
       op1 = is_gimple_assign (stmt) ? gimple_assign_rhs2 (stmt) :
@@ -12478,10 +12482,43 @@ supportable_widening_operation (vec_info *vinfo,
       optab1 = vec_unpacks_sbool_lo_optab;
       optab2 = vec_unpacks_sbool_hi_optab;
     }
-  else
+
+  if (code.is_fn_code ())
+     {
+      internal_fn ifn = as_internal_fn ((combined_fn) code);
+      gcc_assert (decomposes_to_hilo_fn_p (ifn));
+
+      internal_fn lo, hi;
+      lookup_hilo_internal_fn (ifn, &lo, &hi);
+      *code1 = as_combined_fn (lo);
+      *code2 = as_combined_fn (hi);
+      optab1 = direct_internal_fn_optab (lo, {vectype, vectype});
+      optab2 = direct_internal_fn_optab (hi, {vectype, vectype});
+    }
+  else if (code.is_tree_code ())
     {
-      optab1 = optab_for_tree_code (c1, vectype, optab_default);
-      optab2 = optab_for_tree_code (c2, vectype, optab_default);
+      if (code == FIX_TRUNC_EXPR)
+	{
+	  /* The signedness is determined from output operand.  */
+	  optab1 = optab_for_tree_code (c1, vectype_out, optab_default);
+	  optab2 = optab_for_tree_code (c2, vectype_out, optab_default);
+	}
+      else if (CONVERT_EXPR_CODE_P ((tree_code) code.safe_as_tree_code ())
+	       && VECTOR_BOOLEAN_TYPE_P (wide_vectype)
+	       && VECTOR_BOOLEAN_TYPE_P (vectype)
+	       && TYPE_MODE (wide_vectype) == TYPE_MODE (vectype)
+	       && SCALAR_INT_MODE_P (TYPE_MODE (vectype)))
+	{
+	  /* If the input and result modes are the same, a different optab
+	     is needed where we pass in the number of units in vectype.  */
+	  optab1 = vec_unpacks_sbool_lo_optab;
+	  optab2 = vec_unpacks_sbool_hi_optab;
+	}
+      else
+	{
+	  optab1 = optab_for_tree_code (c1, vectype, optab_default);
+	  optab2 = optab_for_tree_code (c2, vectype, optab_default);
+	}
     }
 
   if (!optab1 || !optab2)
diff --git a/gcc/tree.def b/gcc/tree.def
index 90ceeec0b512bfa5f983359c0af03cc71de32007..b37b0b35927b92a6536e5c2d9805ffce8319a240 100644
--- a/gcc/tree.def
+++ b/gcc/tree.def
@@ -1374,15 +1374,16 @@ DEFTREECODE (DOT_PROD_EXPR, "dot_prod_expr", tcc_expression, 3)
 DEFTREECODE (WIDEN_SUM_EXPR, "widen_sum_expr", tcc_binary, 2)
 
 /* Widening sad (sum of absolute differences).
-   The first two arguments are of type t1 which should be integer.
-   The third argument and the result are of type t2, such that t2 is at least
-   twice the size of t1.  Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
+   The first two arguments are of type t1 which should be a vector of integers.
+   The third argument and the result are of type t2, such that the size of
+   the elements of t2 is at least twice the size of the elements of t1.
+   Like DOT_PROD_EXPR, SAD_EXPR (arg1,arg2,arg3) is
    equivalent to:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = PLUS_EXPR (tmp2, arg3)
   or:
-       tmp = WIDEN_MINUS_EXPR (arg1, arg2)
+       tmp = IFN_VEC_WIDEN_MINUS_EXPR (arg1, arg2)
        tmp2 = ABS_EXPR (tmp)
        arg3 = WIDEN_SUM_EXPR (tmp2, arg3)
  */


More information about the Gcc-patches mailing list