This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[07/13] Make vec_perm_indices use new vector encoding


This patch changes vec_perm_indices from a plain vec<> to a class
that stores a canonicalised permutation, using the same encoding
as for VECTOR_CSTs.  This means that vec_perm_indices now carries
information about the number of vectors being permuted (currently
always 1 or 2) and the number of elements in each input vector.

A new vec_perm_builder class is used to actually build up the vector,
like tree_vector_builder does for trees.  vec_perm_indices is the
completed representation, a bit like VECTOR_CST is for trees.

The patch just does a mechanical conversion of the code to
vec_perm_builder: a later patch uses explicit encodings where possible.

The point of all this is that it makes the representation suitable
for variable-length vectors.  It's no longer necessary for the
underlying vec<>s to store every element explicitly.

In int-vector-builder.h, "using the same encoding as tree and rtx constants"
describes the endpoint -- adding the rtx encoding comes later.


2017-12-09  Richard Sandiford  <richard.sandiford@linaro.org>

gcc/
	* int-vector-builder.h: New file.
	* vec-perm-indices.h: Include int-vector-builder.h.
	(vec_perm_indices): Redefine as an int_vector_builder.
	(auto_vec_perm_indices): Delete.
	(vec_perm_builder): Redefine as a stand-alone class.
	(vec_perm_indices::vec_perm_indices): New function.
	(vec_perm_indices::clamp): Likewise.
	* vec-perm-indices.c: Include fold-const.h and tree-vector-builder.h.
	(vec_perm_indices::new_vector): New function.
	(vec_perm_indices::new_expanded_vector): Update for new
	vec_perm_indices class.
	(vec_perm_indices::rotate_inputs): New function.
	(vec_perm_indices::all_in_range_p): Operate directly on the
	encoded form, without computing elided elements.
	(tree_to_vec_perm_builder): Operate directly on the VECTOR_CST
	encoding.  Update for new vec_perm_indices class.
	* optabs.c (expand_vec_perm_const): Create a vec_perm_indices for
	the given vec_perm_builder.
	(expand_vec_perm_var): Update vec_perm_builder constructor.
	(expand_mult_highpart): Use vec_perm_builder instead of
	auto_vec_perm_indices.
	* optabs-query.c (can_mult_highpart_p): Use vec_perm_builder and
	vec_perm_indices instead of auto_vec_perm_indices.  Use a single
	or double series encoding as appropriate.
	* fold-const.c (fold_ternary_loc): Use vec_perm_builder and
	vec_perm_indices instead of auto_vec_perm_indices.
	* tree-ssa-forwprop.c (simplify_vector_constructor): Likewise.
	* tree-vect-data-refs.c (vect_grouped_store_supported): Likewise.
	(vect_permute_store_chain): Likewise.
	(vect_grouped_load_supported): Likewise.
	(vect_permute_load_chain): Likewise.
	(vect_shift_permute_load_chain): Likewise.
	* tree-vect-slp.c (vect_build_slp_tree_1): Likewise.
	(vect_transform_slp_perm_load): Likewise.
	(vect_schedule_slp_instance): Likewise.
	* tree-vect-stmts.c (perm_mask_for_reverse): Likewise.
	(vectorizable_mask_load_store): Likewise.
	(vectorizable_bswap): Likewise.
	(vectorizable_store): Likewise.
	(vectorizable_load): Likewise.
	* tree-vect-generic.c (lower_vec_perm): Use vec_perm_builder and
	vec_perm_indices instead of auto_vec_perm_indices.  Use
	tree_to_vec_perm_builder to read the vector from a tree.
	* tree-vect-loop.c (calc_vec_perm_mask_for_shift): Take a
	vec_perm_builder instead of a vec_perm_indices.
	(have_whole_vector_shift): Use vec_perm_builder and
	vec_perm_indices instead of auto_vec_perm_indices.  Leave the
	truncation to calc_vec_perm_mask_for_shift.
	(vect_create_epilog_for_reduction): Likewise.
	* config/aarch64/aarch64.c (expand_vec_perm_d::perm): Change
	from auto_vec_perm_indices to vec_perm_indices.
	(aarch64_expand_vec_perm_const_1): Use rotate_inputs on d.perm
	instead of changing individual elements.
	(aarch64_vectorize_vec_perm_const): Use new_vector to install
	the vector in d.perm.
	* config/arm/arm.c (expand_vec_perm_d::perm): Change
	from auto_vec_perm_indices to vec_perm_indices.
	(arm_expand_vec_perm_const_1): Use rotate_inputs on d.perm
	instead of changing individual elements.
	(arm_vectorize_vec_perm_const): Use new_vector to install
	the vector in d.perm.
	* config/powerpcspe/powerpcspe.c (rs6000_expand_extract_even):
	Update vec_perm_builder constructor.
	(rs6000_expand_interleave): Likewise.
	* config/rs6000/rs6000.c (rs6000_expand_extract_even): Likewise.
	(rs6000_expand_interleave): Likewise.

Index: gcc/int-vector-builder.h
===================================================================
--- /dev/null	2017-12-09 13:59:56.352713187 +0000
+++ gcc/int-vector-builder.h	2017-12-09 22:48:47.545825268 +0000
@@ -0,0 +1,90 @@
+/* A class for building vector integer constants.
+   Copyright (C) 2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_INT_VECTOR_BUILDER_H
+#define GCC_INT_VECTOR_BUILDER_H 1
+
+#include "vector-builder.h"
+
+/* This class is used to build vectors of integer type T using the same
+   encoding as tree and rtx constants.  See vector_builder for more
+   details.  */
+template<typename T>
+class int_vector_builder : public vector_builder<T, int_vector_builder<T> >
+{
+  typedef vector_builder<T, int_vector_builder> parent;
+  friend class vector_builder<T, int_vector_builder>;
+
+public:
+  int_vector_builder () {}
+  int_vector_builder (unsigned int, unsigned int, unsigned int);
+
+  using parent::new_vector;
+
+private:
+  bool equal_p (T, T) const;
+  bool allow_steps_p () const { return true; }
+  bool integral_p (T) const { return true; }
+  T step (T, T) const;
+  T apply_step (T, unsigned int, T) const;
+  bool can_elide_p (T) const { return true; }
+  void note_representative (T *, T) {}
+};
+
+/* Create a new builder for a vector with FULL_NELTS elements.
+   Initially encode the value as NPATTERNS interleaved patterns with
+   NELTS_PER_PATTERN elements each.  */
+
+template<typename T>
+inline
+int_vector_builder<T>::int_vector_builder (unsigned int full_nelts,
+					   unsigned int npatterns,
+					   unsigned int nelts_per_pattern)
+{
+  new_vector (full_nelts, npatterns, nelts_per_pattern);
+}
+
+/* Return true if elements ELT1 and ELT2 are equal.  */
+
+template<typename T>
+inline bool
+int_vector_builder<T>::equal_p (T elt1, T elt2) const
+{
+  return elt1 == elt2;
+}
+
+/* Return the value of element ELT2 minus the value of element ELT1.  */
+
+template<typename T>
+inline T
+int_vector_builder<T>::step (T elt1, T elt2) const
+{
+  return elt2 - elt1;
+}
+
+/* Return a vector element with the value BASE + FACTOR * STEP.  */
+
+template<typename T>
+inline T
+int_vector_builder<T>::apply_step (T base, unsigned int factor, T step) const
+{
+  return base + factor * step;
+}
+
+#endif
Index: gcc/vec-perm-indices.h
===================================================================
--- gcc/vec-perm-indices.h	2017-12-09 22:47:27.885318101 +0000
+++ gcc/vec-perm-indices.h	2017-12-09 22:48:47.548825399 +0000
@@ -20,30 +20,102 @@ Software Foundation; either version 3, o
 #ifndef GCC_VEC_PERN_INDICES_H
 #define GCC_VEC_PERN_INDICES_H 1
 
+#include "int-vector-builder.h"
+
+/* A vector_builder for building constant permutation vectors.
+   The elements do not need to be clamped to a particular range
+   of input elements.  */
+typedef int_vector_builder<HOST_WIDE_INT> vec_perm_builder;
+
 /* This class represents a constant permutation vector, such as that used
-   as the final operand to a VEC_PERM_EXPR.  */
-class vec_perm_indices : public auto_vec<unsigned short, 32>
+   as the final operand to a VEC_PERM_EXPR.  The vector is canonicalized
+   for a particular number of input vectors and for a particular number
+   of elements per input.  The class copes with cases in which the
+   input and output vectors have different numbers of elements.  */
+class vec_perm_indices
 {
-  typedef unsigned short element_type;
-  typedef auto_vec<element_type, 32> parent_type;
+  typedef HOST_WIDE_INT element_type;
 
 public:
-  vec_perm_indices () {}
-  vec_perm_indices (unsigned int nunits) : parent_type (nunits) {}
+  vec_perm_indices ();
+  vec_perm_indices (const vec_perm_builder &, unsigned int, unsigned int);
 
+  void new_vector (const vec_perm_builder &, unsigned int, unsigned int);
   void new_expanded_vector (const vec_perm_indices &, unsigned int);
+  void rotate_inputs (int delta);
+
+  /* Return the underlying vector encoding.  */
+  const vec_perm_builder &encoding () const { return m_encoding; }
+
+  /* Return the number of output elements.  This is called length ()
+     so that we present a more vec-like interface.  */
+  unsigned int length () const { return m_encoding.full_nelts (); }
+
+  /* Return the number of input vectors being permuted.  */
+  unsigned int ninputs () const { return m_ninputs; }
 
+  /* Return the number of elements in each input vector.  */
+  unsigned int nelts_per_input () const { return m_nelts_per_input; }
+
+  /* Return the total number of input elements.  */
+  unsigned int input_nelts () const { return m_ninputs * m_nelts_per_input; }
+
+  element_type clamp (element_type) const;
+  element_type operator[] (unsigned int i) const;
   bool all_in_range_p (element_type, element_type) const;
 
 private:
   vec_perm_indices (const vec_perm_indices &);
-};
 
-/* Temporary.  */
-typedef vec_perm_indices vec_perm_builder;
-typedef vec_perm_indices auto_vec_perm_indices;
+  vec_perm_builder m_encoding;
+  unsigned int m_ninputs;
+  unsigned int m_nelts_per_input;
+};
 
 bool tree_to_vec_perm_builder (vec_perm_builder *, tree);
 rtx vec_perm_indices_to_rtx (machine_mode, const vec_perm_indices &);
 
+inline
+vec_perm_indices::vec_perm_indices ()
+  : m_ninputs (0),
+    m_nelts_per_input (0)
+{
+}
+
+/* Construct a permutation vector that selects between NINPUTS vector
+   inputs that have NELTS_PER_INPUT elements each.  Take the elements of
+   the new vector from ELEMENTS, clamping each one to be in range.  */
+
+inline
+vec_perm_indices::vec_perm_indices (const vec_perm_builder &elements,
+				    unsigned int ninputs,
+				    unsigned int nelts_per_input)
+{
+  new_vector (elements, ninputs, nelts_per_input);
+}
+
+/* Return the canonical value for permutation vector element ELT,
+   taking into account the current number of input elements.  */
+
+inline vec_perm_indices::element_type
+vec_perm_indices::clamp (element_type elt) const
+{
+  element_type limit = input_nelts ();
+  elt %= limit;
+  /* Treat negative elements as counting from the end.  This only matters
+     if the vector size is not a power of 2.  */
+  if (elt < 0)
+    elt += limit;
+  return elt;
+}
+
+/* Return the value of vector element I, which might or might not be
+   explicitly encoded.  */
+
+inline vec_perm_indices::element_type
+vec_perm_indices::operator[] (unsigned int i) const
+{
+  return clamp (m_encoding.elt (i));
+}
+
 #endif
Index: gcc/vec-perm-indices.c
===================================================================
--- gcc/vec-perm-indices.c	2017-12-09 22:47:27.885318101 +0000
+++ gcc/vec-perm-indices.c	2017-12-09 22:48:47.548825399 +0000
@@ -22,11 +22,33 @@ Software Foundation; either version 3, o
 #include "coretypes.h"
 #include "vec-perm-indices.h"
 #include "tree.h"
+#include "fold-const.h"
+#include "tree-vector-builder.h"
 #include "backend.h"
 #include "rtl.h"
 #include "memmodel.h"
 #include "emit-rtl.h"
 
+/* Switch to a new permutation vector that selects between NINPUTS vector
+   inputs that have NELTS_PER_INPUT elements each.  Take the elements of the
+   new permutation vector from ELEMENTS, clamping each one to be in range.  */
+
+void
+vec_perm_indices::new_vector (const vec_perm_builder &elements,
+			      unsigned int ninputs,
+			      unsigned int nelts_per_input)
+{
+  m_ninputs = ninputs;
+  m_nelts_per_input = nelts_per_input;
+  /* Expand the encoding and clamp each element.  E.g. { 0, 2, 4, ... }
+     might wrap halfway if there is only one vector input.  */
+  unsigned int full_nelts = elements.full_nelts ();
+  m_encoding.new_vector (full_nelts, full_nelts, 1);
+  for (unsigned int i = 0; i < full_nelts; ++i)
+    m_encoding.quick_push (clamp (elements.elt (i)));
+  m_encoding.finalize ();
+}
+
 /* Switch to a new permutation vector that selects the same input elements
    as ORIG, but with each element split into FACTOR pieces.  For example,
    if ORIG is { 1, 2, 0, 3 } and FACTOR is 2, the new permutation is
@@ -36,14 +58,31 @@ Software Foundation; either version 3, o
 vec_perm_indices::new_expanded_vector (const vec_perm_indices &orig,
 				       unsigned int factor)
 {
-  truncate (0);
-  reserve (orig.length () * factor);
-  for (unsigned int i = 0; i < orig.length (); ++i)
+  m_ninputs = orig.m_ninputs;
+  m_nelts_per_input = orig.m_nelts_per_input * factor;
+  m_encoding.new_vector (orig.m_encoding.full_nelts () * factor,
+			 orig.m_encoding.npatterns () * factor,
+			 orig.m_encoding.nelts_per_pattern ());
+  unsigned int encoded_nelts = orig.m_encoding.encoded_nelts ();
+  for (unsigned int i = 0; i < encoded_nelts; ++i)
     {
-      element_type base = orig[i] * factor;
+      element_type base = orig.m_encoding[i] * factor;
       for (unsigned int j = 0; j < factor; ++j)
-	quick_push (base + j);
+	m_encoding.quick_push (base + j);
     }
+  m_encoding.finalize ();
+}
+
+/* Rotate the inputs of the permutation right by DELTA inputs.  This changes
+   the values of the permutation vector but it doesn't change the way that
+   the elements are encoded.  */
+
+void
+vec_perm_indices::rotate_inputs (int delta)
+{
+  element_type element_delta = delta * m_nelts_per_input;
+  for (unsigned int i = 0; i < m_encoding.length (); ++i)
+    m_encoding[i] = clamp (m_encoding[i] + element_delta);
 }
 
 /* Return true if all elements of the permutation vector are in the range
@@ -52,9 +91,44 @@ vec_perm_indices::new_expanded_vector (c
 bool
 vec_perm_indices::all_in_range_p (element_type start, element_type size) const
 {
-  for (unsigned int i = 0; i < length (); ++i)
-    if ((*this)[i] < start || ((*this)[i] - start) >= size)
+  /* Check the first two elements of each pattern.  */
+  unsigned int npatterns = m_encoding.npatterns ();
+  unsigned int nelts_per_pattern = m_encoding.nelts_per_pattern ();
+  unsigned int base_nelts = npatterns * MIN (nelts_per_pattern, 2);
+  for (unsigned int i = 0; i < base_nelts; ++i)
+    if (m_encoding[i] < start || (m_encoding[i] - start) >= size)
       return false;
+
+  /* For stepped encodings, check the full range of the series.  */
+  if (nelts_per_pattern == 3)
+    {
+      element_type limit = input_nelts ();
+
+      /* The number of elements in each pattern beyond the first two
+	 that we checked above.  */
+      unsigned int step_nelts = (m_encoding.full_nelts () / npatterns) - 2;
+      for (unsigned int i = 0; i < npatterns; ++i)
+	{
+	  /* BASE1 has been checked but BASE2 hasn't.   */
+	  element_type base1 = m_encoding[i + npatterns];
+	  element_type base2 = m_encoding[i + base_nelts];
+
+	  /* The step to add to get from BASE1 to each subsequent value.  */
+	  element_type step = clamp (base2 - base1);
+
+	  /* STEP has no inherent sign, so a value near LIMIT can
+	     act as a negative step.  The series is in range if it
+	     is in range according to one of the two interpretations.
+
+	     Since we're dealing with clamped values, ELEMENT_TYPE is
+	     wide enough for overflow not to be a problem.  */
+	  element_type headroom_down = base1 - start;
+	  element_type headroom_up = size - headroom_down - 1;
+	  if (headroom_up < step * step_nelts
+	      && headroom_down < (limit - step) * step_nelts)
+	    return false;
+	}
+    }
   return true;
 }
 
@@ -65,15 +139,16 @@ vec_perm_indices::all_in_range_p (elemen
 bool
 tree_to_vec_perm_builder (vec_perm_builder *builder, tree cst)
 {
-  unsigned int nelts = TYPE_VECTOR_SUBPARTS (TREE_TYPE (cst));
-  for (unsigned int i = 0; i < nelts; ++i)
-    if (!tree_fits_shwi_p (vector_cst_elt (cst, i)))
+  unsigned int encoded_nelts = vector_cst_encoded_nelts (cst);
+  for (unsigned int i = 0; i < encoded_nelts; ++i)
+    if (!tree_fits_shwi_p (VECTOR_CST_ENCODED_ELT (cst, i)))
       return false;
 
-  builder->reserve (nelts);
-  for (unsigned int i = 0; i < nelts; ++i)
-    builder->quick_push (tree_to_shwi (vector_cst_elt (cst, i))
-			 & (2 * nelts - 1));
+  builder->new_vector (TYPE_VECTOR_SUBPARTS (TREE_TYPE (cst)),
+		       VECTOR_CST_NPATTERNS (cst),
+		       VECTOR_CST_NELTS_PER_PATTERN (cst));
+  for (unsigned int i = 0; i < encoded_nelts; ++i)
+    builder->quick_push (tree_to_shwi (VECTOR_CST_ENCODED_ELT (cst, i)));
   return true;
 }
 
Index: gcc/optabs.c
===================================================================
--- gcc/optabs.c	2017-12-09 22:47:27.881318099 +0000
+++ gcc/optabs.c	2017-12-09 22:48:47.546825312 +0000
@@ -5456,6 +5456,11 @@ expand_vec_perm_const (machine_mode mode
   rtx_insn *last = get_last_insn ();
 
   bool single_arg_p = rtx_equal_p (v0, v1);
+  /* Always specify two input vectors here and leave the target to handle
+     cases in which the inputs are equal.  Not all backends can cope with
+     the single-input representation when testing for a double-input
+     target instruction.  */
+  vec_perm_indices indices (sel, 2, GET_MODE_NUNITS (mode));
 
   /* See if this can be handled with a vec_shr.  We only do this if the
      second vector is all zeroes.  */
@@ -5468,7 +5473,7 @@ expand_vec_perm_const (machine_mode mode
       && (shift_code != CODE_FOR_nothing
 	  || shift_code_qi != CODE_FOR_nothing))
     {
-      rtx shift_amt = shift_amt_for_vec_perm_mask (mode, sel);
+      rtx shift_amt = shift_amt_for_vec_perm_mask (mode, indices);
       if (shift_amt)
 	{
 	  struct expand_operand ops[3];
@@ -5500,7 +5505,7 @@ expand_vec_perm_const (machine_mode mode
       else
 	v1 = force_reg (mode, v1);
 
-      if (targetm.vectorize.vec_perm_const (mode, target, v0, v1, sel))
+      if (targetm.vectorize.vec_perm_const (mode, target, v0, v1, indices))
 	return target;
     }
 
@@ -5509,7 +5514,7 @@ expand_vec_perm_const (machine_mode mode
   rtx target_qi = NULL_RTX, v0_qi = NULL_RTX, v1_qi = NULL_RTX;
   if (qimode != VOIDmode)
     {
-      qimode_indices.new_expanded_vector (sel, GET_MODE_UNIT_SIZE (mode));
+      qimode_indices.new_expanded_vector (indices, GET_MODE_UNIT_SIZE (mode));
       target_qi = gen_reg_rtx (qimode);
       v0_qi = gen_lowpart (qimode, v0);
       v1_qi = gen_lowpart (qimode, v1);
@@ -5536,7 +5541,7 @@ expand_vec_perm_const (machine_mode mode
      REQUIRED_SEL_MODE is OK.  */
   if (sel_mode != required_sel_mode)
     {
-      if (!selector_fits_mode_p (required_sel_mode, sel))
+      if (!selector_fits_mode_p (required_sel_mode, indices))
 	{
 	  delete_insns_since (last);
 	  return NULL_RTX;
@@ -5547,7 +5552,7 @@ expand_vec_perm_const (machine_mode mode
   insn_code icode = direct_optab_handler (vec_perm_optab, mode);
   if (icode != CODE_FOR_nothing)
     {
-      rtx sel_rtx = vec_perm_indices_to_rtx (sel_mode, sel);
+      rtx sel_rtx = vec_perm_indices_to_rtx (sel_mode, indices);
       rtx tmp = expand_vec_perm_1 (icode, target, v0, v1, sel_rtx);
       if (tmp)
 	return tmp;
@@ -5621,7 +5626,7 @@ expand_vec_perm_var (machine_mode mode,
   gcc_assert (sel != NULL);
 
   /* Broadcast the low byte each element into each of its bytes.  */
-  vec_perm_builder const_sel (w);
+  vec_perm_builder const_sel (w, w, 1);
   for (i = 0; i < w; ++i)
     {
       int this_e = i / u * u;
@@ -5848,7 +5853,7 @@ expand_mult_highpart (machine_mode mode,
   expand_insn (optab_handler (tab2, mode), 3, eops);
   m2 = gen_lowpart (mode, eops[0].value);
 
-  auto_vec_perm_indices sel (nunits);
+  vec_perm_builder sel (nunits, nunits, 1);
   if (method == 2)
     {
       for (i = 0; i < nunits; ++i)
Index: gcc/optabs-query.c
===================================================================
--- gcc/optabs-query.c	2017-12-09 22:47:27.881318099 +0000
+++ gcc/optabs-query.c	2017-12-09 22:48:47.545825268 +0000
@@ -501,12 +501,13 @@ can_mult_highpart_p (machine_mode mode,
       op = uns_p ? vec_widen_umult_odd_optab : vec_widen_smult_odd_optab;
       if (optab_handler (op, mode) != CODE_FOR_nothing)
 	{
-	  auto_vec_perm_indices sel (nunits);
+	  vec_perm_builder sel (nunits, nunits, 1);
 	  for (i = 0; i < nunits; ++i)
 	    sel.quick_push (!BYTES_BIG_ENDIAN
 			    + (i & ~1)
 			    + ((i & 1) ? nunits : 0));
-	  if (can_vec_perm_const_p (mode, sel))
+	  vec_perm_indices indices (sel, 2, nunits);
+	  if (can_vec_perm_const_p (mode, indices))
 	    return 2;
 	}
     }
@@ -517,10 +518,11 @@ can_mult_highpart_p (machine_mode mode,
       op = uns_p ? vec_widen_umult_lo_optab : vec_widen_smult_lo_optab;
       if (optab_handler (op, mode) != CODE_FOR_nothing)
 	{
-	  auto_vec_perm_indices sel (nunits);
+	  vec_perm_builder sel (nunits, nunits, 1);
 	  for (i = 0; i < nunits; ++i)
 	    sel.quick_push (2 * i + (BYTES_BIG_ENDIAN ? 0 : 1));
-	  if (can_vec_perm_const_p (mode, sel))
+	  vec_perm_indices indices (sel, 2, nunits);
+	  if (can_vec_perm_const_p (mode, indices))
 	    return 3;
 	}
     }
Index: gcc/fold-const.c
===================================================================
--- gcc/fold-const.c	2017-12-09 22:47:27.881318099 +0000
+++ gcc/fold-const.c	2017-12-09 22:48:47.545825268 +0000
@@ -11217,7 +11217,7 @@ fold_ternary_loc (location_t loc, enum t
 	    {
 	      unsigned int nelts = VECTOR_CST_NELTS (arg0), i;
 	      gcc_assert (nelts == TYPE_VECTOR_SUBPARTS (type));
-	      auto_vec_perm_indices sel (nelts);
+	      vec_perm_builder sel (nelts, nelts, 1);
 	      for (i = 0; i < nelts; i++)
 		{
 		  tree val = VECTOR_CST_ELT (arg0, i);
@@ -11228,7 +11228,8 @@ fold_ternary_loc (location_t loc, enum t
 		  else /* Currently unreachable.  */
 		    return NULL_TREE;
 		}
-	      tree t = fold_vec_perm (type, arg1, arg2, sel);
+	      tree t = fold_vec_perm (type, arg1, arg2,
+				      vec_perm_indices (sel, 2, nelts));
 	      if (t != NULL_TREE)
 		return t;
 	    }
@@ -11558,8 +11559,8 @@ fold_ternary_loc (location_t loc, enum t
 	  mask2 = 2 * nelts - 1;
 	  mask = single_arg ? (nelts - 1) : mask2;
 	  gcc_assert (nelts == TYPE_VECTOR_SUBPARTS (type));
-	  auto_vec_perm_indices sel (nelts);
-	  auto_vec_perm_indices sel2 (nelts);
+	  vec_perm_builder sel (nelts, nelts, 1);
+	  vec_perm_builder sel2 (nelts, nelts, 1);
 	  for (i = 0; i < nelts; i++)
 	    {
 	      tree val = VECTOR_CST_ELT (arg2, i);
@@ -11604,12 +11605,13 @@ fold_ternary_loc (location_t loc, enum t
 	      need_mask_canon = true;
 	    }
 
+	  vec_perm_indices indices (sel, 2, nelts);
 	  if ((TREE_CODE (op0) == VECTOR_CST
 	       || TREE_CODE (op0) == CONSTRUCTOR)
 	      && (TREE_CODE (op1) == VECTOR_CST
 		  || TREE_CODE (op1) == CONSTRUCTOR))
 	    {
-	      tree t = fold_vec_perm (type, op0, op1, sel);
+	      tree t = fold_vec_perm (type, op0, op1, indices);
 	      if (t != NULL_TREE)
 		return t;
 	    }
@@ -11621,11 +11623,14 @@ fold_ternary_loc (location_t loc, enum t
 	     argument permutation while still allowing an equivalent
 	     2-argument version.  */
 	  if (need_mask_canon && arg2 == op2
-	      && !can_vec_perm_const_p (TYPE_MODE (type), sel, false)
-	      && can_vec_perm_const_p (TYPE_MODE (type), sel2, false))
+	      && !can_vec_perm_const_p (TYPE_MODE (type), indices, false)
+	      && can_vec_perm_const_p (TYPE_MODE (type),
+				       vec_perm_indices (sel2, 2, nelts),
+				       false))
 	    {
 	      need_mask_canon = need_mask_canon2;
-	      sel = sel2;
+	      sel.truncate (0);
+	      sel.splice (sel2);
 	    }
 
 	  if (need_mask_canon && arg2 == op2)
Index: gcc/tree-ssa-forwprop.c
===================================================================
--- gcc/tree-ssa-forwprop.c	2017-12-09 22:47:27.883318100 +0000
+++ gcc/tree-ssa-forwprop.c	2017-12-09 22:48:47.546825312 +0000
@@ -2019,7 +2019,7 @@ simplify_vector_constructor (gimple_stmt
   elem_type = TREE_TYPE (type);
   elem_size = TREE_INT_CST_LOW (TYPE_SIZE (elem_type));
 
-  auto_vec_perm_indices sel (nelts);
+  vec_perm_builder sel (nelts, nelts, 1);
   orig = NULL;
   conv_code = ERROR_MARK;
   maybe_ident = true;
@@ -2109,7 +2109,8 @@ simplify_vector_constructor (gimple_stmt
     {
       tree mask_type;
 
-      if (!can_vec_perm_const_p (TYPE_MODE (type), sel))
+      vec_perm_indices indices (sel, 1, nelts);
+      if (!can_vec_perm_const_p (TYPE_MODE (type), indices))
 	return false;
       mask_type
 	= build_vector_type (build_nonstandard_integer_type (elem_size, 1),
Index: gcc/tree-vect-data-refs.c
===================================================================
--- gcc/tree-vect-data-refs.c	2017-12-09 22:47:27.883318100 +0000
+++ gcc/tree-vect-data-refs.c	2017-12-09 22:48:47.546825312 +0000
@@ -4566,7 +4566,7 @@ vect_grouped_store_supported (tree vecty
   if (VECTOR_MODE_P (mode))
     {
       unsigned int i, nelt = GET_MODE_NUNITS (mode);
-      auto_vec_perm_indices sel (nelt);
+      vec_perm_builder sel (nelt, nelt, 1);
       sel.quick_grow (nelt);
 
       if (count == 3)
@@ -4574,6 +4574,7 @@ vect_grouped_store_supported (tree vecty
 	  unsigned int j0 = 0, j1 = 0, j2 = 0;
 	  unsigned int i, j;
 
+	  vec_perm_indices indices;
 	  for (j = 0; j < 3; j++)
 	    {
 	      int nelt0 = ((3 - j) * nelt) % 3;
@@ -4588,7 +4589,8 @@ vect_grouped_store_supported (tree vecty
 		  if (3 * i + nelt2 < nelt)
 		    sel[3 * i + nelt2] = 0;
 		}
-	      if (!can_vec_perm_const_p (mode, sel))
+	      indices.new_vector (sel, 2, nelt);
+	      if (!can_vec_perm_const_p (mode, indices))
 		{
 		  if (dump_enabled_p ())
 		    dump_printf (MSG_MISSED_OPTIMIZATION,
@@ -4605,7 +4607,8 @@ vect_grouped_store_supported (tree vecty
 		  if (3 * i + nelt2 < nelt)
 		    sel[3 * i + nelt2] = nelt + j2++;
 		}
-	      if (!can_vec_perm_const_p (mode, sel))
+	      indices.new_vector (sel, 2, nelt);
+	      if (!can_vec_perm_const_p (mode, indices))
 		{
 		  if (dump_enabled_p ())
 		    dump_printf (MSG_MISSED_OPTIMIZATION,
@@ -4625,11 +4628,13 @@ vect_grouped_store_supported (tree vecty
 	      sel[i * 2] = i;
 	      sel[i * 2 + 1] = i + nelt;
 	    }
-	  if (can_vec_perm_const_p (mode, sel))
+	  vec_perm_indices indices (sel, 2, nelt);
+	  if (can_vec_perm_const_p (mode, indices))
 	    {
 	      for (i = 0; i < nelt; i++)
 		sel[i] += nelt / 2;
-	      if (can_vec_perm_const_p (mode, sel))
+	      indices.new_vector (sel, 2, nelt);
+	      if (can_vec_perm_const_p (mode, indices))
 		return true;
 	    }
 	}
@@ -4731,7 +4736,7 @@ vect_permute_store_chain (vec<tree> dr_c
   unsigned int i, n, log_length = exact_log2 (length);
   unsigned int j, nelt = TYPE_VECTOR_SUBPARTS (vectype);
 
-  auto_vec_perm_indices sel (nelt);
+  vec_perm_builder sel (nelt, nelt, 1);
   sel.quick_grow (nelt);
 
   result_chain->quick_grow (length);
@@ -4742,6 +4747,7 @@ vect_permute_store_chain (vec<tree> dr_c
     {
       unsigned int j0 = 0, j1 = 0, j2 = 0;
 
+      vec_perm_indices indices;
       for (j = 0; j < 3; j++)
         {
 	  int nelt0 = ((3 - j) * nelt) % 3;
@@ -4757,7 +4763,8 @@ vect_permute_store_chain (vec<tree> dr_c
 	      if (3 * i + nelt2 < nelt)
 		sel[3 * i + nelt2] = 0;
 	    }
-	  perm3_mask_low = vect_gen_perm_mask_checked (vectype, sel);
+	  indices.new_vector (sel, 2, nelt);
+	  perm3_mask_low = vect_gen_perm_mask_checked (vectype, indices);
 
 	  for (i = 0; i < nelt; i++)
 	    {
@@ -4768,7 +4775,8 @@ vect_permute_store_chain (vec<tree> dr_c
 	      if (3 * i + nelt2 < nelt)
 		sel[3 * i + nelt2] = nelt + j2++;
 	    }
-	  perm3_mask_high = vect_gen_perm_mask_checked (vectype, sel);
+	  indices.new_vector (sel, 2, nelt);
+	  perm3_mask_high = vect_gen_perm_mask_checked (vectype, indices);
 
 	  vect1 = dr_chain[0];
 	  vect2 = dr_chain[1];
@@ -4805,11 +4813,13 @@ vect_permute_store_chain (vec<tree> dr_c
 	  sel[i * 2] = i;
 	  sel[i * 2 + 1] = i + nelt;
 	}
-	perm_mask_high = vect_gen_perm_mask_checked (vectype, sel);
+	vec_perm_indices indices (sel, 2, nelt);
+	perm_mask_high = vect_gen_perm_mask_checked (vectype, indices);
 
 	for (i = 0; i < nelt; i++)
 	  sel[i] += nelt / 2;
-	perm_mask_low = vect_gen_perm_mask_checked (vectype, sel);
+	indices.new_vector (sel, 2, nelt);
+	perm_mask_low = vect_gen_perm_mask_checked (vectype, indices);
 
 	for (i = 0, n = log_length; i < n; i++)
 	  {
@@ -5154,11 +5164,12 @@ vect_grouped_load_supported (tree vectyp
   if (VECTOR_MODE_P (mode))
     {
       unsigned int i, j, nelt = GET_MODE_NUNITS (mode);
-      auto_vec_perm_indices sel (nelt);
+      vec_perm_builder sel (nelt, nelt, 1);
       sel.quick_grow (nelt);
 
       if (count == 3)
 	{
+	  vec_perm_indices indices;
 	  unsigned int k;
 	  for (k = 0; k < 3; k++)
 	    {
@@ -5167,7 +5178,8 @@ vect_grouped_load_supported (tree vectyp
 		  sel[i] = 3 * i + k;
 		else
 		  sel[i] = 0;
-	      if (!can_vec_perm_const_p (mode, sel))
+	      indices.new_vector (sel, 2, nelt);
+	      if (!can_vec_perm_const_p (mode, indices))
 		{
 		  if (dump_enabled_p ())
 		    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -5180,7 +5192,8 @@ vect_grouped_load_supported (tree vectyp
 		  sel[i] = i;
 		else
 		  sel[i] = nelt + ((nelt + k) % 3) + 3 * (j++);
-	      if (!can_vec_perm_const_p (mode, sel))
+	      indices.new_vector (sel, 2, nelt);
+	      if (!can_vec_perm_const_p (mode, indices))
 		{
 		  if (dump_enabled_p ())
 		    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -5195,13 +5208,16 @@ vect_grouped_load_supported (tree vectyp
 	{
 	  /* If length is not equal to 3 then only power of 2 is supported.  */
 	  gcc_assert (pow2p_hwi (count));
+
 	  for (i = 0; i < nelt; i++)
 	    sel[i] = i * 2;
-	  if (can_vec_perm_const_p (mode, sel))
+	  vec_perm_indices indices (sel, 2, nelt);
+	  if (can_vec_perm_const_p (mode, indices))
 	    {
 	      for (i = 0; i < nelt; i++)
 		sel[i] = i * 2 + 1;
-	      if (can_vec_perm_const_p (mode, sel))
+	      indices.new_vector (sel, 2, nelt);
+	      if (can_vec_perm_const_p (mode, indices))
 		return true;
 	    }
         }
@@ -5316,7 +5332,7 @@ vect_permute_load_chain (vec<tree> dr_ch
   unsigned int i, j, log_length = exact_log2 (length);
   unsigned nelt = TYPE_VECTOR_SUBPARTS (vectype);
 
-  auto_vec_perm_indices sel (nelt);
+  vec_perm_builder sel (nelt, nelt, 1);
   sel.quick_grow (nelt);
 
   result_chain->quick_grow (length);
@@ -5327,6 +5343,7 @@ vect_permute_load_chain (vec<tree> dr_ch
     {
       unsigned int k;
 
+      vec_perm_indices indices;
       for (k = 0; k < 3; k++)
 	{
 	  for (i = 0; i < nelt; i++)
@@ -5334,15 +5351,16 @@ vect_permute_load_chain (vec<tree> dr_ch
 	      sel[i] = 3 * i + k;
 	    else
 	      sel[i] = 0;
-	  perm3_mask_low = vect_gen_perm_mask_checked (vectype, sel);
+	  indices.new_vector (sel, 2, nelt);
+	  perm3_mask_low = vect_gen_perm_mask_checked (vectype, indices);
 
 	  for (i = 0, j = 0; i < nelt; i++)
 	    if (3 * i + k < 2 * nelt)
 	      sel[i] = i;
 	    else
 	      sel[i] = nelt + ((nelt + k) % 3) + 3 * (j++);
-
-	  perm3_mask_high = vect_gen_perm_mask_checked (vectype, sel);
+	  indices.new_vector (sel, 2, nelt);
+	  perm3_mask_high = vect_gen_perm_mask_checked (vectype, indices);
 
 	  first_vect = dr_chain[0];
 	  second_vect = dr_chain[1];
@@ -5374,11 +5392,13 @@ vect_permute_load_chain (vec<tree> dr_ch
 
       for (i = 0; i < nelt; ++i)
 	sel[i] = i * 2;
-      perm_mask_even = vect_gen_perm_mask_checked (vectype, sel);
+      vec_perm_indices indices (sel, 2, nelt);
+      perm_mask_even = vect_gen_perm_mask_checked (vectype, indices);
 
       for (i = 0; i < nelt; ++i)
 	sel[i] = i * 2 + 1;
-      perm_mask_odd = vect_gen_perm_mask_checked (vectype, sel);
+      indices.new_vector (sel, 2, nelt);
+      perm_mask_odd = vect_gen_perm_mask_checked (vectype, indices);
 
       for (i = 0; i < log_length; i++)
 	{
@@ -5514,7 +5534,7 @@ vect_shift_permute_load_chain (vec<tree>
   stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
 
-  auto_vec_perm_indices sel (nelt);
+  vec_perm_builder sel (nelt, nelt, 1);
   sel.quick_grow (nelt);
 
   result_chain->quick_grow (length);
@@ -5528,7 +5548,8 @@ vect_shift_permute_load_chain (vec<tree>
 	sel[i] = i * 2;
       for (i = 0; i < nelt / 2; ++i)
 	sel[nelt / 2 + i] = i * 2 + 1;
-      if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel))
+      vec_perm_indices indices (sel, 2, nelt);
+      if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices))
 	{
 	  if (dump_enabled_p ())
 	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -5536,13 +5557,14 @@ vect_shift_permute_load_chain (vec<tree>
 			      supported by target\n");
 	  return false;
 	}
-      perm2_mask1 = vect_gen_perm_mask_checked (vectype, sel);
+      perm2_mask1 = vect_gen_perm_mask_checked (vectype, indices);
 
       for (i = 0; i < nelt / 2; ++i)
 	sel[i] = i * 2 + 1;
       for (i = 0; i < nelt / 2; ++i)
 	sel[nelt / 2 + i] = i * 2;
-      if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel))
+      indices.new_vector (sel, 2, nelt);
+      if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices))
 	{
 	  if (dump_enabled_p ())
 	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -5550,20 +5572,21 @@ vect_shift_permute_load_chain (vec<tree>
 			      supported by target\n");
 	  return false;
 	}
-      perm2_mask2 = vect_gen_perm_mask_checked (vectype, sel);
+      perm2_mask2 = vect_gen_perm_mask_checked (vectype, indices);
 
       /* Generating permutation constant to shift all elements.
 	 For vector length 8 it is {4 5 6 7 8 9 10 11}.  */
       for (i = 0; i < nelt; i++)
 	sel[i] = nelt / 2 + i;
-      if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel))
+      indices.new_vector (sel, 2, nelt);
+      if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices))
 	{
 	  if (dump_enabled_p ())
 	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 			     "shift permutation is not supported by target\n");
 	  return false;
 	}
-      shift1_mask = vect_gen_perm_mask_checked (vectype, sel);
+      shift1_mask = vect_gen_perm_mask_checked (vectype, indices);
 
       /* Generating permutation constant to select vector from 2.
 	 For vector length 8 it is {0 1 2 3 12 13 14 15}.  */
@@ -5571,14 +5594,15 @@ vect_shift_permute_load_chain (vec<tree>
 	sel[i] = i;
       for (i = nelt / 2; i < nelt; i++)
 	sel[i] = nelt + i;
-      if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel))
+      indices.new_vector (sel, 2, nelt);
+      if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices))
 	{
 	  if (dump_enabled_p ())
 	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 			     "select is not supported by target\n");
 	  return false;
 	}
-      select_mask = vect_gen_perm_mask_checked (vectype, sel);
+      select_mask = vect_gen_perm_mask_checked (vectype, indices);
 
       for (i = 0; i < log_length; i++)
 	{
@@ -5634,7 +5658,8 @@ vect_shift_permute_load_chain (vec<tree>
 	  sel[i] = 3 * k + (l % 3);
 	  k++;
 	}
-      if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel))
+      vec_perm_indices indices (sel, 2, nelt);
+      if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices))
 	{
 	  if (dump_enabled_p ())
 	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -5642,59 +5667,63 @@ vect_shift_permute_load_chain (vec<tree>
 			      supported by target\n");
 	  return false;
 	}
-      perm3_mask = vect_gen_perm_mask_checked (vectype, sel);
+      perm3_mask = vect_gen_perm_mask_checked (vectype, indices);
 
       /* Generating permutation constant to shift all elements.
 	 For vector length 8 it is {6 7 8 9 10 11 12 13}.  */
       for (i = 0; i < nelt; i++)
 	sel[i] = 2 * (nelt / 3) + (nelt % 3) + i;
-      if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel))
+      indices.new_vector (sel, 2, nelt);
+      if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices))
 	{
 	  if (dump_enabled_p ())
 	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 			     "shift permutation is not supported by target\n");
 	  return false;
 	}
-      shift1_mask = vect_gen_perm_mask_checked (vectype, sel);
+      shift1_mask = vect_gen_perm_mask_checked (vectype, indices);
 
       /* Generating permutation constant to shift all elements.
 	 For vector length 8 it is {5 6 7 8 9 10 11 12}.  */
       for (i = 0; i < nelt; i++)
 	sel[i] = 2 * (nelt / 3) + 1 + i;
-      if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel))
+      indices.new_vector (sel, 2, nelt);
+      if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices))
 	{
 	  if (dump_enabled_p ())
 	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 			     "shift permutation is not supported by target\n");
 	  return false;
 	}
-      shift2_mask = vect_gen_perm_mask_checked (vectype, sel);
+      shift2_mask = vect_gen_perm_mask_checked (vectype, indices);
 
       /* Generating permutation constant to shift all elements.
 	 For vector length 8 it is {3 4 5 6 7 8 9 10}.  */
       for (i = 0; i < nelt; i++)
 	sel[i] = (nelt / 3) + (nelt % 3) / 2 + i;
-      if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel))
+      indices.new_vector (sel, 2, nelt);
+      if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices))
 	{
 	  if (dump_enabled_p ())
 	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 			     "shift permutation is not supported by target\n");
 	  return false;
 	}
-      shift3_mask = vect_gen_perm_mask_checked (vectype, sel);
+      shift3_mask = vect_gen_perm_mask_checked (vectype, indices);
 
       /* Generating permutation constant to shift all elements.
 	 For vector length 8 it is {5 6 7 8 9 10 11 12}.  */
       for (i = 0; i < nelt; i++)
 	sel[i] = 2 * (nelt / 3) + (nelt % 3) / 2 + i;
-      if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel))
+      indices.new_vector (sel, 2, nelt);
+      if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices))
 	{
 	  if (dump_enabled_p ())
 	    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
 			     "shift permutation is not supported by target\n");
 	  return false;
 	}
-      shift4_mask = vect_gen_perm_mask_checked (vectype, sel);
+      shift4_mask = vect_gen_perm_mask_checked (vectype, indices);
 
       for (k = 0; k < 3; k++)
 	{
Index: gcc/tree-vect-slp.c
===================================================================
--- gcc/tree-vect-slp.c	2017-12-09 22:47:27.884318101 +0000
+++ gcc/tree-vect-slp.c	2017-12-09 22:48:47.547825355 +0000
@@ -894,7 +894,7 @@ vect_build_slp_tree_1 (vec_info *vinfo,
       && TREE_CODE_CLASS (alt_stmt_code) != tcc_reference)
     {
       unsigned int count = TYPE_VECTOR_SUBPARTS (vectype);
-      auto_vec_perm_indices sel (count);
+      vec_perm_builder sel (count, count, 1);
       for (i = 0; i < count; ++i)
 	{
 	  unsigned int elt = i;
@@ -902,7 +902,8 @@ vect_build_slp_tree_1 (vec_info *vinfo,
 	    elt += count;
 	  sel.quick_push (elt);
 	}
-      if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel))
+      vec_perm_indices indices (sel, 2, count);
+      if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices))
 	{
 	  for (i = 0; i < group_size; ++i)
 	    if (gimple_assign_rhs_code (stmts[i]) == alt_stmt_code)
@@ -3570,8 +3571,9 @@ vect_transform_slp_perm_load (slp_tree n
     (int_mode_for_mode (TYPE_MODE (TREE_TYPE (vectype))).require (), 1);
   mask_type = get_vectype_for_scalar_type (mask_element_type);
   nunits = TYPE_VECTOR_SUBPARTS (vectype);
-  auto_vec_perm_indices mask (nunits);
+  vec_perm_builder mask (nunits, nunits, 1);
   mask.quick_grow (nunits);
+  vec_perm_indices indices;
 
   /* Initialize the vect stmts of NODE to properly insert the generated
      stmts later.  */
@@ -3644,10 +3646,10 @@ vect_transform_slp_perm_load (slp_tree n
 	    noop_p = false;
 	  mask[index++] = mask_element;
 
-	  if (index == nunits)
+	  if (index == nunits && !noop_p)
 	    {
-	      if (! noop_p
-		  && ! can_vec_perm_const_p (mode, mask))
+	      indices.new_vector (mask, 2, nunits);
+	      if (!can_vec_perm_const_p (mode, indices))
 		{
 		  if (dump_enabled_p ())
 		    {
@@ -3655,16 +3657,19 @@ vect_transform_slp_perm_load (slp_tree n
 				       vect_location, 
 				       "unsupported vect permute { ");
 		      for (i = 0; i < nunits; ++i)
-			dump_printf (MSG_MISSED_OPTIMIZATION, "%d ", mask[i]);
+			dump_printf (MSG_MISSED_OPTIMIZATION,
+				     HOST_WIDE_INT_PRINT_DEC " ", mask[i]);
 		      dump_printf (MSG_MISSED_OPTIMIZATION, "}\n");
 		    }
 		  gcc_assert (analyze_only);
 		  return false;
 		}
 
-	      if (! noop_p)
-		++*n_perms;
+	      ++*n_perms;
+	    }
 
+	  if (index == nunits)
+	    {
 	      if (!analyze_only)
 		{
 		  tree mask_vec = NULL_TREE;
@@ -3797,7 +3802,7 @@ vect_schedule_slp_instance (slp_tree nod
       enum tree_code code0 = gimple_assign_rhs_code (stmt);
       enum tree_code ocode = ERROR_MARK;
       gimple *ostmt;
-      auto_vec_perm_indices mask (group_size);
+      vec_perm_builder mask (group_size, group_size, 1);
       FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, ostmt)
 	if (gimple_assign_rhs_code (ostmt) != code0)
 	  {
Index: gcc/tree-vect-stmts.c
===================================================================
--- gcc/tree-vect-stmts.c	2017-12-09 22:47:27.885318101 +0000
+++ gcc/tree-vect-stmts.c	2017-12-09 22:48:47.548825399 +0000
@@ -1717,13 +1717,14 @@ perm_mask_for_reverse (tree vectype)
 
   nunits = TYPE_VECTOR_SUBPARTS (vectype);
 
-  auto_vec_perm_indices sel (nunits);
+  vec_perm_builder sel (nunits, nunits, 1);
   for (i = 0; i < nunits; ++i)
     sel.quick_push (nunits - 1 - i);
 
-  if (!can_vec_perm_const_p (TYPE_MODE (vectype), sel))
+  vec_perm_indices indices (sel, 1, nunits);
+  if (!can_vec_perm_const_p (TYPE_MODE (vectype), indices))
     return NULL_TREE;
-  return vect_gen_perm_mask_checked (vectype, sel);
+  return vect_gen_perm_mask_checked (vectype, indices);
 }
 
 /* A subroutine of get_load_store_type, with a subset of the same
@@ -2185,27 +2186,32 @@ vectorizable_mask_load_store (gimple *st
 	{
 	  modifier = WIDEN;
 
-	  auto_vec_perm_indices sel (gather_off_nunits);
+	  vec_perm_builder sel (gather_off_nunits, gather_off_nunits, 1);
 	  for (i = 0; i < gather_off_nunits; ++i)
 	    sel.quick_push (i | nunits);
 
-	  perm_mask = vect_gen_perm_mask_checked (gs_info.offset_vectype, sel);
+	  vec_perm_indices indices (sel, 1, gather_off_nunits);
+	  perm_mask = vect_gen_perm_mask_checked (gs_info.offset_vectype,
+						  indices);
 	}
       else if (nunits == gather_off_nunits * 2)
 	{
 	  modifier = NARROW;
 
-	  auto_vec_perm_indices sel (nunits);
+	  vec_perm_builder sel (nunits, nunits, 1);
 	  sel.quick_grow (nunits);
 	  for (i = 0; i < nunits; ++i)
 	    sel[i] = i < gather_off_nunits
 		     ? i : i + nunits - gather_off_nunits;
+	  vec_perm_indices indices (sel, 2, nunits);
+	  perm_mask = vect_gen_perm_mask_checked (vectype, indices);
 
-	  perm_mask = vect_gen_perm_mask_checked (vectype, sel);
 	  ncopies *= 2;
+
 	  for (i = 0; i < nunits; ++i)
 	    sel[i] = i | gather_off_nunits;
-	  mask_perm_mask = vect_gen_perm_mask_checked (masktype, sel);
+	  indices.new_vector (sel, 2, gather_off_nunits);
+	  mask_perm_mask = vect_gen_perm_mask_checked (masktype, indices);
 	}
       else
 	gcc_unreachable ();
@@ -2498,12 +2504,13 @@ vectorizable_bswap (gimple *stmt, gimple
   unsigned int num_bytes = TYPE_VECTOR_SUBPARTS (char_vectype);
   unsigned word_bytes = num_bytes / nunits;
 
-  auto_vec_perm_indices elts (num_bytes);
+  vec_perm_builder elts (num_bytes, num_bytes, 1);
   for (unsigned i = 0; i < nunits; ++i)
     for (unsigned j = 0; j < word_bytes; ++j)
       elts.quick_push ((i + 1) * word_bytes - j - 1);
 
-  if (!can_vec_perm_const_p (TYPE_MODE (char_vectype), elts))
+  vec_perm_indices indices (elts, 1, num_bytes);
+  if (!can_vec_perm_const_p (TYPE_MODE (char_vectype), indices))
     return false;
 
   if (! vec_stmt)
@@ -5809,22 +5816,25 @@ vectorizable_store (gimple *stmt, gimple
 	{
 	  modifier = WIDEN;
 
-	  auto_vec_perm_indices sel (scatter_off_nunits);
+	  vec_perm_builder sel (scatter_off_nunits, scatter_off_nunits, 1);
 	  for (i = 0; i < (unsigned int) scatter_off_nunits; ++i)
 	    sel.quick_push (i | nunits);
 
-	  perm_mask = vect_gen_perm_mask_checked (gs_info.offset_vectype, sel);
+	  vec_perm_indices indices (sel, 1, scatter_off_nunits);
+	  perm_mask = vect_gen_perm_mask_checked (gs_info.offset_vectype,
+						  indices);
 	  gcc_assert (perm_mask != NULL_TREE);
 	}
       else if (nunits == (unsigned int) scatter_off_nunits * 2)
 	{
 	  modifier = NARROW;
 
-	  auto_vec_perm_indices sel (nunits);
+	  vec_perm_builder sel (nunits, nunits, 1);
 	  for (i = 0; i < (unsigned int) nunits; ++i)
 	    sel.quick_push (i | scatter_off_nunits);
 
-	  perm_mask = vect_gen_perm_mask_checked (vectype, sel);
+	  vec_perm_indices indices (sel, 2, nunits);
+	  perm_mask = vect_gen_perm_mask_checked (vectype, indices);
 	  gcc_assert (perm_mask != NULL_TREE);
 	  ncopies *= 2;
 	}
@@ -6845,22 +6855,25 @@ vectorizable_load (gimple *stmt, gimple_
 	{
 	  modifier = WIDEN;
 
-	  auto_vec_perm_indices sel (gather_off_nunits);
+	  vec_perm_builder sel (gather_off_nunits, gather_off_nunits, 1);
 	  for (i = 0; i < gather_off_nunits; ++i)
 	    sel.quick_push (i | nunits);
 
-	  perm_mask = vect_gen_perm_mask_checked (gs_info.offset_vectype, sel);
+	  vec_perm_indices indices (sel, 1, gather_off_nunits);
+	  perm_mask = vect_gen_perm_mask_checked (gs_info.offset_vectype,
+						  indices);
 	}
       else if (nunits == gather_off_nunits * 2)
 	{
 	  modifier = NARROW;
 
-	  auto_vec_perm_indices sel (nunits);
+	  vec_perm_builder sel (nunits, nunits, 1);
 	  for (i = 0; i < nunits; ++i)
 	    sel.quick_push (i < gather_off_nunits
 			    ? i : i + nunits - gather_off_nunits);
 
-	  perm_mask = vect_gen_perm_mask_checked (vectype, sel);
+	  vec_perm_indices indices (sel, 2, nunits);
+	  perm_mask = vect_gen_perm_mask_checked (vectype, indices);
 	  ncopies *= 2;
 	}
       else
Index: gcc/tree-vect-generic.c
===================================================================
--- gcc/tree-vect-generic.c	2017-12-09 22:47:27.883318100 +0000
+++ gcc/tree-vect-generic.c	2017-12-09 22:48:47.547825355 +0000
@@ -1299,15 +1299,13 @@ lower_vec_perm (gimple_stmt_iterator *gs
 	mask = gimple_assign_rhs1 (def_stmt);
     }
 
-  if (TREE_CODE (mask) == VECTOR_CST)
-    {
-      auto_vec_perm_indices sel_int (elements);
-
-      for (i = 0; i < elements; ++i)
-	sel_int.quick_push (TREE_INT_CST_LOW (VECTOR_CST_ELT (mask, i))
-			    & (2 * elements - 1));
+  vec_perm_builder sel_int;
 
-      if (can_vec_perm_const_p (TYPE_MODE (vect_type), sel_int))
+  if (TREE_CODE (mask) == VECTOR_CST
+      && tree_to_vec_perm_builder (&sel_int, mask))
+    {
+      vec_perm_indices indices (sel_int, 2, elements);
+      if (can_vec_perm_const_p (TYPE_MODE (vect_type), indices))
 	{
 	  gimple_assign_set_rhs3 (stmt, mask);
 	  update_stmt (stmt);
@@ -1319,14 +1317,14 @@ lower_vec_perm (gimple_stmt_iterator *gs
 	  != CODE_FOR_nothing
 	  && TREE_CODE (vec1) == VECTOR_CST
 	  && initializer_zerop (vec1)
-	  && sel_int[0]
-	  && sel_int[0] < elements)
+	  && indices[0]
+	  && indices[0] < elements)
 	{
 	  for (i = 1; i < elements; ++i)
 	    {
-	      unsigned int expected = i + sel_int[0];
+	      unsigned int expected = i + indices[0];
 	      /* Indices into the second vector are all equivalent.  */
-	      if (MIN (elements, (unsigned) sel_int[i])
+	      if (MIN (elements, (unsigned) indices[i])
 		  != MIN (elements, expected))
  		break;
 	    }
Index: gcc/tree-vect-loop.c
===================================================================
--- gcc/tree-vect-loop.c	2017-12-09 22:47:27.884318101 +0000
+++ gcc/tree-vect-loop.c	2017-12-09 22:48:47.547825355 +0000
@@ -3714,12 +3714,11 @@ vect_estimate_min_profitable_iters (loop
    vector elements (not bits) for a vector with NELT elements.  */
 static void
 calc_vec_perm_mask_for_shift (unsigned int offset, unsigned int nelt,
-			      vec_perm_indices *sel)
+			      vec_perm_builder *sel)
 {
-  unsigned int i;
-
-  for (i = 0; i < nelt; i++)
-    sel->quick_push ((i + offset) & (2 * nelt - 1));
+  sel->new_vector (nelt, nelt, 1);
+  for (unsigned int i = 0; i < nelt; i++)
+    sel->quick_push (i + offset);
 }
 
 /* Checks whether the target supports whole-vector shifts for vectors of mode
@@ -3732,13 +3731,13 @@ have_whole_vector_shift (machine_mode mo
     return true;
 
   unsigned int i, nelt = GET_MODE_NUNITS (mode);
-  auto_vec_perm_indices sel (nelt);
-
+  vec_perm_builder sel;
+  vec_perm_indices indices;
   for (i = nelt/2; i >= 1; i/=2)
     {
-      sel.truncate (0);
       calc_vec_perm_mask_for_shift (i, nelt, &sel);
-      if (!can_vec_perm_const_p (mode, sel, false))
+      indices.new_vector (sel, 2, nelt);
+      if (!can_vec_perm_const_p (mode, indices, false))
 	return false;
     }
   return true;
@@ -5028,7 +5027,8 @@ vect_create_epilog_for_reduction (vec<tr
       if (reduce_with_shift && !slp_reduc)
         {
           int nelements = vec_size_in_bits / element_bitsize;
-          auto_vec_perm_indices sel (nelements);
+	  vec_perm_builder sel;
+	  vec_perm_indices indices;
 
           int elt_offset;
 
@@ -5052,9 +5052,9 @@ vect_create_epilog_for_reduction (vec<tr
                elt_offset >= 1;
                elt_offset /= 2)
             {
-	      sel.truncate (0);
 	      calc_vec_perm_mask_for_shift (elt_offset, nelements, &sel);
-	      tree mask = vect_gen_perm_mask_any (vectype, sel);
+	      indices.new_vector (sel, 2, nelements);
+	      tree mask = vect_gen_perm_mask_any (vectype, indices);
 	      epilog_stmt = gimple_build_assign (vec_dest, VEC_PERM_EXPR,
 						 new_temp, zero_vec, mask);
               new_name = make_ssa_name (vec_dest, epilog_stmt);
Index: gcc/config/aarch64/aarch64.c
===================================================================
--- gcc/config/aarch64/aarch64.c	2017-12-09 22:47:27.856318084 +0000
+++ gcc/config/aarch64/aarch64.c	2017-12-09 22:48:47.535824832 +0000
@@ -13208,7 +13208,7 @@ #define MAX_VECT_LEN 16
 struct expand_vec_perm_d
 {
   rtx target, op0, op1;
-  auto_vec_perm_indices perm;
+  vec_perm_indices perm;
   machine_mode vmode;
   bool one_vector_p;
   bool testing_p;
@@ -13598,10 +13598,7 @@ aarch64_expand_vec_perm_const_1 (struct
   unsigned int nelt = d->perm.length ();
   if (d->perm[0] >= nelt)
     {
-      gcc_assert (nelt == (nelt & -nelt));
-      for (unsigned int i = 0; i < nelt; ++i)
-	d->perm[i] ^= nelt; /* Keep the same index, but in the other vector.  */
-
+      d->perm.rotate_inputs (1);
       std::swap (d->op0, d->op1);
     }
 
@@ -13641,12 +13638,10 @@ aarch64_vectorize_vec_perm_const (machin
 
   /* Calculate whether all elements are in one vector.  */
   unsigned int nelt = sel.length ();
-  d.perm.reserve (nelt);
   for (i = which = 0; i < nelt; ++i)
     {
       unsigned int ei = sel[i] & (2 * nelt - 1);
       which |= (ei < nelt ? 1 : 2);
-      d.perm.quick_push (ei);
     }
 
   switch (which)
@@ -13665,8 +13660,6 @@ aarch64_vectorize_vec_perm_const (machin
 	 input vector.  */
       /* Fall Through.  */
     case 2:
-      for (i = 0; i < nelt; ++i)
-	d.perm[i] &= nelt - 1;
       d.op0 = op1;
       d.one_vector_p = true;
       break;
@@ -13677,6 +13670,8 @@ aarch64_vectorize_vec_perm_const (machin
       break;
     }
 
+  d.perm.new_vector (sel.encoding (), d.one_vector_p ? 1 : 2, nelt);
+
   if (!d.testing_p)
     return aarch64_expand_vec_perm_const_1 (&d);
 
Index: gcc/config/arm/arm.c
===================================================================
--- gcc/config/arm/arm.c	2017-12-09 22:47:27.858318085 +0000
+++ gcc/config/arm/arm.c	2017-12-09 22:48:47.538824963 +0000
@@ -28852,7 +28852,7 @@ #define MAX_VECT_LEN 16
 struct expand_vec_perm_d
 {
   rtx target, op0, op1;
-  auto_vec_perm_indices perm;
+  vec_perm_indices perm;
   machine_mode vmode;
   bool one_vector_p;
   bool testing_p;
@@ -29360,9 +29360,7 @@ arm_expand_vec_perm_const_1 (struct expa
   unsigned int nelt = d->perm.length ();
   if (d->perm[0] >= nelt)
     {
-      for (unsigned int i = 0; i < nelt; ++i)
-	d->perm[i] = (d->perm[i] + nelt) & (2 * nelt - 1);
-
+      d->perm.rotate_inputs (1);
       std::swap (d->op0, d->op1);
     }
 
@@ -29402,12 +29400,10 @@ arm_vectorize_vec_perm_const (machine_mo
   d.testing_p = !target;
 
   nelt = GET_MODE_NUNITS (d.vmode);
-  d.perm.reserve (nelt);
   for (i = which = 0; i < nelt; ++i)
     {
       int ei = sel[i] & (2 * nelt - 1);
       which |= (ei < nelt ? 1 : 2);
-      d.perm.quick_push (ei);
     }
 
   switch (which)
@@ -29426,8 +29422,6 @@ arm_vectorize_vec_perm_const (machine_mo
 	 input vector.  */
       /* FALLTHRU */
     case 2:
-      for (i = 0; i < nelt; ++i)
-        d.perm[i] &= nelt - 1;
       d.op0 = op1;
       d.one_vector_p = true;
       break;
@@ -29438,6 +29432,8 @@ arm_vectorize_vec_perm_const (machine_mo
       break;
     }
 
+  d.perm.new_vector (sel.encoding (), d.one_vector_p ? 1 : 2, nelt);
+
   if (d.testing_p)
     return arm_expand_vec_perm_const_1 (&d);
 
Index: gcc/config/powerpcspe/powerpcspe.c
===================================================================
--- gcc/config/powerpcspe/powerpcspe.c	2017-12-09 22:47:27.871318093 +0000
+++ gcc/config/powerpcspe/powerpcspe.c	2017-12-09 22:48:47.541825094 +0000
@@ -38780,7 +38780,7 @@ rs6000_expand_extract_even (rtx target,
 {
   machine_mode vmode = GET_MODE (target);
   unsigned i, nelt = GET_MODE_NUNITS (vmode);
-  vec_perm_builder perm (nelt);
+  vec_perm_builder perm (nelt, nelt, 1);
 
   for (i = 0; i < nelt; i++)
     perm.quick_push (i * 2);
@@ -38795,7 +38795,7 @@ rs6000_expand_interleave (rtx target, rt
 {
   machine_mode vmode = GET_MODE (target);
   unsigned i, high, nelt = GET_MODE_NUNITS (vmode);
-  vec_perm_builder perm (nelt);
+  vec_perm_builder perm (nelt, nelt, 1);
 
   high = (highp ? 0 : nelt / 2);
   for (i = 0; i < nelt / 2; i++)
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	2017-12-09 22:47:27.874318095 +0000
+++ gcc/config/rs6000/rs6000.c	2017-12-09 22:48:47.544825224 +0000
@@ -36017,7 +36017,7 @@ rs6000_expand_extract_even (rtx target,
 {
   machine_mode vmode = GET_MODE (target);
   unsigned i, nelt = GET_MODE_NUNITS (vmode);
-  vec_perm_builder perm (nelt);
+  vec_perm_builder perm (nelt, nelt, 1);
 
   for (i = 0; i < nelt; i++)
     perm.quick_push (i * 2);
@@ -36032,7 +36032,7 @@ rs6000_expand_interleave (rtx target, rt
 {
   machine_mode vmode = GET_MODE (target);
   unsigned i, high, nelt = GET_MODE_NUNITS (vmode);
-  vec_perm_builder perm (nelt);
+  vec_perm_builder perm (nelt, nelt, 1);
 
   high = (highp ? 0 : nelt / 2);
   for (i = 0; i < nelt / 2; i++)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]