This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC, PR 80689] Copy small aggregates element-wise


> I think the limit should be on the number of generated copies and not
> the overall size of the structure...  If the struct were composed of
> 32 individual chars we wouldn't want to emit 32 loads and 32 stores...
> 
> I wonder how rep; movb; interacts with store to load forwarding?  Is
> that maybe optimized well on some archs?  movb should always
> forward and wasn't the setup cost for small N reasonable on modern
> CPUs?

rep mov is win over loop for blocks over 128bytes on core, for blocks in rage
24-128 on zen.  This is w/o store/load forwarding, but I doubt those provide
a cheap way around.

> 
> It probably depends on the width of the entries in the store buffer,
> if they appear in-order and the alignment of the stores (if they are larger than
> 8 bytes they are surely aligned).  IIRC CPUs had smaller store buffer
> entries than cache line size.
> 
> Given that load bandwith is usually higher than store bandwith it
> might make sense to do the store combining in our copying sequence,
> like for the 8 byte entry case use sth like
> 
>   movq 0(%eax), %xmm0
>   movhps 8(%eax), %xmm0 // or vpinsert
>   mov[au]ps %xmm0, 0%(ebx)
> ...
> 
> thus do two loads per store and perform the stores in wider
> mode?

This may be somewhat faster indeed.  I am not sure if store to load
forwarding will work for the later half when read again by halves.
It would not happen on older CPUs :)

Honza
> 
> As said a general concern was you not copying padding.  If you
> put this into an even more common place you surely will break
> stuff, no?
> 
> Richard.
> 
> >
> > Martin
> >
> >
> >>
> >> Richard.
> >>
> >> > Martin
> >> >
> >> >
> >> > 2017-10-12  Martin Jambor  <mjambor@suse.cz>
> >> >
> >> >         PR target/80689
> >> >         * tree-sra.h: New file.
> >> >         * ipa-prop.h: Moved declaration of build_ref_for_offset to
> >> >         tree-sra.h.
> >> >         * expr.c: Include params.h and tree-sra.h.
> >> >         (emit_move_elementwise): New function.
> >> >         (store_expr_with_bounds): Optionally use it.
> >> >         * ipa-cp.c: Include tree-sra.h.
> >> >         * params.def (PARAM_MAX_SIZE_FOR_ELEMENTWISE_COPY): New.
> >> >         * config/i386/i386.c (ix86_option_override_internal): Set
> >> >         PARAM_MAX_SIZE_FOR_ELEMENTWISE_COPY to 35.
> >> >         * tree-sra.c: Include tree-sra.h.
> >> >         (scalarizable_type_p): Renamed to
> >> >         simple_mix_of_records_and_arrays_p, made public, renamed the
> >> >         second parameter to allow_char_arrays.
> >> >         (extract_min_max_idx_from_array): New function.
> >> >         (completely_scalarize): Moved bits of the function to
> >> >         extract_min_max_idx_from_array.
> >> >
> >> >         testsuite/
> >> >         * gcc.target/i386/pr80689-1.c: New test.
> >> > ---
> >> >  gcc/config/i386/i386.c                    |   4 ++
> >> >  gcc/expr.c                                | 103 ++++++++++++++++++++++++++++--
> >> >  gcc/ipa-cp.c                              |   1 +
> >> >  gcc/ipa-prop.h                            |   4 --
> >> >  gcc/params.def                            |   6 ++
> >> >  gcc/testsuite/gcc.target/i386/pr80689-1.c |  38 +++++++++++
> >> >  gcc/tree-sra.c                            |  86 +++++++++++++++----------
> >> >  gcc/tree-sra.h                            |  33 ++++++++++
> >> >  8 files changed, 233 insertions(+), 42 deletions(-)
> >> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr80689-1.c
> >> >  create mode 100644 gcc/tree-sra.h
> >> >
> >> > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> >> > index 1ee8351c21f..87f602e7ead 100644
> >> > --- a/gcc/config/i386/i386.c
> >> > +++ b/gcc/config/i386/i386.c
> >> > @@ -6511,6 +6511,10 @@ ix86_option_override_internal (bool main_args_p,
> >> >                          ix86_tune_cost->l2_cache_size,
> >> >                          opts->x_param_values,
> >> >                          opts_set->x_param_values);
> >> > +  maybe_set_param_value (PARAM_MAX_SIZE_FOR_ELEMENTWISE_COPY,
> >> > +                        35,
> >> > +                        opts->x_param_values,
> >> > +                        opts_set->x_param_values);
> >> >
> >> >    /* Enable sw prefetching at -O3 for CPUS that prefetching is helpful.  */
> >> >    if (opts->x_flag_prefetch_loop_arrays < 0
> >> > diff --git a/gcc/expr.c b/gcc/expr.c
> >> > index 134ee731c29..dff24e7f166 100644
> >> > --- a/gcc/expr.c
> >> > +++ b/gcc/expr.c
> >> > @@ -61,7 +61,8 @@ along with GCC; see the file COPYING3.  If not see
> >> >  #include "tree-chkp.h"
> >> >  #include "rtl-chkp.h"
> >> >  #include "ccmp.h"
> >> > -
> >> > +#include "params.h"
> >> > +#include "tree-sra.h"
> >> >
> >> >  /* If this is nonzero, we do not bother generating VOLATILE
> >> >     around volatile memory references, and we are willing to
> >> > @@ -5340,6 +5341,80 @@ emit_storent_insn (rtx to, rtx from)
> >> >    return maybe_expand_insn (code, 2, ops);
> >> >  }
> >> >
> >> > +/* Generate code for copying data of type TYPE at SOURCE plus OFFSET to TARGET
> >> > +   plus OFFSET, but do so element-wise and/or field-wise for each record and
> >> > +   array within TYPE.  TYPE must either be a register type or an aggregate
> >> > +   complying with scalarizable_type_p.
> >> > +
> >> > +   If CALL_PARAM_P is nonzero, this is a store into a call param on the
> >> > +   stack, and block moves may need to be treated specially.  */
> >> > +
> >> > +static void
> >> > +emit_move_elementwise (tree type, rtx target, rtx source, HOST_WIDE_INT offset,
> >> > +                      int call_param_p)
> >> > +{
> >> > +  switch (TREE_CODE (type))
> >> > +    {
> >> > +    case RECORD_TYPE:
> >> > +      for (tree fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld))
> >> > +       if (TREE_CODE (fld) == FIELD_DECL)
> >> > +         {
> >> > +           HOST_WIDE_INT fld_offset = offset + int_bit_position (fld);
> >> > +           tree ft = TREE_TYPE (fld);
> >> > +           emit_move_elementwise (ft, target, source, fld_offset,
> >> > +                                  call_param_p);
> >> > +         }
> >> > +      break;
> >> > +
> >> > +    case ARRAY_TYPE:
> >> > +      {
> >> > +       tree elem_type = TREE_TYPE (type);
> >> > +       HOST_WIDE_INT el_size = tree_to_shwi (TYPE_SIZE (elem_type));
> >> > +       gcc_assert (el_size > 0);
> >> > +
> >> > +       offset_int idx, max;
> >> > +       /* Skip (some) zero-length arrays; others have MAXIDX == MINIDX - 1.  */
> >> > +       if (extract_min_max_idx_from_array (type, &idx, &max))
> >> > +         {
> >> > +           HOST_WIDE_INT el_offset = offset;
> >> > +           for (; idx <= max; ++idx)
> >> > +             {
> >> > +               emit_move_elementwise (elem_type, target, source, el_offset,
> >> > +                                      call_param_p);
> >> > +               el_offset += el_size;
> >> > +             }
> >> > +         }
> >> > +      }
> >> > +      break;
> >> > +    default:
> >> > +      machine_mode mode = TYPE_MODE (type);
> >> > +
> >> > +      rtx ntgt = adjust_address (target, mode, offset / BITS_PER_UNIT);
> >> > +      rtx nsrc = adjust_address (source, mode, offset / BITS_PER_UNIT);
> >> > +
> >> > +      /* TODO: Figure out whether the following is actually necessary.  */
> >> > +      if (target == ntgt)
> >> > +       ntgt = copy_rtx (target);
> >> > +      if (source == nsrc)
> >> > +       nsrc = copy_rtx (source);
> >> > +
> >> > +      gcc_assert (mode != VOIDmode);
> >> > +      if (mode != BLKmode)
> >> > +       emit_move_insn (ntgt, nsrc);
> >> > +      else
> >> > +       {
> >> > +         /* For example vector gimple registers can end up here.  */
> >> > +         rtx size = expand_expr (TYPE_SIZE_UNIT (type), NULL_RTX,
> >> > +                                 TYPE_MODE (sizetype), EXPAND_NORMAL);
> >> > +         emit_block_move (ntgt, nsrc, size,
> >> > +                          (call_param_p
> >> > +                           ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL));
> >> > +       }
> >> > +      break;
> >> > +    }
> >> > +  return;
> >> > +}
> >> > +
> >> >  /* Generate code for computing expression EXP,
> >> >     and storing the value into TARGET.
> >> >
> >> > @@ -5713,9 +5788,29 @@ store_expr_with_bounds (tree exp, rtx target, int call_param_p,
> >> >         emit_group_store (target, temp, TREE_TYPE (exp),
> >> >                           int_size_in_bytes (TREE_TYPE (exp)));
> >> >        else if (GET_MODE (temp) == BLKmode)
> >> > -       emit_block_move (target, temp, expr_size (exp),
> >> > -                        (call_param_p
> >> > -                         ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL));
> >> > +       {
> >> > +         /* Copying smallish BLKmode structures with emit_block_move and thus
> >> > +            by-pieces can result in store-to-load stalls.  So copy some simple
> >> > +            small aggregates element or field-wise.  */
> >> > +         if (GET_MODE (target) == BLKmode
> >> > +             && AGGREGATE_TYPE_P (TREE_TYPE (exp))
> >> > +             && !TREE_ADDRESSABLE (TREE_TYPE (exp))
> >> > +             && tree_fits_shwi_p (TYPE_SIZE (TREE_TYPE (exp)))
> >> > +             && (tree_to_shwi (TYPE_SIZE (TREE_TYPE (exp)))
> >> > +                 <= (PARAM_VALUE (PARAM_MAX_SIZE_FOR_ELEMENTWISE_COPY)
> >> > +                     * BITS_PER_UNIT))
> >> > +             && simple_mix_of_records_and_arrays_p (TREE_TYPE (exp), false))
> >> > +           {
> >> > +             /* FIXME:  Can this happen?  What would it mean?  */
> >> > +             gcc_assert (!reverse);
> >> > +             emit_move_elementwise (TREE_TYPE (exp), target, temp, 0,
> >> > +                                    call_param_p);
> >> > +           }
> >> > +         else
> >> > +           emit_block_move (target, temp, expr_size (exp),
> >> > +                            (call_param_p
> >> > +                             ? BLOCK_OP_CALL_PARM : BLOCK_OP_NORMAL));
> >> > +       }
> >> >        /* If we emit a nontemporal store, there is nothing else to do.  */
> >> >        else if (nontemporal && emit_storent_insn (target, temp))
> >> >         ;
> >> > diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> >> > index 6b3d8d7364c..7d6019bbd30 100644
> >> > --- a/gcc/ipa-cp.c
> >> > +++ b/gcc/ipa-cp.c
> >> > @@ -124,6 +124,7 @@ along with GCC; see the file COPYING3.  If not see
> >> >  #include "tree-ssa-ccp.h"
> >> >  #include "stringpool.h"
> >> >  #include "attribs.h"
> >> > +#include "tree-sra.h"
> >> >
> >> >  template <typename valtype> class ipcp_value;
> >> >
> >> > diff --git a/gcc/ipa-prop.h b/gcc/ipa-prop.h
> >> > index fa5bed49ee0..2313cc884ed 100644
> >> > --- a/gcc/ipa-prop.h
> >> > +++ b/gcc/ipa-prop.h
> >> > @@ -877,10 +877,6 @@ ipa_parm_adjustment *ipa_get_adjustment_candidate (tree **, bool *,
> >> >  void ipa_release_body_info (struct ipa_func_body_info *);
> >> >  tree ipa_get_callee_param_type (struct cgraph_edge *e, int i);
> >> >
> >> > -/* From tree-sra.c:  */
> >> > -tree build_ref_for_offset (location_t, tree, HOST_WIDE_INT, bool, tree,
> >> > -                          gimple_stmt_iterator *, bool);
> >> > -
> >> >  /* In ipa-cp.c  */
> >> >  void ipa_cp_c_finalize (void);
> >> >
> >> > diff --git a/gcc/params.def b/gcc/params.def
> >> > index e55afc28053..5e19f1414a0 100644
> >> > --- a/gcc/params.def
> >> > +++ b/gcc/params.def
> >> > @@ -1294,6 +1294,12 @@ DEFPARAM (PARAM_VECT_EPILOGUES_NOMASK,
> >> >           "Enable loop epilogue vectorization using smaller vector size.",
> >> >           0, 0, 1)
> >> >
> >> > +DEFPARAM (PARAM_MAX_SIZE_FOR_ELEMENTWISE_COPY,
> >> > +         "max-size-for-elementwise-copy",
> >> > +         "Maximum size in bytes of a structure or array to by considered for "
> >> > +         "copying by its individual fields or elements",
> >> > +         0, 0, 512)
> >> > +
> >> >  /*
> >> >
> >> >  Local variables:
> >> > diff --git a/gcc/testsuite/gcc.target/i386/pr80689-1.c b/gcc/testsuite/gcc.target/i386/pr80689-1.c
> >> > new file mode 100644
> >> > index 00000000000..4156d4fba45
> >> > --- /dev/null
> >> > +++ b/gcc/testsuite/gcc.target/i386/pr80689-1.c
> >> > @@ -0,0 +1,38 @@
> >> > +/* { dg-do compile } */
> >> > +/* { dg-options "-O2" } */
> >> > +
> >> > +typedef struct st1
> >> > +{
> >> > +        long unsigned int a,b;
> >> > +        long int c,d;
> >> > +}R;
> >> > +
> >> > +typedef struct st2
> >> > +{
> >> > +        int  t;
> >> > +        R  reg;
> >> > +}N;
> >> > +
> >> > +void Set (const R *region,  N *n_info );
> >> > +
> >> > +void test(N  *n_obj ,const long unsigned int a, const long unsigned int b,  const long int c,const long int d)
> >> > +{
> >> > +        R reg;
> >> > +
> >> > +        reg.a=a;
> >> > +        reg.b=b;
> >> > +        reg.c=c;
> >> > +        reg.d=d;
> >> > +        Set (&reg, n_obj);
> >> > +
> >> > +}
> >> > +
> >> > +void Set (const R *reg,  N *n_obj )
> >> > +{
> >> > +        n_obj->reg=(*reg);
> >> > +}
> >> > +
> >> > +
> >> > +/* { dg-final { scan-assembler-not "%(x|y|z)mm\[0-9\]+" } } */
> >> > +/* { dg-final { scan-assembler-not "movdqu" } } */
> >> > +/* { dg-final { scan-assembler-not "movups" } } */
> >> > diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> >> > index bac593951e7..ade97964205 100644
> >> > --- a/gcc/tree-sra.c
> >> > +++ b/gcc/tree-sra.c
> >> > @@ -104,6 +104,7 @@ along with GCC; see the file COPYING3.  If not see
> >> >  #include "ipa-fnsummary.h"
> >> >  #include "ipa-utils.h"
> >> >  #include "builtins.h"
> >> > +#include "tree-sra.h"
> >> >
> >> >  /* Enumeration of all aggregate reductions we can do.  */
> >> >  enum sra_mode { SRA_MODE_EARLY_IPA,   /* early call regularization */
> >> > @@ -952,14 +953,14 @@ create_access (tree expr, gimple *stmt, bool write)
> >> >  }
> >> >
> >> >
> >> > -/* Return true iff TYPE is scalarizable - i.e. a RECORD_TYPE or fixed-length
> >> > -   ARRAY_TYPE with fields that are either of gimple register types (excluding
> >> > -   bit-fields) or (recursively) scalarizable types.  CONST_DECL must be true if
> >> > -   we are considering a decl from constant pool.  If it is false, char arrays
> >> > -   will be refused.  */
> >> > +/* Return true if TYPE consists of RECORD_TYPE or fixed-length ARRAY_TYPE with
> >> > +   fields/elements that are not bit-fields and are either register types or
> >> > +   recursively comply with simple_mix_of_records_and_arrays_p.  Furthermore, if
> >> > +   ALLOW_CHAR_ARRAYS is false, the function will return false also if TYPE
> >> > +   contains an array of elements that only have one byte.  */
> >> >
> >> > -static bool
> >> > -scalarizable_type_p (tree type, bool const_decl)
> >> > +bool
> >> > +simple_mix_of_records_and_arrays_p (tree type, bool allow_char_arrays)
> >> >  {
> >> >    gcc_assert (!is_gimple_reg_type (type));
> >> >    if (type_contains_placeholder_p (type))
> >> > @@ -977,7 +978,7 @@ scalarizable_type_p (tree type, bool const_decl)
> >> >             return false;
> >> >
> >> >           if (!is_gimple_reg_type (ft)
> >> > -             && !scalarizable_type_p (ft, const_decl))
> >> > +             && !simple_mix_of_records_and_arrays_p (ft, allow_char_arrays))
> >> >             return false;
> >> >         }
> >> >
> >> > @@ -986,7 +987,7 @@ scalarizable_type_p (tree type, bool const_decl)
> >> >    case ARRAY_TYPE:
> >> >      {
> >> >        HOST_WIDE_INT min_elem_size;
> >> > -      if (const_decl)
> >> > +      if (allow_char_arrays)
> >> >         min_elem_size = 0;
> >> >        else
> >> >         min_elem_size = BITS_PER_UNIT;
> >> > @@ -1008,7 +1009,7 @@ scalarizable_type_p (tree type, bool const_decl)
> >> >
> >> >        tree elem = TREE_TYPE (type);
> >> >        if (!is_gimple_reg_type (elem)
> >> > -         && !scalarizable_type_p (elem, const_decl))
> >> > +         && !simple_mix_of_records_and_arrays_p (elem, allow_char_arrays))
> >> >         return false;
> >> >        return true;
> >> >      }
> >> > @@ -1017,10 +1018,38 @@ scalarizable_type_p (tree type, bool const_decl)
> >> >    }
> >> >  }
> >> >
> >> > -static void scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, bool, tree, tree);
> >> > +static void scalarize_elem (tree, HOST_WIDE_INT, HOST_WIDE_INT, bool, tree,
> >> > +                           tree);
> >> > +
> >> > +/* For a given array TYPE, return false if its domain does not have any maximum
> >> > +   value.  Otherwise calculate MIN and MAX indices of the first and the last
> >> > +   element.  */
> >> > +
> >> > +bool
> >> > +extract_min_max_idx_from_array (tree type, offset_int *min, offset_int *max)
> >> > +{
> >> > +  tree domain = TYPE_DOMAIN (type);
> >> > +  tree minidx = TYPE_MIN_VALUE (domain);
> >> > +  gcc_assert (TREE_CODE (minidx) == INTEGER_CST);
> >> > +  tree maxidx = TYPE_MAX_VALUE (domain);
> >> > +  if (!maxidx)
> >> > +    return false;
> >> > +  gcc_assert (TREE_CODE (maxidx) == INTEGER_CST);
> >> > +
> >> > +  /* MINIDX and MAXIDX are inclusive, and must be interpreted in
> >> > +     DOMAIN (e.g. signed int, whereas min/max may be size_int).  */
> >> > +  *min = wi::to_offset (minidx);
> >> > +  *max = wi::to_offset (maxidx);
> >> > +  if (!TYPE_UNSIGNED (domain))
> >> > +    {
> >> > +      *min = wi::sext (*min, TYPE_PRECISION (domain));
> >> > +      *max = wi::sext (*max, TYPE_PRECISION (domain));
> >> > +    }
> >> > +  return true;
> >> > +}
> >> >
> >> >  /* Create total_scalarization accesses for all scalar fields of a member
> >> > -   of type DECL_TYPE conforming to scalarizable_type_p.  BASE
> >> > +   of type DECL_TYPE conforming to simple_mix_of_records_and_arrays_p.  BASE
> >> >     must be the top-most VAR_DECL representing the variable; within that,
> >> >     OFFSET locates the member and REF must be the memory reference expression for
> >> >     the member.  */
> >> > @@ -1047,27 +1076,14 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref)
> >> >        {
> >> >         tree elemtype = TREE_TYPE (decl_type);
> >> >         tree elem_size = TYPE_SIZE (elemtype);
> >> > -       gcc_assert (elem_size && tree_fits_shwi_p (elem_size));
> >> >         HOST_WIDE_INT el_size = tree_to_shwi (elem_size);
> >> >         gcc_assert (el_size > 0);
> >> >
> >> > -       tree minidx = TYPE_MIN_VALUE (TYPE_DOMAIN (decl_type));
> >> > -       gcc_assert (TREE_CODE (minidx) == INTEGER_CST);
> >> > -       tree maxidx = TYPE_MAX_VALUE (TYPE_DOMAIN (decl_type));
> >> > +       offset_int idx, max;
> >> >         /* Skip (some) zero-length arrays; others have MAXIDX == MINIDX - 1.  */
> >> > -       if (maxidx)
> >> > +       if (extract_min_max_idx_from_array (decl_type, &idx, &max))
> >> >           {
> >> > -           gcc_assert (TREE_CODE (maxidx) == INTEGER_CST);
> >> >             tree domain = TYPE_DOMAIN (decl_type);
> >> > -           /* MINIDX and MAXIDX are inclusive, and must be interpreted in
> >> > -              DOMAIN (e.g. signed int, whereas min/max may be size_int).  */
> >> > -           offset_int idx = wi::to_offset (minidx);
> >> > -           offset_int max = wi::to_offset (maxidx);
> >> > -           if (!TYPE_UNSIGNED (domain))
> >> > -             {
> >> > -               idx = wi::sext (idx, TYPE_PRECISION (domain));
> >> > -               max = wi::sext (max, TYPE_PRECISION (domain));
> >> > -             }
> >> >             for (int el_off = offset; idx <= max; ++idx)
> >> >               {
> >> >                 tree nref = build4 (ARRAY_REF, elemtype,
> >> > @@ -1088,10 +1104,10 @@ completely_scalarize (tree base, tree decl_type, HOST_WIDE_INT offset, tree ref)
> >> >  }
> >> >
> >> >  /* Create total_scalarization accesses for a member of type TYPE, which must
> >> > -   satisfy either is_gimple_reg_type or scalarizable_type_p.  BASE must be the
> >> > -   top-most VAR_DECL representing the variable; within that, POS and SIZE locate
> >> > -   the member, REVERSE gives its torage order. and REF must be the reference
> >> > -   expression for it.  */
> >> > +   satisfy either is_gimple_reg_type or simple_mix_of_records_and_arrays_p.
> >> > +   BASE must be the top-most VAR_DECL representing the variable; within that,
> >> > +   POS and SIZE locate the member, REVERSE gives its torage order. and REF must
> >> > +   be the reference expression for it.  */
> >> >
> >> >  static void
> >> >  scalarize_elem (tree base, HOST_WIDE_INT pos, HOST_WIDE_INT size, bool reverse,
> >> > @@ -1111,7 +1127,8 @@ scalarize_elem (tree base, HOST_WIDE_INT pos, HOST_WIDE_INT size, bool reverse,
> >> >  }
> >> >
> >> >  /* Create a total_scalarization access for VAR as a whole.  VAR must be of a
> >> > -   RECORD_TYPE or ARRAY_TYPE conforming to scalarizable_type_p.  */
> >> > +   RECORD_TYPE or ARRAY_TYPE conforming to
> >> > +   simple_mix_of_records_and_arrays_p.  */
> >> >
> >> >  static void
> >> >  create_total_scalarization_access (tree var)
> >> > @@ -2803,8 +2820,9 @@ analyze_all_variable_accesses (void)
> >> >        {
> >> >         tree var = candidate (i);
> >> >
> >> > -       if (VAR_P (var) && scalarizable_type_p (TREE_TYPE (var),
> >> > -                                               constant_decl_p (var)))
> >> > +       if (VAR_P (var)
> >> > +           && simple_mix_of_records_and_arrays_p (TREE_TYPE (var),
> >> > +                                                  constant_decl_p (var)))
> >> >           {
> >> >             if (tree_to_uhwi (TYPE_SIZE (TREE_TYPE (var)))
> >> >                 <= max_scalarization_size)
> >> > diff --git a/gcc/tree-sra.h b/gcc/tree-sra.h
> >> > new file mode 100644
> >> > index 00000000000..dc901385994
> >> > --- /dev/null
> >> > +++ b/gcc/tree-sra.h
> >> > @@ -0,0 +1,33 @@
> >> > +/* tree-sra.h - Run-time parameters.
> >> > +   Copyright (C) 2017 Free Software Foundation, Inc.
> >> > +
> >> > +This file is part of GCC.
> >> > +
> >> > +GCC is free software; you can redistribute it and/or modify it under
> >> > +the terms of the GNU General Public License as published by the Free
> >> > +Software Foundation; either version 3, or (at your option) any later
> >> > +version.
> >> > +
> >> > +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> >> > +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> >> > +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> >> > +for more details.
> >> > +
> >> > +You should have received a copy of the GNU General Public License
> >> > +along with GCC; see the file COPYING3.  If not see
> >> > +<http://www.gnu.org/licenses/>.  */
> >> > +
> >> > +#ifndef TREE_SRA_H
> >> > +#define TREE_SRA_H
> >> > +
> >> > +
> >> > +bool simple_mix_of_records_and_arrays_p (tree type, bool allow_char_arrays);
> >> > +bool extract_min_max_idx_from_array (tree type, offset_int *idx,
> >> > +                                    offset_int *max);
> >> > +tree build_ref_for_offset (location_t loc, tree base, HOST_WIDE_INT offset,
> >> > +                          bool reverse, tree exp_type,
> >> > +                          gimple_stmt_iterator *gsi, bool insert_after);
> >> > +
> >> > +
> >> > +
> >> > +#endif /* TREE_SRA_H */
> >> > --
> >> > 2.14.1
> >> >


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]