This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [PATCH] Add a new option "-fmerge-bitfields" (patch / doc inside)


Hello,
This is new patch version in which reported issue is fixed.
Also, patch is rebased to the revision 216452 and some minor code clean-up is done.

--------------------------------------------------------------------------------------------------------------------------

Lowering is applied only for bit-fields copy sequences that are merged.
Data structure representing bit-field copy sequences is renamed and reduced in size.
Optimization turned on by default for -O2 and higher.
Some comments fixed.

Benchmarking performed on WebKit for Android.
Code size reduction noticed on several files, best examples are:

core/rendering/style/StyleMultiColData (632->520 bytes)
core/platform/graphics/FontDescription (1715->1475 bytes)
core/rendering/style/FillLayer (5069->4513 bytes)
core/rendering/style/StyleRareInheritedData (5618->5346)
core/css/CSSSelectorList(4047->3887)
core/platform/animation/CSSAnimationData (3844->3440 bytes)
core/css/resolver/FontBuilder (13818->13350 bytes)
core/platform/graphics/Font (16447->15975 bytes)


Example:

One of the motivating examples for this work was copy constructor of the class which contains bit-fields.

C++ code:
class A
{
public:
        A(const A &x);
        unsigned a : 1;
        unsigned b : 2;
        unsigned c : 4;
};

A::A(const A&x)
{
        a = x.a;
        b = x.b;
        c = x.c;
}

GIMPLE code without optimization:

  <bb 2>:
  _3 = x_2(D)->a;
  this_4(D)->a = _3;
  _6 = x_2(D)->b;
  this_4(D)->b = _6;
  _8 = x_2(D)->c;
  this_4(D)->c = _8;
  return;

Optimized GIMPLE code:
  <bb 2>:
  _10 = x_2(D)->D.1867;
  _11 = BIT_FIELD_REF <_10, 7, 0>;
  _12 = this_4(D)->D.1867;
  _13 = _12 & 128;
  _14 = (unsigned char) _11;
  _15 = _13 | _14;
  this_4(D)->D.1867 = _15;
  return;

Generated MIPS32r2 assembly code without optimization:
         lw      $3,0($5)
        lbu     $2,0($4)
        andi    $3,$3,0x1
        andi    $2,$2,0xfe
        or      $2,$2,$3
        sb      $2,0($4)
        lw      $3,0($5)
        andi    $2,$2,0xf9
        andi    $3,$3,0x6
        or      $2,$2,$3
        sb      $2,0($4)
        lw      $3,0($5)
        andi    $2,$2,0x87
        andi    $3,$3,0x78
        or      $2,$2,$3
        j       $31
        sb      $2,0($4)

Optimized MIPS32r2 assembly code:
        lw      $3,0($5)
        lbu     $2,0($4)
        andi    $3,$3,0x7f
        andi    $2,$2,0x80
        or      $2,$3,$2
        j       $31
        sb      $2,0($4)


Algorithm works on basic block level and consists of following 3 major steps:
1. Go through basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure.
2. Identify records that represent adjacent bit field accesses and mark them as merged.
3. Lower bit-field accesses by using new field size for those that can be merged.


New command line option "-fmerge-bitfields" is introduced.


Tested - passed gcc regression tests for MIPS32r2.


Changelog -

gcc/ChangeLog:
2014-04-22 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
  * common.opt (fmerge-bitfields): New option.
  * doc/invoke.texi: Add reference to "-fmerge-bitfields".
  * doc/invoke.texi: Add "-fmerge-bitfields" to the list of optimization
    flags turned on at -O2.
  * tree-sra.c (lower_bitfields): New function.
  Entry for (-fmerge-bitfields).
  (part_of_union_p): New function.
  (bf_access_candidate_p): New function.
  (lower_bitfield_read): New function.
  (lower_bitfield_write): New function.
  (bitfield_stmt_bfcopy_pair::hash): New function.
  (bitfield_stmt_bfcopy_pair::equal): New function.
  (bitfield_stmt_bfcopy_pair::remove): New function.
  (create_and_insert_bfcopy): New function.
  (get_bit_offset): New function.
  (add_stmt_bfcopy_pair): New function.
  (cmp_bfcopies): New function.
  (get_merged_bit_field_size): New function.
  * dwarf2out.c (simple_type_size_in_bits): Move to tree.c.
  (field_byte_offset): Move declaration to tree.h and make it extern.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
  * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c.
  * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h.
  * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c.
  (simple_type_size_in_bits): Move from dwarf2out.c.
  * tree.h (expressions_equal_p): Add declaration.
  (field_byte_offset): Add declaration.

Patch -


diff --git a/gcc/common.opt b/gcc/common.opt
index 5db5e1e..cec145c 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2270,6 +2270,10 @@ ftree-sra
 Common Report Var(flag_tree_sra) Optimization
 Perform scalar replacement of aggregates
 
+fmerge-bitfields
+Common Report Var(flag_tree_bitfield_merge) Optimization
+Merge loads and stores of consecutive bitfields
+
 ftree-ter
 Common Report Var(flag_tree_ter) Optimization
 Replace temporary expressions in the SSA->normal pass
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 23f272f..d4205e1 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -421,7 +421,7 @@ Objective-C and Objective-C++ Dialects}.
 -fsplit-ivs-in-unroller -fsplit-wide-types -fssa-phiopt -fstack-protector @gol
 -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
--ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
+-fmerge-bitfields -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
 -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
 -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
 -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
@@ -7152,6 +7152,7 @@ also turns on the following optimization flags:
 -fipa-sra @gol
 -fipa-icf @gol
 -fisolate-erroneous-paths-dereference @gol
+-fmerge-bitfields @gol
 -foptimize-sibling-calls @gol
 -foptimize-strlen @gol
 -fpartial-inlining @gol
@@ -8133,6 +8134,12 @@ pointer alignment information.
 This pass only operates on local scalar variables and is enabled by default
 at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
 
+@item -fmerge-bitfields
+@opindex fmerge-bitfields
+Combines several adjacent bit-field accesses that copy values
+from one memory location to another into one single bit-field access.
+This is enabled by default at @option{-O2} and higher.
+
 @item -ftree-ccp
 @opindex ftree-ccp
 Perform sparse conditional constant propagation (CCP) on trees.  This
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 8c65176..88c29b1 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3205,8 +3205,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned int);
 static tree field_type (const_tree);
 static unsigned int simple_type_align_in_bits (const_tree);
 static unsigned int simple_decl_align_in_bits (const_tree);
-static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
-static HOST_WIDE_INT field_byte_offset (const_tree);
 static void add_AT_location_description	(dw_die_ref, enum dwarf_attribute,
 					 dw_loc_list_ref);
 static void add_data_member_location_attribute (dw_die_ref, tree);
@@ -10413,25 +10411,6 @@ is_base_type (tree type)
   return 0;
 }
 
-/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
-   node, return the size in bits for the type if it is a constant, or else
-   return the alignment for the type if the type's size is not constant, or
-   else return BITS_PER_WORD if the type actually turns out to be an
-   ERROR_MARK node.  */
-
-static inline unsigned HOST_WIDE_INT
-simple_type_size_in_bits (const_tree type)
-{
-  if (TREE_CODE (type) == ERROR_MARK)
-    return BITS_PER_WORD;
-  else if (TYPE_SIZE (type) == NULL_TREE)
-    return 0;
-  else if (tree_fits_uhwi_p (TYPE_SIZE (type)))
-    return tree_to_uhwi (TYPE_SIZE (type));
-  else
-    return TYPE_ALIGN (type);
-}
-
 /* Similarly, but return an offset_int instead of UHWI.  */
 
 static inline offset_int
@@ -14906,7 +14885,7 @@ round_up_to_align (const offset_int &t, unsigned int align)
    because the offset is actually variable.  (We can't handle the latter case
    just yet).  */
 
-static HOST_WIDE_INT
+HOST_WIDE_INT
 field_byte_offset (const_tree decl)
 {
   offset_int object_offset_in_bits;
diff --git a/gcc/opts.c b/gcc/opts.c
index 3054196..ef3007b 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -500,6 +500,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS, OPT_fipa_icf, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fuse_caller_save, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_fmerge_bitfields, NULL, 1 },
 
     /* -O3 optimizations.  */
     { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
new file mode 100644
index 0000000..638db58
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fmerge-bitfields -fdump-tree-esra" }  */
+
+struct S
+{
+  unsigned f1:7;
+  unsigned f2:9;
+  unsigned f3:3;
+  unsigned f4:5;
+  unsigned f5:1;
+  unsigned f6:2;
+};
+
+unsigned
+foo (struct S *p1, struct S *p2, int *ptr)
+{
+  p2->f1 = p1->f1;
+  p2->f2 = p1->f2;
+  p2->f3 = p1->f3;
+  *ptr = 7;
+  p2->f4 = p1->f4;
+  p2->f5 = p1->f5;
+  p2->f6 = p1->f6;
+  return 0;
+}
+
+/* Check if bit-field access is lowered.  */
+/* { dg-final { scan-tree-dump "BIT_FIELD_REF" "esra" } } */
+/* { dg-final { cleanup-tree-dump "esra" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
new file mode 100644
index 0000000..a7b906f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
@@ -0,0 +1,89 @@
+/* Check whether use of -fmerge-bitfields results in presence of
+   overlapping unions results in incorrect code.  */
+/* { dg-options "-O2 -fmerge-bitfields" }  */
+/* { dg-do run } */
+extern void abort (void);
+
+struct s1
+{
+  unsigned f1:4;
+  unsigned f2:4;
+  unsigned f3:4;
+};
+
+struct s2
+{
+  unsigned char c;
+  unsigned f1:4;
+  unsigned f2:4;
+  unsigned f3:4;
+};
+
+struct s3
+{
+  unsigned f1:3;
+  unsigned f2:3;
+  unsigned f3:3;
+};
+
+struct s4
+{
+  unsigned f0:3;
+  unsigned f1:3;
+  unsigned f2:3;
+  unsigned f3:3;
+};
+
+union un_1
+{
+  struct s1 a;
+  struct s2 b;
+};
+
+union un_2
+{
+  struct s3 a;
+  struct s4 b;
+};
+
+void f1 (union un_1 *p1, union un_1 *p2)
+{
+  p2->a.f3 = p1->b.f3;
+  p2->a.f2 = p1->b.f2;
+  p2->a.f1 = p1->b.f1;
+
+  if (p1->a.f1 != 3)
+    abort ();
+}
+
+void f2 (union un_2 *p1, union un_2 *p2)
+{
+  p2->b.f1 = p1->a.f1;
+  p2->b.f2 = p1->a.f2;
+  p2->b.f3 = p1->a.f3;
+
+  if (p2->b.f1 != 0 || p2->b.f2 != 0 || p2->b.f3 != 0)
+    abort ();
+}
+
+int main ()
+{
+  union un_1 u1;
+  union un_2 u2;
+
+  u1.b.f1 = 1;
+  u1.b.f2 = 2;
+  u1.b.f3 = 3;
+  u1.b.c = 0;
+
+  f1 (&u1, &u1);
+
+  u2.b.f0 = 0;
+  u2.b.f1 = 1;
+  u2.b.f2 = 2;
+  u2.b.f3 = 3;
+
+  f2 (&u2, &u2);
+
+  return 0;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index fb24114..60dd095 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2243,7 +2243,7 @@ analyze_access_subtree (struct access *root, struct access *parent,
       if (INTEGRAL_TYPE_P (root->type)
 	  && (TREE_CODE (root->type) != INTEGER_TYPE
 	      || TYPE_PRECISION (root->type) != root->size)
-	  /* But leave bitfield accesses alone.  */
+	  /* But leave bit-field accesses alone.  */
 	  && (TREE_CODE (root->expr) != COMPONENT_REF
 	      || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1))))
 	{
@@ -3519,12 +3519,632 @@ perform_intra_sra (void)
   return ret;
 }
 
+/* Bit-field copy sequences.  */
+
+struct bfcopy
+{
+  bfcopy ():merged (false), modified (false), is_barrier (false), next (0),
+    head_copy (0)
+  {
+  }
+
+  gimple load_stmt;            /* Bit-field load statement.  */
+  gimple store_stmt;           /* Bit-field store statement.  */
+  unsigned src_offset_words;   /* Bit-field offset at src in words.  */
+  unsigned src_bit_offset;     /* Bit-field offset inside source word.  */
+  unsigned src_bit_size;       /* Size of bit-field in source word.  */
+  unsigned dst_offset_words;   /* Bit-field offset at dst in words.  */
+  unsigned dst_bit_offset;     /* Bit-field offset inside destination
+                                  word.  */
+  unsigned src_field_offset;   /* Source field offset.  */
+  unsigned dst_bit_size;       /* Size of bit-field in destination word.  */
+  tree src_addr;               /* Address of source memory access.  */
+  tree dst_addr;               /* Address of destination memory access.  */
+  unsigned merged:1;           /* 1 if copy is merged with another one.  */
+  unsigned modified:1;         /* 1 if bit-field size is modified.  */
+  unsigned is_barrier:1;       /* 1 if copy is barrier (call or mem
+                                  access).  */
+  struct bfcopy *next;         /* Copy with which this one is merged.  */
+  tree bitfield_representative;        /* Bit field representative of original
+                                  declaration.  */
+  struct bfcopy *head_copy;    /* Head of copies list where this one is
+                                  merged.  */
+};
+
+typedef struct bfcopy *bfcopy_p;
+
+/* Returns true if given COMPONENT_REF is part of an union.  */
+
+static bool part_of_union_p (tree component)
+{
+  tree tmp = component;
+  bool res = false;
+  while (TREE_CODE (tmp) == COMPONENT_REF)
+    {
+      if (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE)
+       {
+         res = true;
+         break;
+       }
+      tmp = TREE_OPERAND (tmp, 0);
+    }
+  if (tmp && (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE))
+    res = true;
+  return res;
+}
+
+/* Return TRUE if REF is a bit-field access.  If *OFF is not NULL it will
+   contain offset of the bit-field within the representative in bits.  */
+
+static bool
+bf_access_candidate_p (tree ref, unsigned HOST_WIDE_INT * off)
+{
+  if (TREE_CODE (ref) != COMPONENT_REF)
+    return false;
+  if (part_of_union_p (ref))
+    return false;
+  tree field = TREE_OPERAND (ref, 1);
+  if (!DECL_BIT_FIELD_TYPE (field))
+    return false;
+
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+  if (!rep)
+    return false;
+  /* Do not lower if representative is bigger than one word.  */
+  if (TREE_CODE (TREE_TYPE (rep)) == ARRAY_TYPE)
+    return false;
+
+  if (!off)
+    return true;
+
+  if (tree_fits_uhwi_p (DECL_FIELD_OFFSET (field))
+      && tree_fits_uhwi_p (DECL_FIELD_OFFSET (rep)))
+    *off = (tree_to_uhwi (DECL_FIELD_OFFSET (field))
+           - tree_to_uhwi (DECL_FIELD_OFFSET (rep))) * BITS_PER_UNIT;
+  else
+    *off = 0;
+  *off += (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field))
+          - tree_to_uhwi (DECL_FIELD_BIT_OFFSET (rep)));
+
+  return true;
+}
+
+
+/* Lower the bit-field read at *GSI.  */
+
+static void
+lower_bitfield_read (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT off,
+                    tree size, tree type)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  tree ref = gimple_assign_rhs1 (stmt);
+  tree field = TREE_OPERAND (ref, 1);
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+  tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple load = gimple_build_assign (loadres,
+                                    build3 (COMPONENT_REF, TREE_TYPE (rep),
+                                            TREE_OPERAND (ref, 0), rep,
+                                            NULL_TREE));
+  gimple_set_vuse (load, gimple_vuse (stmt));
+  gsi_insert_before (gsi, load, GSI_SAME_STMT);
+  if (!type)
+    type = TREE_TYPE (ref);
+  gimple_assign_set_rhs1 (stmt,
+                         build3 (BIT_FIELD_REF, type,
+                                 loadres, size, bitsize_int (off)));
+  update_stmt (stmt);
+}
+
+/* Lower the bit-field write at *GSI.  */
+
+static void
+lower_bitfield_write (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT off,
+                     tree size)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  tree ref = gimple_assign_lhs (stmt);
+  tree field = TREE_OPERAND (ref, 1);
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+  tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple load = gimple_build_assign (loadres,
+                                    build3 (COMPONENT_REF, TREE_TYPE (rep),
+                                            unshare_expr
+                                            (TREE_OPERAND (ref, 0)),
+                                            rep,
+                                            NULL_TREE));
+  gimple_set_vuse (load, gimple_vuse (stmt));
+  gsi_insert_before (gsi, load, GSI_SAME_STMT);
+  /* Mask out bits.  */
+  tree masked = make_ssa_name (TREE_TYPE (rep), NULL);
+  tree mask = double_int_to_tree (TREE_TYPE (rep),
+                                 ~double_int::mask
+                                 (TREE_INT_CST_LOW (size)).lshift (off));
+  gimple tems = gimple_build_assign_with_ops (BIT_AND_EXPR,
+                                             masked, loadres, mask);
+  gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+  /* Zero-extend the value to representative size.  */
+  tree tem2;
+  if (!TYPE_UNSIGNED (TREE_TYPE (field)))
+    {
+      tem2 = make_ssa_name (unsigned_type_for (TREE_TYPE (field)), NULL);
+      tems = gimple_build_assign_with_ops (NOP_EXPR, tem2,
+                                          gimple_assign_rhs1 (stmt),
+                                          NULL_TREE);
+      gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+    }
+  else
+    tem2 = gimple_assign_rhs1 (stmt);
+  tree tem = make_ssa_name (TREE_TYPE (rep), NULL);
+  tems = gimple_build_assign_with_ops (NOP_EXPR, tem, tem2, NULL_TREE);
+  gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+  /* Shift the value into place.  */
+  if (off != 0)
+    {
+      tem2 = make_ssa_name (TREE_TYPE (rep), NULL);
+      tems = gimple_build_assign_with_ops (LSHIFT_EXPR, tem2, tem,
+                                          size_int (off));
+      gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+    }
+  else
+    tem2 = tem;
+  /* Merge masked loaded value and value.  */
+  tree modres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple mod = gimple_build_assign_with_ops (BIT_IOR_EXPR, modres,
+                                            masked, tem2);
+  gsi_insert_before (gsi, mod, GSI_SAME_STMT);
+  /* Finally adjust the store.  */
+  gimple_assign_set_rhs1 (stmt, modres);
+  gimple_assign_set_lhs (stmt,
+                        build3 (COMPONENT_REF, TREE_TYPE (rep),
+                                TREE_OPERAND (ref, 0), rep, NULL_TREE));
+  update_stmt (stmt);
+}
+
+/* Connects statement with bit-field copy sequence to which that statement
+   belong.  */
+
+struct bitfield_stmt_bfcopy_pair
+{
+  gimple stmt;
+  bfcopy *copy;
+    bitfield_stmt_bfcopy_pair (gimple s, bfcopy * c):stmt (s), copy (c)
+  {
+  };
+  /* hash_table support.  */
+  typedef bitfield_stmt_bfcopy_pair value_type;
+  typedef bitfield_stmt_bfcopy_pair compare_type;
+  static inline hashval_t hash (const bitfield_stmt_bfcopy_pair * const);
+  static inline int equal (const bitfield_stmt_bfcopy_pair * const,
+                          const bitfield_stmt_bfcopy_pair * const);
+  static inline void remove (bitfield_stmt_bfcopy_pair *);
+};
+
+hashval_t
+bitfield_stmt_bfcopy_pair::hash (const bitfield_stmt_bfcopy_pair * const a)
+{
+  return hashval_t (gimple_uid (a->stmt));
+}
+
+int
+bitfield_stmt_bfcopy_pair::equal (const bitfield_stmt_bfcopy_pair * const a,
+                                 const bitfield_stmt_bfcopy_pair * const b)
+{
+  return a->stmt == b->stmt;
+}
+
+void
+bitfield_stmt_bfcopy_pair::remove (bitfield_stmt_bfcopy_pair * a)
+{
+  delete a;
+}
+
+/* Create new bfcopy structure and add it to given bitfield_copies
+   htab.  */
+
+static bfcopy_p
+create_and_insert_bfcopy (vec <bfcopy_p>*bitfield_copies)
+{
+  bfcopy_p copy = new bfcopy;
+  bitfield_copies->safe_push (copy);
+  return copy;
+}
+
+/* Get offset of the field in bits from the start of the record.  */
+
+static inline HOST_WIDE_INT
+get_bit_offset (tree decl)
+{
+  tree type = DECL_BIT_FIELD_TYPE (decl);
+  HOST_WIDE_INT bitpos_int;
+
+  /* Must be a field and a bit-field.  */
+  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
+  /* Bit position and decl size should be integer constants that can be
+     represented in a single HOST_WIDE_INT.  */
+  if (!tree_fits_uhwi_p (bit_position (decl))
+      || !tree_fits_uhwi_p (DECL_SIZE (decl)))
+    return -1;
+
+  bitpos_int = int_bit_position (decl);
+  return bitpos_int;
+}
+
+/* Creates new bitfield_stmt_bfcopy_pair structure and adds it to given
+   htab.  */
+
+static bool
+add_stmt_bfcopy_pair (hash_table < bitfield_stmt_bfcopy_pair > &bf_stmnt_cpy,
+                     bfcopy * copy, gimple stmt)
+{
+  bitfield_stmt_bfcopy_pair p (stmt, copy);
+  bitfield_stmt_bfcopy_pair **slot = bf_stmnt_cpy.find_slot (&p, INSERT);
+  if (!*slot)
+    {
+      *slot = new bitfield_stmt_bfcopy_pair (stmt, copy);
+      return true;
+    }
+  return false;
+}
+
+/* Passed to qsort.  Compares two bfcopy records.  */
+
+static int
+cmp_bfcopies (const void *p1, const void *p2)
+{
+  const bfcopy_p *ap1 = (const bfcopy_p*) p1;
+  const bfcopy_p *ap2 = (const bfcopy_p*) p2;
+  const bfcopy_p a1 = *ap1;
+  const bfcopy_p a2 = *ap2;
+
+  if (DECL_UID (a1->bitfield_representative) -
+      DECL_UID (a2->bitfield_representative))
+    return DECL_UID (a1->bitfield_representative) -
+      DECL_UID (a2->bitfield_representative);
+
+  if (!expressions_equal_p (a1->src_addr, a2->src_addr))
+    return a1->src_addr - a2->src_addr;
+
+  if (!expressions_equal_p (a1->dst_addr, a2->dst_addr))
+    return a1->dst_addr - a2->dst_addr;
+
+  if (a1->src_offset_words - a2->src_offset_words)
+    return a1->src_offset_words - a2->src_offset_words;
+
+  return a1->src_bit_offset - a2->src_bit_offset;
+}
+
+/* Returns size of combined bitfields.  Size cannot be larger than size
+   of largest directly accessible memory unit.  */
+
+static int
+get_merged_bit_field_size (bfcopy * copy)
+{
+  bfcopy *tmp_copy = copy;
+  int size = 0;
+
+  while (tmp_copy)
+    {
+      size += tmp_copy->src_bit_size;
+      tmp_copy = tmp_copy->next;
+    }
+  return size;
+}
+
+/* Lower bit-field access to accesses to its
+   DECL_BIT_FIELD_REPRESENTATIVE.  */
+
+static void
+lower_bitfields (void)
+{
+  basic_block bb;
+  hash_table<bitfield_stmt_bfcopy_pair> *bf_stmnt_cpy;
+  vec <bfcopy_p>bitfield_copies;
+  bfcopy_p copy;
+  bf_stmnt_cpy = new hash_table<bitfield_stmt_bfcopy_pair> (1);
+
+  FOR_EACH_BB_FN (bb, cfun)
+  {
+    bf_stmnt_cpy->empty ();
+    tree prev_representative = NULL_TREE;
+    bitfield_copies.create (0);
+
+    /* There are two passes, the first one identifies interesting
+       bit-field accesses and the second one actually lowers them.  */
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+        !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+       gimple stmt = gsi_stmt (gsi);
+
+       if (!gimple_assign_single_p (stmt) || gimple_has_volatile_ops (stmt))
+         continue;
+
+       tree ref = gimple_assign_rhs1 (stmt);
+       if (bf_access_candidate_p (ref, NULL))
+         {
+           gimple use_stmt;
+           use_operand_p use;
+           tree op0 = TREE_OPERAND (ref, 0);
+           tree op1 = TREE_OPERAND (ref, 1);
+
+           if (TREE_CODE (DECL_CONTEXT (op1)) == UNION_TYPE
+               || TREE_CODE (DECL_CONTEXT (op1)) == QUAL_UNION_TYPE)
+             continue;
+
+           if (TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME
+               && single_imm_use (gimple_assign_lhs (stmt), &use,
+                                  &use_stmt) && is_gimple_assign (use_stmt))
+             {
+               tree uses_stmt_lhs = gimple_assign_lhs (use_stmt);
+
+               if (bf_access_candidate_p (uses_stmt_lhs, NULL))
+                 {
+                   tree use_op0 = TREE_OPERAND (uses_stmt_lhs, 0);
+                   tree use_op1 = TREE_OPERAND (uses_stmt_lhs, 1);
+                   tree use_repr = DECL_BIT_FIELD_REPRESENTATIVE (use_op1);
+                   if (prev_representative
+                       && (prev_representative != use_repr))
+                     {
+                       /* If previous bfcopy has different
+                          representative then barrier is needed
+                          between it and new access.  */
+                       copy = create_and_insert_bfcopy (&bitfield_copies);
+                       copy->is_barrier = true;
+                     }
+                   prev_representative = use_repr;
+                   /* Create new bit-field bfcopy structure.  */
+                   copy = create_and_insert_bfcopy (&bitfield_copies);
+                   /* Collect bfcopy data - load instruction.  */
+                   copy->src_bit_size = tree_to_uhwi (DECL_SIZE (op1));
+                   copy->src_bit_offset = get_bit_offset (op1);
+                   copy->src_offset_words =
+                     field_byte_offset (op1) / UNITS_PER_WORD;
+                   copy->src_field_offset =
+                     tree_to_uhwi (DECL_FIELD_OFFSET (op1));
+                   copy->src_addr = op0;
+                   copy->load_stmt = gsi_stmt (gsi);
+                   /* Collect bfcopy data - store instruction.  */
+                   copy->dst_bit_size = tree_to_uhwi (DECL_SIZE (use_op1));
+                   copy->dst_bit_offset = get_bit_offset (use_op1);
+                   copy->dst_offset_words =
+                     field_byte_offset (use_op1) / UNITS_PER_WORD;
+                   copy->dst_addr = use_op0;
+                   copy->store_stmt = use_stmt;
+                   add_stmt_bfcopy_pair (*bf_stmnt_cpy, copy, stmt);
+                   add_stmt_bfcopy_pair (*bf_stmnt_cpy, copy, use_stmt);
+                   copy->bitfield_representative = use_repr;
+                 }
+             }
+         }
+
+       /* Insert barrier for merging if statement is function call or memory
+          access.  */
+       bitfield_stmt_bfcopy_pair csdata (stmt, NULL);
+       if (!bf_stmnt_cpy->find (&csdata)
+           && ((gimple_code (stmt) == GIMPLE_CALL)
+               || (gimple_has_mem_ops (stmt))))
+         {
+           /* Create new bfcopy access structure.  */
+           copy = create_and_insert_bfcopy (&bitfield_copies);
+           /* Mark it as barrier.  */
+           copy->is_barrier = true;
+         }
+      }
+
+    /* If there are no at least two sequences go to the next basic block.  */
+    if (bitfield_copies.length () <= 1)
+      {
+       bitfield_copies.release ();
+       continue;
+      }
+    vec <bfcopy_p>bitfield_copies_merge = vNULL;
+    /* Try to merge different bfcopy sequences.  */
+    for (int ix = 0; bitfield_copies.iterate (ix, &copy); ix++)
+      {
+       bfcopy_p head_copy;
+       bfcopy_p mrg_copy;
+       bfcopy_p prev_copy;
+
+       if (!bitfield_copies_merge.exists ())
+         bitfield_copies_merge.create (0);
+
+       if (!copy->is_barrier)
+         bitfield_copies_merge.safe_push (copy);
+
+       if (!copy->is_barrier
+           && !(copy == bitfield_copies.last ()
+                && !bitfield_copies_merge.is_empty ()))
+         continue;
+
+       bitfield_copies_merge.qsort (cmp_bfcopies);
+
+       head_copy = prev_copy = NULL;
+       int iy;
+       for (iy = 0; bitfield_copies_merge.iterate (iy, &mrg_copy); iy++)
+         {
+           if (head_copy
+               && expressions_equal_p (head_copy->src_addr,
+                                       mrg_copy->src_addr)
+               && expressions_equal_p (head_copy->dst_addr,
+                                       mrg_copy->dst_addr)
+               && prev_copy->src_offset_words
+               == mrg_copy->src_offset_words
+               && prev_copy->dst_offset_words
+               == mrg_copy->dst_offset_words
+               && prev_copy->src_bit_offset + prev_copy->src_bit_size
+               == mrg_copy->src_bit_offset
+               && prev_copy->dst_bit_offset + prev_copy->dst_bit_size
+               == mrg_copy->dst_bit_offset
+               && prev_copy->bitfield_representative
+               == mrg_copy->bitfield_representative)
+             {
+               /* Merge conditions are satisfied - merge copies.  */
+               mrg_copy->merged = true;
+               prev_copy->next = mrg_copy;
+               head_copy->modified = true;
+               prev_copy = mrg_copy;
+               mrg_copy->head_copy = head_copy;
+             }
+           else
+             head_copy = prev_copy = mrg_copy;
+         }
+       bitfield_copies_merge.release ();
+       bitfield_copies_merge = vNULL;
+      }
+
+    tree reaching_vuse = NULL_TREE;
+
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
+      {
+       gimple stmt = gsi_stmt (gsi);
+
+       /* Fix vuse info if necessary.  */
+       if (gimple_has_mem_ops (stmt) && reaching_vuse != NULL_TREE)
+         {
+           gimple_set_vuse (stmt, reaching_vuse);
+           update_stmt (stmt);
+         }
+
+       if (!gimple_assign_single_p (stmt) || gimple_has_volatile_ops (stmt))
+         {
+           gsi_next (&gsi);
+           if (gimple_vdef (stmt))
+             reaching_vuse = gimple_vdef (stmt);
+           continue;
+         }
+
+       tree ref;
+       unsigned HOST_WIDE_INT off;
+       tree size;
+       bool deleted = false;
+
+       /* Lower a bit-field read.  */
+       ref = gimple_assign_rhs1 (stmt);
+       if (bf_access_candidate_p (ref, &off))
+         {
+           bfcopy *cc = NULL;
+           bitfield_stmt_bfcopy_pair st_cpy (stmt, NULL);
+           bitfield_stmt_bfcopy_pair *p_st_cpy;
+           p_st_cpy = bf_stmnt_cpy->find (&st_cpy);
+           unsigned HOST_WIDE_INT mrg_off;
+           if (p_st_cpy)
+             cc = p_st_cpy->copy;
+
+           if (cc && (cc->merged || cc->modified))
+             size =
+               build_int_cst (unsigned_type_node,
+                              get_merged_bit_field_size (cc->head_copy ?
+                                                         cc->head_copy :
+                                                         cc));
+           else
+             size = DECL_SIZE (TREE_OPERAND (ref, 1));
+           if (cc && cc->merged
+               && bf_access_candidate_p (gimple_assign_rhs1
+                                         (cc->head_copy->load_stmt),
+                                         &mrg_off))
+             off = mrg_off;
+           if (cc && cc->merged)
+             {
+               tree head_rhs = gimple_assign_rhs1 (cc->head_copy->load_stmt);
+               switch (TREE_CODE (head_rhs))
+                 {
+                 case COMPONENT_REF:
+                   if (bf_access_candidate_p (head_rhs, &mrg_off))
+                     off = mrg_off;
+                   break;
+                 case BIT_FIELD_REF:
+                   off = tree_to_uhwi (TREE_OPERAND (head_rhs, 2));
+                   break;
+                 default:
+                   break;
+                 }
+             }
+
+           if (cc && (cc->modified))
+             {
+               tree tmp_ssa;
+               tree itype = make_node (INTEGER_TYPE);
+               TYPE_PRECISION (itype) = TREE_INT_CST_LOW (size);
+               fixup_unsigned_type (itype);
+               lower_bitfield_read (&gsi, off, size, itype);
+               tmp_ssa =
+                 make_ssa_name (create_tmp_var (itype, NULL), cc->load_stmt);
+               gimple_assign_set_lhs (cc->load_stmt, tmp_ssa);
+               update_stmt (cc->load_stmt);
+               gimple_assign_set_rhs1 (cc->store_stmt, tmp_ssa);
+               update_stmt (cc->store_stmt);
+             }
+           else if (cc && cc->merged)
+             {
+               gsi_remove (&gsi, true);
+               deleted = true;
+             }
+         }
+       /* Lower a bit-field write.  */
+       ref = gimple_assign_lhs (stmt);
+       if (bf_access_candidate_p (ref, &off))
+         {
+           bfcopy *cc = NULL;
+           bitfield_stmt_bfcopy_pair st_cpy (stmt, NULL);
+           bitfield_stmt_bfcopy_pair *p_st_cpy;
+           unsigned HOST_WIDE_INT mrg_off;
+           p_st_cpy = bf_stmnt_cpy->find (&st_cpy);
+           if (p_st_cpy)
+             cc = p_st_cpy->copy;
+
+           if (cc && (cc->merged || cc->modified))
+             size =
+               build_int_cst (unsigned_type_node,
+                              get_merged_bit_field_size (cc->head_copy ?
+                                                         cc->head_copy :
+                                                         cc));
+           else
+             size = DECL_SIZE (TREE_OPERAND (ref, 1));
+           if (cc && cc->merged
+               &&
+               bf_access_candidate_p (gimple_assign_lhs
+                                      (cc->head_copy->store_stmt), &mrg_off))
+             off = mrg_off;
+
+           if (cc && (cc->modified) && !(cc && cc->merged))
+             lower_bitfield_write (&gsi, off, size);
+           else if (cc && cc->merged)
+             {
+               if (gimple_vdef (stmt))
+                 {
+                   imm_use_iterator imm_iter;
+                   gimple use_stmt;
+                   use_operand_p use_p;
+                   FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter,
+                                          gimple_vdef (stmt))
+                   {
+                     FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
+                       SET_USE (use_p, reaching_vuse);
+                     update_stmt (use_stmt);
+                   }
+                 }
+
+               gsi_remove (&gsi, true);
+               deleted = true;
+             }
+         }
+       if (gimple_vdef (stmt) && !deleted)
+         reaching_vuse = gimple_vdef (stmt);
+       if (!deleted)
+         gsi_next (&gsi);
+      }
+  }
+  delete bf_stmnt_cpy;
+}
+
 /* Perform early intraprocedural SRA.  */
 static unsigned int
 early_intra_sra (void)
 {
   sra_mode = SRA_MODE_EARLY_INTRA;
-  return perform_intra_sra ();
+  unsigned int res = perform_intra_sra ();
+  if (flag_tree_bitfield_merge)
+    lower_bitfields ();
+  return res;
 }
 
 /* Perform "late" intraprocedural SRA.  */
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index 44656ea..cdb3850 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -4410,29 +4410,6 @@ get_next_value_id (void)
   return next_value_id++;
 }
 
-
-/* Compare two expressions E1 and E2 and return true if they are equal.  */
-
-bool
-expressions_equal_p (tree e1, tree e2)
-{
-  /* The obvious case.  */
-  if (e1 == e2)
-    return true;
-
-  /* If only one of them is null, they cannot be equal.  */
-  if (!e1 || !e2)
-    return false;
-
-  /* Now perform the actual comparison.  */
-  if (TREE_CODE (e1) == TREE_CODE (e2)
-      && operand_equal_p (e1, e2, OEP_PURE_SAME))
-    return true;
-
-  return false;
-}
-
-
 /* Return true if the nary operation NARY may trap.  This is a copy
    of stmt_could_throw_1_p adjusted to the SCCVN IL.  */
 
diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
index ad99604..d5963b6 100644
--- a/gcc/tree-ssa-sccvn.h
+++ b/gcc/tree-ssa-sccvn.h
@@ -21,10 +21,6 @@
 #ifndef TREE_SSA_SCCVN_H
 #define TREE_SSA_SCCVN_H
 
-/* In tree-ssa-sccvn.c  */
-bool expressions_equal_p (tree, tree);
-
-
 /* TOP of the VN lattice.  */
 extern tree VN_TOP;
 
diff --git a/gcc/tree.c b/gcc/tree.c
index 365e89c..8df9812 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -12286,4 +12286,44 @@ get_base_address (tree t)
   return t;
 }
 
+/* Compare two expressions E1 and E2 and return true if they are equal.  */
+
+bool
+expressions_equal_p (const_tree e1, const_tree e2)
+{
+  /* The obvious case.  */
+  if (e1 == e2)
+    return true;
+
+  /* If only one of them is null, they cannot be equal.  */
+  if (!e1 || !e2)
+    return false;
+
+  /* Now perform the actual comparison.  */
+  if (TREE_CODE (e1) == TREE_CODE (e2)
+      && operand_equal_p (e1, e2, OEP_PURE_SAME))
+    return true;
+
+  return false;
+}
+
+/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
+   node, return the size in bits for the type if it is a constant, or else
+   return the alignment for the type if the type's size is not constant, or
+   else return BITS_PER_WORD if the type actually turns out to be an
+   ERROR_MARK node.  */
+
+unsigned HOST_WIDE_INT
+simple_type_size_in_bits (const_tree type)
+{
+  if (TREE_CODE (type) == ERROR_MARK)
+    return BITS_PER_WORD;
+  else if (TYPE_SIZE (type) == NULL_TREE)
+    return 0;
+  else if (tree_fits_uhwi_p (TYPE_SIZE (type)))
+    return tree_to_uhwi (TYPE_SIZE (type));
+  else
+    return TYPE_ALIGN (type);
+}
+
 #include "gt-tree.h"
diff --git a/gcc/tree.h b/gcc/tree.h
index 45f127f..b903089 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -4068,6 +4068,7 @@ extern tree substitute_placeholder_in_expr (tree, tree);
   ((EXP) == 0 || TREE_CONSTANT (EXP) ? (EXP)	\
    : substitute_placeholder_in_expr (EXP, OBJ))
 
+extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type);
 
 /* stabilize_reference (EXP) returns a reference equivalent to EXP
    but it can be used multiple times
@@ -4184,6 +4185,11 @@ inlined_function_outer_scope_p (const_tree block)
        (TREE = function_args_iter_cond (&(ITER))) != NULL_TREE;		\
        function_args_iter_next (&(ITER)))
 
+
+/* In dwarf2out.c.  */
+HOST_WIDE_INT
+field_byte_offset (const_tree decl);
+
 /* In tree.c */
 extern unsigned crc32_string (unsigned, const char *);
 extern unsigned crc32_byte (unsigned, char);
@@ -4340,6 +4346,7 @@ extern tree obj_type_ref_class (tree ref);
 extern bool types_same_for_odr (const_tree type1, const_tree type2);
 extern bool contains_bitfld_component_ref_p (const_tree);
 extern bool type_in_anonymous_namespace_p (const_tree);
+extern bool expressions_equal_p (const_tree e1, const_tree e2);
 extern bool block_may_fallthru (const_tree);
 extern void using_eh_for_cleanups (void);
 extern bool using_eh_for_cleanups_p (void);


Regards,
Zoran



> From: Andrew Pinski [pinskia@gmail.com]
> Sent: Thursday, October 16, 2014 4:09 AM
> To: Zoran Jovanovic
> Cc: gcc-patches@gcc.gnu.org; Richard Biener; Bernhard Reutner-Fischer
> Subject: Re: [PATCH] Add a new option "-fmerge-bitfields" (patch / doc inside)
> 
> On Tue, Apr 22, 2014 at 3:28 AM, Zoran Jovanovic
> <Zoran.Jovanovic@imgtec.com> wrote:
> > Hello,
> > Updated doc/invoke.texi by stating that new option is enabled by default at -O2 and higher.
> > Also, -fmerge-bitfields added to the list  of optimization flags enabled by default  at -O2 and higher.
> 
> 
> With this patch applied gcc from SPEC 2006 ICEs on aarch64-linux-gnu.
> Here is a reduced testcase:
> typedef struct rtx_def *rtx;
> struct rtx_def
> {
>   unsigned int unchanging : 1;
>   unsigned int volatil : 1;
>   unsigned int in_struct : 1;
>   unsigned integrated : 1;
>   unsigned frame_related : 1;
> };
> ix86_set_move_mem_attrs_1 (rtx x, rtx dstref)
> {
>     ((x)->volatil) = ((dstref)->volatil);
>  ((x)->in_struct) = ((dstref)->in_struct);
>  ((x)->frame_related) = ((dstref)->frame_related);
>  ((x)->unchanging) = ((dstref)->unchanging);
> }
> --- CUT ---
> 
> Thanks,
> Andrew Pinski
> 

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]