This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: [PATCH] Add a new option "-fmerge-bitfields" (patch / doc inside)


Hello,
Updated doc/invoke.texi by stating that new option is enabled by default at -O2 and higher. 
Also, -fmerge-bitfields added to the list  of optimization flags enabled by default  at -O2 and higher.

Regards,
Zoran Jovanovic

--------------------------------------------------------------------------------------------------------------------------

Lowering is applied only for bit-fields copy sequences that are merged.
Data structure representing bit-field copy sequences is renamed and reduced in size.
Optimization turned on by default for -O2 and higher.
Some comments fixed.

Benchmarking performed on WebKit for Android.
Code size reduction noticed on several files, best examples are:

core/rendering/style/StyleMultiColData (632->520 bytes)
core/platform/graphics/FontDescription (1715->1475 bytes)
core/rendering/style/FillLayer (5069->4513 bytes)
core/rendering/style/StyleRareInheritedData (5618->5346)
core/css/CSSSelectorList(4047->3887)
core/platform/animation/CSSAnimationData (3844->3440 bytes)
core/css/resolver/FontBuilder (13818->13350 bytes)
core/platform/graphics/Font (16447->15975 bytes)


Example:

One of the motivating examples for this work was copy constructor of the class which contains bit-fields.

C++ code:
class A
{
public:
        A(const A &x);
        unsigned a : 1;
        unsigned b : 2;
        unsigned c : 4;
};

A::A(const A&x)
{
        a = x.a;
        b = x.b;
        c = x.c;
}

GIMPLE code without optimization:

  <bb 2>:
  _3 = x_2(D)->a;
  this_4(D)->a = _3;
  _6 = x_2(D)->b;
  this_4(D)->b = _6;
  _8 = x_2(D)->c;
  this_4(D)->c = _8;
  return;

Optimized GIMPLE code:
  <bb 2>:
  _10 = x_2(D)->D.1867;
  _11 = BIT_FIELD_REF <_10, 7, 0>;
  _12 = this_4(D)->D.1867;
  _13 = _12 & 128;
  _14 = (unsigned char) _11;
  _15 = _13 | _14;
  this_4(D)->D.1867 = _15;
  return;

Generated MIPS32r2 assembly code without optimization:
         lw      $3,0($5)
        lbu     $2,0($4)
        andi    $3,$3,0x1
        andi    $2,$2,0xfe
        or      $2,$2,$3
        sb      $2,0($4)
        lw      $3,0($5)
        andi    $2,$2,0xf9
        andi    $3,$3,0x6
        or      $2,$2,$3
        sb      $2,0($4)
        lw      $3,0($5)
        andi    $2,$2,0x87
        andi    $3,$3,0x78
        or      $2,$2,$3
        j       $31
        sb      $2,0($4)

Optimized MIPS32r2 assembly code:
        lw      $3,0($5)
        lbu     $2,0($4)
        andi    $3,$3,0x7f
        andi    $2,$2,0x80
        or      $2,$3,$2
        j       $31
        sb      $2,0($4)


Algorithm works on basic block level and consists of following 3 major steps:
1. Go through basic block statements list. If there are statement pairs that implement copy of bit field content from one memory location to another record statements pointers and other necessary data in corresponding data structure.
2. Identify records that represent adjacent bit field accesses and mark them as merged.
3. Lower bit-field accesses by using new field size for those that can be merged.


New command line option "-fmerge-bitfields" is introduced.


Tested - passed gcc regression tests for MIPS32r2.


Changelog -

gcc/ChangeLog:
2014-04-22 Zoran Jovanovic (zoran.jovanovic@imgtec.com)
  * common.opt (fmerge-bitfields): New option.
  * doc/invoke.texi: Add reference to "-fmerge-bitfields".
  * doc/invoke.texi: Add "-fmerge-bitfields" to the list of optimization
    flags turned on at -O2.
  * tree-sra.c (lower_bitfields): New function.
  Entry for (-fmerge-bitfields).
  (part_of_union_p): New function.
  (bf_access_candidate_p): New function.
  (lower_bitfield_read): New function.
  (lower_bitfield_write): New function.
  (bitfield_stmt_bfcopy_pair::hash): New function.
  (bitfield_stmt_bfcopy_pair::equal): New function.
  (bitfield_stmt_bfcopy_pair::remove): New function.
  (create_and_insert_bfcopy): New function.
  (get_bit_offset): New function.
  (add_stmt_bfcopy_pair): New function.
  (cmp_bfcopies): New function.
  (get_merged_bit_field_size): New function.
  * dwarf2out.c (simple_type_size_in_bits): Move to tree.c.
  (field_byte_offset): Move declaration to tree.h and make it extern.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg1.c: New test.
  * testsuite/gcc.dg/tree-ssa/bitfldmrg2.c: New test.
  * tree-ssa-sccvn.c (expressions_equal_p): Move to tree.c.
  * tree-ssa-sccvn.h (expressions_equal_p): Move declaration to tree.h.
  * tree.c (expressions_equal_p): Move from tree-ssa-sccvn.c.
  (simple_type_size_in_bits): Move from dwarf2out.c.
  * tree.h (expressions_equal_p): Add declaration.
  (field_byte_offset): Add declaration.

Patch -

diff --git a/gcc/common.opt b/gcc/common.opt
index da275e5..52c7f58 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2203,6 +2203,10 @@ ftree-sra
 Common Report Var(flag_tree_sra) Optimization
 Perform scalar replacement of aggregates
 
+fmerge-bitfields
+Common Report Var(flag_tree_bitfield_merge) Optimization
+Merge loads and stores of consecutive bitfields
+
 ftree-ter
 Common Report Var(flag_tree_ter) Optimization
 Replace temporary expressions in the SSA->normal pass
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 8004da8..6942174 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -411,7 +411,7 @@ Objective-C and Objective-C++ Dialects}.
 -fsplit-ivs-in-unroller -fsplit-wide-types -fstack-protector @gol
 -fstack-protector-all -fstack-protector-strong -fstrict-aliasing @gol
 -fstrict-overflow -fthread-jumps -ftracer -ftree-bit-ccp @gol
--ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
+-fmerge-bitfields -ftree-builtin-call-dce -ftree-ccp -ftree-ch @gol
 -ftree-coalesce-inline-vars -ftree-coalesce-vars -ftree-copy-prop @gol
 -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse @gol
 -ftree-forwprop -ftree-fre -ftree-loop-if-convert @gol
@@ -6860,6 +6860,7 @@ also turns on the following optimization flags:
 -findirect-inlining @gol
 -fipa-sra @gol
 -fisolate-erroneous-paths-dereference @gol
+-fmerge-bitfields @gol
 -foptimize-sibling-calls @gol
 -fpartial-inlining @gol
 -fpeephole2 @gol
@@ -7789,6 +7790,12 @@ pointer alignment information.
 This pass only operates on local scalar variables and is enabled by default
 at @option{-O} and higher.  It requires that @option{-ftree-ccp} is enabled.
 
+@item -fmerge-bitfields
+@opindex fmerge-bitfields
+Combines several adjacent bit-field accesses that copy values
+from one memory location to another into one single bit-field access.
+This is enabled by default at @option{-O2} and higher.
+
 @item -ftree-ccp
 @opindex ftree-ccp
 Perform sparse conditional constant propagation (CCP) on trees.  This
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 8caf940..71a3e6b 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -3119,8 +3119,6 @@ static HOST_WIDE_INT ceiling (HOST_WIDE_INT, unsigned int);
 static tree field_type (const_tree);
 static unsigned int simple_type_align_in_bits (const_tree);
 static unsigned int simple_decl_align_in_bits (const_tree);
-static unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree);
-static HOST_WIDE_INT field_byte_offset (const_tree);
 static void add_AT_location_description	(dw_die_ref, enum dwarf_attribute,
 					 dw_loc_list_ref);
 static void add_data_member_location_attribute (dw_die_ref, tree);
@@ -10279,25 +10277,6 @@ is_base_type (tree type)
   return 0;
 }
 
-/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
-   node, return the size in bits for the type if it is a constant, or else
-   return the alignment for the type if the type's size is not constant, or
-   else return BITS_PER_WORD if the type actually turns out to be an
-   ERROR_MARK node.  */
-
-static inline unsigned HOST_WIDE_INT
-simple_type_size_in_bits (const_tree type)
-{
-  if (TREE_CODE (type) == ERROR_MARK)
-    return BITS_PER_WORD;
-  else if (TYPE_SIZE (type) == NULL_TREE)
-    return 0;
-  else if (tree_fits_uhwi_p (TYPE_SIZE (type)))
-    return tree_to_uhwi (TYPE_SIZE (type));
-  else
-    return TYPE_ALIGN (type);
-}
-
 /* Similarly, but return a double_int instead of UHWI.  */
 
 static inline double_int
@@ -14672,7 +14651,7 @@ round_up_to_align (double_int t, unsigned int align)
    because the offset is actually variable.  (We can't handle the latter case
    just yet).  */
 
-static HOST_WIDE_INT
+HOST_WIDE_INT
 field_byte_offset (const_tree decl)
 {
   double_int object_offset_in_bits;
diff --git a/gcc/opts.c b/gcc/opts.c
index 1873b96..b2692b0 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -497,6 +497,7 @@ static const struct default_options default_options_table[] =
     { OPT_LEVELS_2_PLUS_SPEED_ONLY, OPT_foptimize_strlen, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
     { OPT_LEVELS_2_PLUS, OPT_fisolate_erroneous_paths_dereference, NULL, 1 },
+    { OPT_LEVELS_2_PLUS, OPT_fmerge_bitfields, NULL, 1 },
 
     /* -O3 optimizations.  */
     { OPT_LEVELS_3_PLUS, OPT_ftree_loop_distribute_patterns, NULL, 1 },
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
new file mode 100644
index 0000000..638db58
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg1.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fmerge-bitfields -fdump-tree-esra" }  */
+
+struct S
+{
+  unsigned f1:7;
+  unsigned f2:9;
+  unsigned f3:3;
+  unsigned f4:5;
+  unsigned f5:1;
+  unsigned f6:2;
+};
+
+unsigned
+foo (struct S *p1, struct S *p2, int *ptr)
+{
+  p2->f1 = p1->f1;
+  p2->f2 = p1->f2;
+  p2->f3 = p1->f3;
+  *ptr = 7;
+  p2->f4 = p1->f4;
+  p2->f5 = p1->f5;
+  p2->f6 = p1->f6;
+  return 0;
+}
+
+/* Check if bit-field access is lowered.  */
+/* { dg-final { scan-tree-dump "BIT_FIELD_REF" "esra" } } */
+/* { dg-final { cleanup-tree-dump "esra" } } */
+
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
new file mode 100644
index 0000000..a7b906f
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/bitfldmrg2.c
@@ -0,0 +1,89 @@
+/* Check whether use of -fmerge-bitfields results in presence of
+   overlapping unions results in incorrect code.  */
+/* { dg-options "-O2 -fmerge-bitfields" }  */
+/* { dg-do run } */
+extern void abort (void);
+
+struct s1
+{
+  unsigned f1:4;
+  unsigned f2:4;
+  unsigned f3:4;
+};
+
+struct s2
+{
+  unsigned char c;
+  unsigned f1:4;
+  unsigned f2:4;
+  unsigned f3:4;
+};
+
+struct s3
+{
+  unsigned f1:3;
+  unsigned f2:3;
+  unsigned f3:3;
+};
+
+struct s4
+{
+  unsigned f0:3;
+  unsigned f1:3;
+  unsigned f2:3;
+  unsigned f3:3;
+};
+
+union un_1
+{
+  struct s1 a;
+  struct s2 b;
+};
+
+union un_2
+{
+  struct s3 a;
+  struct s4 b;
+};
+
+void f1 (union un_1 *p1, union un_1 *p2)
+{
+  p2->a.f3 = p1->b.f3;
+  p2->a.f2 = p1->b.f2;
+  p2->a.f1 = p1->b.f1;
+
+  if (p1->a.f1 != 3)
+    abort ();
+}
+
+void f2 (union un_2 *p1, union un_2 *p2)
+{
+  p2->b.f1 = p1->a.f1;
+  p2->b.f2 = p1->a.f2;
+  p2->b.f3 = p1->a.f3;
+
+  if (p2->b.f1 != 0 || p2->b.f2 != 0 || p2->b.f3 != 0)
+    abort ();
+}
+
+int main ()
+{
+  union un_1 u1;
+  union un_2 u2;
+
+  u1.b.f1 = 1;
+  u1.b.f2 = 2;
+  u1.b.f3 = 3;
+  u1.b.c = 0;
+
+  f1 (&u1, &u1);
+
+  u2.b.f0 = 0;
+  u2.b.f1 = 1;
+  u2.b.f2 = 2;
+  u2.b.f3 = 3;
+
+  f2 (&u2, &u2);
+
+  return 0;
+}
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 49bbee3..b4f7f26 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2266,7 +2266,7 @@ analyze_access_subtree (struct access *root, struct access *parent,
       if (INTEGRAL_TYPE_P (root->type)
 	  && (TREE_CODE (root->type) != INTEGER_TYPE
 	      || TYPE_PRECISION (root->type) != root->size)
-	  /* But leave bitfield accesses alone.  */
+	  /* But leave bit-field accesses alone.  */
 	  && (TREE_CODE (root->expr) != COMPONENT_REF
 	      || !DECL_BIT_FIELD (TREE_OPERAND (root->expr, 1))))
 	{
@@ -3517,12 +3517,629 @@ perform_intra_sra (void)
   return ret;
 }
 
+/* Bit-field copy sequences.  */
+
+struct bfcopy
+{
+  bfcopy ():merged (false), modified (false), is_barrier (false), next (0),
+    head_copy (0)
+  {
+  }
+
+  gimple load_stmt;		/* Bit-field load statement.  */
+  gimple store_stmt;		/* Bit-field store statement.  */
+  unsigned src_offset_words;	/* Bit-field offset at src in words.  */
+  unsigned src_bit_offset;	/* Bit-field offset inside source word.  */
+  unsigned src_bit_size;	/* Size of bit-field in source word.  */
+  unsigned dst_offset_words;	/* Bit-field offset at dst in words.  */
+  unsigned dst_bit_offset;	/* Bit-field offset inside destination
+				   word.  */
+  unsigned src_field_offset;	/* Source field offset.  */
+  unsigned dst_bit_size;	/* Size of bit-field in destination word.  */
+  tree src_addr;		/* Address of source memory access.  */
+  tree dst_addr;		/* Address of destination memory access.  */
+  unsigned merged:1;		/* 1 if copy is merged with another one.  */
+  unsigned modified:1;		/* 1 if bit-field size is modified.  */
+  unsigned is_barrier:1;	/* 1 if copy is barrier (call or mem
+				   access).  */
+  struct bfcopy *next;		/* Copy with which this one is merged.  */
+  tree bitfield_representative;	/* Bit field representative of original
+				   declaration.  */
+  struct bfcopy *head_copy;	/* Head of copies list where this one is
+				   merged.  */
+};
+
+/* Returns true if given COMPONENT_REF is part of an union.  */
+
+static bool part_of_union_p (tree component)
+{
+  tree tmp = component;
+  bool res = false;
+  while (TREE_CODE (tmp) == COMPONENT_REF)
+    {
+      if (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE)
+       {
+         res = true;
+         break;
+       }
+      tmp = TREE_OPERAND (tmp, 0);
+    }
+  if (tmp && (TREE_CODE (TREE_TYPE (tmp)) == UNION_TYPE))
+    res = true;
+  return res;
+}
+
+/* Return TRUE if REF is a bit-field access.  If *OFF is not NULL it will
+   contain offset of the bit-field within the representative in bits.  */
+
+static bool
+bf_access_candidate_p (tree ref, unsigned HOST_WIDE_INT * off)
+{
+  if (TREE_CODE (ref) != COMPONENT_REF)
+    return false;
+  if (part_of_union_p (ref))
+    return false;
+  tree field = TREE_OPERAND (ref, 1);
+  if (!DECL_BIT_FIELD_TYPE (field))
+    return false;
+
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+  if (!rep)
+    return false;
+  /* Do not lower if representative is bigger than one word.  */
+  if (TREE_CODE (TREE_TYPE (rep)) == ARRAY_TYPE)
+    return false;
+
+  if (!off)
+    return true;
+
+  if (tree_fits_uhwi_p (DECL_FIELD_OFFSET (field))
+      && tree_fits_uhwi_p (DECL_FIELD_OFFSET (rep)))
+    *off = (tree_to_uhwi (DECL_FIELD_OFFSET (field))
+	    - tree_to_uhwi (DECL_FIELD_OFFSET (rep))) * BITS_PER_UNIT;
+  else
+    *off = 0;
+  *off += (tree_to_uhwi (DECL_FIELD_BIT_OFFSET (field))
+	   - tree_to_uhwi (DECL_FIELD_BIT_OFFSET (rep)));
+
+  return true;
+}
+
+
+/* Lower the bit-field read at *GSI.  */
+
+static void
+lower_bitfield_read (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT off,
+		     tree size, tree type)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  tree ref = gimple_assign_rhs1 (stmt);
+  tree field = TREE_OPERAND (ref, 1);
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+  tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple load = gimple_build_assign (loadres,
+				     build3 (COMPONENT_REF, TREE_TYPE (rep),
+					     TREE_OPERAND (ref, 0), rep,
+					     NULL_TREE));
+  gimple_set_vuse (load, gimple_vuse (stmt));
+  gsi_insert_before (gsi, load, GSI_SAME_STMT);
+  if (!type)
+    type = TREE_TYPE (ref);
+  gimple_assign_set_rhs1 (stmt,
+			  build3 (BIT_FIELD_REF, type,
+				  loadres, size, bitsize_int (off)));
+  update_stmt (stmt);
+}
+
+/* Lower the bit-field write at *GSI.  */
+
+static void
+lower_bitfield_write (gimple_stmt_iterator * gsi, unsigned HOST_WIDE_INT off,
+		      tree size)
+{
+  gimple stmt = gsi_stmt (*gsi);
+  tree ref = gimple_assign_lhs (stmt);
+  tree field = TREE_OPERAND (ref, 1);
+  tree rep = DECL_BIT_FIELD_REPRESENTATIVE (field);
+  tree loadres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple load = gimple_build_assign (loadres,
+				     build3 (COMPONENT_REF, TREE_TYPE (rep),
+					     unshare_expr
+					     (TREE_OPERAND (ref, 0)),
+					     rep,
+					     NULL_TREE));
+  gimple_set_vuse (load, gimple_vuse (stmt));
+  gsi_insert_before (gsi, load, GSI_SAME_STMT);
+  /* Mask out bits.  */
+  tree masked = make_ssa_name (TREE_TYPE (rep), NULL);
+  tree mask = double_int_to_tree (TREE_TYPE (rep),
+				  ~double_int::mask
+				  (TREE_INT_CST_LOW (size)).lshift (off));
+  gimple tems = gimple_build_assign_with_ops (BIT_AND_EXPR,
+					      masked, loadres, mask);
+  gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+  /* Zero-extend the value to representative size.  */
+  tree tem2;
+  if (!TYPE_UNSIGNED (TREE_TYPE (field)))
+    {
+      tem2 = make_ssa_name (unsigned_type_for (TREE_TYPE (field)), NULL);
+      tems = gimple_build_assign_with_ops (NOP_EXPR, tem2,
+					   gimple_assign_rhs1 (stmt),
+					   NULL_TREE);
+      gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+    }
+  else
+    tem2 = gimple_assign_rhs1 (stmt);
+  tree tem = make_ssa_name (TREE_TYPE (rep), NULL);
+  tems = gimple_build_assign_with_ops (NOP_EXPR, tem, tem2, NULL_TREE);
+  gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+  /* Shift the value into place.  */
+  if (off != 0)
+    {
+      tem2 = make_ssa_name (TREE_TYPE (rep), NULL);
+      tems = gimple_build_assign_with_ops (LSHIFT_EXPR, tem2, tem,
+					   size_int (off));
+      gsi_insert_before (gsi, tems, GSI_SAME_STMT);
+    }
+  else
+    tem2 = tem;
+  /* Merge masked loaded value and value.  */
+  tree modres = make_ssa_name (TREE_TYPE (rep), NULL);
+  gimple mod = gimple_build_assign_with_ops (BIT_IOR_EXPR, modres,
+					     masked, tem2);
+  gsi_insert_before (gsi, mod, GSI_SAME_STMT);
+  /* Finally adjust the store.  */
+  gimple_assign_set_rhs1 (stmt, modres);
+  gimple_assign_set_lhs (stmt,
+			 build3 (COMPONENT_REF, TREE_TYPE (rep),
+				 TREE_OPERAND (ref, 0), rep, NULL_TREE));
+  update_stmt (stmt);
+}
+
+/* Connects statement with bit-field copy sequence to which that statement
+   belong.  */
+
+struct bitfield_stmt_bfcopy_pair
+{
+  gimple stmt;
+  bfcopy *copy;
+    bitfield_stmt_bfcopy_pair (gimple s, bfcopy * c):stmt (s), copy (c)
+  {
+  };
+  /* hash_table support.  */
+  typedef bitfield_stmt_bfcopy_pair value_type;
+  typedef bitfield_stmt_bfcopy_pair compare_type;
+  static inline hashval_t hash (const bitfield_stmt_bfcopy_pair * const);
+  static inline int equal (const bitfield_stmt_bfcopy_pair * const,
+			   const bitfield_stmt_bfcopy_pair * const);
+  static inline void remove (bitfield_stmt_bfcopy_pair *);
+};
+
+hashval_t
+bitfield_stmt_bfcopy_pair::hash (const bitfield_stmt_bfcopy_pair * const a)
+{
+  return hashval_t (gimple_uid (a->stmt));
+}
+
+int
+bitfield_stmt_bfcopy_pair::equal (const bitfield_stmt_bfcopy_pair * const a,
+				  const bitfield_stmt_bfcopy_pair * const b)
+{
+  return a->stmt == b->stmt;
+}
+
+void
+bitfield_stmt_bfcopy_pair::remove (bitfield_stmt_bfcopy_pair * a)
+{
+  delete a;
+}
+
+/* Create new bfcopy structure and add it to given bitfield_copies
+   htab.  */
+
+static struct bfcopy *
+create_and_insert_bfcopy (vec < struct bfcopy *>*bitfield_copies)
+{
+  bfcopy *copy = new bfcopy;
+  bitfield_copies->safe_push (copy);
+  return copy;
+}
+
+/* Get offset of the field in bits from the start of the record.  */
+
+static inline HOST_WIDE_INT
+get_bit_offset (tree decl)
+{
+  tree type = DECL_BIT_FIELD_TYPE (decl);
+  HOST_WIDE_INT bitpos_int;
+
+  /* Must be a field and a bit-field.  */
+  gcc_assert (type && TREE_CODE (decl) == FIELD_DECL);
+  /* Bit position and decl size should be integer constants that can be
+     represented in a single HOST_WIDE_INT.  */
+  if (!tree_fits_uhwi_p (bit_position (decl))
+      || !tree_fits_uhwi_p (DECL_SIZE (decl)))
+    return -1;
+
+  bitpos_int = int_bit_position (decl);
+  return bitpos_int;
+}
+
+/* Creates new bitfield_stmt_bfcopy_pair structure and adds it to given
+   htab.  */
+
+static bool
+add_stmt_bfcopy_pair (hash_table < bitfield_stmt_bfcopy_pair > &bf_stmnt_cpy,
+		      bfcopy * copy, gimple stmt)
+{
+  bitfield_stmt_bfcopy_pair p (stmt, copy);
+  bitfield_stmt_bfcopy_pair **slot = bf_stmnt_cpy.find_slot (&p, INSERT);
+  if (!*slot)
+    {
+      *slot = new bitfield_stmt_bfcopy_pair (stmt, copy);
+      return true;
+    }
+  return false;
+}
+
+/* Passed to qsort.  Compares two bfcopy records.  */
+
+static int
+cmp_bfcopies (const void *p1, const void *p2)
+{
+  const struct bfcopy *a1 = *((const struct bfcopy **) p1);
+  const struct bfcopy *a2 = *((const struct bfcopy **) p2);
+
+  if (DECL_UID (a1->bitfield_representative) -
+      DECL_UID (a2->bitfield_representative))
+    return DECL_UID (a1->bitfield_representative) -
+      DECL_UID (a2->bitfield_representative);
+
+  if (!expressions_equal_p (a1->src_addr, a2->src_addr))
+    return a1->src_addr - a2->src_addr;
+
+  if (!expressions_equal_p (a1->dst_addr, a2->dst_addr))
+    return a1->dst_addr - a2->dst_addr;
+
+  if (a1->src_offset_words - a2->src_offset_words)
+    return a1->src_offset_words - a2->src_offset_words;
+
+  return a1->src_bit_offset - a2->src_bit_offset;
+}
+
+/* Returns size of combined bitfields.  Size cannot be larger than size
+   of largest directly accessible memory unit.  */
+
+static int
+get_merged_bit_field_size (bfcopy * copy)
+{
+  bfcopy *tmp_copy = copy;
+  int size = 0;
+
+  while (tmp_copy)
+    {
+      size += tmp_copy->src_bit_size;
+      tmp_copy = tmp_copy->next;
+    }
+  return size;
+}
+
+/* Lower bit-field access to accesses to its
+   DECL_BIT_FIELD_REPRESENTATIVE.  */
+
+static void
+lower_bitfields (void)
+{
+  basic_block bb;
+  hash_table < bitfield_stmt_bfcopy_pair > bf_stmnt_cpy;
+  vec < struct bfcopy *>bitfield_copies;
+  struct bfcopy *copy;
+  bf_stmnt_cpy.create (1);
+
+  FOR_EACH_BB_FN (bb, cfun)
+  {
+    bf_stmnt_cpy.empty ();
+    tree prev_representative = NULL_TREE;
+    bitfield_copies.create (0);
+
+    /* There are two passes, the first one identifies interesting
+       bit-field accesses and the second one actually lowers them.  */
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	 !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	gimple stmt = gsi_stmt (gsi);
+
+	if (!gimple_assign_single_p (stmt) || gimple_has_volatile_ops (stmt))
+	  continue;
+
+	tree ref = gimple_assign_rhs1 (stmt);
+	if (bf_access_candidate_p (ref, NULL))
+	  {
+	    gimple use_stmt;
+	    use_operand_p use;
+	    tree op0 = TREE_OPERAND (ref, 0);
+	    tree op1 = TREE_OPERAND (ref, 1);
+
+	    if (TREE_CODE (DECL_CONTEXT (op1)) == UNION_TYPE
+		|| TREE_CODE (DECL_CONTEXT (op1)) == QUAL_UNION_TYPE)
+	      continue;
+
+	    if (TREE_CODE (gimple_assign_lhs (stmt)) == SSA_NAME
+		&& single_imm_use (gimple_assign_lhs (stmt), &use,
+				   &use_stmt) && is_gimple_assign (use_stmt))
+	      {
+		tree uses_stmt_lhs = gimple_assign_lhs (use_stmt);
+
+		if (bf_access_candidate_p (uses_stmt_lhs, NULL))
+		  {
+		    tree use_op0 = TREE_OPERAND (uses_stmt_lhs, 0);
+		    tree use_op1 = TREE_OPERAND (uses_stmt_lhs, 1);
+		    tree use_repr = DECL_BIT_FIELD_REPRESENTATIVE (use_op1);
+		    if (prev_representative
+			&& (prev_representative != use_repr))
+		      {
+			/* If previous bfcopy has different
+			   representative then barrier is needed
+			   between it and new access.  */
+			copy = create_and_insert_bfcopy (&bitfield_copies);
+			copy->is_barrier = true;
+		      }
+		    prev_representative = use_repr;
+		    /* Create new bit-field bfcopy structure.  */
+		    copy = create_and_insert_bfcopy (&bitfield_copies);
+		    /* Collect bfcopy data - load instruction.  */
+		    copy->src_bit_size = tree_to_uhwi (DECL_SIZE (op1));
+		    copy->src_bit_offset = get_bit_offset (op1);
+		    copy->src_offset_words =
+		      field_byte_offset (op1) / UNITS_PER_WORD;
+		    copy->src_field_offset =
+		      tree_to_uhwi (DECL_FIELD_OFFSET (op1));
+		    copy->src_addr = op0;
+		    copy->load_stmt = gsi_stmt (gsi);
+		    /* Collect bfcopy data - store instruction.  */
+		    copy->dst_bit_size = tree_to_uhwi (DECL_SIZE (use_op1));
+		    copy->dst_bit_offset = get_bit_offset (use_op1);
+		    copy->dst_offset_words =
+		      field_byte_offset (use_op1) / UNITS_PER_WORD;
+		    copy->dst_addr = use_op0;
+		    copy->store_stmt = use_stmt;
+		    add_stmt_bfcopy_pair (bf_stmnt_cpy, copy, stmt);
+		    add_stmt_bfcopy_pair (bf_stmnt_cpy, copy, use_stmt);
+		    copy->bitfield_representative = use_repr;
+		  }
+	      }
+	  }
+
+	/* Insert barrier for merging if statement is function call or memory
+	   access.  */
+	bitfield_stmt_bfcopy_pair csdata (stmt, NULL);
+	if (!bf_stmnt_cpy.find (&csdata)
+	    && ((gimple_code (stmt) == GIMPLE_CALL)
+		|| (gimple_has_mem_ops (stmt))))
+	  {
+	    /* Create new bfcopy access structure.  */
+	    copy = create_and_insert_bfcopy (&bitfield_copies);
+	    /* Mark it as barrier.  */
+	    copy->is_barrier = true;
+	  }
+      }
+
+    /* If there are no at least two sequences go to the next basic block.  */
+    if (bitfield_copies.length () <= 1)
+      {
+	bitfield_copies.release ();
+	continue;
+      }
+    vec < struct bfcopy *>bitfield_copies_merge = vNULL;
+    /* Try to merge different bfcopy sequences.  */
+    for (int ix = 0; bitfield_copies.iterate (ix, &copy); ix++)
+      {
+	struct bfcopy *head_copy;
+	struct bfcopy *mrg_copy;
+	struct bfcopy *prev_copy;
+
+	if (!bitfield_copies_merge.exists ())
+	  bitfield_copies_merge.create (0);
+
+	if (!copy->is_barrier)
+	  bitfield_copies_merge.safe_push (copy);
+
+	if (!copy->is_barrier
+	    && !(copy == bitfield_copies.last ()
+		 && !bitfield_copies_merge.is_empty ()))
+	  continue;
+
+	bitfield_copies_merge.qsort (cmp_bfcopies);
+
+	head_copy = prev_copy = NULL;
+	int iy;
+	for (iy = 0; bitfield_copies_merge.iterate (iy, &mrg_copy); iy++)
+	  {
+	    if (head_copy
+		&& expressions_equal_p (head_copy->src_addr,
+					mrg_copy->src_addr)
+		&& expressions_equal_p (head_copy->dst_addr,
+					mrg_copy->dst_addr)
+		&& prev_copy->src_offset_words
+		== mrg_copy->src_offset_words
+		&& prev_copy->dst_offset_words
+		== mrg_copy->dst_offset_words
+		&& prev_copy->src_bit_offset + prev_copy->src_bit_size
+		== mrg_copy->src_bit_offset
+		&& prev_copy->dst_bit_offset + prev_copy->dst_bit_size
+		== mrg_copy->dst_bit_offset
+		&& prev_copy->bitfield_representative
+		== mrg_copy->bitfield_representative)
+	      {
+		/* Merge conditions are satisfied - merge copies.  */
+		mrg_copy->merged = true;
+		prev_copy->next = mrg_copy;
+		head_copy->modified = true;
+		prev_copy = mrg_copy;
+		mrg_copy->head_copy = head_copy;
+	      }
+	    else
+	      head_copy = prev_copy = mrg_copy;
+	  }
+	bitfield_copies_merge.release ();
+	bitfield_copies_merge = vNULL;
+      }
+
+    tree reaching_vuse = NULL_TREE;
+
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb); !gsi_end_p (gsi);)
+      {
+	gimple stmt = gsi_stmt (gsi);
+
+	/* Fix vuse info if necessary.  */
+	if (gimple_has_mem_ops (stmt) && reaching_vuse != NULL_TREE)
+	  {
+	    gimple_set_vuse (stmt, reaching_vuse);
+	    update_stmt (stmt);
+	  }
+
+	if (!gimple_assign_single_p (stmt) || gimple_has_volatile_ops (stmt))
+	  {
+	    gsi_next (&gsi);
+	    if (gimple_vdef (stmt))
+	      reaching_vuse = gimple_vdef (stmt);
+	    continue;
+	  }
+
+	tree ref;
+	unsigned HOST_WIDE_INT off;
+	tree size;
+	bool deleted = false;
+
+	/* Lower a bit-field read.  */
+	ref = gimple_assign_rhs1 (stmt);
+	if (bf_access_candidate_p (ref, &off))
+	  {
+	    bfcopy *cc = NULL;
+	    bitfield_stmt_bfcopy_pair st_cpy (stmt, NULL);
+	    bitfield_stmt_bfcopy_pair *p_st_cpy;
+	    p_st_cpy = bf_stmnt_cpy.find (&st_cpy);
+	    unsigned HOST_WIDE_INT mrg_off;
+	    if (p_st_cpy)
+	      cc = p_st_cpy->copy;
+
+	    if (cc && (cc->merged || cc->modified))
+	      size =
+		build_int_cst (unsigned_type_node,
+			       get_merged_bit_field_size (cc->head_copy ?
+							  cc->head_copy :
+							  cc));
+	    else
+	      size = DECL_SIZE (TREE_OPERAND (ref, 1));
+	    if (cc && cc->merged
+		&& bf_access_candidate_p (gimple_assign_rhs1
+					  (cc->head_copy->load_stmt),
+					  &mrg_off))
+	      off = mrg_off;
+	    if (cc && cc->merged)
+	      {
+		tree head_rhs = gimple_assign_rhs1 (cc->head_copy->load_stmt);
+		switch (TREE_CODE (head_rhs))
+		  {
+		  case COMPONENT_REF:
+		    if (bf_access_candidate_p (head_rhs, &mrg_off))
+		      off = mrg_off;
+		    break;
+		  case BIT_FIELD_REF:
+		    off = tree_to_uhwi (TREE_OPERAND (head_rhs, 2));
+		    break;
+		  default:
+		    break;
+		  }
+	      }
+
+	    if (cc && (cc->modified))
+	      {
+		tree tmp_ssa;
+		tree itype = make_node (INTEGER_TYPE);
+		TYPE_PRECISION (itype) = TREE_INT_CST_LOW (size);
+		fixup_unsigned_type (itype);
+		lower_bitfield_read (&gsi, off, size, itype);
+		tmp_ssa =
+		  make_ssa_name (create_tmp_var (itype, NULL), cc->load_stmt);
+		gimple_assign_set_lhs (cc->load_stmt, tmp_ssa);
+		update_stmt (cc->load_stmt);
+		gimple_assign_set_rhs1 (cc->store_stmt, tmp_ssa);
+		update_stmt (cc->store_stmt);
+	      }
+	    else if (cc && cc->merged)
+	      {
+		gsi_remove (&gsi, true);
+		deleted = true;
+	      }
+	  }
+	/* Lower a bit-field write.  */
+	ref = gimple_assign_lhs (stmt);
+	if (bf_access_candidate_p (ref, &off))
+	  {
+	    bfcopy *cc = NULL;
+	    bitfield_stmt_bfcopy_pair st_cpy (stmt, NULL);
+	    bitfield_stmt_bfcopy_pair *p_st_cpy;
+	    unsigned HOST_WIDE_INT mrg_off;
+	    p_st_cpy = bf_stmnt_cpy.find (&st_cpy);
+	    if (p_st_cpy)
+	      cc = p_st_cpy->copy;
+
+	    if (cc && (cc->merged || cc->modified))
+	      size =
+		build_int_cst (unsigned_type_node,
+			       get_merged_bit_field_size (cc->head_copy ?
+							  cc->head_copy :
+							  cc));
+	    else
+	      size = DECL_SIZE (TREE_OPERAND (ref, 1));
+	    if (cc && cc->merged
+		&&
+		bf_access_candidate_p (gimple_assign_lhs
+				       (cc->head_copy->store_stmt), &mrg_off))
+	      off = mrg_off;
+
+	    if (cc && (cc->modified) && !(cc && cc->merged))
+	      lower_bitfield_write (&gsi, off, size);
+	    else if (cc && cc->merged)
+	      {
+		if (gimple_vdef (stmt))
+		  {
+		    imm_use_iterator imm_iter;
+		    gimple use_stmt;
+		    use_operand_p use_p;
+		    FOR_EACH_IMM_USE_STMT (use_stmt, imm_iter,
+					   gimple_vdef (stmt))
+		    {
+		      FOR_EACH_IMM_USE_ON_STMT (use_p, imm_iter)
+			SET_USE (use_p, reaching_vuse);
+		      update_stmt (stmt);
+		    }
+		  }
+
+		gsi_remove (&gsi, true);
+		deleted = true;
+	      }
+	  }
+	if (gimple_vdef (stmt) && !deleted)
+	  reaching_vuse = gimple_vdef (stmt);
+	if (!deleted)
+	  gsi_next (&gsi);
+      }
+  }
+
+  bf_stmnt_cpy.dispose ();
+}
+
 /* Perform early intraprocedural SRA.  */
 static unsigned int
 early_intra_sra (void)
 {
   sra_mode = SRA_MODE_EARLY_INTRA;
-  return perform_intra_sra ();
+  unsigned int res = perform_intra_sra ();
+  if (flag_tree_bitfield_merge)
+    lower_bitfields ();
+  return res;
 }
 
 /* Perform "late" intraprocedural SRA.  */
diff --git a/gcc/tree-ssa-sccvn.c b/gcc/tree-ssa-sccvn.c
index f7ec8b6..cde6ce6 100644
--- a/gcc/tree-ssa-sccvn.c
+++ b/gcc/tree-ssa-sccvn.c
@@ -4193,29 +4193,6 @@ get_next_value_id (void)
   return next_value_id++;
 }
 
-
-/* Compare two expressions E1 and E2 and return true if they are equal.  */
-
-bool
-expressions_equal_p (tree e1, tree e2)
-{
-  /* The obvious case.  */
-  if (e1 == e2)
-    return true;
-
-  /* If only one of them is null, they cannot be equal.  */
-  if (!e1 || !e2)
-    return false;
-
-  /* Now perform the actual comparison.  */
-  if (TREE_CODE (e1) == TREE_CODE (e2)
-      && operand_equal_p (e1, e2, OEP_PURE_SAME))
-    return true;
-
-  return false;
-}
-
-
 /* Return true if the nary operation NARY may trap.  This is a copy
    of stmt_could_throw_1_p adjusted to the SCCVN IL.  */
 
diff --git a/gcc/tree-ssa-sccvn.h b/gcc/tree-ssa-sccvn.h
index f52783a..0aa5537 100644
--- a/gcc/tree-ssa-sccvn.h
+++ b/gcc/tree-ssa-sccvn.h
@@ -21,10 +21,6 @@
 #ifndef TREE_SSA_SCCVN_H
 #define TREE_SSA_SCCVN_H
 
-/* In tree-ssa-sccvn.c  */
-bool expressions_equal_p (tree, tree);
-
-
 /* TOP of the VN lattice.  */
 extern tree VN_TOP;
 
diff --git a/gcc/tree.c b/gcc/tree.c
index 8b44ecc..0104c80 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -12358,4 +12358,44 @@ get_base_address (tree t)
   return t;
 }
 
+/* Compare two expressions E1 and E2 and return true if they are equal.  */
+
+bool
+expressions_equal_p (tree e1, tree e2)
+{
+  /* The obvious case.  */
+  if (e1 == e2)
+    return true;
+
+  /* If only one of them is null, they cannot be equal.  */
+  if (!e1 || !e2)
+    return false;
+
+  /* Now perform the actual comparison.  */
+  if (TREE_CODE (e1) == TREE_CODE (e2)
+      && operand_equal_p (e1, e2, OEP_PURE_SAME))
+    return true;
+
+  return false;
+}
+
+/* Given a pointer to a tree node, assumed to be some kind of a ..._TYPE
+   node, return the size in bits for the type if it is a constant, or else
+   return the alignment for the type if the type's size is not constant, or
+   else return BITS_PER_WORD if the type actually turns out to be an
+   ERROR_MARK node.  */
+
+unsigned HOST_WIDE_INT
+simple_type_size_in_bits (const_tree type)
+{
+  if (TREE_CODE (type) == ERROR_MARK)
+    return BITS_PER_WORD;
+  else if (TYPE_SIZE (type) == NULL_TREE)
+    return 0;
+  else if (tree_fits_uhwi_p (TYPE_SIZE (type)))
+    return tree_to_uhwi (TYPE_SIZE (type));
+  else
+    return TYPE_ALIGN (type);
+}
+
 #include "gt-tree.h"
diff --git a/gcc/tree.h b/gcc/tree.h
index 9ad0b6f..a855070 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -3993,6 +3993,7 @@ extern tree substitute_placeholder_in_expr (tree, tree);
   ((EXP) == 0 || TREE_CONSTANT (EXP) ? (EXP)	\
    : substitute_placeholder_in_expr (EXP, OBJ))
 
+extern unsigned HOST_WIDE_INT simple_type_size_in_bits (const_tree type);
 
 /* stabilize_reference (EXP) returns a reference equivalent to EXP
    but it can be used multiple times
@@ -4109,6 +4110,11 @@ inlined_function_outer_scope_p (const_tree block)
        (TREE = function_args_iter_cond (&(ITER))) != NULL_TREE;		\
        function_args_iter_next (&(ITER)))
 
+
+/* In dwarf2out.c.  */
+HOST_WIDE_INT
+field_byte_offset (const_tree decl);
+
 /* In tree.c */
 extern unsigned crc32_string (unsigned, const char *);
 extern unsigned crc32_byte (unsigned, char);
@@ -4253,6 +4259,7 @@ extern tree obj_type_ref_class (tree ref);
 extern bool types_same_for_odr (tree type1, tree type2);
 extern bool contains_bitfld_component_ref_p (const_tree);
 extern bool type_in_anonymous_namespace_p (tree);
+extern bool expressions_equal_p (tree e1, tree e2);
 extern bool block_may_fallthru (const_tree);
 extern void using_eh_for_cleanups (void);
 extern bool using_eh_for_cleanups_p (void);



Regards,
Zoran Jovanovic


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]