This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] GCSE after reload


The gcse-after-reload is performed after register allocation and reload,
and its purpose is to remove redundant loads due to spilling and other
optimizations that occur after gcse was performed.

The gcse-after-reload, try to find inter-block redundancies but only in
those redundancies that caused by the immediate predecessor blocks.
Since this is performed after the register allocation and reload, the
redundancies are removed only if the register usage allows so.

In general this optimization considers every load in the given basic block
and looks for loads or stores from the same memory location in the its
immediate predecessors. It handles also partial redundancies by adding new
loads on the edges if according to the feedback information it worth.

Testing:
--------
Bootstrap on powerpc-apple-darwin6.4
Regression : g++, g77, libstdc++, libjava passed (no new failures).
             gcc several tests reported as unexpected fails but behaved
             as expected and the same as the vanilla, so I consider
             these as OK too.

ChangeLog
---------
ChangeLog entry
---------------
2003-10-13  Mostafa Hagog  <mustafa@il.ibm.com>

      * common.opt : Add description of the new -fgcse-after-reload flag.

      * flags.h : Declaration of global variable flag_gcse_after_reload.

      * gcse.c: (reg_used_on_edge ,reg_set_between_after_reload_p,
      reg_used_between_after_reload_p, rtx get_avail_load_store_reg,
      is_jump_table_basic_block, bb_has_well_behaved_predecessors,
      get_bb_avail_insn, hash_scan_set_after_reload,
      compute_hash_table_after_reload, eliminate_partially_redundant_loads,
      gcse_after_reload, get_bb_avail_insn) New functions to implement the
      gcse-after-reload.
      (gcse_after_reload_main): New function, the main entry point to
      gcse-after-reload.

      * rtl.h: (gcse_after_reload_main): Declaration of the new function.

      * opts.c:
      (common_handle_option): Handle the -fgcse-after-reload flag.

      * toplev.c: Initialization of flag_gcse_after_reload.

      * doc/invoke.texi: documentation for the new flag gcse-after-reload.

Mostafa

Index: common.opt
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/common.opt,v
retrieving revision 1.17
diff -b -c -p -r1.17 common.opt
*** common.opt    9 Oct 2003 09:08:32 -0000     1.17
--- common.opt    14 Oct 2003 13:59:33 -0000
*************** fgcse-sm
*** 362,367 ****
--- 362,373 ----
  Common
  Perform store motion after global common subexpression elimination

+ fgcse-after-reload
+ Common
+ Perform global common subexpression elimination after register alloaction
+ has finished, the purpose from this optimization is to cleanup redundat
+ loads that optimization has inserted after GCSE.
+
  fgnu-linker
  Common
  Output GNU ld formatted global initializers
Index: flags.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/flags.h,v
retrieving revision 1.122
diff -b -c -p -r1.122 flags.h
*** flags.h 9 Oct 2003 09:08:32 -0000     1.122
--- flags.h 14 Oct 2003 13:59:33 -0000
*************** extern int flag_gcse_lm;
*** 675,680 ****
--- 675,685 ----

  extern int flag_gcse_sm;

+ /* Nonzero if we want to perform global redundancy elimination after
+    register allocation.  */
+
+ extern int flag_gcse_after_reload;
+
  /* Perform branch target register optimization before prologue / epilogue
     threading.  */

Index: gcse.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/gcse.c,v
retrieving revision 1.276
diff -b -c -p -r1.276 gcse.c
*** gcse.c  5 Oct 2003 19:50:54 -0000     1.276
--- gcse.c  14 Oct 2003 13:59:36 -0000
*************** is_too_expensive (const char *pass)
*** 7936,7941 ****
--- 7936,8587 ----
      }

    return false;
+ }
+
+ /* The following code implements gcse after reload, the purpose of this
+    pass is to cleanup redundant loads generated by reload and other
+    optimizations that comes after gcse. It searches for simple
inter-block
+    redundancies and tries to eliminates them by adding moves and loads
+    in cold places.  */
+
+ /*This is the threshold ratio when to perform partial redundancy
+   elimination after reload.  */
+ #define gcse_after_reload_PARTIAL_FRACTION 3
+
+ /* The following structure hold the information about the occurrences of
+    the redundant instructions.  */
+ struct unoccr
+ {
+   struct unoccr *next;
+   edge pred;
+   rtx insn;
+ };
+
+ static bool reg_used_on_edge (rtx, edge);
+ static rtx reg_set_between_after_reload_p (rtx, rtx, rtx);
+ static rtx reg_used_between_after_reload_p (rtx, rtx, rtx);
+ static rtx get_avail_load_store_reg (rtx);
+ static bool is_jump_table_basic_block (basic_block);
+ static int bb_has_well_behaved_predecessors (basic_block);
+ static struct occr* get_bb_avail_insn (basic_block, struct occr *);
+ static void hash_scan_set_after_reload (rtx, rtx, struct hash_table *);
+ static void compute_hash_table_after_reload (struct hash_table *);
+ static rtx eliminate_partially_redundant_loads (basic_block,
+                                                 rtx,
+                                                 struct expr *);
+ static void gcse_after_reload (void);
+ static struct occr* get_bb_avail_insn (basic_block, struct occr *);
+ void gcse_after_reload_main (rtx, FILE *);
+
+
+ /* Check if register reg is used in any insn waiting to be inserted on e.
+    Assumes no such insn can be a CALL_INSN; if so call reg_used_between_p
+    with PREV(insn),NEXT(insn) instead of calling reg_overlap_mentioned_p.
*/
+ static bool
+ reg_used_on_edge (reg, e)
+      rtx reg;
+      edge e;
+ {
+   rtx insn;
+
+   for (insn = e->insns; insn; insn = NEXT_INSN (insn))
+     if (INSN_P (insn) && reg_overlap_mentioned_p (reg, PATTERN (insn)))
+       return true;
+
+   return false;
+ }
+
+ /* Return the insn that sets register REG or clobbers it in between
+    FROM_INSN and TO_INSN (exclusive of those two).
+    Just like reg_set_between but for hard registers and not pseudos.  */
+
+ static rtx
+ reg_set_between_after_reload_p (rtx reg, rtx from_insn, rtx to_insn)
+ {
+   rtx insn;
+   int regno;
+
+   if (GET_CODE (reg) != REG)
+     return to_insn;
+   regno = REGNO (reg);
+
+   /* We are called after register allocation.  */
+   if (regno >= FIRST_PSEUDO_REGISTER)
+     abort ();
+
+   if (from_insn == to_insn)
+     return 0;
+
+   for (insn = NEXT_INSN (from_insn);
+        insn != to_insn;
+        insn = NEXT_INSN (insn))
+     {
+       rtx body = insn;
+
+       if (INSN_P (insn))
+         {
+           if (FIND_REG_INC_NOTE (insn, reg)
+               || (GET_CODE (insn) == CALL_INSN
+                   && call_used_regs[regno])
+               || find_reg_fusage (insn, CLOBBER, reg))
+             {
+               return insn;
+             }
+           body = PATTERN (insn);
+         }
+       if (set_of (reg, insn) != NULL_RTX)
+         return insn;
+     }
+   return 0;
+ }
+
+ /* Return the insn that uses register REG in between FROM_INSN and
TO_INSN
+    (exclusive of those two). Similar to reg_used_between but for hard
+    registers and not pseudos.  */
+
+ static rtx
+ reg_used_between_after_reload_p (rtx reg, rtx from_insn, rtx to_insn)
+ {
+   rtx insn;
+   int regno;
+
+   if (GET_CODE (reg) != REG)
+     return to_insn;
+   regno = REGNO (reg);
+
+   /*We are called after register allocation.  */
+   if (regno >= FIRST_PSEUDO_REGISTER)
+     abort ();
+   if (from_insn == to_insn)
+     return 0;
+
+   for (insn = NEXT_INSN (from_insn);
+        insn != to_insn;
+        insn = NEXT_INSN (insn))
+     if (INSN_P (insn)
+         && (reg_overlap_mentioned_p (reg, PATTERN (insn))
+             || (GET_CODE (insn) == CALL_INSN
+                 && call_used_regs[regno])
+             || find_reg_fusage (insn, USE, reg)
+             || find_reg_fusage (insn, CLOBBER, reg)))
+       return insn;
+   return 0;
+ }
+
+ /* Return the loaded/stored register of a load/store instruction.  */
+
+ static rtx
+ get_avail_load_store_reg (rtx insn)
+ {
+   if (GET_CODE (SET_DEST (PATTERN (insn))) == REG)  /* A load.  */
+     return SET_DEST(PATTERN(insn));
+   if (GET_CODE (SET_SRC (PATTERN (insn))) == REG)  /* A store.  */
+     return SET_SRC (PATTERN (insn));
+   else
+     return NULL_RTX /*Can't be*/;
+ }
+
+ /* don't handle ABNORMAL edges or jump tables.  */
+
+ static bool
+ is_jump_table_basic_block (basic_block bb)
+ {
+   rtx insn = bb->end;
+
+   if (GET_CODE (insn) == JUMP_INSN &&
+       (GET_CODE (PATTERN (insn)) == ADDR_VEC
+        || GET_CODE (PATTERN (insn)) == ADDR_DIFF_VEC))
+     return true;
+   return false;
+ }
+
+ /* Return nonzero if the predecessors of BB are "well behaved".  */
+
+ static int
+ bb_has_well_behaved_predecessors (basic_block bb)
+ {
+   edge pred;
+
+   if (!bb->pred)
+     return 0;
+   for (pred = bb->pred; pred != NULL; pred = pred->pred_next)
+     if ((pred->flags & EDGE_COMPLEX)
+         || is_jump_table_basic_block (pred->src))
+       return 0;
+   return 1;
+ }
+
+
+ /* Search for the occurrences of expression in BB.  */
+
+ static struct occr*
+ get_bb_avail_insn (basic_block bb, struct occr *occr)
+ {
+   for (; occr != NULL; occr = occr->next)
+     if (BLOCK_FOR_INSN (occr->insn)->index == bb->index)
+       return occr;
+   return NULL;
+ }
+
+ /* Perform partial GCSE pass after array, try to eliminate redundant
loads
+    that caused by the reload pass. We try to look for a full or partial
+       redundant loads fed by one or more loads/stores in predecessor BBs,
+       and try adding loads to make them fully redundant. We also check if
+       it worth it adding loads to be able to delete the redundant load.
+
+    Algorithm:
+    1. Build available expressions hash table:
+        For each load/store instruction, if the loaded/stored memory
didn't
+        change till the end of the basic block add this memory expression
to
+        the hash table.
+    2. Perform Redundancy elimination:
+       For each load instruction do the following:
+          perform partial redundancy elimination, check if it worth adding
+          loads to make the load fully redundant. If so add loads and
+          register copies and delete the load.
+
+  Future enhancement:
+         if loaded register is used/defined between load and some store,
+         look for some other free register between load and all its
stores,
+         and replace load with a copy from this register to the loaded
+         register.  */
+
+
+ /* This handles the case where several stores feed a partially redundant
+    load. It checks if the redundancy elimination is possible and if it
+    worth it.  */
+ static rtx
+ eliminate_partially_redundant_loads (basic_block bb, rtx insn,
+                                      struct expr *expr)
+ {
+   edge pred;
+   rtx avail_insn=NULL_RTX,
+       avail_reg,
+       dest = SET_DEST(PATTERN (insn));
+   struct occr *a_occr;
+   struct unoccr *occr, *avail_occrs=NULL;
+   struct unoccr *unoccr, *unavail_occrs=NULL;
+   int npred_ok = 0,
+       npred_not_ok = 0;
+   gcov_type ok_count = 0;
+   gcov_type not_ok_count = 0;
+   basic_block pred_bb;
+
+   /* Check if the loaded register is not used nor killed from the
beginning
+      of the block.  */
+   if (reg_used_between_after_reload_p (dest, PREV_INSN (bb->head), insn))
+     return NULL_RTX;
+
+   /* check potential for replacing load with copy for predecessors.  */
+   for (pred = bb->pred; pred ; pred = pred->pred_next)
+     {
+       avail_insn = NULL_RTX;
+       pred_bb = pred->src;
+       for (a_occr = get_bb_avail_insn (pred_bb, expr->avail_occr);
a_occr;
+            a_occr = get_bb_avail_insn (pred_bb, a_occr->next))
+         {
+           /*Check if the loaded register is not used */
+           avail_insn = a_occr->insn;
+           if (! (avail_reg = get_avail_load_store_reg (avail_insn)))
+             abort ();
+             /* Make sure we can generate a move from register avail_reg
to
+                dest. */
+           extract_insn (gen_move_insn (copy_rtx (dest),
+                                        copy_rtx (avail_reg)));
+           if (! constrain_operands (1)
+               || reg_killed_on_edge (avail_reg, pred)
+               || reg_used_on_edge (dest, pred))
+             {
+               avail_insn = NULL;
+               continue;
+             }
+           if (! reg_set_between_after_reload_p (avail_reg, avail_insn,
+                                                NEXT_INSN(pred_bb->end)))
+             /* avail_insn remains non-null.  */
+             break;
+           else
+             avail_insn = NULL;
+         }
+       if (avail_insn != NULL_RTX)
+         {
+           npred_ok++;
+           ok_count += pred->count;
+           occr = (struct unoccr *) gmalloc (sizeof (struct unoccr));
+           occr->insn = avail_insn;
+           occr->pred = pred;
+           occr->next = avail_occrs;
+           avail_occrs = occr;
+         }
+       else
+         {
+           npred_not_ok++;
+           not_ok_count += pred->count;
+           unoccr = (struct unoccr *) gmalloc (sizeof (struct unoccr));
+           unoccr->insn = NULL_RTX;
+           unoccr->pred = pred;
+           unoccr->next = unavail_occrs;
+           unavail_occrs = unoccr;
+         }
+     }
+
+   if (npred_ok == 0)    /* no load can be replaced by copy */
+     return NULL_RTX;
+
+   /* Check if it worth applying the partial redundancy elimination.  */
+   if (ok_count < gcse_after_reload_PARTIAL_FRACTION * not_ok_count)
+     return NULL_RTX;
+
+   /* Generate moves to the loaded register from where
+      the memory is available */
+   for (occr = avail_occrs; occr; occr = occr->next)
+     {
+       avail_insn = occr->insn;
+       pred = occr->pred;
+       /* Set avail_reg to be the register having the value of the memory.
*/
+        avail_reg = get_avail_load_store_reg (avail_insn);
+        if (!avail_reg)
+          abort(); /* Can't be.  */
+
+        insert_insn_on_edge (gen_move_insn (copy_rtx(dest),
+                                            copy_rtx(avail_reg)),
+                             pred);
+
+        if (gcse_file)
+          fprintf (gcse_file,
+                   "GCSE AFTER reload generating move from %d to %d on \
+                   edge from %d to %d\n",
+                   REGNO(avail_reg),
+                   REGNO(dest),
+                   pred->src->index,
+                   pred->dest->index);
+     }
+
+   /*Regenerate loads where the memory is unavailable.  */
+   for (unoccr = unavail_occrs; unoccr; unoccr = unoccr->next)
+     {
+       pred = unoccr->pred;
+       insert_insn_on_edge (copy_insn (PATTERN (insn)), pred);
+
+       if (gcse_file)
+         fprintf (gcse_file,
+                  "GCSE AFTER reload: generating on edge from %d to %d\
+                   a copy of load:\n",
+                  pred->src->index,
+                  pred->dest->index);
+     }
+
+   /* Delete the insn if it is not available in this block
+      And mark it for deletion if it is available.  */
+   for (a_occr = get_bb_avail_insn (bb, expr->avail_occr);
+        a_occr && (a_occr->insn != insn);
+        a_occr = get_bb_avail_insn (bb, a_occr->next));
+
+   if (!a_occr)
+     return insn;
+
+   a_occr->deleted_p = 1;
+   return NULL_RTX;
+ }
+
+
+
+ /* Performing the redundancy elimination as described before.  */
+ static void
+ gcse_after_reload ()
+ {
+   unsigned int i;
+   rtx insn;
+   basic_block bb;
+   struct expr *expr;
+   struct occr *occr;
+
+   /* Note we start at block 1.  */
+
+   if (ENTRY_BLOCK_PTR->next_bb == EXIT_BLOCK_PTR)
+     return;
+
+   /*Go over the expression hash table and reset occr->delete_p*/
+   for (i = 0; i < expr_hash_table.size ; i++)
+     {
+       for (expr = expr_hash_table.table[i];
+            expr != NULL;
+            expr = expr->next_same_hash)
+         for (occr = expr->avail_occr; occr;occr = occr->next)
+           occr->deleted_p = 0;
+     }
+
+   FOR_BB_BETWEEN (bb,
+                   ENTRY_BLOCK_PTR->next_bb->next_bb,
+                   EXIT_BLOCK_PTR,
+                   next_bb)
+     {
+       rtx prev_delete = NULL_RTX;
+       if (! bb_has_well_behaved_predecessors (bb))
+         continue;
+
+       /* Do not try this optimization on cold basic blocks.  */
+       if (probably_cold_bb_p (bb))
+         continue;
+
+       reset_opr_set_tables ();
+
+       for (insn = bb->head;
+            insn != NULL
+            && insn != NEXT_INSN (bb->end);
+            insn = NEXT_INSN (insn))
+         {
+           if (prev_delete)
+             delete_insn (prev_delete);
+           prev_delete = NULL_RTX;
+
+           /* Is it a load - of the form (set (reg) (mem))?  */
+           if (GET_CODE (insn) == INSN
+               && GET_CODE (PATTERN (insn)) == SET
+               && GET_CODE (SET_DEST (PATTERN (insn))) == REG
+               && GET_CODE (SET_SRC (PATTERN(insn))) == MEM)
+             {
+               rtx pat = PATTERN (insn);
+               rtx src = SET_SRC (pat);
+               struct expr *expr;
+
+               if (general_operand (src, GET_MODE (src))
+                   /* Is the expression recorded?  */
+                   && (expr = lookup_expr (src, &expr_hash_table)) != NULL
+                   /* Are the operands unchanged since the start of the
+                      block?  */
+                   && oprs_not_set_p (src, insn)
+                   && ! MEM_VOLATILE_P (src)
+                   && GET_MODE (src) != BLKmode
+                   && !(flag_non_call_exceptions && may_trap_p (src))
+                   && !side_effects_p (src))
+                 {
+                   /* We now have a load (insn) and an available memory at
+                      its BB start (expr). Try to remove the loads if it
is
+                      redundant.  */
+                   prev_delete = eliminate_partially_redundant_loads(bb,
+                                                                    insn,
+                                                                    expr);
+                 }
+             }
+
+             /* Keep track of everything modified by this insn.  */
+             if (INSN_P (insn))
+               mark_oprs_set (insn);
+         }
+
+       if (prev_delete)
+         delete_insn(prev_delete);
+     }
+
+   commit_edge_insertions ();
+
+   /* Go over the expression hash table and delete insns with
+      occr->delete_p == 1.  */
+   for (i = 0; i < expr_hash_table.size ; i++)
+     {
+       for (expr = expr_hash_table.table[i];
+            expr != NULL;
+            expr = expr->next_same_hash)
+         for (occr = expr->avail_occr; occr;occr = occr->next)
+           if (occr->deleted_p)
+             delete_insn(occr->insn);
+     }
+ }
+
+ /* Scan pattern PAT of INSN and add an entry to the hash TABLE
+    After reload we are interested in loads/stores only.  */
+ static void
+ hash_scan_set_after_reload (rtx pat, rtx insn, struct hash_table *table)
+ {
+   rtx src = SET_SRC (pat);
+   rtx dest = SET_DEST (pat);
+
+   if (GET_CODE (src) != MEM && GET_CODE (dest) != MEM)
+     return;
+
+   if (GET_CODE (dest) == REG)
+     {
+       if (/* Don't GCSE something if we can't do a reg/reg copy.  */
+           can_copy_p (GET_MODE (dest))
+           /* GCSE commonly inserts instruction after the insn.  We can't
+              do that easily for EH_REGION notes so disable GCSE on these
+              for now.  */
+           && !find_reg_note (insn, REG_EH_REGION, NULL_RTX)
+           /* Is SET_SRC something we want to gcse?  */
+           && general_operand (src, GET_MODE (src))
+           /* Don't CSE a nop.  */
+           && ! set_noop_p (pat)
+           && ! JUMP_P (insn))
+         {
+           /* An expression is not available if its operands are
+              subsequently modified, including this insn. */
+           if (oprs_available_p (src, insn))
+             insert_expr_in_table (src, GET_MODE (dest), insn, 0, 1,
table);
+         }
+     }
+   else if ((GET_CODE (src) == REG))
+     {
+       /* Only record sets of pseudo-regs in the hash table.  */
+       if (/* Don't GCSE something if we can't do a reg/reg copy.  */
+           can_copy_p (GET_MODE (src))
+           /* GCSE commonly inserts instruction after the insn.  We can't
+              do that easily for EH_REGION notes so disable GCSE on these
+              for now.  */
+           && !find_reg_note (insn, REG_EH_REGION, NULL_RTX)
+           /* Is SET_DEST something we want to gcse?  */
+           && general_operand (dest, GET_MODE (dest))
+           /* Don't CSE a nop.  */
+           && ! set_noop_p (pat)
+           &&! JUMP_P (insn)
+           && !(flag_float_store && FLOAT_MODE_P (GET_MODE (dest)))
+           /* Check if the memory expression is killed after insn */
+           && !load_killed_in_block_p (BLOCK_FOR_INSN (insn),
+                                       INSN_CUID (insn) + 1,
+                                       dest,
+                                       1)
+           && oprs_unchanged_p (XEXP (dest, 0), insn, 1))
+         {
+           insert_expr_in_table (dest, GET_MODE (dest), insn, 0, 1,
table);
+         }
+     }
+ }
+
+
+ /* Create hash table of memory expressions available at end of basic
+    blocks.  */
+ static void
+ compute_hash_table_after_reload (struct hash_table *table)
+ {
+   unsigned int i;
+
+   table->set_p = 0;
+
+   /* Initialize count of number of entries in hash table.  */
+   table->n_elems = 0;
+   memset ((char *) table->table, 0,
+           table->size * sizeof (struct expr *));
+
+   /* While we compute the hash table we also compute a bit array of which
+      registers are set in which blocks. */
+   sbitmap_vector_zero (reg_set_in_block, last_basic_block);
+
+   /* re-Cache any INSN_LIST nodes we have allocated.  */
+   clear_modify_mem_tables ();
+
+   /* Some working arrays used to track first and last set in each block.
*/
+   reg_avail_info = (struct reg_avail_info*)
+                  gmalloc (max_gcse_regno * sizeof (struct
reg_avail_info));
+
+   for (i = 0; i < max_gcse_regno; ++i)
+     reg_avail_info[i].last_bb = NULL;
+
+   FOR_EACH_BB (current_bb)
+     {
+       rtx insn;
+       unsigned int regno;
+
+       /* First pass over the instructions records information used to
+          determine when registers and memory are first and last set.  */
+         for (insn = current_bb->head;
+              insn && insn != NEXT_INSN (current_bb->end);
+              insn = NEXT_INSN (insn))
+           {
+             if (! INSN_P (insn))
+               continue;
+
+             if (GET_CODE (insn) == CALL_INSN)
+               {
+                 bool clobbers_all = false;
+
+ #ifdef NON_SAVING_SETJMP
+                 if (NON_SAVING_SETJMP
+                     && find_reg_note (insn, REG_SETJMP, NULL_RTX))
+                   clobbers_all = true;
+ #endif
+
+                 for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++)
+                    if (clobbers_all
+                        || TEST_HARD_REG_BIT (regs_invalidated_by_call,
+                                              regno))
+                      record_last_reg_set_info (insn, regno);
+
+                 mark_call (insn);
+               }
+
+               note_stores (PATTERN (insn), record_last_set_info, insn);
+
+               if(GET_CODE (PATTERN (insn)) == SET)
+                 {
+                   rtx src,dest;
+                   src = SET_SRC (PATTERN (insn));
+                   dest = SET_DEST (PATTERN (insn));
+                   if (GET_CODE (src) == MEM && auto_inc_p (XEXP (src,
0)))
+                     {
+                       regno = REGNO(XEXP(XEXP (src, 0),0));
+                       record_last_reg_set_info (insn, regno);
+                     }
+                   if (GET_CODE (dest) == MEM && auto_inc_p (XEXP (dest,
0)))
+                   {
+                     regno = REGNO (XEXP (XEXP (dest, 0),0));
+                     record_last_reg_set_info (insn, regno);
+                   }
+                 }
+           }
+
+         /* The next pass builds the hash table.  */
+         for (insn = current_bb->head;
+              insn && insn != NEXT_INSN (current_bb->end);
+              insn = NEXT_INSN (insn))
+           if (INSN_P (insn) && GET_CODE (PATTERN (insn)) == SET)
+             if (! find_reg_note (insn, REG_LIBCALL, NULL_RTX))
+               hash_scan_set_after_reload (PATTERN (insn), insn, table);
+     }
+
+   free (reg_avail_info);
+   reg_avail_info = NULL;
+ }
+
+
+
+ /* Main entry point of the GCSE after reload - clean some redundant loads
+    Due to spilling.  */
+ void
+ gcse_after_reload_main (rtx f, FILE* file)
+ {
+   gcse_subst_count = 0;
+   gcse_create_count = 0;
+
+   gcse_file = file;
+
+   gcc_obstack_init (&gcse_obstack);
+   bytes_used = 0;
+
+   /* We need alias.  */
+   init_alias_analysis ();
+
+   max_gcse_regno = max_reg_num ();
+
+   alloc_reg_set_mem (max_gcse_regno);
+   alloc_gcse_mem(f);
+   alloc_hash_table (max_cuid, &expr_hash_table, 0);
+   compute_hash_table_after_reload (&expr_hash_table);
+
+   if (gcse_file)
+     dump_hash_table (gcse_file, "Expression", &expr_hash_table);
+
+   if (expr_hash_table.n_elems > 0)
+     gcse_after_reload ();
+
+   free_hash_table (&expr_hash_table);
+
+   free_gcse_mem ();
+   free_reg_set_mem ();
+
+   /* We are finished with alias.  */
+   end_alias_analysis ();
+
+   obstack_free (&gcse_obstack, NULL);
+
  }

  #include "gt-gcse.h"
Index: opts.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/opts.c,v
retrieving revision 1.39
diff -b -c -p -r1.39 opts.c
*** opts.c  9 Oct 2003 09:08:32 -0000     1.39
--- opts.c  14 Oct 2003 13:59:36 -0000
*************** decode_options (unsigned int argc, const
*** 564,569 ****
--- 564,570 ----
        flag_inline_functions = 1;
        flag_rename_registers = 1;
        flag_unswitch_loops = 1;
+       flag_gcse_after_reload = 1;
      }

    if (optimize < 2 || optimize_size)
*************** common_handle_option (size_t scode, cons
*** 1016,1021 ****
--- 1017,1026 ----

      case OPT_fgcse_sm:
        flag_gcse_sm = value;
+       break;
+
+     case OPT_fgcse_after_reload:
+       flag_gcse_after_reload = value;
        break;

      case OPT_fgnu_linker:
Index: rtl.h
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/rtl.h,v
retrieving revision 1.435
diff -b -c -p -r1.435 rtl.h
*** rtl.h   12 Sep 2003 15:07:50 -0000    1.435
--- rtl.h   14 Oct 2003 13:59:37 -0000
*************** extern rtx fis_get_condition (rtx);
*** 2129,2134 ****
--- 2129,2135 ----
  #ifdef BUFSIZ
  extern int gcse_main (rtx, FILE *);
  extern int bypass_jumps (FILE *);
+ extern void gcse_after_reload_main(rtx f, FILE* file);
  #endif

  /* In global.c */
Index: toplev.c
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.831
diff -b -c -p -r1.831 toplev.c
*** toplev.c      9 Oct 2003 09:08:33 -0000     1.831
--- toplev.c      14 Oct 2003 13:59:39 -0000
*************** enum dump_file_index
*** 276,281 ****
--- 276,282 ----
    DFI_lreg,
    DFI_greg,
    DFI_postreload,
+   DFI_gcse2,
    DFI_flow2,
    DFI_peephole2,
    DFI_rnreg,
*************** enum dump_file_index
*** 295,301 ****
     Remaining -d letters:

      "            m   q         "
!     "         JK   O Q    V  YZ"
  */

  static struct dump_file_info dump_file[DFI_MAX] =
--- 296,302 ----
     Remaining -d letters:

      "            m   q         "
!     "          K   O Q    V  YZ"
  */

  static struct dump_file_info dump_file[DFI_MAX] =
*************** static struct dump_file_info dump_file[D
*** 329,334 ****
--- 330,336 ----
    { "lreg",     'l', 1, 0, 0 },
    { "greg",     'g', 1, 0, 0 },
    { "postreload", 'o', 1, 0, 0 },
+   { "gcse2",    'J', 0, 0, 0 },
    { "flow2",    'w', 1, 0, 0 },
    { "peephole2", 'z', 1, 0, 0 },
    { "rnreg",    'n', 1, 0, 0 },
*************** int flag_gcse_lm = 1;
*** 691,696 ****
--- 693,701 ----

  int flag_gcse_sm = 1;

+ /* Nonzero means perform global cse after register allocation.  */
+ int flag_gcse_after_reload =0;
+
  /* Perform target register optimization before prologue / epilogue
     threading.  */

*************** rest_of_compilation (tree decl)
*** 3415,3420 ****
--- 3420,3441 ----
      }

    close_dump_file (DFI_postreload, print_rtl_with_bb, insns);
+
+   if (optimize > 0 && flag_gcse_after_reload)
+     {
+       open_dump_file (DFI_gcse2, decl);
+
+       gcse_after_reload_main (insns, rtl_dump_file);
+       rebuild_jump_labels (insns);
+       delete_trivially_dead_insns (insns, max_reg_num ());
+       close_dump_file (DFI_gcse2, print_rtl_with_bb, insns);
+
+       ggc_collect ();
+
+  #ifdef ENABLE_CHECKING
+       verify_flow_info ();
+  #endif
+     }

    /* Re-create the death notes which were deleted during reload.  */
    timevar_push (TV_FLOW2);
Index: doc/invoke.texi
===================================================================
RCS file: /cvsroot/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.344
diff -b -c -p -r1.344 invoke.texi
*** doc/invoke.texi     9 Oct 2003 09:08:37 -0000     1.344
--- doc/invoke.texi     14 Oct 2003 13:59:48 -0000
*************** in the following sections.
*** 270,277 ****
  -fdelayed-branch  -fdelete-null-pointer-checks @gol
  -fexpensive-optimizations  -ffast-math  -ffloat-store @gol
  -fforce-addr  -fforce-mem  -ffunction-sections @gol
! -fgcse  -fgcse-lm  -fgcse-sm  -floop-optimize  -fcrossjumping @gol
! -fif-conversion  -fif-conversion2 @gol
  -finline-functions  -finline-limit=@var{n}  -fkeep-inline-functions @gol
  -fkeep-static-consts  -fmerge-constants  -fmerge-all-constants @gol
  -fmove-all-movables  -fnew-ra  -fno-branch-count-reg @gol
--- 270,277 ----
  -fdelayed-branch  -fdelete-null-pointer-checks @gol
  -fexpensive-optimizations  -ffast-math  -ffloat-store @gol
  -fforce-addr  -fforce-mem  -ffunction-sections @gol
! -fgcse  -fgcse-lm  -fgcse-sm  -fgcse-after-reload -floop-optimize @gol
! -fcrossjumping -fif-conversion  -fif-conversion2 @gol
  -finline-functions  -finline-limit=@var{n}  -fkeep-inline-functions @gol
  -fkeep-static-consts  -fmerge-constants  -fmerge-all-constants @gol
  -fmove-all-movables  -fnew-ra  -fno-branch-count-reg @gol
*************** invoking @option{-O2} on programs that u
*** 3697,3703 ****
  @opindex O3
  Optimize yet more.  @option{-O3} turns on all optimizations specified by
  @option{-O2} and also turns on the @option{-finline-functions},
! @option{-funit-at-a-time} and @option{-frename-registers} options.

  @item -O0
  @opindex O0
--- 3697,3704 ----
  @opindex O3
  Optimize yet more.  @option{-O3} turns on all optimizations specified by
  @option{-O2} and also turns on the @option{-finline-functions},
! @option{-funit-at-a-time}, @option{-fgcse-after-reload}and
! @option{-frename-registers} options.

  @item -O0
  @opindex O0
*************** When used in conjunction with @option{-f
*** 4001,4006 ****
--- 4002,4016 ----
  can be changed to a load before the loop and a store after the loop.

  Enabled by default when gcse is enabled.
+
+ @item -fgcse-after-reload
+ @opindex fgcse-after-reload
+ When @option{-fgcse-after-reload} is enable, A global common
subexpression
+ elimination pass is run after register allocation. The purpose from this
+ pass is to remove redundant loads due to spills and other optimizations
+ that run after gcse pass.
+
+ Enable at level @option{-O3}

  @item -floop-optimize
  @opindex floop-optimize

(See attached file: gcse_after_reload.patch)

Attachment: gcse_after_reload.patch
Description: Binary data


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]