This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] GCSE redundant load after store elimination


Hi Mostafa,
http://gcc.gnu.org/ml/gcc-patches/2003-09/msg00514.html
http://gcc.gnu.org/ml/gcc-patches/2003-10/msg01198.html


There were a number of nits in your gcse-las patch:

Firstly, when resubmitting an updated patch, please submit an updated
ChangeLog entry, especially when the new patch touches more files
than the original, in this case doc/invoke.texi.

> 2003-09-09  Mostafa Hagog  <mustafa@il.ibm.com>
>
>     * common.opt : Add description of the new -fgcse-las flag.

No space before the colon delimiter in ChangeLog entries.


>     (pre_insert_copies): added the call to update_ld_motion_stores when
>     one or more copies were inserted.

Each change description should be a full sentence, and therefore start
with a captial letter and end in a period, i.e. "Added ...".


> /* Nonzero if we want to perfrom redundant load-after-store elimination
in gcse.  */

Lines longer than 78 characters need to be wrapped, there's especially
no excuse for long comment lines.  Typo "perfrom" should be "perform".


>   /* In case of store we want to consider the memory value as avaiable
in the REG
>      stored in that memory. This makes it possible to remove redundant
loads from
>      due to stores to the same location.  */

Likewise.


> else if (flag_gcse_las && (GET_CODE(src) == REG) && GET_CODE(dest) == MEM)

The GNU coding conventions require spaces before open parentheses in
macro and function invocations, e.g. "GET_CODE (src)", and there's no
need for the inconsistent parenthesis around the second clause and not
the third.   If there's a possibility for confusion in such complex
conditionals, each clause should be written on a separate line.


> /*Do not do this for constant/copy propagation.  */

Space before the start of a comment and the comment itself.


> && !find_reg_note (insn, REG_EH_REGION, NULL_RTX)
...
> && ! set_noop_p (pat)

Whether there's a space after "!" or not is optional, but the rule must
be applied consistently within a single complex conditional (and
preferably within a single function).


> /* stores are never anticipatable.  */

Comments should ideally be full sentences, and start with a capital
letter and end with a period followed by two spaces.


> int avail_p = (oprs_available_p (dest, insn)  && ! JUMP_P (insn));

Long lines of code should also be wrapped, in this case by placing the
"&&" operator and its following operanded (suitably indented on the
next line).  Not there should also only be one space before the "&&".


> insert_expr_in_table (dest, GET_MODE (dest), insn, antic_p, avail_p, table);

Similarly, for long lines involving function calls, the parameter list
should be wrapped following a comma.

>  /* moved to pre_insert_copies
>  update_ld_motion_stores (expr);  */


Deleted code should be removed completely, not just comment out or
surround by #if 0.


> -fgcse  -fgcse-lm  -fgcse-sm  -fgcse-las -floop-optimize  -fcrossjumping @gol

In the GCC command line documentation each option is conventionally
separated by exactly two spaces, and if adding a new option creates a
long line, the line should be wrapped (preserving the @gol at the end
of each line).

> @item -fgcse-las
> @opindex fgcse-las
> When @option{-fgcse-las} is enabled, The global common subexpression
elimination pass eliminates redundant loads that come after stores to the
same memory location (both partial and full redundacies).

You should also avoid overly long lines in the .texi documentation files.
And in the section quotated above there's no need to capitalize the "The"
following a comma.



Now the good news.  I notice that you're not listed in the GCC MAINTAINERS
file as having CVS write access, so as a one-time only offer, I've made
all of the changes required above and committed the revised patch to CVS
mainline.  To make sure I didn't screw anything up I bootstrapped and
regression tested the revised patch on i686-pc-linux-gnu, all languages
except treelang with no new testsuite failures.  I also confirmed that
the patch does indeed eliminate some redundant loads from the example you
posted with your original patch (on x86 with -O2).

Thanks,


For the record, the patch as committed is given below:


2003-10-17  Mostafa Hagog  <mustafa@il.ibm.com>

	* common.opt: Add description of the new -fgcse-las flag.
	* flags.h (flag_gcse_las): Declaration of global flag_gcse_las.
	* gcse.c (hash_scan_set): Handle the case of store expression and
	insert the memory expression to the hash table, this way we make it
	possible to discover redundant loads after stores and remove them.
	(pre_insert_copy_insn): moved the call to update_ld_motion_stores,
	to pre_insert_copies, it is not the correct place to call it after
	adding stores to be in the available expression hash table.
	(pre_insert_copies): Added the call to update_ld_motion_stores when
	one or more copies were inserted.
	* opts.c (common_handle_option): Handle the -fgcse-las flag.
	* toplev.c (flag_gcse_las): Initialization of flag_gcse_las.

	* doc/invoke.tex: Document new -fgcse-las flag.


Index: common.opt
===================================================================
RCS file: /cvs/gcc/gcc/gcc/common.opt,v
retrieving revision 1.18
diff -c -3 -p -r1.18 common.opt
*** common.opt	11 Oct 2003 22:57:45 -0000	1.18
--- common.opt	17 Oct 2003 12:55:08 -0000
*************** fgcse-sm
*** 362,367 ****
--- 362,371 ----
  Common
  Perform store motion after global common subexpression elimination

+ fgcse-las
+ Common
+ Perform redundant load after store elimination in global common subexpression elimination
+
  fgnu-linker
  Common
  Output GNU ld formatted global initializers
Index: flags.h
===================================================================
RCS file: /cvs/gcc/gcc/gcc/flags.h,v
retrieving revision 1.123
diff -c -3 -p -r1.123 flags.h
*** flags.h	11 Oct 2003 22:57:45 -0000	1.123
--- flags.h	17 Oct 2003 12:55:08 -0000
*************** extern int flag_gcse_lm;
*** 675,680 ****
--- 675,685 ----

  extern int flag_gcse_sm;

+ /* Nonzero if we want to perform redundant load-after-store elimination
+    in gcse.  */
+
+ extern int flag_gcse_las;
+
  /* Perform branch target register optimization before prologue / epilogue
     threading.  */

Index: gcse.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/gcse.c,v
retrieving revision 1.276
diff -c -3 -p -r1.276 gcse.c
*** gcse.c	5 Oct 2003 19:50:54 -0000	1.276
--- gcse.c	17 Oct 2003 12:55:10 -0000
*************** hash_scan_set (rtx pat, rtx insn, struct
*** 2205,2210 ****
--- 2205,2253 ----
  		       && oprs_available_p (pat, tmp))))
  	insert_set_in_table (pat, insn, table);
      }
+   /* In case of store we want to consider the memory value as avaiable in
+      the REG stored in that memory. This makes it possible to remove
+      redundant loads from due to stores to the same location.  */
+   else if (flag_gcse_las && GET_CODE (src) == REG && GET_CODE (dest) == MEM)
+       {
+         unsigned int regno = REGNO (src);
+
+         /* Do not do this for constant/copy propagation.  */
+         if (! table->set_p
+             /* Only record sets of pseudo-regs in the hash table.  */
+ 	    && regno >= FIRST_PSEUDO_REGISTER
+ 	   /* Don't GCSE something if we can't do a reg/reg copy.  */
+ 	   && can_copy_p (GET_MODE (src))
+ 	   /* GCSE commonly inserts instruction after the insn.  We can't
+ 	      do that easily for EH_REGION notes so disable GCSE on these
+ 	      for now.  */
+ 	   && ! find_reg_note (insn, REG_EH_REGION, NULL_RTX)
+ 	   /* Is SET_DEST something we want to gcse?  */
+ 	   && want_to_gcse_p (dest)
+ 	   /* Don't CSE a nop.  */
+ 	   && ! set_noop_p (pat)
+ 	   /* Don't GCSE if it has attached REG_EQUIV note.
+ 	      At this point this only function parameters should have
+ 	      REG_EQUIV notes and if the argument slot is used somewhere
+ 	      explicitly, it means address of parameter has been taken,
+ 	      so we should not extend the lifetime of the pseudo.  */
+ 	   && ((note = find_reg_note (insn, REG_EQUIV, NULL_RTX)) == 0
+ 	       || GET_CODE (XEXP (note, 0)) != MEM))
+              {
+                /* Stores are never anticipatable.  */
+                int antic_p = 0;
+ 	       /* An expression is not available if its operands are
+ 	          subsequently modified, including this insn.  It's also not
+ 	          available if this is a branch, because we can't insert
+ 	          a set after the branch.  */
+                int avail_p = oprs_available_p (dest, insn)
+ 			     && ! JUMP_P (insn);
+
+ 	       /* Record the memory expression (DEST) in the hash table.  */
+ 	       insert_expr_in_table (dest, GET_MODE (dest), insn,
+ 				     antic_p, avail_p, table);
+              }
+       }
  }

  static void
*************** pre_edge_insert (struct edge_list *edge_
*** 5360,5366 ****
       reaching_reg <- expr
       old_reg      <- reaching_reg
     because this way copy propagation can discover additional PRE
!    opportunities.  But if this fails, we try the old way.  */

  static void
  pre_insert_copy_insn (struct expr *expr, rtx insn)
--- 5403,5415 ----
       reaching_reg <- expr
       old_reg      <- reaching_reg
     because this way copy propagation can discover additional PRE
!    opportunities.  But if this fails, we try the old way.
!    When "expr" is a store, i.e.
!    given "MEM <- old_reg", instead of adding after it
!      reaching_reg <- old_reg
!    it's better to add it before as follows:
!      reaching_reg <- old_reg
!      MEM          <- reaching_reg.  */

  static void
  pre_insert_copy_insn (struct expr *expr, rtx insn)
*************** pre_insert_copy_insn (struct expr *expr,
*** 5395,5416 ****
    else
      abort ();

!   old_reg = SET_DEST (set);
!
!   /* Check if we can modify the set destination in the original insn.  */
!   if (validate_change (insn, &SET_DEST (set), reg, 0))
      {
!       new_insn = gen_move_insn (old_reg, reg);
!       new_insn = emit_insn_after (new_insn, insn);

!       /* Keep register set table up to date.  */
!       replace_one_set (REGNO (old_reg), insn, new_insn);
!       record_one_set (regno, insn);
      }
!   else
      {
        new_insn = gen_move_insn (reg, old_reg);
!       new_insn = emit_insn_after (new_insn, insn);

        /* Keep register set table up to date.  */
        record_one_set (regno, new_insn);
--- 5444,5481 ----
    else
      abort ();

!   if (GET_CODE (SET_DEST (set)) == REG)
      {
!       old_reg = SET_DEST (set);
!       /* Check if we can modify the set destination in the original insn.  */
!       if (validate_change (insn, &SET_DEST (set), reg, 0))
!         {
!           new_insn = gen_move_insn (old_reg, reg);
!           new_insn = emit_insn_after (new_insn, insn);
!
!           /* Keep register set table up to date.  */
!           replace_one_set (REGNO (old_reg), insn, new_insn);
!           record_one_set (regno, insn);
!         }
!       else
!         {
!           new_insn = gen_move_insn (reg, old_reg);
!           new_insn = emit_insn_after (new_insn, insn);

!           /* Keep register set table up to date.  */
!           record_one_set (regno, new_insn);
!         }
      }
!   else /* This is possible only in case of a store to memory.  */
      {
+       old_reg = SET_SRC (set);
        new_insn = gen_move_insn (reg, old_reg);
!
!       /* Check if we can modify the set source in the original insn.  */
!       if (validate_change (insn, &SET_SRC (set), reg, 0))
!         new_insn = emit_insn_before (new_insn, insn);
!       else
!         new_insn = emit_insn_after (new_insn, insn);

        /* Keep register set table up to date.  */
        record_one_set (regno, new_insn);
*************** pre_insert_copy_insn (struct expr *expr,
*** 5423,5429 ****
  	     "PRE: bb %d, insn %d, copy expression %d in insn %d to reg %d\n",
  	      BLOCK_NUM (insn), INSN_UID (new_insn), indx,
  	      INSN_UID (insn), regno);
-   update_ld_motion_stores (expr);
  }

  /* Copy available expressions that reach the redundant expression
--- 5488,5493 ----
*************** pre_insert_copy_insn (struct expr *expr,
*** 5432,5438 ****
  static void
  pre_insert_copies (void)
  {
!   unsigned int i;
    struct expr *expr;
    struct occr *occr;
    struct occr *avail;
--- 5496,5502 ----
  static void
  pre_insert_copies (void)
  {
!   unsigned int i, added_copy;
    struct expr *expr;
    struct occr *occr;
    struct occr *avail;
*************** pre_insert_copies (void)
*** 5453,5458 ****
--- 5517,5525 ----
  	   expression wasn't deleted anywhere.  */
  	if (expr->reaching_reg == NULL)
  	  continue;
+
+ 	/* Set when we add a copy for that expression.  */
+ 	added_copy = 0;

  	for (occr = expr->antic_occr; occr != NULL; occr = occr->next)
  	  {
*************** pre_insert_copies (void)
*** 5477,5487 ****
--- 5544,5559 ----
  					       BLOCK_FOR_INSN (occr->insn)))
  		  continue;

+                 added_copy = 1;
+
  		/* Copy the result of avail to reaching_reg.  */
  		pre_insert_copy_insn (expr, insn);
  		avail->copied_p = 1;
  	      }
  	  }
+
+  	  if (added_copy)
+             update_ld_motion_stores (expr);
        }
  }

Index: opts.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/opts.c,v
retrieving revision 1.40
diff -c -3 -p -r1.40 opts.c
*** opts.c	11 Oct 2003 22:57:45 -0000	1.40
--- opts.c	17 Oct 2003 12:55:10 -0000
*************** common_handle_option (size_t scode, cons
*** 1019,1024 ****
--- 1019,1028 ----
        flag_gcse_sm = value;
        break;

+     case OPT_fgcse_las:
+       flag_gcse_las = value;
+       break;
+
      case OPT_fgnu_linker:
        flag_gnu_linker = value;
        break;
Index: toplev.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/toplev.c,v
retrieving revision 1.833
diff -c -3 -p -r1.833 toplev.c
*** toplev.c	15 Oct 2003 21:57:21 -0000	1.833
--- toplev.c	17 Oct 2003 12:55:11 -0000
*************** int flag_gcse_lm = 1;
*** 697,702 ****
--- 697,707 ----

  int flag_gcse_sm = 1;

+ /* Nonzero if we want to perfrom redundant load after store elimination
+    in gcse.  */
+
+ int flag_gcse_las = 1;
+
  /* Perform target register optimization before prologue / epilogue
     threading.  */

*************** static const lang_independent_options f_
*** 1075,1080 ****
--- 1080,1086 ----
    {"gcse", &flag_gcse, 1 },
    {"gcse-lm", &flag_gcse_lm, 1 },
    {"gcse-sm", &flag_gcse_sm, 1 },
+   {"gcse-las", &flag_gcse_las, 1 },
    {"branch-target-load-optimize", &flag_branch_target_load_optimize, 1 },
    {"branch-target-load-optimize2", &flag_branch_target_load_optimize2, 1 },
    {"loop-optimize", &flag_loop_optimize, 1 },
Index: doc/invoke.texi
===================================================================
RCS file: /cvs/gcc/gcc/gcc/doc/invoke.texi,v
retrieving revision 1.346
diff -c -3 -p -r1.346 invoke.texi
*** doc/invoke.texi	13 Oct 2003 17:01:00 -0000	1.346
--- doc/invoke.texi	17 Oct 2003 12:55:13 -0000
*************** in the following sections.
*** 270,277 ****
  -fdelayed-branch  -fdelete-null-pointer-checks @gol
  -fexpensive-optimizations  -ffast-math  -ffloat-store @gol
  -fforce-addr  -fforce-mem  -ffunction-sections @gol
! -fgcse  -fgcse-lm  -fgcse-sm  -floop-optimize  -fcrossjumping @gol
! -fif-conversion  -fif-conversion2 @gol
  -finline-functions  -finline-limit=@var{n}  -fkeep-inline-functions @gol
  -fkeep-static-consts  -fmerge-constants  -fmerge-all-constants @gol
  -fmove-all-movables  -fnew-ra  -fno-branch-count-reg @gol
--- 270,277 ----
  -fdelayed-branch  -fdelete-null-pointer-checks @gol
  -fexpensive-optimizations  -ffast-math  -ffloat-store @gol
  -fforce-addr  -fforce-mem  -ffunction-sections @gol
! -fgcse  -fgcse-lm  -fgcse-sm  -fgcse-las  -floop-optimize @gol
! -fcrossjumping  -fif-conversion  -fif-conversion2 @gol
  -finline-functions  -finline-limit=@var{n}  -fkeep-inline-functions @gol
  -fkeep-static-consts  -fmerge-constants  -fmerge-all-constants @gol
  -fmove-all-movables  -fnew-ra  -fno-branch-count-reg @gol
*************** also turns on the following optimization
*** 3677,3686 ****
  -fstrength-reduce @gol
  -fcse-follow-jumps  -fcse-skip-blocks @gol
  -frerun-cse-after-loop  -frerun-loop-opt @gol
! -fgcse   -fgcse-lm   -fgcse-sm @gol
  -fdelete-null-pointer-checks @gol
  -fexpensive-optimizations @gol
! -fregmove -@gol
  -fschedule-insns  -fschedule-insns2 @gol
  -fsched-interblock  -fsched-spec @gol
  -fcaller-saves @gol
--- 3677,3686 ----
  -fstrength-reduce @gol
  -fcse-follow-jumps  -fcse-skip-blocks @gol
  -frerun-cse-after-loop  -frerun-loop-opt @gol
! -fgcse  -fgcse-lm  -fgcse-sm  -fgcse-las @gol
  -fdelete-null-pointer-checks @gol
  -fexpensive-optimizations @gol
! -fregmove @gol
  -fschedule-insns  -fschedule-insns2 @gol
  -fsched-interblock  -fsched-spec @gol
  -fcaller-saves @gol
*************** Enabled by default when gcse is enabled.
*** 3996,4005 ****

  @item -fgcse-sm
  @opindex fgcse-sm
! When @option{-fgcse-sm} is enabled, A store motion pass is run after global common
! subexpression elimination.  This pass will attempt to move stores out of loops.
! When used in conjunction with @option{-fgcse-lm}, loops containing a load/store sequence
! can be changed to a load before the loop and a store after the loop.

  Enabled by default when gcse is enabled.

--- 3996,4014 ----

  @item -fgcse-sm
  @opindex fgcse-sm
! When @option{-fgcse-sm} is enabled, a store motion pass is run after
! global common subexpression elimination.  This pass will attempt to move
! stores out of loops.  When used in conjunction with @option{-fgcse-lm},
! loops containing a load/store sequence can be changed to a load before
! the loop and a store after the loop.
!
! Enabled by default when gcse is enabled.
!
! @item -fgcse-las
! @opindex fgcse-las
! When @option{-fgcse-las} is enabled, the global common subexpression
! elimination pass eliminates redundant loads that come after stores to the
! same memory location (both partial and full redundacies).

  Enabled by default when gcse is enabled.


Roger
--


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]