This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Swap optimization pass



  In message <199801120319.WAA08871@jwlab.FEITH.COM>you write:
  > Here's an initial implementation of a swap optimization pass.  The
  > idea is to detect when a series of SET instructions is being used
  > in conjunction with a temporary register to swap two locations and
  > to replace those instructions with a swap instruction, thus eliminating
  > the need for the temporary register.
Cool.  Though my initial inclination is to put this kind of optimization
into regmove.  Interesting enough, it looks like you cribbed some code
from regmove to implement this optimization.

I'd also recommend that we not create another flag for this optimization;
it ought to be controlled by -fregmove (IMHO) and enabled by default
whenever -fregmove is enabled by default (-O2 I think).

This change is significant enough that we'd need a copyright assignment
(and possibly a disclaimer from your employer) before we could use any
of the code.  See our web page for suitable instructions and forms.
However, instead of sending the forms directly to RMS, send them to me
(I'll give you the address separately).  I'll review the assignment for
correctness, then forward to RMS.

Your code does not adhere to the GNU coding standards in many ways; I
recommend you read up on the GNU coding standards if you're planning on
submitting additional work in the future.  I think I've fixed most of
the formatting/convention problems.

Your code could use a few more comments :-)  And the variable names aren't
very descriptive.  Using more comments and better variable names would
greatly help the readability of your code at both a high and low level.
I've added some comments and renamed some of the variables.  I didn't finish,
but it's a start.

The idea is someone should be able to easily read your code, and relate the
high level algorithm to actual code using the comments and well choosen
variable names as guides.

On a more technical level, at least on a 586 the xchng instruction always
ran slower than a suitable series of moves on trivial testcases (using
either memory or register operands).

So I would worry that we need to either disable those patterns on some
variants of the x86, or we need some kind of feeling that the penalty
for using xchng is negated by the better code we get due to reduced
register pressure.  When I get back to SLC I'll perform some similar
tests on a 686.

It might also be worth benchmarking some trivial tests on various
68ks since I think they have similar instructions.


You should also try and handle other common modes (DI, SF, DF).  Or
rework the code so that you don't have to explicitly mention modes
in the tests.

The calls to copy_rtx look extremely bogus.  Can you comment on why
they're necessary?

Now, having said that, here's my hacked up version of your main
optimization code.  It's certainly OK to continue development of
this code as a separate pass, or as callable code from inside regmove,
but long term it should run in parallel to the regmove code if
possible.

I made a first pass at fixing the formatting, variable names, etc.  Though
in the process of discovering the structure I decided some of the new
variable names aren't exactly right, so they need additional work.

/* Simplify the swapping of locations to reduce register pressure.
   Contributed by John Wehle (john@feith.com).  */

/* This routine looks for cases where a sequence of set instructions
   involving a temporary register is effectively swapping two locations.
   It then attempts to simplify the set instructions by using a swap
   instruction thus freeing up the temporary register. */

regswap_optimize (f, nregs, regswap_dump_file)
     rtx f;
     int nregs;
     FILE *regswap_dump_file;
{
#if defined(HAVE_swapsi) || defined(HAVE_swaphi) || defined(HAVE_swapqi)
  rtx insn;

  for (insn = f; insn; insn = NEXT_INSN (insn))
    {
      enum insn_code code;
      rtx set, src1, temp1, src2, dest2, first, last, p;
      rtx store_to_src1_insn;
      rtx store_from_temp1_insn;


      int temp1_referenced;
      int temp2_referenced;
      rtx load_from_op2_insn;
      rtx move_pat;
      rtx reg_note;
      rtx swap_insn;
      rtx swap_pat;
      rtx temp1_note;
      rtx temp2;
      rtx temp2_note;

      /* Find an insn that sets a pseudo from either another REG
	 or a MEM.  Ignore insns which have side effects.  */
      set = single_set (insn);
      if (!set
	  || (GET_CODE (SET_SRC (set)) != REG
	      && GET_CODE (SET_SRC (set)) != MEM)
	  || side_effects_p (SET_SRC (set))
	  || GET_CODE (SET_DEST (set)) != REG
	  || GET_MODE (SET_SRC (set)) != GET_MODE (SET_DEST (set)))
	continue;

      first = insn;

      src1 = SET_SRC (set);
      temp1 = SET_DEST (set);

      code = CODE_FOR_nothing;

      switch (GET_MODE (src1))
	{
#if defined(HAVE_swapdi)
	case DImode:
	  code = CODE_FOR_swapdi;
	  break;
#endif
#if defined(HAVE_swapsi)
	case SImode:
	  code = CODE_FOR_swapsi;
	  break;
#endif
#if defined(HAVE_swaphi)
	case HImode:
	  code = CODE_FOR_swaphi;
	  break;
#endif
#if defined(HAVE_swapqi)
	case QImode:
	  code = CODE_FOR_swapqi;
	  break;
#endif
#if defined(HAVE_swapdf)
	case DFmode:
	  code = CODE_FOR_swapdf;
	  break;
#endif
#if defined(HAVE_swapsf)
	case SFmode:
	  code = CODE_FOR_swapsf;
	  break;
#endif
	default:
	  break;
	}

      /* If we don't have a swap insn for this mode, then ignore this
	 insn.  */
      if (code == CODE_FOR_nothing)
	continue;

      store_to_src1_insn = NULL;
      store_from_temp1_insn = NULL;
      dest2 = NULL;
      temp2 = NULL;
      temp1_referenced = 0;
      temp2_referenced = 0;

      /* Search forward from FIRST for a set using SRC1 as the destination
	 or TEMP1 as the source.

	 We must find both before reaching a terminating insn or we can
	 not optimize the possible swap sequence starting with FIRST.  */
      for (p = NEXT_INSN (first); p; p = NEXT_INSN (p))
	{
	  /* Stop if we hit a label, jump, or interesting note.  */
	  if (GET_CODE (p) == CODE_LABEL
	      || GET_CODE (p) == JUMP_INSN
	      || (GET_CODE (p) == NOTE
		  && (NOTE_LINE_NUMBER (p) == NOTE_INSN_LOOP_BEG
		      || NOTE_LINE_NUMBER (p) == NOTE_INSN_LOOP_END
		      || NOTE_LINE_NUMBER (p) == NOTE_INSN_EH_REGION_BEG
		      || NOTE_LINE_NUMBER (p) == NOTE_INSN_EH_REGION_END
		      || NOTE_LINE_NUMBER (p) == NOTE_INSN_SETJMP)))
	    break;
    
	  /* If it's not an insn, then ignore it.  */
	  if (GET_RTX_CLASS (GET_CODE (p)) != 'i')
	    continue;

	  set = single_set (p);
	  if (set)
	    {
	      /* See if this insn sets SRC1 from either another register
		 or memory location using the same mode.  */
	      if (!store_to_src1_insn
		  && (GET_CODE (SET_SRC (set)) == REG
		      || GET_CODE (SET_SRC (set)) == MEM)
		  && rtx_equal_p (src1, SET_DEST (set))
		  && GET_MODE (SET_SRC (set)) == GET_MODE (SET_DEST (set)))
		{
		  store_to_src1_insn = p;
		  temp2 = SET_SRC (set);

		  /* If we found both stores that we needed, then we can
		     get out of the loop now.  */
		  if (store_from_temp1_insn)
		    break;

		  /* Start another loop iteration.  */
		  continue;
		}

	      /* See if this insn sets either another register or a
		 memory location from temp1 using the same mode.  */
	      if (!store_from_temp1_insn
		  && rtx_equal_p (temp1, SET_SRC (set))
		  && (GET_CODE (SET_DEST (set)) == REG
		      || GET_CODE (SET_DEST (set)) == MEM)
		  && !side_effects_p (SET_DEST (set))
		  && GET_MODE (SET_SRC (set)) == GET_MODE (SET_DEST (set)))
		{
		  store_from_temp1_insn = p;
		  dest2 = SET_DEST (set);

		  /* If we found both stores that we needed, then we can
		     get out of the loop now.  */
		  if (store_to_src1_insn)
		    break;

		  /* Start another loop iteration.  */
		  continue;
		}
	    }

	  /* Make sure that SRC1 is not used between STORE_TO_SRC1_INSN
	     and the last instruction in the swap.  Simlarly for DEST2
	     starting after STORE_FROM_DEST1_INSN.  */
	  if (store_to_src1_insn
	      && reg_overlap_mentioned_p (src1, PATTERN (p)))
	    break;

	  if (store_from_temp1_insn
	      && reg_overlap_mentioned_p (dest2, PATTERN (p)))
	    break;

	  /* Make sure that DEST1 is not modified between FIRST and
	     STORE_FROM_DEST1_INSN.  */
	  if (!store_from_temp1_insn)
	    {
	      if (reg_set_p (temp1, p))
		break;

	      /* Also keep track whether temp1 is referenced.  */
	      if (reg_referenced_p (temp1, p))
		temp1_referenced = 1;
	    }
	}

      /* If we didn't find the two stores we needed, then start another
	 pass.  */
      if (!store_to_src1_insn || !store_from_temp1_insn)
	continue;

      last = p;

      /* Determine if temp1 is needed after the store.  */
      temp1_note = find_reg_note (store_from_temp1_insn, REG_DEAD, temp1);
      if (!temp1_note)
	temp1_referenced = 1;

      /* If the source for the store to SRC2 is not DEST2, then search
	 for a SET instruction which uses DEST2 as the source and temp2
	 as the destination.  */
      load_from_op2_insn = NULL;
      temp2_note = NULL;

      if (!rtx_equal_p (dest2, temp2))
	{
	  if (GET_CODE (temp2) != REG)
	    continue;

	  for (p = PREV_INSN (store_to_src1_insn);
	       p != first;
	       p = PREV_INSN (p))
	    {

	      /* Ignore non-INSNs.  */
	      if (GET_RTX_CLASS (GET_CODE (p)) != 'i')
		continue;

	      set = single_set (p);
	      if (set
		  && rtx_equal_p (dest2, SET_SRC (set))
		  && rtx_equal_p (temp2, SET_DEST (set)))
		{
		  load_from_op2_insn = p;
		  break;
		}

	      /* Make sure that TMP2 is not modified between
		 load_from_op2_insn ande STORE_TO_SRC1_INSN.  */
	      if (reg_set_p (temp2, p))
		break;

	      if (reg_referenced_p (temp2, p))
		temp2_referenced = 1;
	    }

	  /* Did we find a suitable insn?  If not then start another
	     iteration now.  */
	  if (!load_from_op2_insn)
	    continue;

	  /* See if TMP2 is needed after the store.  */

	  temp2_note = find_reg_note (store_to_src1_insn, REG_DEAD, temp2);
	  if (!temp2_note)
	    temp2_referenced = 1;
        }
      else
	{
	  temp2_referenced = 1;
	}

      /* Make sure that SRC1 and DEST2 are not modified except by the
	 swap itself.  Also make sure none of the register they use
	 (if either is a MEM) are modified.  */
      for (p = NEXT_INSN (first);
	   p != last;
	   p = NEXT_INSN (p))
	{
	  /* Ignore non-insns.  */
	  if (GET_RTX_CLASS (GET_CODE (p)) != 'i')
	    continue;

	  if (p == load_from_op2_insn
	      || p == store_to_src1_insn
	      || p == store_from_temp1_insn)
	    continue;

	  if (modified_in_p (src1, p)
	      || modified_in_p (dest2, p) )
	    break;
	}

      /* Something was modified at a bad time, so we can not optimize
	 this case.  */
      if (p != last)
	continue;

      /* If this isn't going to eliminate a register, then don't
	 bother trying to simplify.  */
      if (temp1_referenced && temp2_referenced)
	continue; 

      /* This looks very bogus.  The comment before this was
	 "Create copies of src1 and DEST2 so reload works.  */
      src1 = copy_rtx (src1);
      dest2 = copy_rtx (dest2);

      /* At this point we've got a case we can optimize, so optimize it.  */
      move_pat = NULL;
      swap_pat = NULL;

      if ((*insn_operand_predicate[code][0]) (src1, insn_operand_mode[code][0])
	  && (*insn_operand_predicate[code][1]) (dest2, insn_operand_mode[code][1]))
	swap_pat = GEN_FCN (code) (src1, dest2);
      else if ((*insn_operand_predicate[code][0]) (dest2, insn_operand_mode[code][0])
	       && (*insn_operand_predicate[code][1])(src1, insn_operand_mode[code][1]) )
	swap_pat = GEN_FCN (code) (dest2, src1);
      else if (temp1_note && !temp2_referenced && temp2_note
	       && (*insn_operand_predicate[code][0]) (temp1, insn_operand_mode[code][0])
	       && (*insn_operand_predicate[code][1]) (dest2, insn_operand_mode[code][1]))
	{
	  swap_pat = GEN_FCN (code) (temp1, dest2);
	  move_pat = gen_move_insn (src1, temp1);

	  /* Ugh.  We can't optimize this case.  */
	  if (!move_pat)
	    continue;
	  last = store_from_temp1_insn;
	  temp1_referenced = 1;
	}
      else if (temp1_note && !temp2_referenced && temp2_note
	       && (*insn_operand_predicate[code][0]) (dest2, insn_operand_mode[code][0])
	       && (*insn_operand_predicate[code][1]) (temp1,insn_operand_mode[code][1]))
	{
	  swap_pat = GEN_FCN (code) (dest2, temp1);
	  move_pat = gen_move_insn (src1, temp1);

	  /* Ugh.  We can't optimize this case.  */
	  if (!move_pat)
	    continue;
	  last = store_from_temp1_insn;
	  temp1_referenced = 1;
	}

      if (!swap_pat)
        continue;

      /* Remove the death note for DEST1 and TMP2 from
         STORE_TO_OP2_INSN and STORE_TO_OP1_INSN respectively.  */
      if (temp1_note)
	remove_note (store_from_temp1_insn, temp1_note);

      if (temp2_note)
 	remove_note (store_to_src1_insn, temp2_note);

      /* Remove any death note for DEST2 from store_to_src1_insn.  */
      if (GET_CODE (dest2) == REG
	  && (reg_note = find_reg_note (store_to_src1_insn, REG_DEAD, dest2)))
	remove_note (store_to_src1_insn, reg_note);

      /* Emit the new swap instruction.  */
      swap_insn = emit_insn_before (swap_pat, last);
      if (move_pat)
	{
	  rtx move_insn;

	  move_insn = emit_insn_after(move_pat, swap_insn);

	  /* Move the death note for DEST1 to MOVE_INSN.  */
	  XEXP (temp1_note, 1) = REG_NOTES (move_insn);
	  REG_NOTES (move_insn) = temp1_note;

	  /* Move any remaining death notes from STORE_TO_OP1_INSN to
	     MOVE_INSN.  */
	  while ((reg_note = find_reg_note (store_to_src1_insn,
					    REG_DEAD, NULL_RTX)))
	    {
	      remove_note (store_to_src1_insn, reg_note);
	      XEXP (reg_note, 1) = REG_NOTES (move_insn);
	      REG_NOTES (move_insn) = reg_note;
	    }
	}

      /* Move any remaining death notes from STORE_TO_OP1_INSN to SWAP_INSN.  */
      while ((reg_note = find_reg_note (store_to_src1_insn, REG_DEAD, NULL_RTX)))
	{
	  remove_note (store_to_src1_insn, reg_note);
	  XEXP (reg_note, 1) = REG_NOTES (swap_insn);
	  REG_NOTES (swap_insn) = reg_note;
	}

      /* Similarly for death notes from STORE_TO_OP2_INSN to SWAP_INSN.  */
      while ((reg_note = find_reg_note (store_from_temp1_insn, REG_DEAD, NULL_RTX)))
	{
	  remove_note (store_from_temp1_insn, reg_note);
	  XEXP (reg_note, 1) = REG_NOTES (swap_insn);
	  REG_NOTES (swap_insn) = reg_note;
	}

      /* Delete the old swap instructions.  */
      if (!temp1_referenced)
	{
	  PUT_CODE (first, NOTE);
	  NOTE_LINE_NUMBER (first) = NOTE_INSN_DELETED;
	  NOTE_SOURCE_FILE (first) = 0;
	  REG_N_SETS (REGNO (temp1))--;
	}

      if (load_from_op2_insn && !temp2_referenced)
	{
	  PUT_CODE (load_from_op2_insn, NOTE);
	  NOTE_LINE_NUMBER (load_from_op2_insn) = NOTE_INSN_DELETED;
	  NOTE_SOURCE_FILE (load_from_op2_insn) = 0;
	  REG_N_SETS (REGNO (temp2))--;
	}

      PUT_CODE (store_to_src1_insn, NOTE);
      NOTE_LINE_NUMBER (store_to_src1_insn) = NOTE_INSN_DELETED;
      NOTE_SOURCE_FILE (store_to_src1_insn) = 0;

      PUT_CODE (store_from_temp1_insn, NOTE);
      NOTE_LINE_NUMBER (store_from_temp1_insn) = NOTE_INSN_DELETED;
      NOTE_SOURCE_FILE (store_from_temp1_insn) = 0;
    }
#endif
}


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]