This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Improve ix86 machine reorg (PR target/39942, take 2, part 2)


On Thu, Apr 30, 2009 at 01:46:39PM +0200, Jakub Jelinek wrote:
> This patch kills some IMHO completely unnecessary paddings and
> decreases others, added for TARGET_FOUR_JUMP_LIMIT optimization
> during ix86 machine reorg.
...
> The other change is just taking into account the label .p2align directives
> we are going to emit (and also the "align" instructions we added previously
> in the pass).  If there is say
> .p2align 4,,15
> going to be emitted, we know we don't have to worry about any instructions
> before the label anymore, all jumps after it are in a different 16 byte
> page.  So we can pretend the label has minimal size 16.  For
> .p2align 4,,10
> we know either that anything after the label also is in a new 16 byte page,
> or nothing was added because >= 11 bytes would need to be skipped.  But
> in that case the current group could only contain at most 5 bytes.
> If we pretend the label has size 11, the algorithm will not consider
> anything but the last 5 bytes before it.  Similarly, for say
> .p2align 2
> we know that either there were only at most 12 bytes in the current 16 byte
> page before the alignment, or it aligned to a 16 byte boundary, so
> pretending the label has size 4 works as well.
> 
> What the patch doesn't solve (and I've mentioned in the PR) is that in many
> cases min_insn_size is too conservative, there are plenty of > 1 byte
> instructions when not counting displacement, where we could assume larger
> minimal size.
> 

Here is an updated patch for the rest, I've noted a couple of issues with
the patch from April.  I'm also attaching an awk script
(objdump -d cc1plus | ~/test4jmp.awk) which I've used for verification.
Both without and with the patch there are a few hits, meaning the algorithm
had and still has issues (especially for -fprofile-use), but at least
this patch (nor the other patch I've sent a few minutes ago) causes
significant regressions.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2009-05-13  Jakub Jelinek  <jakub@redhat.com>

	PR target/39942
	* final.c (label_to_max_skip): New function.
	(label_to_alignment): Only use LABEL_TO_ALIGNMENT if
	CODE_LABEL_NUMBER <= max_labelno.
	* output.h (label_to_max_skip): New prototype.
	* config/i386/i386.c (ix86_avoid_jump_misspredicts): Renamed to...
	(ix86_avoid_jump_mispredicts): ... this.  Don't define if
	ASM_OUTPUT_MAX_SKIP_ALIGN isn't defined.  Update comment.
	Handle CODE_LABELs with >= 16 byte alignment or with
	max_skip == (1 << align) - 1.
	(ix86_reorg): Don't call ix86_avoid_jump_mispredicts if
	ASM_OUTPUT_MAX_SKIP_ALIGN isn't defined.

--- gcc/config/i386/i386.c.jj	2009-05-05 08:33:20.000000000 +0200
+++ gcc/config/i386/i386.c	2009-05-05 18:33:28.000000000 +0200
@@ -27193,6 +27193,7 @@ x86_function_profiler (FILE *file, int l
     }
 }
 
+#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
 /* We don't have exact information about the insn sizes, but we may assume
    quite safely that we are informed about all 1 byte insns and memory
    address sizes.  This is enough to eliminate unnecessary padding in
@@ -27243,7 +27244,7 @@ min_insn_size (rtx insn)
    window.  */
 
 static void
-ix86_avoid_jump_misspredicts (void)
+ix86_avoid_jump_mispredicts (void)
 {
   rtx insn, start = get_insns ();
   int nbytes = 0, njumps = 0;
@@ -27257,15 +27258,52 @@ ix86_avoid_jump_misspredicts (void)
 
      The smallest offset in the page INSN can start is the case where START
      ends on the offset 0.  Offset of INSN is then NBYTES - sizeof (INSN).
-     We add p2align to 16byte window with maxskip 17 - NBYTES + sizeof (INSN).
+     We add p2align to 16byte window with maxskip 15 - NBYTES + sizeof (INSN).
      */
-  for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
+  for (insn = start; insn; insn = NEXT_INSN (insn))
     {
+      int min_size;
 
-      nbytes += min_insn_size (insn);
+      if (GET_CODE (insn) == CODE_LABEL)
+	{
+	  int align = label_to_alignment (insn);
+	  int max_skip = label_to_max_skip (insn);
+
+	  if (max_skip > 15)
+	    max_skip = 15;
+	  /* If align > 3, only up to 16 - max_skip - 1 bytes can be
+	     already in the current 16 byte page, because otherwise
+	     ASM_OUTPUT_MAX_SKIP_ALIGN could skip max_skip or fewer
+	     bytes to reach 16 byte boundary.  */
+	  if (align <= 0
+	      || (align <= 3 && max_skip != (1 << align) - 1))
+	    max_skip = 0;
+	  if (dump_file)
+	    fprintf (dump_file, "Label %i with max_skip %i\n",
+		     INSN_UID (insn), max_skip);
+	  if (max_skip)
+	    {
+	      while (nbytes + max_skip >= 16)
+		{
+		  start = NEXT_INSN (start);
+		  if ((JUMP_P (start)
+		       && GET_CODE (PATTERN (start)) != ADDR_VEC
+		       && GET_CODE (PATTERN (start)) != ADDR_DIFF_VEC)
+		      || CALL_P (start))
+		    njumps--, isjump = 1;
+		  else
+		    isjump = 0;
+		  nbytes -= min_insn_size (start);
+		}
+	    }
+	  continue;
+	}
+
+      min_size = min_insn_size (insn);
+      nbytes += min_size;
       if (dump_file)
-        fprintf(dump_file, "Insn %i estimated to %i bytes\n",
-		INSN_UID (insn), min_insn_size (insn));
+	fprintf (dump_file, "Insn %i estimated to %i bytes\n",
+		 INSN_UID (insn), min_size);
       if ((JUMP_P (insn)
 	   && GET_CODE (PATTERN (insn)) != ADDR_VEC
 	   && GET_CODE (PATTERN (insn)) != ADDR_DIFF_VEC)
@@ -27289,7 +27327,7 @@ ix86_avoid_jump_misspredicts (void)
       gcc_assert (njumps >= 0);
       if (dump_file)
         fprintf (dump_file, "Interval %i to %i has %i bytes\n",
-		INSN_UID (start), INSN_UID (insn), nbytes);
+		 INSN_UID (start), INSN_UID (insn), nbytes);
 
       if (njumps == 3 && isjump && nbytes < 16)
 	{
@@ -27302,6 +27340,7 @@ ix86_avoid_jump_misspredicts (void)
 	}
     }
 }
+#endif
 
 /* AMD Athlon works faster
    when RET is not destination of conditional jump or directly preceded
@@ -27364,9 +27403,14 @@ ix86_reorg (void)
   if (TARGET_PAD_RETURNS && optimize
       && optimize_function_for_speed_p (cfun))
     ix86_pad_returns ();
+#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN
+  /* `align' insn expands to nothing if ASM_OUTPUT_MAX_SKIP_ALIGN
+     is not defined, so it makes no sense to do this optimization
+     in that case.  */
   if (TARGET_FOUR_JUMP_LIMIT && optimize
       && optimize_function_for_speed_p (cfun))
-    ix86_avoid_jump_misspredicts ();
+    ix86_avoid_jump_mispredicts ();
+#endif
 }
 
 /* Return nonzero when QImode register that must be represented via REX prefix
--- gcc/final.c.jj	2009-05-05 08:33:20.000000000 +0200
+++ gcc/final.c	2009-05-05 16:37:13.000000000 +0200
@@ -553,7 +553,17 @@ static int min_labelno, max_labelno;
 int
 label_to_alignment (rtx label)
 {
-  return LABEL_TO_ALIGNMENT (label);
+  if (CODE_LABEL_NUMBER (label) <= max_labelno)
+    return LABEL_TO_ALIGNMENT (label);
+  return 0;
+}
+
+int
+label_to_max_skip (rtx label)
+{
+  if (CODE_LABEL_NUMBER (label) <= max_labelno)
+    return LABEL_TO_MAX_SKIP (label);
+  return 0;
 }
 
 #ifdef HAVE_ATTR_length
--- gcc/output.h.jj	2009-05-05 08:33:20.000000000 +0200
+++ gcc/output.h	2009-05-05 16:37:13.000000000 +0200
@@ -1,7 +1,7 @@
 /* Declarations for insn-output.c.  These functions are defined in recog.c,
    final.c, and varasm.c.
    Copyright (C) 1987, 1991, 1994, 1997, 1998, 1999, 2000, 2001, 2002,
-   2003, 2004, 2005, 2006, 2007, 2008 Free Software Foundation, Inc.
+   2003, 2004, 2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc.
 
 This file is part of GCC.
 
@@ -94,6 +94,10 @@ extern int insn_current_reference_addres
    Defined in final.c.  */
 extern int label_to_alignment (rtx);
 
+/* Find the alignment maximum skip associated with a CODE_LABEL.
+   Defined in final.c.  */
+extern int label_to_max_skip (rtx);
+
 /* Output a LABEL_REF, or a bare CODE_LABEL, as an assembler symbol.  */
 extern void output_asm_label (rtx);
 

	Jakub

Attachment: test4jmp.awk
Description: Text document


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]