This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[PATCH] Don't schedule prefetch insns - take 2


Hi,

back in May I discussed a problem where the insn scheduler shuffled my
carefully placed PREFETCH insns around while making them pointless. I
proposed a patch which made prefetches a general scheduling barrier.

http://gcc.gnu.org/ml/gcc-patches/2009-05/msg01233.html

That step was considered to be a too big hammer to solve the problem
and I agree to that but ...

I think we cannot handle all prefetches the same here.  There are
mainly 3 origins of prefetches:

1. Prefetches issued by a GCC optimizer.  That pass uses intricate
heuristics in order to place one or more prefetches inside a loop
body.

2. Prefetches issued by a programmer using __builtin_prefetch.

3. Prefetches issued by a back-end for a certain kind of code snippet.


I agree that for 1. a prefetch should not be a scheduling barrier.
The pass already has quite some trouble to do its job properly and
making a perhaps misplaced prefetch a full scheduling barrier would
further pessimize the code.

I'm not sure about 2. . A programmer optimizing for a certain
architecture might want to issue a prefetch very precisely in order to
get a specific effect.  That programmer would probably like the
prefetch to stay exactly at the location he had in mind. Otherwise one
purpose of the builtin is to make prefetches somewhat platform
independent so that kind of cycle precise optimization is doomed to
fail some way or the other.

But regarding 3. I'm absolutely sure that a back-end coder wants the
prefetch to stay exactly where he wants it to be :)  No matter if the
other instructions access memory or not they should never pass the
prefetch INSN. Maxim proposed to create scheduler dependencies in the
back-end in order to achieve that (btw. sorry for never answering
back).  For the purpose I had in mind I would have to create such deps
for all the other insns in the basic block. I must admit that I don't
like that idea very much.

The attached patch simply uses the RTX flag volatil to mark a prefetch
insn as full barrier.  That way every code issuing a prefetch is able
to decide whether it has to be barrier or not.

Bootstrapping and testing still running on s390x (but the patch anyway
should be a NOP for now).

Ok for mainline if passing regression test?

Bye,

-Andreas-



2009-09-02  Andreas Krebbel  <Andreas.Krebbel@de.ibm.com>

	* rtl.h (PREFETCH_VOLATILE_P): New macro.	
	* sched-deps.c (sched_analyze_2): Make prefetches a hard barrier
	when volatile flag is set.
	* doc/rtl.texi (PREFETCH_VOLATILE_P): Add documentation pieces.


Index: gcc/rtl.h
===================================================================
*** gcc/rtl.h.orig	2009-08-17 11:17:57.000000000 +0200
--- gcc/rtl.h	2009-09-02 15:44:32.000000000 +0200
*************** do {						\
*** 1382,1387 ****
--- 1382,1390 ----
     offset within that block.  */
  #define SYMBOL_REF_BLOCK_OFFSET(RTX) (BLOCK_SYMBOL_CHECK (RTX)->offset)
  
+ #define PREFETCH_VOLATILE_P(RTX)					\
+   (RTL_FLAG_CHECK1("PREFETCH_VOLATILE_P", (RTX), PREFETCH)->volatil)
+ 
  /* Indicate whether the machine has any sort of auto increment addressing.
     If not, we can avoid checking for REG_INC notes.  */
  
Index: gcc/sched-deps.c
===================================================================
*** gcc/sched-deps.c.orig	2009-09-02 15:16:54.000000000 +0200
--- gcc/sched-deps.c	2009-09-02 15:40:10.000000000 +0200
*************** sched_analyze_2 (struct deps *deps, rtx 
*** 2132,2137 ****
--- 2132,2142 ----
        flush_pending_lists (deps, insn, true, false);
        break;
  
+     case PREFETCH:
+       if (PREFETCH_VOLATILE_P (x))
+ 	reg_pending_barrier = TRUE_BARRIER;
+       break;
+ 
      case UNSPEC_VOLATILE:
        flush_pending_lists (deps, insn, true, true);
        /* FALLTHRU */
Index: gcc/doc/rtl.texi
===================================================================
*** gcc/doc/rtl.texi.orig	2009-05-25 10:20:38.000000000 +0200
--- gcc/doc/rtl.texi	2009-09-02 15:42:09.000000000 +0200
*************** Stored in the @code{volatil} field and p
*** 882,887 ****
--- 882,895 ----
  Most uses of @code{SYMBOL_REF_FLAG} are historic and may be subsumed
  by @code{SYMBOL_REF_FLAGS}.  Certainly use of @code{SYMBOL_REF_FLAGS}
  is mandatory if the target requires more than one bit of storage.
+ 
+ @findex PREFETCH_VOLATILE_P
+ @cindex @code{prefetch} and @samp{/v}
+ @cindex @code{volatile}, in @code{prefetch}
+ @item PREFETCH_VOLATILE_P (@var{x})
+ In a @code{prefetch}, indicates that the prefetch is a scheduling barrier.
+ No other INSNs will be moved over it.
+ Stored in the @code{volatil} field and printed as @samp{/v}.
  @end table
  
  These are the fields to which the above macros refer:
*************** In an @code{insn}, 1 means the insn has 
*** 1034,1039 ****
--- 1042,1050 ----
  In @code{label_ref} and @code{reg_label} expressions, 1 means a reference
  to a non-local label.
  
+ In @code{prefetch} expressions, 1 means that the containing insn is a
+ scheduling barrier.
+ 
  In an RTL dump, this flag is represented as @samp{/v}.
  @end table
  


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]