[PATCH] Don't schedule prefetch insns - take 2

Wed Sep 2 14:50:00 GMT 2009

On Wed, Sep 2, 2009 at 4:33 PM, Andreas
Krebbel<krebbel@linux.vnet.ibm.com> wrote:
> Hi,
>
> back in May I discussed a problem where the insn scheduler shuffled my
> carefully placed PREFETCH insns around while making them pointless. I
> proposed a patch which made prefetches a general scheduling barrier.
>
> http://gcc.gnu.org/ml/gcc-patches/2009-05/msg01233.html
>
> That step was considered to be a too big hammer to solve the problem
> and I agree to that but ...
>
> I think we cannot handle all prefetches the same here.  There are
> mainly 3 origins of prefetches:
>
> 1. Prefetches issued by a GCC optimizer.  That pass uses intricate
> heuristics in order to place one or more prefetches inside a loop
> body.
>
> 2. Prefetches issued by a programmer using __builtin_prefetch.
>
> 3. Prefetches issued by a back-end for a certain kind of code snippet.
>
>
> I agree that for 1. a prefetch should not be a scheduling barrier.
> The pass already has quite some trouble to do its job properly and
> making a perhaps misplaced prefetch a full scheduling barrier would
> further pessimize the code.
>
> I'm not sure about 2. . A programmer optimizing for a certain
> architecture might want to issue a prefetch very precisely in order to
> get a specific effect.  That programmer would probably like the
> prefetch to stay exactly at the location he had in mind. Otherwise one
> purpose of the builtin is to make prefetches somewhat platform
> independent so that kind of cycle precise optimization is doomed to
> fail some way or the other.
>
> But regarding 3. I'm absolutely sure that a back-end coder wants the
> prefetch to stay exactly where he wants it to be :)  No matter if the
> other instructions access memory or not they should never pass the
> prefetch INSN. Maxim proposed to create scheduler dependencies in the
> back-end in order to achieve that (btw. sorry for never answering
> back).  For the purpose I had in mind I would have to create such deps
> for all the other insns in the basic block. I must admit that I don't
> like that idea very much.
>
> The attached patch simply uses the RTX flag volatil to mark a prefetch
> insn as full barrier.  That way every code issuing a prefetch is able
> to decide whether it has to be barrier or not.
>
> Bootstrapping and testing still running on s390x (but the patch anyway
> should be a NOP for now).
>
> Ok for mainline if passing regression test?

That would fix case 3, yes?  I think its a reasonable approach for that.

Thus, the patch is ok if the scheduler maintainers don't complain within
the next days.

Thanks,
Richard.

> Bye,
>
> -Andreas-
>
>
>
> 2009-09-02  Andreas Krebbel  <Andreas.Krebbel@de.ibm.com>
>
>        * rtl.h (PREFETCH_VOLATILE_P): New macro.
>        * sched-deps.c (sched_analyze_2): Make prefetches a hard barrier
>        when volatile flag is set.
>        * doc/rtl.texi (PREFETCH_VOLATILE_P): Add documentation pieces.
>
>
> Index: gcc/rtl.h
> ===================================================================
> *** gcc/rtl.h.orig      2009-08-17 11:17:57.000000000 +0200
> --- gcc/rtl.h   2009-09-02 15:44:32.000000000 +0200
> *************** do {                                            \
> *** 1382,1387 ****
> --- 1382,1390 ----
>     offset within that block.  */
>  #define SYMBOL_REF_BLOCK_OFFSET(RTX) (BLOCK_SYMBOL_CHECK (RTX)->offset)
>
> + #define PREFETCH_VOLATILE_P(RTX)                                      \
> +   (RTL_FLAG_CHECK1("PREFETCH_VOLATILE_P", (RTX), PREFETCH)->volatil)
> +
>  /* Indicate whether the machine has any sort of auto increment addressing.
>     If not, we can avoid checking for REG_INC notes.  */
>
> Index: gcc/sched-deps.c
> ===================================================================
> *** gcc/sched-deps.c.orig       2009-09-02 15:16:54.000000000 +0200
> --- gcc/sched-deps.c    2009-09-02 15:40:10.000000000 +0200
> *************** sched_analyze_2 (struct deps *deps, rtx
> *** 2132,2137 ****
> --- 2132,2142 ----
>        flush_pending_lists (deps, insn, true, false);
>        break;
>
> +     case PREFETCH:
> +       if (PREFETCH_VOLATILE_P (x))
> +       reg_pending_barrier = TRUE_BARRIER;
> +       break;
> +
>      case UNSPEC_VOLATILE:
>        flush_pending_lists (deps, insn, true, true);
>        /* FALLTHRU */
> Index: gcc/doc/rtl.texi
> ===================================================================
> *** gcc/doc/rtl.texi.orig       2009-05-25 10:20:38.000000000 +0200
> --- gcc/doc/rtl.texi    2009-09-02 15:42:09.000000000 +0200
> *************** Stored in the @code{volatil} field and p
> *** 882,887 ****
> --- 882,895 ----
>  Most uses of @code{SYMBOL_REF_FLAG} are historic and may be subsumed
>  by @code{SYMBOL_REF_FLAGS}.  Certainly use of @code{SYMBOL_REF_FLAGS}
>  is mandatory if the target requires more than one bit of storage.
> +
> + @findex PREFETCH_VOLATILE_P
> + @cindex @code{prefetch} and @samp{/v}
> + @cindex @code{volatile}, in @code{prefetch}
> + @item PREFETCH_VOLATILE_P (@var{x})
> + In a @code{prefetch}, indicates that the prefetch is a scheduling barrier.
> + No other INSNs will be moved over it.
> + Stored in the @code{volatil} field and printed as @samp{/v}.
>  @end table
>
>  These are the fields to which the above macros refer:
> *************** In an @code{insn}, 1 means the insn has
> *** 1034,1039 ****
> --- 1042,1050 ----
>  In @code{label_ref} and @code{reg_label} expressions, 1 means a reference
>  to a non-local label.
>
> + In @code{prefetch} expressions, 1 means that the containing insn is a
> + scheduling barrier.
> +
>  In an RTL dump, this flag is represented as @samp{/v}.
>  @end table
>
>