[PATCH] Don't schedule prefetch insns - take 2
Richard Guenther
richard.guenther@gmail.com
Wed Sep 2 14:50:00 GMT 2009
On Wed, Sep 2, 2009 at 4:33 PM, Andreas
Krebbel<krebbel@linux.vnet.ibm.com> wrote:
> Hi,
>
> back in May I discussed a problem where the insn scheduler shuffled my
> carefully placed PREFETCH insns around while making them pointless. I
> proposed a patch which made prefetches a general scheduling barrier.
>
> http://gcc.gnu.org/ml/gcc-patches/2009-05/msg01233.html
>
> That step was considered to be a too big hammer to solve the problem
> and I agree to that but ...
>
> I think we cannot handle all prefetches the same here. There are
> mainly 3 origins of prefetches:
>
> 1. Prefetches issued by a GCC optimizer. That pass uses intricate
> heuristics in order to place one or more prefetches inside a loop
> body.
>
> 2. Prefetches issued by a programmer using __builtin_prefetch.
>
> 3. Prefetches issued by a back-end for a certain kind of code snippet.
>
>
> I agree that for 1. a prefetch should not be a scheduling barrier.
> The pass already has quite some trouble to do its job properly and
> making a perhaps misplaced prefetch a full scheduling barrier would
> further pessimize the code.
>
> I'm not sure about 2. . A programmer optimizing for a certain
> architecture might want to issue a prefetch very precisely in order to
> get a specific effect. That programmer would probably like the
> prefetch to stay exactly at the location he had in mind. Otherwise one
> purpose of the builtin is to make prefetches somewhat platform
> independent so that kind of cycle precise optimization is doomed to
> fail some way or the other.
>
> But regarding 3. I'm absolutely sure that a back-end coder wants the
> prefetch to stay exactly where he wants it to be :) No matter if the
> other instructions access memory or not they should never pass the
> prefetch INSN. Maxim proposed to create scheduler dependencies in the
> back-end in order to achieve that (btw. sorry for never answering
> back). For the purpose I had in mind I would have to create such deps
> for all the other insns in the basic block. I must admit that I don't
> like that idea very much.
>
> The attached patch simply uses the RTX flag volatil to mark a prefetch
> insn as full barrier. That way every code issuing a prefetch is able
> to decide whether it has to be barrier or not.
>
> Bootstrapping and testing still running on s390x (but the patch anyway
> should be a NOP for now).
>
> Ok for mainline if passing regression test?
That would fix case 3, yes? I think its a reasonable approach for that.
Thus, the patch is ok if the scheduler maintainers don't complain within
the next days.
Thanks,
Richard.
> Bye,
>
> -Andreas-
>
>
>
> 2009-09-02 Andreas Krebbel <Andreas.Krebbel@de.ibm.com>
>
> * rtl.h (PREFETCH_VOLATILE_P): New macro.
> * sched-deps.c (sched_analyze_2): Make prefetches a hard barrier
> when volatile flag is set.
> * doc/rtl.texi (PREFETCH_VOLATILE_P): Add documentation pieces.
>
>
> Index: gcc/rtl.h
> ===================================================================
> *** gcc/rtl.h.orig 2009-08-17 11:17:57.000000000 +0200
> --- gcc/rtl.h 2009-09-02 15:44:32.000000000 +0200
> *************** do { \
> *** 1382,1387 ****
> --- 1382,1390 ----
> offset within that block. */
> #define SYMBOL_REF_BLOCK_OFFSET(RTX) (BLOCK_SYMBOL_CHECK (RTX)->offset)
>
> + #define PREFETCH_VOLATILE_P(RTX) \
> + (RTL_FLAG_CHECK1("PREFETCH_VOLATILE_P", (RTX), PREFETCH)->volatil)
> +
> /* Indicate whether the machine has any sort of auto increment addressing.
> If not, we can avoid checking for REG_INC notes. */
>
> Index: gcc/sched-deps.c
> ===================================================================
> *** gcc/sched-deps.c.orig 2009-09-02 15:16:54.000000000 +0200
> --- gcc/sched-deps.c 2009-09-02 15:40:10.000000000 +0200
> *************** sched_analyze_2 (struct deps *deps, rtx
> *** 2132,2137 ****
> --- 2132,2142 ----
> flush_pending_lists (deps, insn, true, false);
> break;
>
> + case PREFETCH:
> + if (PREFETCH_VOLATILE_P (x))
> + reg_pending_barrier = TRUE_BARRIER;
> + break;
> +
> case UNSPEC_VOLATILE:
> flush_pending_lists (deps, insn, true, true);
> /* FALLTHRU */
> Index: gcc/doc/rtl.texi
> ===================================================================
> *** gcc/doc/rtl.texi.orig 2009-05-25 10:20:38.000000000 +0200
> --- gcc/doc/rtl.texi 2009-09-02 15:42:09.000000000 +0200
> *************** Stored in the @code{volatil} field and p
> *** 882,887 ****
> --- 882,895 ----
> Most uses of @code{SYMBOL_REF_FLAG} are historic and may be subsumed
> by @code{SYMBOL_REF_FLAGS}. Certainly use of @code{SYMBOL_REF_FLAGS}
> is mandatory if the target requires more than one bit of storage.
> +
> + @findex PREFETCH_VOLATILE_P
> + @cindex @code{prefetch} and @samp{/v}
> + @cindex @code{volatile}, in @code{prefetch}
> + @item PREFETCH_VOLATILE_P (@var{x})
> + In a @code{prefetch}, indicates that the prefetch is a scheduling barrier.
> + No other INSNs will be moved over it.
> + Stored in the @code{volatil} field and printed as @samp{/v}.
> @end table
>
> These are the fields to which the above macros refer:
> *************** In an @code{insn}, 1 means the insn has
> *** 1034,1039 ****
> --- 1042,1050 ----
> In @code{label_ref} and @code{reg_label} expressions, 1 means a reference
> to a non-local label.
>
> + In @code{prefetch} expressions, 1 means that the containing insn is a
> + scheduling barrier.
> +
> In an RTL dump, this flag is represented as @samp{/v}.
> @end table
>
>
More information about the Gcc-patches
mailing list