This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH RFC]Pair load store instructions using a generic scheduling fusion pass

On 10/10/14 21:32, Bin.Cheng wrote:
Mike already gave great answers, here are just some of my thoughts on
the specific questions.  See embedded below.
Thanks to both of you for your answers.

Fundamentally, what I see is this scheme requires us to be able to come up with a key based solely on information in a particular insn. To get fusion another insn has to have the same or a closely related key.

This implies that the the two candidates for fusion are related, even if there isn't a data dependency between them. The canonical example would be two loads with reg+d addressing modes. If they use the same base register and the displacements differ by a word, then we don't have a data dependency between the insns, but the insns are closely related by their address computations and we can compute a key to ensure those two related insns end up consecutive. At any given call to the hook, the only context we can directly see is the current insn.

I'm pretty sure if I were to tweak the ARM bits ever-so-slightly it could easily model the load-load or store-store special case on the PA7xxx[LC] processors. Normally a pair of loads or stores can't dual issue. But if the two loads (or two stores) hit the upper and lower half of a double-word objects, then the instructions can dual issue.

I'd forgotten about that special case scheduling opportunity until I started looking at some unrelated enhancement for prefetching.

Your code would also appear to allow significant cleanup of the old caller-save code that had a fair amount of bookkeeping added to issue double-word memory loads/stores rather than single word operations. This *greatly* improved performance on the old sparc processors which had no call-saved FP registers.

However, your new code doesn't handle fusing instructions which are totally independent and of different static types. There just isn't a good way to compute a key that I can see. And this is OK -- that case, if we cared to improve it, would be best handled by the SCHED_REORDER hooks.

I guess another way to ask the question, are fusion priorities static based on the insn/alternative, or can they vary?  And if they can vary, can they vary each tick of the scheduler?

Though this pass works on predefined fusion types and priorities now,
there might be two possible fixes for this specific problem.
1) Introduce another exclusive_pri, now it's like "fusion_pri,
priority, exclusive_pri".  The first one is assigned to mark
instructions belonging to same fusion type.  The second is assigned to
fusion each pair/consecutive instructions together.  The last one is
assigned to prevent specific pair of instructions from being fused,
just like "BC" mentioned.
2) Extend the idea by using hook function
fusion_pri at the first place, making sure instructions in same fusion
type will be adjacent to each other, then we can change priority (thus
reorder the ready list) at back-end's wish even per each tick of the
#2 would be the best solution for the case I was pondering, but I don't think solving that case is terribly important given the processors for which it was profitable haven't been made for a very long time.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]