[PATCH RFC]Pair load store instructions using a generic scheduling fusion pass

Thu Oct 30 20:22:00 GMT 2014

On 10/10/14 21:32, Bin.Cheng wrote:
> Mike already gave great answers, here are just some of my thoughts on
> the specific questions.  See embedded below.
Thanks to both of you for your answers.

Fundamentally, what I see is this scheme requires us to be able to come 
up with a key based solely on information in a particular insn.  To get 
fusion another insn has to have the same or a closely related key.

This implies that the the two candidates for fusion are related, even if 
there isn't a data dependency between them.  The canonical example would 
be two loads with reg+d addressing modes.  If they use the same base 
register and the displacements differ by a word, then we don't have a 
data dependency between the insns, but the insns are closely related by 
their address computations and we can compute a key to ensure those two 
related insns end up consecutive.  At any given call to the hook, the 
only context we can directly see is the current insn.

I'm pretty sure if I were to tweak the ARM bits ever-so-slightly it 
could easily model the load-load or store-store special case on the 
PA7xxx[LC] processors.  Normally a pair of loads or stores can't dual 
issue.  But if the two loads (or two stores) hit the upper and lower 
half of a double-word objects, then the instructions can dual issue.

I'd forgotten about that special case scheduling opportunity until I 
started looking at some unrelated enhancement for prefetching.

Your code would also appear to allow significant cleanup of the old 
caller-save code that had a fair amount of bookkeeping added to issue 
double-word memory loads/stores rather than single word operations. 
This *greatly* improved performance on the old sparc processors which 
had no call-saved FP registers.

However, your new code doesn't handle fusing instructions which are 
totally independent and of different static types.  There just isn't a 
good way to compute a key that I can see.  And this is OK -- that case, 
if we cared to improve it, would be best handled by the SCHED_REORDER hooks.

>>>
>>> I guess another way to ask the question, are fusion priorities static based on the insn/alternative, or can they vary?  And if they can vary, can they vary each tick of the scheduler?
>
> Though this pass works on predefined fusion types and priorities now,
> there might be two possible fixes for this specific problem.
> 1) Introduce another exclusive_pri, now it's like "fusion_pri,
> priority, exclusive_pri".  The first one is assigned to mark
> instructions belonging to same fusion type.  The second is assigned to
> fusion each pair/consecutive instructions together.  The last one is
> assigned to prevent specific pair of instructions from being fused,
> just like "BC" mentioned.
> 2) Extend the idea by using hook function
> TARGET_SCHED_REORDER/TARGET_SCHED_REORDER2.  Now we can assign
> fusion_pri at the first place, making sure instructions in same fusion
> type will be adjacent to each other, then we can change priority (thus
> reorder the ready list) at back-end's wish even per each tick of the
> scheduler.
#2 would be the best solution for the case I was pondering, but I don't 
think solving that case is terribly important given the processors for 
which it was profitable haven't been made for a very long time.

Jeff