[PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.

Thu Nov 20 11:01:00 GMT 2014

Kyrill,

> I don't mind it being in config/arm if you plan to wire it up later, good to know.
> Another comment inline….

I’ll clean up the missing xgene1_ and the mistyped xgene_ prefix and resubmit.

>> +(define_insn_reservation "div" 2
>> +  (and (eq_attr "tune" "xgene1")
>> +       (eq_attr "type" "sdiv,udiv"))
>> +  "xgene1_decode1op,xgene_divide")
> 
> The dangerous part was the reservation duration (the xgene_divide*<large number>).
> The latency number (2 in this version, 66 in the previous) is not harmful to the automaton size
> and can be as high as needed (if this operation is high latency)....

It doesn’t really matter for any workload we’ve encountered, as the hardware is better at dealing with ‘div’-latencies than the scheduler (especially, as ‘div’ is variable latency and any guess we have will be wrong… we’ll likely add scheduling hook function in the future).
The more important thing is to keep the cost of divides high enough in the cost-model.

In other words: 66 would be the worst case and will normally not be correct anyway. Furthermore, it’s rather unplausible, that we find 264 instructions (for this worst-case scenario) to fill the scheduling bubble between the div-insn and its result usage.

Best,
Philipp.