[PATCH 2/2, AArch64, v2] Pipeline model for APM XGene-1.

Thu Nov 20 11:01:00 GMT 2014

Hi Philipp,

On 20/11/14 10:47, Dr. Philipp Tomsich wrote:
> Kyrill,
>
>> I don't mind it being in config/arm if you plan to wire it up later, good to know.
>> Another comment inline….
> I’ll clean up the missing xgene1_ and the mistyped xgene_ prefix and resubmit.
>
>>> +(define_insn_reservation "div" 2
>>> +  (and (eq_attr "tune" "xgene1")
>>> +       (eq_attr "type" "sdiv,udiv"))
>>> +  "xgene1_decode1op,xgene_divide")
>> The dangerous part was the reservation duration (the xgene_divide*<large number>).
>> The latency number (2 in this version, 66 in the previous) is not harmful to the automaton size
>> and can be as high as needed (if this operation is high latency)....
> It doesn’t really matter for any workload we’ve encountered, as the hardware is better at dealing with ‘div’-latencies than the scheduler (especially, as ‘div’ is variable latency and any guess we have will be wrong… we’ll likely add scheduling hook function in the future).
> The more important thing is to keep the cost of divides high enough in the cost-model.
>
> In other words: 66 would be the worst case and will normally not be correct anyway. Furthermore, it’s rather unplausible, that we find 264 instructions (for this worst-case scenario) to fill the scheduling bubble between the div-insn and its result usage.

Ok, makes sense. I just thought that 2 is a bit too low but if your 
benchmarking showed it to be reasonable I won't complain ;)

Kyrill

>
> Best,
> Philipp.
>