Kyrill,
I don't mind it being in config/arm if you plan to wire it up later, good to know.
Another comment inlineâ.
Iâll clean up the missing xgene1_ and the mistyped xgene_ prefix and resubmit.
+(define_insn_reservation "div" 2
+ (and (eq_attr "tune" "xgene1")
+ (eq_attr "type" "sdiv,udiv"))
+ "xgene1_decode1op,xgene_divide")
The dangerous part was the reservation duration (the xgene_divide*<large number>).
The latency number (2 in this version, 66 in the previous) is not harmful to the automaton size
and can be as high as needed (if this operation is high latency)....
It doesnât really matter for any workload weâve encountered, as the hardware is better at dealing with âdivâ-latencies than the scheduler (especially, as âdivâ is variable latency and any guess we have will be wrongâ weâll likely add scheduling hook function in the future).
The more important thing is to keep the cost of divides high enough in the cost-model.
In other words: 66 would be the worst case and will normally not be correct anyway. Furthermore, itâs rather unplausible, that we find 264 instructions (for this worst-case scenario) to fill the scheduling bubble between the div-insn and its result usage.