As far as madd goes, I think it would be better to either
(a) get combine to handle this situation or (b) get expand
to generate a fused multiply-add from the outset.
(b) sounds like it might be useful in its own right. At the moment we
treat the generation of floating-point multiply-adds as an optimisation,
but in some applications it's critical not to round the intermediate
result. (I don't know if there's a bugzilla entry about this.)
If we treated fused multiply-add as a primitive operation, we could
extend it to integer types too. In this case we'd also need to
handle widening multiplications, but we already need to do that
for stand-alone multiplications.