I don't think my suggestion prevents optimizing (2). :-) In the (2)
case __builtin_sin will be serialized out, read in by LTO, and optimize.
That makes sense. I didn't know that we transform sin to
"__builtin_sin" even at "-O0".
We do not generally convert calls to sin() to calls to __builtin_sin(), but
if the middle-end introduces new calls to sin it generally uses the
builtin variant. We would need to start doing so instead, to make Paolos
suggestion work - which I think would be a sensible thing to do to address
this problem.