GCC should optimize printf("%s",foo) and printf("foo") into fputs(foo,stdout)
and fputs("foo",stdout) respectively. As noted here:
We can capture stdout in an inline function using fixincl, perhaps adding the
__always_inline__ attribute. Then do the above transformation.
In at least the printf("%s", foo) case, the result fputs(foo,stdout) has the
same number of arguments, so it might not even be a -Os problem.
Confirmed, and yes we need to do something about stdout :).
Getting stdout wrapped in an inline function is not hard. I can create something fixincl or whatever to capture that. The part I don't know how to do is expand that inline function's body into the code stream from fold_builtin_printf or expand_builtin_printf.
Just inserting the inline function call as the right parameter to fputs and calling expand() used to just magically work when we had the RTL inliner because that ran after builtin expansion. Now with the tree inliner it's too late so we'd have to do something else extra.
Anybody have ideas on this? It might also help with PR 24729.