While attempting to port a profiler designed to work with the Microsoft compiler's /Gh and /GH (generate _penter and _pexit) options, I have rediscovered a problem that was previously filed as bug #23296: "When compiling code with gcc 4.0.0 or 4.0.1 and when specifying both the -finstrument-functions and the -O3 options, then __cyg_profile_func_enter() is called two or more times successively with exactly the same arguments (called function pointer and call site pointer). This should never happen." This was marked INVALID with the comment: "This is not a bug, this is how it works now in 4.0.0 and above with respect with the inliner."
However, this makes it more difficult to implement an efficient and accurate profiler. The inlined functions look just like regular ones, and can't be billed to the called function, because the called function doesn't appear in the binary, and its address wouldn't be given to __cyg_profile_func_enter even if it were. So the only semantics a profiler could get out of this are "a function would have been called here if it hadn't been inlined, but we don't know what it was in any case." And when lots of functions are inlined, we'll take a significant time hit (and possibly disturb pipelining and optimization) to get this effectively useless knowledge. The best we can do is say "-fno-inline" to turn off inlining altogether, and accept a slower and less accurate profiler.
I would like to have another option (or to change the semantics of the existing option) that would cause inlined function calls to behave as if they had __attribute((no_instrument_function)). If this was too hard, it would also be acceptable to make it be a little more heuristic and cause possibly-inlined function calls to behave as if they had __attribute((no_instrument_function). Although I'm not familiar with the gcc codebase, I can imagine this working by making a change in expand_function_start in gcc/function.c:
&& ! DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT (subr));
&& ! DECL_NO_INSTRUMENT_FUNCTION_ENTRY_EXIT (subr)
&& ! tree_inlinable_function_p (subr));
(In reply to comment #0)
> The best we can do is say
> "-fno-inline" to turn off inlining altogether, and accept a slower and less
> accurate profiler.
This is problematic as well, due to the existence of __attribute__((always_inline)), which causes functions to be inlined efen when -fno-inline is specified. In particular, my code calls functions defined by the system header emmintrin.h as always inlined.
Confirmed, it is a little hard as -finstrument-functions now applied before inlining.
Would it be feasible to pass a different function address for inlined functions? In particular, it should not be the same address as the parent function.
I seem to recall that in previous versions (2.95.x, 3.x) of GCC __cyg_profile_func_enter() used the call location of an inlined function as
This is somewhere in the middle of the parent function, which makes sense somehow. Any invalid address could be used as well.
In the end, the user could be aware of a function being inlined.