This is the mail archive of the
mailing list for the GCC project.
Re: [RFC] Memcpy/memset profiling infrastructure
- From: "Richard Guenther" <richard dot guenther at gmail dot com>
- To: "Jan Hubicka" <jh at suse dot cz>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Thu, 26 Oct 2006 13:44:23 +0200
- Subject: Re: [RFC] Memcpy/memset profiling infrastructure
- References: <20061026113337.GL610@kam.mff.cuni.cz>
On 10/26/06, Jan Hubicka <email@example.com> wrote:
I am sending this patch as RFC because it would need at least adding mechanizm to hide internal builtin functions from user.
What I am shooting for is to annotate histogram information during profiling
about expected size and alignment of memcpy/memsetted blocks to be used later
at RTL expansion time. I do have patch that allows chosing of proper memcpy
algorithm (i. e. rep/movs, loop, unrolled loop or libcall) in x86 backend based
on this info.
The problem lies in annotating the histogram with the call. What I do is
simply adding variants builtin_memcpy_hints/builtin_memset_hints that do accept
those extra information as additional arguments. This is very non-intrusive to
rest of middle-end but do have disadvantage that it works only for explicit
memset/memcpy calls (ie not for structure assignments, where alignment would be
still interesting, but not as much as in the generic case) and it woiuld be
moderately painful to add similar profiling to other builtins (my profiling
code, not included in patch, only memset/memcpy/bzero is profiled) because new
alternatives needs to be introduced.
I've disucssed this briefly on GCC summit with Rth and we didn't found better
way around. Possibly if we get rid of TER, it would be more convenient to
attach the profiles to statements and use it at exansion time, but even that
has problems, since updating the histograms would need some care.
If no one comes with good idea, I will add the bits to avoid the function from
being user visible (how this is best doable BTW?) and send updated patch at
begining of next week.
I agree that this is the most sensible way of doing it - did you
verify that statement
annotations for the call do not survive until expansion? We at least have stuff
stored for value profiling there (but it is used before expansion in a