This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
- From: Teresa Johnson <tejohnson at google dot com>
- To: Jan Hubicka <hubicka at ucw dot cz>
- Cc: Bernhard Reutner-Fischer <rep dot dot dot nop at gmail dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Steven Bosscher <stevenb dot gcc at gmail dot com>, Jeff Law <law at redhat dot com>, "marxin.liska" <marxin dot liska at gmail dot com>, Sriraman Tallam <tmsriram at google dot com>, Rong Xu <xur at google dot com>
- Date: Sun, 11 Aug 2013 06:25:33 -0700
- Subject: Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
- References: <CAAe5K+XM_RCJBmtKSyQCfD86hi2Ls_BCyuZkd2WKSoGUFm84DA at mail dot gmail dot com> <20130802150529 dot GC15776 at kam dot mff dot cuni dot cz> <CAAe5K+WiR02Rs1jYMFRF2F8ey60UO7LwRa8WWq7coQ5Pq8HhiQ at mail dot gmail dot com> <CAAe5K+WToUznwFFfm5beapXAOOrOgxHR8LXmYBTL70C4VVsT+w at mail dot gmail dot com> <20130808222332 dot GA31755 at kam dot mff dot cuni dot cz> <CAAe5K+W+8borbPkt4BB1ayRgDbFBtd6oyZsGuUiC854o9t0Rjg at mail dot gmail dot com> <20130809095843 dot GC31755 at kam dot mff dot cuni dot cz> <CAAe5K+XXT6t5CXBDXPWMNSrLWwqfw8F_J2fNUAN2afqb5qPhzQ at mail dot gmail dot com> <20130809152804 dot GA6579 at kam dot mff dot cuni dot cz> <CAAe5K+UcMevsDcXOeq5tu0+u-FrLVVWR6Wd20ZhBHNWdNsQ4Zw at mail dot gmail dot com> <20130811122143 dot GE22678 at kam dot mff dot cuni dot cz>
Cc'ing Rong since he is also working on trying to address the comdat
profile issue. Rong, you may need to see an earlier message for more
context:
http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00558.html
Teresa
On Sun, Aug 11, 2013 at 5:21 AM, Jan Hubicka <hubicka@ucw.cz> wrote:
>>
>> I see, yes LTO can deal with this better since it has global
>> information. In non-LTO mode (including LIPO) we have the issue.
>
> Either Martin or me will implement merging of the multiple copies at
> LTO link time. This is needed for Martin's code unification patch anyway.
>
> Theoretically gcov runtime can also have symbol names and cfg checksums of
> comdats in the static data and at exit produce buckets based on matching
> names+checksums+counter counts, merge all data into in each bucket to one
> representative by the existing merging routines and then memcpy them to
> all the oriignal copiles. This way all compilation units will receive same
> results.
>
> I am not very keen about making gcov runtime bigger and more complex than it
> needs to be, but having sane profile for comdats seems quite important.
> Perhaps, in GNU toolchain, ordered subsections can be used to make linker to
> produce ordered list of comdats, so the runtime won't need to do hashing +
> lookups.
>
> Honza
>>
>> I take it gimp is built with LTO and therefore shouldn't be hitting
>> this comdat issue?
>>
>> Let me do a couple things:
>> - port over my comdat inlining fix from the google branch to trunk and
>> send it for review. If you or Martin could try it to see if it helps
>> with function splitting to avoid the hits from the cold code that
>> would be great
>> - I'll add some new sanity checking to try to detect non-zero blocks
>> in the cold section, or 0 blocks reached by non-zero edges and see if
>> I can flush out any problems with my tests or a profiledbootstrap or
>> gimp.
>> - I'll try building and profiling gimp myself to see if I can
>> reproduce the issue with code executing out of the cold section.
>>
>> Thanks,
>> Teresa
>>
>> >>
>> >> Also, can you send me reproduction instructions for gimp? I don't
>> >> think I need Martin's patch, but which version of gimp and what is the
>> >> equivalent way for me to train it? I have some scripts to generate a
>> >> similar type of instruction heat map graph that I have been using to
>> >> tune partitioning and function reordering. Essentially it uses linux
>> >> perf to sample on instructions_retired and then munge the data in
>> >> several ways to produce various stats and graphs. One thing that has
>> >> been useful has been to combine the perf data with nm output to
>> >> determine which cold functions are being executed at runtime.
>> >
>> > Martin?
>> >
>> >>
>> >> However, for this to tell me which split cold bbs are being executed I
>> >> need to use a patch that Sri sent for review several months back that
>> >> gives the split cold section its own name:
>> >> http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01571.html
>> >> Steven had some follow up comments that Sri hasn't had a chance to address yet:
>> >> http://gcc.gnu.org/ml/gcc-patches/2013-05/msg00798.html
>> >> (cc'ing Sri as we should probably revive this patch soon to address
>> >> gdb and other issues with detecting split functions properly)
>> >
>> > Intresting, I used linker script for this purposes, but that his GNU ld only...
>> >
>> > Honza
>> >>
>> >> Thanks!
>> >> Teresa
>> >>
>> >> >
>> >> > Honza
>> >> >>
>> >> >> Thanks,
>> >> >> Teresa
>> >> >>
>> >> >> > I think we are really looking primarily for dead parts of the functions (sanity checks/error handling)
>> >> >> > that should not be visited by train run. We can then see how to make the heuristic more aggressive?
>> >> >> >
>> >> >> > Honza
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
>> >>
>> >>
>> >>
>> >> --
>> >> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413
--
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413