This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Add working-set size and hotness information to fdo summary (issue6465057)


> 
> This is useful for large applications with a long tail. The
> instruction working set for those applications are very large, and
> inliner and unroller need to be aware of that and good heuristics can
> be developed to throttle aggressive code bloat transformations. For
> inliner, it is kind of the like the global budget but more application
> dependent. In the long run, we will collect more advanced fdo summary
> regarding working set -- it will be working set size for each code
> region (locality region).

I see, so you use it to estimate size of the working set and effect of bloating
optimizations on cache size. This sounds interesting. What are you experiences
with this?

What concerns me that it is greatly inaccurate - you have no idea how many
instructions given counter is guarding and it can differ quite a lot. Also
inlining/optimization makes working sets significantly different (by factor of
100 for tramp3d). But on the ohter hand any solution at this level will be
greatly inaccurate. So I am curious how reliable data you can get from this?
How you take this into account for the heuristics?

It seems to me that for this use perhaps the simple logic in histogram merging
maximizing the number of BBs for given bucket will work well?  It is
inaccurate, but we are working with greatly inaccurate data anyway.
Except for degenerated cases, the small and unimportant runs will have small BB
counts, while large runs will have larger counts and those are ones we optimize
for anyway.
> 
> 
> >  2) Do we plan to add some features in near future that will anyway require global locking?
> >     I guess LIPO itself does not count since it streams its data into independent file as you
> >     mentioned earlier and locking LIPO file is not that hard.
> >     Does LIPO stream everything into that common file, or does it use combination of gcda files
> >     and common summary?
> 
> Actually, LIPO module grouping information are stored in gcda files.
> It is also stored in a separate .imports file (one per object) ---
> this is primarily used by our build system for dependence information.

I see, getting LIPO safe WRT parallel updates will be fun. How does LIPO behave
on GCC bootstrap? (i.e. it does a lot more work in the libgcov module per each
invocation, so I am curious if it is practically useful at all).

With LTO based solution a lot can be probably pushed at link time? Before
actual GCC starts from the linker plugin, LIPO module can read gcov CFGs from
gcda files and do all the merging/updating/CFG constructions that is currently
performed at runtime, right?
> 
> 
> >
> >     What other stuff Google plans to merge?
> >     (In general I would be curious about merging plans WRT profile stuff, so we get more
> >     synchronized and effective on getting patches in. We have about two months to get it done
> >     in stage1 and it would be nice to get as much as possible. Obviously some of the patches will
> >     need bit fo dicsussion like this one. Hope you do not find it frustrating, I actually think
> >     this is an important feature).
> 
> We plan to merge in the new LIPO implementation based on LTO
> streaming. Rong Xu finished this in 4.6 based compiler, and he needs
> to port it to 4.8.

Good.  Looks like a lot of work ahead. It would be nice if we can perhaps start
by merging the libgcov infrastructure updates prior the LIPO changes.  From
what I saw at LIPO branch some time ago it has a lot of stuff that is not
exactly LIPO specific.

Honza
> 
> 
> thanks,
> 
> David
> 
> >
> > I also realized today that the common value counters (used by switch, indirect
> > call and div/mod value profiling) are non-stanble WRT different merging orders
> > (i.e.  parallel make in train run).  I do not think there is actual solution to
> > that except for not merging the counter section of this type in libgcov and
> > merge them in some canonical order at profile feedback time.  Perhaps we just
> > want to live with this, since the disprepancy here is small. (i.e. these
> > counters are quite rare and their outcome has just local effect on the final
> > binary, unlike the global summaries/edge counters).
> >
> > Honza
> >>
> >> Thanks,
> >> Teresa
> >>
> >> >
> >> > Honza
> >> >>
> >> >> -Andi
> >>
> >>
> >>
> >> --
> >> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]