[PATCH] Relax invalidation of TOP N counters in PGO.

Martin Liška mliska@suse.cz
Wed Jan 8 13:19:00 GMT 2020


On 1/8/20 1:24 PM, Jan Hubicka wrote:
>> On 1/8/20 11:35 AM, Jan Hubicka wrote:
>>> Hi,
>>> Just to explain better what I am worried about.  The overall sum of
>>> counters in TOPN does not have very good meaning if you have more than N
>>> target.
>>>
>>> Lets for simplicity assume that we have TOPN for N=1 (i.e. old code). It
>>> guarantees if target X is taken by more than 50% of times, it will win,
>>> however its count can be arbitrarily low.
>>>
>>> Consider following sequence of call targets
>>>
>>> Y  Z  Z  Y  Y  Z  Z  X  X  X  X  X  X  X  X  X
>>> The TOPN counter will behave as follows:
>>> Y1 Y0 Z1 Z0 Y1 Y0 Z1 Z0 X1 X2 X3 X4 X5 X6 X7 X8
>>>
>>> Now for sequence
>>>
>>> Y  Y  Y  X  X  X  Z  Z  Z  X  X  X  Z  X  X  X
>>> Y1 Y2 Y3 Y2 Y1 Y0 Z1 Z2 Z3 Z2 Z1 Z0 Z1 Z0 X1 X2
>>>
>>> There are 16 calls, 9 of them calling X, 3 calls of Y and 4 calls of Z.
>>> Depending on order the counter of X can be anywhere between 2 and 8. So
>>> setting threshold without trashing a lot of valid cases would be hard
>>> and if you have something like 40k of parallel invocations of clang I
>>> would expect quite high divergence between the runs.
>>
>> Thank you for the example. I'm aware of these sequences that can lead
>> to a significant divergence in profiles.
>>
>>>
>>> Similar sequence exists for N>1, you only need to populate other entries
>>> by new targets.
>>
>> Yes, but it's much more harder to rapidly decrease a dominant value if we
>> want to assume that only 10% of samples were populated.
>>
>>>
>>> However based on observation above we could probably scale up the
>>> probabilites assuming that the missing part of histogram has pretty much
>>> the same distribution as known part.
>>
>> I hope so.
>>
>> I'm also opened for a lower default param value. If you can provide any numbers
>> for clang I would appreciate that.
> 
> Well, my problem is that as soon as the parameter is non-0 you will
> inherently get different counts on counters you accept. Even if you cut
> them to differ by less than 10% they will still differ from run to run.
> This will lead to different speculation probabilities in each build
> which may or may not produce different code.  For complex callgraphs
> even relatively small differences in weights leads to reordering of
> inline decisions which may lead to different inline decision but even if
> it does not it definitly leads to different order of callgraph,
> different UIDs of decl which give different hashtable values in late
> optimizers and make them to diverge.

Fine, I don't insist on the patch, it was actually reaction to your PR.

> 
> If I want to get reproducible builds of very large code base (like
> clang) i am not even sure how to test it: look for value which leads to
> no difference in say 1000 builds and do it each time GCC or clang
> updates?

Yep, testing that is very difficult of course.

> 
> I would still preffer invalidation before streaming (which is fully
> deterministic) and possibly have option

Do you mean __gcov_merge_topn?

> -fno-deterministic-profile-merging (or something similar) for users who
> do not care and for threaded programs where this is impossible
> (which needs to trigger different merging in libgcov).

Fine, let's postpone that for GCC 11.

Martin

> 
> Honza
> 



More information about the Gcc-patches mailing list