This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition


On Thu, Aug 8, 2013 at 3:23 PM, Jan Hubicka <hubicka@ucw.cz> wrote:
> Hi,
> Martin Liska was kind enough to generate disk seeking graph of gimp statrup with his function reordering.
> His code simply measures time of firest execution of a function and orders functions in the given order.
> The functions stay in the subsections (unlikely/startup/exit/hot/normal) that are then glued together
> in this order.
>
> I am attaching disk seeking with and without -freorder-blocks-and-partition (with your patch).
>
> In 2.pdf you can see two increasing sequences in the text segment.  If I am not mistaken the bottom
> one comes for hot and the top one for normal section.  The big unused part on bottom is unlikely
> section since most of gimp is not trained.

2.pdf is reordered with Martin's technique?

>
> Now 1.pdf is with -freorder-blocks-and-partition and your patch.  You can see there is third sequence
> near bottom of the text seciton. that is beggining of unlikely section, so it tracks jumps where we
> fall into cold section of function.

1.pdf is generated using the usual FDO +
-freorder-blocks-and-partition (i.e. not using Martin's technique)?

>
> It still seems rather bad (i.e. good part of unlikely section is actually used).  I think the dominator
> based approach is not going to work too reliably (I can "fix" my testcase to contain multiple nested
> conditionals and then the heuristic about predecestors won't help).

Yes, this doesn't look good. Did you use the latest version of my
patch that doesn't walk the dominators?

Do you know how many training runs are done for this benchmark? I
think a lot of the issues that you pointed out with the hot loop
preceded by non-looping conditional code as in your earlier example,
or multiple nested conditionals, comes from the fact that the cold
cutoff is not 0, but some number less than the number of training
runs. Perhaps the cutoff for splitting should be 0. Then the main
issue that needs to be corrected is profile insanities, not code that
is executed once (since that would not be marked cold).

The only other issue that I can think of here is that the training
data was not representative and didn't execute these blocks.

>
> What about simply walking the CFG from entry through all edges with non-0 counts and making all reachable
> blocks hot + forcingly make any hot blocks not reachable this way reachable?

Is this different than what I currently have + changing the cold
cutoff to 0? In that case any blocks reachable through non-0 edges
should be non-0 and marked hot, and the current patch forces the most
frequent paths to all hot blocks to be hot.

Thanks,
Teresa

> I think we are really looking primarily for dead parts of the functions (sanity checks/error handling)
> that should not be visited by train run.  We can then see how to make the heuristic more aggressive?
>
> Honza



-- 
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]