This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

From: Jan Hubicka <hubicka at ucw dot cz>
To: Teresa Johnson <tejohnson at google dot com>
Cc: Jan Hubicka <hubicka at ucw dot cz>, Bernhard Reutner-Fischer <rep dot dot dot nop at gmail dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Steven Bosscher <stevenb dot gcc at gmail dot com>, Jeff Law <law at redhat dot com>, "marxin.liska" <marxin dot liska at gmail dot com>, Sriraman Tallam <tmsriram at google dot com>
Date: Wed, 28 Aug 2013 18:58:30 +0200
Subject: Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
Authentication-results: sourceware.org; auth=none
References: <20130809095843 dot GC31755 at kam dot mff dot cuni dot cz> <CAAe5K+XXT6t5CXBDXPWMNSrLWwqfw8F_J2fNUAN2afqb5qPhzQ at mail dot gmail dot com> <20130809152804 dot GA6579 at kam dot mff dot cuni dot cz> <CAAe5K+UcMevsDcXOeq5tu0+u-FrLVVWR6Wd20ZhBHNWdNsQ4Zw at mail dot gmail dot com> <CAAe5K+W+VEERrsaCCjCD=n40_MO2EPashDp5qnKoS8SLaSjBjQ at mail dot gmail dot com> <20130817204408 dot GA16557 at kam dot mff dot cuni dot cz> <CAAe5K+XGKvd+_8Ukp0kpOWvc2k165F=4fdemf-iDz+QkirLPmg at mail dot gmail dot com> <20130819150942 dot GA28264 at kam dot mff dot cuni dot cz> <CAAe5K+UnqBfxYrZxSkjRudq8NYni_9ih+t=us+2hH+UPsrwDLA at mail dot gmail dot com> <CAAe5K+UhCzfwm_WomE1yk+ET1tBiNT5svfn_LAc57MUSqJbnaQ at mail dot gmail dot com>

Hi,
with Martin we did bit of progress on analyzing the problems.  We now have
COMDAT profile merging for FDO and we also noticed that forks can make your
basic blocks appear never executed even though they are executed every run:
the fork is accounted as 3 independnet runs of the program.  First run
is until fork, the second 2 runs are both variant.

I have patch to track this.  Moreover vforks seems to produce repeated
merging of results.

These two factors seems to help to riddle enought the firefox profiles
so it took us weeks to understand what happens.
> +         if (changed)
> +            {
> +              /* Edge forwarding in particular can cause hot blocks previously
> +                 reached by both hot and cold blocks to become dominated only
> +                 by cold blocks. This will cause the verification
> below to fail,
> +                 and lead to now cold code in the hot section. This is not easy
> +                 to detect and fix during edge forwarding, and in some cases
> +                 is only visible after newly unreachable blocks are deleted,
> +                 which will be done in fixup_partitions.  */
> +              fixup_partitions ();

Is it really necessary to run this from internal loop of the cfgcleanup?  It seems
you will play back and forth game where edge forwarding will remove your fallthru
and you will be re-adding it?

I would wait for cfgcleanup to finish its job (I don't really think it needs to be
iterative) and then do the fixup possibly cleaning up when the blocks was repoisitoined
(I suppose it is only case where the code above introduces new cfgcleanup oppurtunities).

> +      /* Walk the preds/succs and check if there is at least one already
> +         marked hot. Keep track of the most frequent pred/succ so that we
> +         can mark it hot if we don't find one.  */
> +      FOR_EACH_EDGE (e, ei, edges)
> +        {
> +          basic_block reach_bb = walk_up ? e->src : e->dest;
> +
> +          if (e->flags & EDGE_DFS_BACK)
> +            continue;
> +
> +          if (BB_PARTITION (reach_bb) != BB_COLD_PARTITION)
> +          {
> +            found = true;
> +            break;
> +          }
> +          if (e->probability > highest_probability)
> +            highest_probability = e->probability;

When doing predecestor walk if you have two predecestors, one executing 100000
times and getting to the block with probability 1%, you want to chose it over
block executing once and getting to you with probability 100%.

You probably want to look for most likely predecestor here.  You need to look
for highest e->count and if they are all 0 for highest EDGE_FREQUENCY and for
maximal probability only if all fails?

> +        }
> +
> +      /* If bb is reached by (or reaches, in the case of !WALK_UP) another hot
> +         block (or unpartitioned, e.g. the entry block) then it is ok. If not,
> +         then the most frequent pred (or succ) needs to be adjusted.  In the
> +         case where multiple preds/succs have the same probability (e.g. a
> +         50-50 branch), then both will be adjusted.  */
> +      if (found)
> +        continue;
> +
> +      FOR_EACH_EDGE (e, ei, edges)
> +        {
> +          if (e->flags & EDGE_DFS_BACK)
> +            continue;
> +          if (e->probability < highest_probability)
> +            continue;

Again for predecestor walk you need to wind down the bit crazy logic described above.
> Index: predict.c
> ===================================================================
> --- predict.c   (revision 202021)
> +++ predict.c   (working copy)
> @@ -241,6 +241,22 @@ probably_never_executed_bb_p (struct function *fun
>    return false;
>  }
> 
> +
> +/* Return true in case edge E is probably never executed.  */
> +
> +bool
> +probably_never_executed_edge_p (struct function *fun, edge e)
> +{
> +  gcc_checking_assert (fun);
> +  if (profile_info && flag_branch_probabilities)
> +    return ((e->count + profile_info->runs / 2) / profile_info->runs) == 0;
> +  if ((!profile_info || !flag_branch_probabilities)
> +      && (cgraph_get_node (fun->decl)->frequency
> +         == NODE_FREQUENCY_UNLIKELY_EXECUTED))
> +    return true;
> +  return false;
Instead of duplicating the conditional, break out the tests into
probably_never_executed_count_p, like we have for maybe_hot_count_p.

It would be nice to extend this to work w/o profile: probably_never_executed_edge_p
can return true for EH edges, setjmp edges and we can then walk bodies ignoring
EH/setjmp flagging blocks that are probably never executed enabling the partitioning
to do a job w/o profile.

Otherwise the patch look OK to me. Thanks for working on this. Do we have agreement on C++ way of mixing
declarations and code?

Honza

Follow-Ups:
- Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
  - From: Teresa Johnson

References:
- Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
  - From: Jan Hubicka
- Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
  - From: Teresa Johnson
- Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
  - From: Jan Hubicka
- Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
  - From: Teresa Johnson
- Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
  - From: Teresa Johnson
- Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
  - From: Jan Hubicka
- Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
  - From: Teresa Johnson
- Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
  - From: Jan Hubicka
- Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
  - From: Teresa Johnson
- Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition
  - From: Teresa Johnson

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]