This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] cfg.c: Remove cached_make_edge.
- From: Jeffrey A Law <law at redhat dot com>
- To: Kazu Hirata <kazu at cs dot umass dot edu>
- Cc: gcc-patches at gcc dot gnu dot org
- Date: Mon, 29 Nov 2004 13:22:00 -0700
- Subject: Re: [RFC] cfg.c: Remove cached_make_edge.
- Organization: Red Hat, Inc
- References: <20041129.124145.71117592.kazu@cs.umass.edu>
- Reply-to: law at redhat dot com
On Mon, 2004-11-29 at 12:41 -0500, Kazu Hirata wrote:
> Hi,
>
> Attached is a patch to remove cached_make_edge.
>
> cached_make_edge takes an adjacency matrix of the CFG to quickly
> determine if an edge that is requested for addition already exists.
> In GCC, this matrix is called edge_cache, and was originaly meant to
> speed up testcases with a lot of computed gotos.
>
> http://gcc.gnu.org/ml/gcc/1999-10n/msg00078.html
>
> However, retesting on computed-goto-intensive testcases like _num.i
> (PR ????) and pi.i (from PR 2001) show that the edge cache does not
> help these days. This could be due to the fact that computed gotos
> are factored.
>
> The patch removes cached_make_edge and replaces all uses of it with
> make_edge, which performs a check using find_edge. Note that the
> performance of find_edge is not too bad thanks to Jeff's work as long
> as one end of an edge has a low degree.
>
> Here is a timing in seconds on 5 runs of ./cc1 on _numi.i.
>
> original patched
> real: 204.254 203.445 (0.397% down)
> user: 199.371 199.208 (0.081% down)
>
> Here is a timing in seconds on 5 runs of ./cc1 on pi.i.
>
> original patched
> real: 5.563 4.717 (17.935% down)
> user: 5.454 4.628 (17.847% down)
>
> Here is a timing in seconds on 5 runs of ./cc1 on insn-attrtab.i.
>
> original patched
> real: 192.644 190.258 (1.254% down)
> user: 189.373 187.822 (0.825% down)
>
> Here is a timing in seconds on a single run on cc1-i files.
>
> original patched
> real: 232.116 231.826 (0.125% down)
> user: 227.405 226.879 (0.231% down)
>
> Having put all these benchmarks, I should not hide the fact that we
> still have a lot of attempts to create duplicate edges. Here is a
> coverage test done on cc1-i files using GCC with this patch applied.
>
> -: 288:edge
> -: 289:make_edge (basic_block src, basic_block dest, int flags)
> function make_edge called 2677949 returned 100% blocks executed 100%
> 2677949: 290:{
> 2677949: 291: edge e = find_edge (src, dest);
> -: 292:
> -: 293: /* Make sure we don't add duplicate edges. */
> 2677949: 294: if (e)
> -: 295: {
> 1461894: 296: e->flags |= flags;
> 1461894: 297: return NULL;
> -: 298: }
> -: 299:
> 1216055: 300: return unchecked_make_edge (src, dest, flags);
> -: 301:}
>
> Note that more than 50% of calls to make_edge are the duplicate case.
> To see where these duplicate come from, I instrumented make_edge and
> got the following numbers.
>
> 802648 cfgbuild.c:196:make_label_edge trying to create a duplicate edge.
> 620708 cfgbuild.c:383:make_edges trying to create a duplicate edge.
> 15868 cfgbuild.c:244:make_edges trying to create a duplicate edge.
> 7742 cfgbuild.c:379:make_edges trying to create a duplicate edge.
> 6534 cfgbuild.c:321:make_edges trying to create a duplicate edge.
> 4361 cfgbuild.c:336:make_edges trying to create a duplicate edge.
> 1 tree-cfg.c:604:make_cond_expr_edges trying to create a duplicate edge.
>
> The first column is the number of times the location at the second
> column attempts to create a duplicate edge. Note that most duplicates
> come from the CFG builder at RTL level. We cannot assume too much
> about how expanders expand trees, so it may not be easy to figure out
> how we can avoid redundant calls to make_edge.
>
> Another approach, which I have not tried, is to improve the semantics
> of edge_cache.
>
> Currently, it's "one-sided" cache. That is, if a bit in the bitmap is
> set, that tells you that the corresponding edge definitely exists.
> Otherwise, it does not tell you anything useful, in which case,
> cached_make_edge resorts to find_edge.
>
> My proposed not-yet-tried improvement would make it "double-sided"
> cache. That is, if a bit is set *if and only if* the corresponding
> bit is set. This would avoid all calls to find_edge while the edge
> cache is available. Of course, the down side is that we have to
> initialize the adjacency matrix completely.
>
> Any comment?
>
> Kazu Hirata
>
> 2004-11-29 Kazu Hirata <kazu@cs.umass.edu>
>
> * cfg.c (cached_make_edge): Remove.
> (make_edge): Call unchecked_make_edge instead.
> * basic-block.h: Remove the prototype for cached_make_edge.
> * cfgbuild.c (make_label_edge): Remove the first argument.
> Call make_edge instead of cached_make_edge
> (rtl_make_eh_edge): Likewise.
> (make_edges): Don't allocate edge_cache. Remove the last
> argument.
> Call make_edge instead of cached_make_edge
> Adjust calls to rtl_make_eh_edge and make_label_edge.
> (find_basic_blocks): Adjust calls to make_edges.
> (find_many_sub_basic_blocks): Likewise.
> (find_sub_basic_blocks): Likewise.
> * except.c (finish_eh_generation): Adjust a call to
> rtl_make_eh_edge.
I'd pondered killing the edge cache myself recently. It would improve
15524 as well.
But I couldn't convince myself it was actually a wise thing to do;
because we could have nodes in the CFG with a high out degree in which
their successors have a high in degree. That would trigger poor
behavior in find_edge. It wouldn't be terribly hard to construct such
code with a switch statement and some gotos.
An idea I just had would be to kill the edge cache except for critical
edges where the head of the edge has a high out degree and the tail of
the edge has a high in degree. That captures precisely the cases where
find_edge would be potentially useful. The question then becomes can we
build & maintain the edge cache for just those edges and what would be
the cost compared to what we do now.
jeff