[PATCH 2/6] Implement a new partitioner for parallel compilation

Richard Biener richard.guenther@gmail.com
Mon Aug 31 09:25:30 GMT 2020


On Fri, Aug 21, 2020 at 12:00 AM Giuliano Belinassi
<giuliano.belinassi@usp.br> wrote:
>
> When using the LTO infrastructure to compile files in parallel, we
> can't simply use any of the LTO partitioner, once extra dependency
> analysis is required to ensure that some nodes are correctly
> partitioned together.
>
> Therefore, here we implement a new partitioner called
> "lto_merge_comdat_map" that does all these required analysis.
> The partitioner works as follows:
>
> 1. We create a number of disjoint sets and inserts each node into a
>    separate set, which may be merged together in the future.
>
> 2. Find COMDAT groups, and mark them to be partitioned together.
>
> 3. Check all nodes that would require any COMDAT group to be
>    copied to its partition (which we name "COMDAT frontier"),
>    and mark them to be partitioned together.
>    This avoids duplication of COMDAT groups and crashes on the LTO
>    partitioning infrastructure.
>
> 4. Check if the user allows the partitioner to promote non-public
>    functions or variables to global to improve parallelization
>    opportunity with a cost of modifying the output code layout.
>
> 5. Balance generated partitions for performance unless not told to.
>
> The choice of 1. was by design, so we could use a union-find
> data structure, which are know for being very fast on set unite
> operations.
>
> For 3. to work properly, we also had to modify
> lto_promote_cross_file_statics to handle this case.
>
> The parameters --param=promote-statics and --param=balance-partitions
> control 4. and 5., respectively

Just a few comments ontop of Honzas remarks:

> gcc/ChangeLog:
> 2020-08-20  Giuliano Belinassi  <giuliano.belinassi@usp.br>
>
>         * Makefile.in: Add lto-partition.o
>         * cgraph.h (struct symtab_node::aux2): New variable.
>         * lto-partition.c: Move from gcc/lto/lto-partition.c
>         (add_symbol_to_partition_1): Only compute insn size
>         if information is available.
>         (node_cmp): Same as above.
>         (class union_find): New.
>         (ds_print_roots): New function.
>         (balance_partitions): New function.
>         (build_ltrans_partitions): New function.
>         (merge_comdat_nodes): New function.
>         (merge_static_calls): New function.
>         (merge_contained_symbols): New function.
>         (lto_merge_comdat_map): New function.
>         (privatize_symbol_name_1): Handle when WPA is not enabled.
>         (privatize_symbol_name): Same as above.
>         (lto_promote_cross_file_statics): New parameter to select when
>         to promote to global.
>         (lto_check_usage_from_other_partitions): New function.
>         * lto-partition.h: Move from gcc/lto/lto-partition.h
>         (lto_promote_cross_file_statics): Update prototype.
>         (lto_check_usage_from_other_partitions): Declare.
>         (lto_merge_comdat_map): Declare.
>
> gcc/lto/ChangeLog:
> 2020-08-20  Giuliano Belinassi  <giuliano.belinassi@usp.br>
>
>         * lto-partition.c: Move to gcc/lto-partition.c.
>         * lto-partition.h: Move to gcc/lto-partition.h.
>         * lto.c: Update call to lto_promote_cross_file_statics.
>         * Makefile.in: Remove lto-partition.o.
> ---
>  gcc/Makefile.in               |   1 +
>  gcc/cgraph.h                  |   1 +
>  gcc/{lto => }/lto-partition.c | 463 +++++++++++++++++++++++++++++++++-
>  gcc/{lto => }/lto-partition.h |   4 +-
>  gcc/lto/Make-lang.in          |   4 +-
>  gcc/lto/lto.c                 |   2 +-
>  gcc/params.opt                |   8 +
>  gcc/tree.c                    |  23 +-
>  8 files changed, 489 insertions(+), 17 deletions(-)
>  rename gcc/{lto => }/lto-partition.c (78%)
>  rename gcc/{lto => }/lto-partition.h (89%)
>
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 79e854aa938..be42b15f4ff 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1457,6 +1457,7 @@ OBJS = \
>         lra-spills.o \
>         lto-cgraph.o \
>         lto-streamer.o \
> +       lto-partition.o \
>         lto-streamer-in.o \
>         lto-streamer-out.o \
>         lto-section-in.o \
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index 0211f08964f..b4a7871bd3d 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -615,6 +615,7 @@ public:
>    struct lto_file_decl_data * lto_file_data;
>
>    PTR GTY ((skip)) aux;
> +  int aux2;
>
>    /* Comdat group the symbol is in.  Can be private if GGC allowed that.  */
>    tree x_comdat_group;
> diff --git a/gcc/lto/lto-partition.c b/gcc/lto-partition.c
> similarity index 78%
> rename from gcc/lto/lto-partition.c
> rename to gcc/lto-partition.c
> index 8e0488ab13e..ca962e69b5d 100644
> --- a/gcc/lto/lto-partition.c
> +++ b/gcc/lto-partition.c
> @@ -170,7 +170,11 @@ add_symbol_to_partition_1 (ltrans_partition part, symtab_node *node)
>      {
>        struct cgraph_edge *e;
>        if (!node->alias && c == SYMBOL_PARTITION)
> -       part->insns += ipa_size_summaries->get (cnode)->size;
> +       {
> +         /* FIXME: Find out why this is being returned NULL in some cases.  */
> +         if (ipa_size_summaries->get (cnode))
> +           part->insns += ipa_size_summaries->get (cnode)->size;
> +       }
>
>        /* Add all inline clones and callees that are duplicated.  */
>        for (e = cnode->callees; e; e = e->next_callee)
> @@ -372,6 +376,402 @@ lto_max_map (void)
>      new_partition ("empty");
>  }
>
> +/* Class implementing a union-find algorithm.  */
> +
> +class union_find
> +{
> +public:
> +
> +  int *parent;
> +  int *rank;
> +  int n;
> +  int successful_unions;
> +
> +  union_find (int num_nodes)
> +    {
> +      n = num_nodes;
> +      parent = XNEWVEC (int, n);
> +      rank   = XNEWVEC (int, n);
> +
> +      for (int i = 0; i < n; ++i)
> +       parent[i] = i;
> +
> +      memset (rank, 0, n*sizeof(*rank));
> +      successful_unions = 0;
> +    }
> +
> +  ~union_find ()
> +    {
> +      free (parent);
> +      free (rank);
> +    }
> +
> +  int find (int x)
> +    {
> +      while (parent[x] != x)
> +       {
> +         parent[x] = parent[parent[x]];
> +         x = parent[x];
> +       }
> +      return x;
> +    }
> +
> +  void unite (int x, int y)
> +    {
> +      int x_root = find (x);
> +      int y_root = find (y);
> +
> +      if (x_root == y_root) /* If x and y are in same set.  */
> +       return;
> +
> +      successful_unions++;
> +
> +      if (rank[x_root] > rank[y_root]) /* Get which ones have greater rank.  */
> +       {
> +         x_root ^= y_root; /* Swap.  */
> +         y_root ^= x_root;
> +         x_root ^= y_root;

You can use std::swap (x_root, y_root); which is nicer to read.

> +       }
> +
> +      parent[y_root] = x_root;
> +      if (rank[x_root] == rank[y_root])
> +       rank[x_root]++;
> +    }
> +
> +  void print_roots ()
> +    {
> +      int i;
> +      for (i = 0; i < n; ++i)
> +       printf ("%d, ", find (i));
> +      printf ("\n");
> +    }
> +};
> +
> +static union_find *ds;
> +
> +DEBUG_FUNCTION void ds_print_roots (void)
> +{
> +  ds->print_roots ();
> +}
> +
> +static bool
> +privatize_symbol_name (symtab_node *);
> +
> +static void
> +promote_symbol (symtab_node *);
> +
> +/* Quickly balance partitions, trying to reach target_size in each of
> +   them.  Returns true if something was done, or false if we decided
> +   that it is not worth.  */
> +
> +static bool
> +balance_partitions (union_find *ds, int n, int jobs)
> +{
> +  int *sizes, i, j;
> +  int total_size = 0, max_size = -1;
> +  int target_size;
> +  const int eps = 0;
> +
> +  symtab_node *node;
> +
> +  sizes = (int *) alloca (n * sizeof (*sizes));
> +  memset (sizes, 0, n * sizeof (*sizes));
> +
> +  /* Compute costs.  */
> +  i = 0;
> +  FOR_EACH_SYMBOL (node)
> +    {
> +      int root = ds->find (i);
> +
> +      if (cgraph_node *cnode = dyn_cast<cgraph_node *> (node))
> +       {
> +         ipa_size_summary *summary = ipa_size_summaries->get (cnode);
> +         if (summary)
> +           sizes[root] += summary->size;
> +         else
> +           sizes[root] += 10;
> +       }
> +      else
> +       sizes[root] += 10;
> +
> +
> +      i++;
> +    }
> +
> +  /* Compute the total size and maximum size.  */
> +  for (i = 0; i < n; ++i)
> +    {
> +      total_size += sizes[i];
> +      max_size    = MAX (max_size, sizes[i]);
> +    }
> +
> +  /* Quick return if total size is small.  */
> +  if (total_size < param_min_partition_size)
> +    return false;
> +
> +  target_size = total_size / (jobs + 1);
> +
> +  /* Unite small partitions.  */
> +  for (i = 0, j = 0; j < n; ++j)
> +    {
> +      if (sizes[j] == 0)
> +       continue;
> +
> +      if (i == -1)
> +       i = j;
> +      else
> +       {
> +         if (sizes[i] + sizes[j] < target_size + eps)
> +           {
> +             ds->unite (i, j);
> +             sizes[i] += sizes[j];
> +             sizes[j] = 0;
> +           }
> +         else
> +             i = j;
> +       }
> +    }
> +  return true;
> +}
> +
> +/* Builds the LTRANS partitions, or return if not needed.  */
> +
> +static int
> +build_ltrans_partitions (union_find *ds, int n)
> +{
> +  int i, n_partitions;
> +  symtab_node *node;
> +
> +  int *compression = (int *) alloca (n * sizeof (*compression));
> +  for (i = 0; i < n; ++i)
> +    compression[i] = -1; /* Invalid value.  */
> +
> +  i = 0, n_partitions = 0;
> +  FOR_EACH_SYMBOL (node)
> +    {
> +      int root = ds->find (i);
> +      node->aux2 = root;
> +      node->aux = NULL;
> +
> +      if (node->get_partitioning_class () == SYMBOL_PARTITION
> +         && compression[root] < 0)
> +       compression[root] = n_partitions++;
> +      i++;
> +    }
> +
> +  if (dump_file)
> +    fprintf (dump_file, "n_partitions = %d\n", n_partitions);
> +
> +  if (n_partitions <= 1)
> +    return false;
> +
> +  /* Create LTRANS partitions.  */
> +  ltrans_partitions.create (n_partitions);
> +  for (i = 0; i < n_partitions; i++)
> +    new_partition ("");
> +
> +  FOR_EACH_SYMBOL (node)
> +    {
> +      if (node->get_partitioning_class () != SYMBOL_PARTITION
> +         || symbol_partitioned_p (node))
> +         continue;
> +
> +      int p = compression[node->aux2];
> +      if (dump_file)
> +       fprintf (dump_file, "p = %d\t;; %s\n", p, node->dump_name ());
> +      add_symbol_to_partition (ltrans_partitions[p], node);
> +    }
> +
> +  return true;
> +}
> +
> +/* Partition COMDAT groups together, and also bring together nodes that
> +   requires them. Such nodes that are not in the COMDAT group that have
> +   references to COMDAT grouped nodes are called the COMDAT frontier.  */
> +
> +static bool
> +merge_comdat_nodes (symtab_node *node, int set)
> +{
> +  enum symbol_partitioning_class c = node->get_partitioning_class ();
> +  bool ret = false;
> +  symtab_node *node1;
> +  cgraph_edge *e;
> +
> +  /* If node is already analysed, quickly return.  */
> +  if (node->aux)
> +    return false;
> +
> +  /* Mark as analysed.  */
> +  node->aux = (void *) 1;
> +
> +
> +  /* Aglomerate the COMDAT group into the same partition.  */
> +  if (node->same_comdat_group)
> +    {
> +      for (node1 = node->same_comdat_group;
> +          node1 != node; node1 = node1->same_comdat_group)
> +       if (!node->alias)
> +         {
> +           ds->unite (node1->aux2, set);
> +           merge_comdat_nodes (node1, set);
> +           ret = true;
> +         }
> +    }
> +
> +  /* Look at nodes that can reach the COMDAT group, and aglomerate to the
> +     same partition.  These nodes are called the "COMDAT Frontier".  The
> +     idea is that every unpartitioned node that reaches a COMDAT group MUST
> +     go through the COMDAT frontier before reaching it.  Therefore, only
> +     nodes in the frontier are exported.  */
> +  if (node->same_comdat_group || c == SYMBOL_DUPLICATE)
> +    {
> +      int i;
> +      struct ipa_ref *ref = NULL;
> +
> +      if (cgraph_node *cnode = dyn_cast <cgraph_node *> (node))
> +       {
> +         /* Add all inline clones and callees that are duplicated.  */
> +         for (e = cnode->callers; e; e = e->next_caller)
> +           if (!e->inline_failed || c == SYMBOL_DUPLICATE)
> +             {
> +               ds->unite (set, e->caller->aux2);
> +               merge_comdat_nodes (e->caller, set);
> +               ret = true;
> +             }
> +
> +         /* Add all thunks associated with the function.  */
> +         for (e = cnode->callees; e; e = e->next_callee)
> +           if (e->caller->thunk.thunk_p && !e->caller->inlined_to)
> +             {
> +               ds->unite (set, e->callee->aux2);
> +               merge_comdat_nodes (e->callee, set);
> +               ret = true;
> +             }
> +       }
> +
> +      for (i = 0; node->iterate_referring (i, ref); i++)
> +       {
> +         symtab_node *node1 = ref->referring;
> +         ds->unite (node1->aux2, set);
> +         ret = true;
> +
> +         if (node1->get_partitioning_class () == SYMBOL_DUPLICATE)
> +           merge_comdat_nodes (node1, set);
> +       }
> +    }
> +
> +  return ret;
> +}
> +
> +/* Bring together static nodes that are called by static functions, so
> +   promotion of statics to globals are not required.  This *MIGHT* negatively
> +   impact the number of partitions, and even generate very umbalanced
> +   partitions that can't be fixed.  */
> +
> +static bool
> +merge_static_calls (symtab_node *node, int set)
> +{
> +  bool ret = false;
> +  enum symbol_partitioning_class c = node->get_partitioning_class ();
> +
> +  if (node->aux)
> +    return false;
> +
> +  node->aux = (void *) 1;
> +
> +
> +  if (!TREE_PUBLIC (node->decl) || c == SYMBOL_DUPLICATE)
> +    {
> +      int i;
> +      struct ipa_ref *ref = NULL;
> +
> +      if (cgraph_node *cnode = dyn_cast <cgraph_node *> (node))
> +       {
> +         for (cgraph_edge *e = cnode->callers; e; e = e->next_caller)
> +           {
> +             /* FIXME: In theory, inlined functions should be a criteria to not
> +                merge partitions.  */
> +             ds->unite (node->aux2, e->caller->aux2);
> +             merge_static_calls (e->caller, set);
> +             ret = true;
> +           }
> +
> +       }
> +
> +      for (i = 0; node->iterate_referring (i, ref); ++i)
> +       {
> +         symtab_node *node1 = ref->referring;
> +         ds->unite (node1->aux2, set);
> +         merge_static_calls (node1, set);
> +         ret = true;
> +       }
> +    }
> +
> +  return ret;
> +}
> +
> +static bool
> +merge_contained_symbols (symtab_node *node, int set)
> +{
> +  bool ret = false;
> +  symtab_node *node1;
> +
> +  while ((node1 = contained_in_symbol (node)) != node)
> +    {
> +      node = node1;
> +      ds->unite (node->aux2, set);
> +      ret = true;
> +    }
> +
> +  return ret;
> +}
> +
> +/* Partition the program into several partitions with a restriction that
> +   COMDATS are partitioned together with all nodes requiring them.  If
> +   promote_statics is false, we also partition together static functions
> +   and nodes that call eachother, so non-public functions are not promoted
> +   to globals.  */
> +
> +void
> +lto_merge_comdat_map (bool balance, bool promote_statics, int jobs)
> +{
> +  symtab_node *node;
> +  int n = 0;
> +
> +  /* Initialize each not into its own distinct disjoint sets.  */
> +  FOR_EACH_SYMBOL (node)
> +    node->aux2 = n++;
> +
> +  union_find disjoint_sets = union_find (n);
> +  ds = &disjoint_sets;
> +
> +  /* First look at COMDATs.  */
> +  FOR_EACH_SYMBOL (node)
> +    {
> +      if (node->same_comdat_group)
> +       merge_comdat_nodes (node, node->aux2);
> +      merge_contained_symbols (node, node->aux2);
> +    }
> +
> +  FOR_EACH_SYMBOL (node)
> +    node->aux = NULL;
> +
> +  /* Then look at STATICs, if needed.  */
> +  if (!promote_statics)
> +    FOR_EACH_SYMBOL (node)
> +      if (!TREE_PUBLIC (node->decl))
> +       merge_static_calls (node, node->aux2);
> +
> +  FOR_EACH_SYMBOL (node)
> +    node->aux = NULL;
> +
> +  if (balance && !balance_partitions (&disjoint_sets, n, jobs))
> +    return;
> +
> +  build_ltrans_partitions (&disjoint_sets, n);
> +}
> +
> +
>  /* Helper function for qsort; sort nodes by order.  */
>  static int
>  node_cmp (const void *pa, const void *pb)
> @@ -931,7 +1331,7 @@ static hash_map<const char *, unsigned> *lto_clone_numbers;
>     represented by DECL.  */
>
>  static bool
> -privatize_symbol_name_1 (symtab_node *node, tree decl)
> +privatize_symbol_name_1 (symtab_node *node, tree decl, bool wpa)
>  {
>    const char *name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (decl));
>
> @@ -939,11 +1339,19 @@ privatize_symbol_name_1 (symtab_node *node, tree decl)
>      return false;
>
>    name = maybe_rewrite_identifier (name);
> -  unsigned &clone_number = lto_clone_numbers->get_or_insert (name);
> -  symtab->change_decl_assembler_name (decl,
> -                                     clone_function_name (
> -                                         name, "lto_priv", clone_number));
> -  clone_number++;
> +  if (wpa)
> +    {
> +      gcc_assert (lto_clone_numbers);
> +
> +      unsigned &clone_number = lto_clone_numbers->get_or_insert (name);
> +      symtab->change_decl_assembler_name (decl,
> +                                         clone_function_name (
> +                                             name, "lto_priv", clone_number));
> +      clone_number++;
> +    }
> +  else
> +    symtab->change_decl_assembler_name (decl, get_file_function_name
> +                                       (node->asm_name ()));
>
>    if (node->lto_file_data)
>      lto_record_renamed_decl (node->lto_file_data, name,
> @@ -968,7 +1376,9 @@ privatize_symbol_name_1 (symtab_node *node, tree decl)
>  static bool
>  privatize_symbol_name (symtab_node *node)
>  {
> -  if (!privatize_symbol_name_1 (node, node->decl))
> +  bool wpa = !split_outputs;
> +
> +  if (!privatize_symbol_name_1 (node, node->decl, wpa))
>      return false;
>
>    return true;
> @@ -1117,7 +1527,7 @@ rename_statics (lto_symtab_encoder_t encoder, symtab_node *node)
>     all inlinees are added.  */
>
>  void
> -lto_promote_cross_file_statics (void)
> +lto_promote_cross_file_statics (bool promote)
>  {
>    unsigned i, n_sets;
>
> @@ -1147,10 +1557,17 @@ lto_promote_cross_file_statics (void)
>            lsei_next (&lsei))
>          {
>            symtab_node *node = lsei_node (lsei);
> +         cgraph_node *cnode = dyn_cast <cgraph_node *> (node);
>
>           /* If symbol is static, rename it if its assembler name
>              clashes with anything else in this unit.  */
>           rename_statics (encoder, node);
> +         if (cnode)
> +           {
> +             bool in_partition = lsei.encoder->nodes[lsei.index].in_partition;
> +             if (!in_partition)
> +               cnode->local = false;
> +           }
>
>           /* No need to promote if symbol already is externally visible ... */
>           if (node->externally_visible
> @@ -1163,8 +1580,12 @@ lto_promote_cross_file_statics (void)
>               validize_symbol_for_target (node);
>               continue;
>             }
> -
> -          promote_symbol (node);
> +         if (promote)
> +           {
> +             promote_symbol (node);
> +             if (cnode && split_outputs)
> +               cnode->local = false;
> +           }
>          }
>      }
>    delete lto_clone_numbers;
> @@ -1186,3 +1607,23 @@ lto_promote_statics_nonwpa (void)
>      }
>    delete lto_clone_numbers;
>  }
> +
> +/* Check if a variable is accessed across partitions.  If yesm then update
> +   used_from_other_partition.  */
> +
> +void
> +lto_check_usage_from_other_partitions (void)
> +{
> +  unsigned int i, j;
> +  for (i = 0; i < ltrans_partitions.length (); i++)
> +    {
> +      vec<lto_encoder_entry> &nodes = (ltrans_partitions[i])->encoder->nodes;
> +
> +      for (j = 0; j < nodes.length (); j++)
> +       {
> +         symtab_node *node = nodes[j].node;
> +         if (node && !nodes[j].in_partition)
> +           node->used_from_other_partition = true;
> +       }
> +    }
> +}
> diff --git a/gcc/lto/lto-partition.h b/gcc/lto-partition.h
> similarity index 89%
> rename from gcc/lto/lto-partition.h
> rename to gcc/lto-partition.h
> index 42b5ea8c80c..4a1b17fa728 100644
> --- a/gcc/lto/lto-partition.h
> +++ b/gcc/lto-partition.h
> @@ -36,6 +36,8 @@ extern vec<ltrans_partition> ltrans_partitions;
>  void lto_1_to_1_map (void);
>  void lto_max_map (void);
>  void lto_balanced_map (int, int);
> -void lto_promote_cross_file_statics (void);
> +void lto_promote_cross_file_statics (bool promote);
>  void free_ltrans_partitions (void);
>  void lto_promote_statics_nonwpa (void);
> +void lto_check_usage_from_other_partitions (void);
> +void lto_merge_comdat_map (bool, bool, int);
> diff --git a/gcc/lto/Make-lang.in b/gcc/lto/Make-lang.in
> index 0b73f9ef7bb..46b52cff183 100644
> --- a/gcc/lto/Make-lang.in
> +++ b/gcc/lto/Make-lang.in
> @@ -24,9 +24,9 @@ LTO_EXE = lto1$(exeext)
>  LTO_DUMP_EXE = lto-dump$(exeext)
>  LTO_DUMP_INSTALL_NAME := $(shell echo lto-dump|sed '$(program_transform_name)')
>  # The LTO-specific object files inclued in $(LTO_EXE).
> -LTO_OBJS = lto/lto-lang.o lto/lto.o lto/lto-object.o attribs.o lto/lto-partition.o lto/lto-symtab.o lto/lto-common.o
> +LTO_OBJS = lto/lto-lang.o lto/lto.o lto/lto-object.o attribs.o lto/lto-symtab.o lto/lto-common.o
>  lto_OBJS = $(LTO_OBJS)
> -LTO_DUMP_OBJS = lto/lto-lang.o lto/lto-object.o attribs.o lto/lto-partition.o lto/lto-symtab.o lto/lto-dump.o lto/lto-common.o
> +LTO_DUMP_OBJS = lto/lto-lang.o lto/lto-object.o attribs.o lto/lto-symtab.o lto/lto-dump.o lto/lto-common.o
>  lto_dump_OBJS = $(LTO_DUMP_OBJS)
>
>  # this is only useful in a LTO bootstrap, but this does not work right
> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> index 1c37814bde4..803b9920e35 100644
> --- a/gcc/lto/lto.c
> +++ b/gcc/lto/lto.c
> @@ -515,7 +515,7 @@ do_whole_program_analysis (void)
>    /* Find out statics that need to be promoted
>       to globals with hidden visibility because they are accessed from multiple
>       partitions.  */
> -  lto_promote_cross_file_statics ();
> +  lto_promote_cross_file_statics (true);
>    if (dump_file)
>       dump_end (partition_dump_id, dump_file);
>    dump_file = NULL;
> diff --git a/gcc/params.opt b/gcc/params.opt
> index f39e5d1a012..00fc58cd5cc 100644
> --- a/gcc/params.opt
> +++ b/gcc/params.opt
> @@ -366,6 +366,14 @@ Minimal size of a partition for LTO (in estimated instructions).
>  Common Joined UInteger Var(param_lto_partitions) Init(128) IntegerRange(1, 65536) Param
>  Number of partitions the program should be split to.
>
> +-param=promote-statics=
> +Common Joined UInteger Var(param_promote_statics) Init(0) IntegerRange(0, 1) Param
> +Allow statics and non-public functions to be promoted as public when compiling in parallel.
> +
> +-param=balance-partitions=
> +Common Joined UInteger Var(param_balance_partitions) Init(1) IntegerRange(0, 1) Param
> +When compiling in parallel, try to balance the partitions for compilation performance.
> +
>  -param=max-average-unrolled-insns=
>  Common Joined UInteger Var(param_max_average_unrolled_insns) Init(80) Param Optimization
>  The maximum number of instructions to consider to unroll in a loop on average.
> diff --git a/gcc/tree.c b/gcc/tree.c
> index d0202c3f785..3ca162d5070 100644
> --- a/gcc/tree.c
> +++ b/gcc/tree.c
> @@ -9595,6 +9595,24 @@ make_anon_name ()
>    return id;
>  }
>
> +/* Filter the input name removing characters that may confuse the linker.  */
> +
> +static void
> +filter_name (char *name)
> +{
> +  char *p = name;
> +
> +  while (*p != '\0')
> +    {
> +      switch (*p)
> +       {
> +         case '*':
> +           *p = '_';
> +       }
> +      p++;
> +    }
> +}
> +
>  /* Generate a name for a special-purpose function.
>     The generated name may need to be unique across the whole link.
>     Changes to this function may also require corresponding changes to
> @@ -9651,8 +9669,7 @@ get_file_function_name (const char *type)
>        q = (char *) alloca (9 + 19 + len + 1);
>        memcpy (q, file, len + 1);
>
> -      snprintf (q + len, 9 + 19 + 1, "_%08X_" HOST_WIDE_INT_PRINT_HEX,
> -               crc32_string (0, name), get_random_seed (false));
> +      snprintf (q + len, 9 + 19 + 1, "_%08X", crc32_string (0, name));
>
>        p = q;
>      }
> @@ -9665,7 +9682,9 @@ get_file_function_name (const char *type)
>       Use a global object (which is already required to be unique over
>       the program) rather than the file name (which imposes extra
>       constraints).  */
> +
>    sprintf (buf, FILE_FUNCTION_FORMAT, type, p);
> +  filter_name (buf);

I wonder why you need this - none of the other callers of
get_file_function_name do so?

I'd rather have you not use get_file_function_name but instead
modify the partitioners symbol promotion code to append a
hash computed once by the partitioning code.  Like maybe simply
hash the cgraph structure somehow:  for-each cgraph node in
UID oder hash UID and merge hashes of all callees (or if we
have some simple way, hash node UIDs in PRE order of the
graph?).

I wonder why the LTO code does not run into collisions - maybe
we do not try hard enough?  Guess doing LTO bootstrap with a modified
-flto that randomly turns itself on/off might show some cases.

Btw, I'd default to the symbol promotion being enabled for
explicit -fparallel.

Richard.

>
>    return get_identifier (buf);
>  }
> --
> 2.28.0
>


More information about the Gcc-patches mailing list