This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
- From: Richard Biener <rguenther at suse dot de>
- To: Prathamesh Kulkarni <prathamesh dot kulkarni at linaro dot org>
- Cc: Jan Hubicka <hubicka at ucw dot cz>, gcc Patches <gcc-patches at gcc dot gnu dot org>, Ramana Radhakrishnan <ramana dot radhakrishnan at arm dot com>
- Date: Tue, 5 Apr 2016 14:58:01 +0200 (CEST)
- Subject: Re: [RFC] introduce --param max-lto-partition for having an upper bound on partition size
- Authentication-results: sourceware.org; auth=none
- References: <CAAgBjMkjMcf7XTvBOtnTr-zcKu-rED3WcLY=a4iYoijOw3V3vQ at mail dot gmail dot com> <C74AF7F0-5036-4524-99BE-E24A4243FC88 at suse dot de> <CAAgBjM=gBmgxQCXnvvjXYpVoU5p4wmkpkO0tMepjzz7yiB38vw at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1604041021170 dot 13384 at t29 dot fhfr dot qr> <CAAgBjMk=qWCt8VB7f_4+x-Ck6OwsV_CXtnyX1QGKuqta4sPKYA at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1604041342040 dot 13384 at t29 dot fhfr dot qr> <20160404120030 dot GD14122 at kam dot mff dot cuni dot cz> <CAAgBjM=3k7YMB4AvDeFAgrBJpLKeJiQPGFHtyNT9pXqLqo2LGQ at mail dot gmail dot com> <20160404141436 dot GB95176 at kam dot mff dot cuni dot cz> <CAAgBjMn7TX0ZkPrvc78qX2doNNivwLFsC8ubqH=TkTz4+6fnRg at mail dot gmail dot com> <alpine dot LSU dot 2 dot 11 dot 1604051324100 dot 13384 at t29 dot fhfr dot qr> <CAAgBjMn+JGNgkFMwhUtYCVRhKfGRCicOK0F5o6ppwgbbJMkzRA at mail dot gmail dot com>
On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> On 5 April 2016 at 16:58, Richard Biener <rguenther@suse.de> wrote:
> > On Tue, 5 Apr 2016, Prathamesh Kulkarni wrote:
> >
> >> On 4 April 2016 at 19:44, Jan Hubicka <hubicka@ucw.cz> wrote:
> >> >
> >> >> diff --git a/gcc/lto/lto-partition.c b/gcc/lto/lto-partition.c
> >> >> index 9eb63c2..bc0c612 100644
> >> >> --- a/gcc/lto/lto-partition.c
> >> >> +++ b/gcc/lto/lto-partition.c
> >> >> @@ -511,9 +511,20 @@ lto_balanced_map (int n_lto_partitions)
> >> >> varpool_order.qsort (varpool_node_cmp);
> >> >>
> >> >> /* Compute partition size and create the first partition. */
> >> >> + if (PARAM_VALUE (MIN_PARTITION_SIZE) > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> + fatal_error (input_location, "min partition size cannot be greater than max partition size");
> >> >> +
> >> >> partition_size = total_size / n_lto_partitions;
> >> >> if (partition_size < PARAM_VALUE (MIN_PARTITION_SIZE))
> >> >> partition_size = PARAM_VALUE (MIN_PARTITION_SIZE);
> >> >> + else if (partition_size > PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> + {
> >> >> + n_lto_partitions = total_size / PARAM_VALUE (MAX_PARTITION_SIZE);
> >> >> + if (total_size % PARAM_VALUE (MAX_PARTITION_SIZE))
> >> >> + n_lto_partitions++;
> >> >> + partition_size = total_size / n_lto_partitions;
> >> >> + }
> >> >
> >> > lto_balanced_map actually works in a way that looks for cheapest cutpoint in range
> >> > 3/4*parittion_size to 2*partition_size and picks the cheapest range.
> >> > Setting partition_size to this value will thus not cause partitioner to produce smaller
> >> > partitions only. I suppose modify the conditional:
> >> >
> >> > /* Partition is too large, unwind into step when best cost was reached and
> >> > start new partition. */
> >> > if (partition->insns > 2 * partition_size)
> >> >
> >> > and/or in the code above set the partition_size to half of total_size/max_size.
> >> >
> >> > I know this is somewhat sloppy. This was really just first cut implementation
> >> > many years ago. I expected to reimplement it marter soon, but then there was
> >> > never really a need for it (I am trying to avoid late IPA optimizations so the
> >> > partitioning decisions should mostly affect compile time performance only).
> >> > If ARM is more sensitive for partitining, perhaps it would make sense to try to
> >> > look for something smarter.
> >> >
> >> >> +
> >> >> npartitions = 1;
> >> >> partition = new_partition ("");
> >> >> if (symtab->dump_file)
> >> >> diff --git a/gcc/lto/lto.c b/gcc/lto/lto.c
> >> >> index 9dd513f..294b8a4 100644
> >> >> --- a/gcc/lto/lto.c
> >> >> +++ b/gcc/lto/lto.c
> >> >> @@ -3112,6 +3112,12 @@ do_whole_program_analysis (void)
> >> >> timevar_pop (TV_WHOPR_WPA);
> >> >>
> >> >> timevar_push (TV_WHOPR_PARTITIONING);
> >> >> +
> >> >> + if (flag_lto_partition != LTO_PARTITION_BALANCED
> >> >> + && PARAM_VALUE (MAX_PARTITION_SIZE) != INT_MAX)
> >> >> + fatal_error (input_location, "--param max-lto-partition should only"
> >> >> + " be used with balanced partitioning\n");
> >> >> +
> >> >
> >> > I think we should wire in resonable MAX_PARTITION_SIZE default. THe value you
> >> > found experimentally may be a good start. For that reason we can't really
> >> > refuse a value when !LTO_PARTITION_BALANCED. Just document it as parameter for
> >> > balanced partitioning only and add a parameter to lto_balanced_map specifying whether
> >> > this param should be honored (because the same path is used for partitioning to one partition)
> >> >
> >> > Otherwise the patch looks good to me modulo missing documentation.
> >> Thanks for the review. I have updated the patch.
> >> Does this version look OK ?
> >> I had randomly chosen 10000, not sure if that's an appropriate value
> >> for default.
> >
> > I think it's way too small. This is roughly the number of GIMPLE stmts
> > (thus roughly the number of instructions). So with say a 8 byte
> > instruction format it is on the order of 80kB. You'd want to have a
> > default of at least several ten times of large-unit-insns (also 10000).
> > I'd choose sth like 1000000 (one million). I find the lto-min-partition
> > number quite small as well (and up it by a factor of 10).
> Done in this version.
I'd do that separately.
Please no default parameter for lto_balanced_map (), instead change
all callers.
> Is it OK after bootstrap+test ?
Note that this is for stage1 only. I'll leave approval to Honza
(also verification of the default max param - not sure if for example
chromium or firefox should/will be split to more than 32 partitions
with the patch)
Richard.
> Thanks,
> Prathamesh
> >
> > Richard.
> >
> >> I have a silly question about partitioning: Does it hamper
> >> transformations on ipa optimizations if caller and
> >> callee get placed in separate partitions ? For instance if callee is
> >> supposed to be inlined
> >> into caller, would inlining still take place if callee and caller get
> >> placed in separate partitions ?
> >> I tried with a trivial example with -flto-partition=max
> >> which created 3 partitions for 3 functions (bar, foo and main), and it was
> >> able to inline bar into foo and foo into main. I am not sure how that happens.
> >> I thought ltrans can perform transformations on functions only within
> >> a single partition
> >> and not across partitions ?
> >>
> >> Thanks,
> >> Prathamesh
> >> >
> >> > Honza
> >>
> >
> > --
> > Richard Biener <rguenther@suse.de>
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)