This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] lto partitioning of varpool_nodes for section anchors

On 4 July 2016 at 13:51, Andrew Pinski <> wrote:
> On Mon, Jul 4, 2016 at 12:58 AM, Prathamesh Kulkarni
> <> wrote:
>> Hi,
>> I have attached a "quick and dirty" prototype patch (var-partition-1.diff),
>> that attempts to partition variables to reduce number of
>> external references and to increase usage of section-anchors
>> to CSE address computation of global variables.
>> We could put a variable in a partition that has max references for it,
>> however it doesn't lend itself directly to section anchor optimization.
>> For instance if a partition has max references for variables 'a' and 'b',
>> but no function in that partition references both 'a', and 'b' then AFAIU
>> it doesn't make any difference from section anchors perspective to have them
>> in same partition.
>> The patch tries to assign a set of variables (>= 2)
>> to a partition whose functions have maximum references for that set.
>> Functions within the partition that reference the variables
>> in the set can take advantage of section-anchors. Functions
>> referencing the variables in the set outside the partition
>> would need to load them as external references (using movw/movt),
>> however since we are placing the set in partition that has maximal
>> references for it, number of external references should be overall
>> reduced.
>> Partitioning is gated by -flto-var-partition and enabled
>> only for arm and aarch64.
> Why only for arm and aarch64?  Shouldn't it be enabled for all section
> anchor targets?
AFAIK the only targets supporting section anchors are arm, aarch64 and powerpc.
I didn't enable it for ppc64 because I am not sure how much profitable
it is for that target.
Honza mentioned to me some time back that effect of partitioning on
powerpc was nearly zero.

> Thanks,
> Andrew
>> As per previous discussion [1], I haven't
>> touched function partitioning. Does this approach look ok
>> especially regarding correctness ?
>> So far, I have cross-tested patch on arm*-*-*, aarch64*-*-*.
>> I haven't yet managed to benchmark the patch.
>> As a cheap measurement, I tried to measure number of external
>> references with and without patch by writing a small ipa pass
>> which is run during ltrans and simply walks over varpool nodes
>> and counts number of varpool_nodes for which DECL_EXTERNAL (vnode->decl) is true
>> and vnode->definition is 0. Is that sufficient condition to determine
>> if variable is externally defined ? I have attached the pass
>> (count-external-refs.diff)
>> and the comparison done with it for for SPEC2000 [2]. The entries
>> in "before" and "after" column contain summation of number of
>> external refs (total_count) across all partitions before and after applying
>> the patch. Does the comparison hold any merit ?
>> I was wondering if we could we use a better way for
>> measuring statically the effects of variable partitioning ?
>> I hope also to get done with benchmarking soon.
>> I have not yet figured out how to integrate it with existing cost metrics for
>> balanced partitioning, I am looking into that.
>> I would be grateful for suggestions on the patch.
>> [1]
>> [2] SPEC2000 comparison:

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]