This is the mail archive of the
mailing list for the GCC project.
Re: [PATCH][ARM] Switch to default sched pressure algorithm
- From: Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>
- To: Richard Earnshaw <Richard dot Earnshaw at arm dot com>, Ramana Radhakrishnan <ramana dot radhakrishnan at foss dot arm dot com>, Christophe Lyon <christophe dot lyon at linaro dot org>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>, Kyrylo Tkachov <Kyrylo dot Tkachov at arm dot com>
- Date: Tue, 30 Jul 2019 15:15:53 +0000
- Subject: Re: [PATCH][ARM] Switch to default sched pressure algorithm
- Arc-authentication-results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=arm.com;dmarc=pass action=none header.from=arm.com;dkim=pass header.d=arm.com;arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=fzUQ61dZuvGKOPyqIDP9cf9deQvADNzOawNJbDRCrmA=; b=j7zOk2YXpg//FwjKz8hNeEjbN6f0urA5XG3BVpIC+TeQXZqPHYDSeiIcO5a58adj0gBbN1mClKYwXocf0KNNqMLsZNaPElSma01gCYNHOTaMiNxFRfqBc4rXvfnAXan2V3geinu4A2FOWoPvA1iDQLq3PBezTOsUunZCvzgqeZxY+eAnZGPYSE041QpcmN+tn6BUOq48YmVOfML0uY6rQFbCS9wGis4ptP5FoZ8RtLZw6b6SYldrvjH4s2ckNBIpgWJvRSbP81Pvyi03RFIwTvO8rVdrE1JisgsfWoBODZljp60ZGSbj6iRnxniZIk5ESWnqxG48hmocpha1yUPG8Q==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=EPTU/K6TFnjXApw5/LQ1a7/2tjlxYshWLeitT+rCIrNLO+da2Gfkhv6GQxNosmUrDpFEOyzDaiZzcXgYZ3wLa8pEncojMGCe719v/IixpvMWWezst20fdZe3GoDzNMFUrgCIgy5k5MEWrbEuS3Ef6bfGboRt+3ZpTawNPXmMT8Jkx8/zmdhoZYpxBIv6u0opaf23RN/pZ4cifQm1srRFCBfM2LLvNAcPNblEKb9FKUd8H4Up9pk+AoxLxMCjy84Y50qa5QjSZBDYzcFTu7lWSBB3IHxcZqJU0/PRzNMl0bunozTRMDC35VquJeB4W8inpNlnSd+DTQZzH4c5QXGgjA==
- Original-authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco dot Dijkstra at arm dot com;
- References: <VI1PR0801MB2127648B6F8EA34101D444F683DD0@VI1PR0801MB2127.eurprd08.prod.outlook.com> <CAKdteOZBCEivYMCsUOsfPL30QfyFyyLnPHY4HFXhPJoiGJ5zyQ@mail.gmail.com> <email@example.com>,<firstname.lastname@example.org>
>On 30/07/2019 10:31, Ramana Radhakrishnan wrote:
>> On 30/07/2019 10:08, Christophe Lyon wrote:
>>> Hi Wilco,
>>> Do you know which benchmarks were used when this was checked-in?
>>> It isn't clear from
>> It was from my time in Linaro and thus would have been a famous embedded
>> benchmark, coremark , spec2000 - all tested probably on cortex-a9 and
>> Cortex-A15. In addition to this I would like to see what the impact of
>> this is on something like Cortex-A53 as the issue rates are likely to be
>> different on the schedulers causing different behaviour.
Obviously there are differences between various schedulers, but the general
issue is that register pressure is increased many times beyond the spilling limit
(a few cases I looked at had a pressure well over 120 when there are only 14
integer registers - this causes panic spilling in the register allocator).
In fact the spilling overhead between the 2 algorithms is almost identical on
Cortex-A53 and Cortex-A57, so the issue isn't directly related to the pipeline
model used. It seems more related to the scheduler being too aggressive
and not caring about register pressure at all (for example lifting a load 100
instructions before its use so it must be spilled).
>> I don't have all the notes today for that - maybe you can look into the
>> linaro wiki.
>> I am concerned about taking this patch in without some more data across
>> a variety of cores.
> My concern is the original patch
> (https://gcc.gnu.org/ml/gcc-patches/2012-07/msg00706.html) is lacking in
> any real detail as to the reasons for the choice of the second algorithm
> over the first.
> - It's not clear what the win was
> - It's not clear what outliers there were and whether they were significant.
> And finally, it's not clear if, 7 years later, this is still the best
> If the second algorithm really is better, why is no other target using
> it by default?
> I think we need a bit more information (both ways). In particular I'm
> concerned not just by the overall benchmark average, but also the amount
> of variance across the benchmarks. I think the default needs to avoid
> significant outliers if at all possible, even if it is marginally less
> good on the average.
The results clearly show that algorithm 1 works best on Arm today - I haven't
seen a single benchmark where algorithm 2 results in less spilling. We could
tune algorithm 2 so it switches back to algorithm 1 when register pressure is
high or a basic block is large. However until it is fixed, the evidence is that
algorithm 1 is the best choice for current cores.