This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH] Fix computation of register limit for -fsched-pressure

From: Pat Haugen <pthaugen at linux dot vnet dot ibm dot com>
To: Maxim Kuvyrkov <maxim dot kuvyrkov at linaro dot org>
Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
Date: Mon, 17 Oct 2016 11:21:44 -0500
Subject: Re: [PATCH] Fix computation of register limit for -fsched-pressure
Authentication-results: sourceware.org; auth=none
References: <2cf05793-1a7e-b084-c500-d7cbe275dd9d@linux.vnet.ibm.com> <A06121DF-F90A-4B5E-AD82-1F30E56493B1@linaro.org>

On 10/17/2016 08:17 AM, Maxim Kuvyrkov wrote:
>> The patch here, https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01872.html, attempted to scale down the register limit used by -fsched-pressure for the case where the block in question executes as frequently as the entry block to just the call_clobbered (i.e. call_used) regs. But the code is actually scaling toward call_saved registers. The following patch corrects that by computing call_saved regs per class and subtracting out some scaled portion of that.
>> > 
>> > Bootstrap/regtest on powerpc64le with no new failures. Ok for trunk?
> Hi Pat,
> 
> I stared at your patch and current code for good 30 minutes, and I still don't see what is wrong with the current code.
> 
> With your patch the number of registers from class CL that scheduler has at its disposal for a single-basic-block function will be:
> 
> sched_call_regs_num[CL] = ira_class_hard_regs_num[CL] - call_saved_regs_num[CL];
> 
> where call_saved_regs_num is number of registers in class CL that need to be saved in the prologue (i.e., "free" registers).  I can see some logic in setting
> 
> sched_call_regs_num[CL] = call_saved_regs_num[CL];
> 
> but not in subtracting number of such registers from the number of total available hard registers.
> 
> Am I missing something?
> 

Your original patch gave the following reasoning:

"At the moment the scheduler does not account for spills in the prologues and restores in the epilogue, which occur from use of call-used registers.  The current state is, essentially, optimized for case when there is a hot loop inside the function, and the loop executes significantly more often than the prologue/epilogue.  However, on the opposite end, we have a case when the function is just a single non-cyclic basic block, which executes just as often as prologue / epilogue, so spills in the prologue hurt performance as much as spills in the basic block itself.  In such a case the scheduler should throttle-down on the number of available registers and try to not go beyond call-clobbered registers."

But the misunderstanding is that call-used registers do NOT cause any save/restore. That is to say, call-used == call-clobbered. Your last sentence explains the goal for a single block function, to not go beyond call-clobbered (i.e. call-used) registers, which makes perfect sense. My patch implements that goal by subtracting out call_saved_regs_num (those that require prolog/epilog save/restore) from the total regs, and using that as the target # of registers to be used for the block.

> Also, could you share the testcase that you used to investigate the problem with register-aware scheduling?  I wonder if there is a problem lurking.

I don't have a testcase. I'm currently trying to get -fsched-pressure to be beneficial for PowerPC and was familiarizing myself with the code when I spotted the issue.

Thanks,
Pat

Follow-Ups:
- Re: [PATCH] Fix computation of register limit for -fsched-pressure
  - From: Maxim Kuvyrkov

References:
- [PATCH] Fix computation of register limit for -fsched-pressure
  - From: Pat Haugen
- Re: [PATCH] Fix computation of register limit for -fsched-pressure
  - From: Maxim Kuvyrkov

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]