[PATCH] Add a bit dislike for separate mem alternative when op is REG_P.

Liu, Hongtao hongtao.liu@intel.com
Mon May 30 02:52:46 GMT 2022



> -----Original Message-----
> From: Alexander Monakov <amonakov@ispras.ru>
> Sent: Friday, May 27, 2022 5:39 PM
> To: Liu, Hongtao <hongtao.liu@intel.com>
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] Add a bit dislike for separate mem alternative when op is
> REG_P.
> 
> On Wed, 25 May 2022, liuhongt via Gcc-patches wrote:
> 
> > Rigt now, mem_cost for separate mem alternative is 1 * frequency which
> > is pretty small and caused the unnecessary SSE spill in the PR, I've
> > tried to rework backend cost model, but RA still not happy with
> > that(regress somewhere else). I think the root cause of this is cost for separate
> 'm'
> > alternative cost is too small, especially considering that the mov
> > cost of gpr are 2(default for REGISTER_MOVE_COST). So this patch
> > increase mem_cost to 2*frequency, also increase 1 for reg_class cost when m
> alternative.
> 
> In the PR, the spill happens in the initial basic block of the function, i.e.
> the one with the highest frequency.
> 
> Also as noted in the PR, swapping the 'unlikely' branch to 'likely' avoids the spill,
> even though it does not affect the frequency of the initial basic block, and
> makes the block with the use more rarely executed.

The spill is mainly decided by 3 insns related to r92

283(insn 3 61 4 2 (set (reg/v:SF 92 [ x ])
284        (reg:SF 102)) "test3.c":7:1 142 {*movsf_internal}
285     (expr_list:REG_DEAD (reg:SF 102)

288(insn 9 4 12 2 (set (reg:SI 89 [ _11 ])
289        (subreg:SI (reg/v:SF 92 [ x ]) 0)) "test3.c":3:36 81 {*movsi_internal}
290     (nil))

And
382(insn 28 27 29 5 (set (reg:DF 98)
383        (float_extend:DF (reg/v:SF 92 [ x ]))) "test3.c":11:13 163 {*extendsfdf2}
384     (expr_list:REG_DEAD (reg/v:SF 92 [ x ])
385        (nil)))
386(insn 29 28 30 5 (s

The frequency the for INSN 3 and INSN 9 is not affected, but frequency of INSN 28 drop from 805 -> 89 after swapping "unlikely" and "likely".
Because of that, GPR cost decreases a lot, finally make the RA choose GPR instead of MEM.

GENERAL_REGS:2356,2356 
SSE_REGS:6000,6000
MEM:4089,4089

Dump of 301.ira:
67  a4(r92,l0) costs: AREG:2356,2356 DREG:2356,2356 CREG:2356,2356 BREG:2356,2356 SIREG:2356,2356 DIREG:2356,2356 AD_REGS:2356,2356 CLOBBERED_REGS:2356,2356 Q_REGS:2356,2356 NON_Q_REGS:2356,2356 TLS_GOTBASE_REGS:2356,2356 GENERAL_REGS:2356,2356 SSE_FIRST_REG:6000,6000 NO_REX_SSE_REGS:6000,6000 SSE_REGS:6000,6000 \
   MMX_REGS:19534,19534 INT_SSE_REGS:19534,19534 ALL_REGS:214534,214534 MEM:4089,4089

And although there's no spill, there's an extra VMOVD in the later BB which looks suboptimal(Guess we can stand with that since it's cold.)

24        vmovd   %eax, %xmm2
25        vcvtss2sd       %xmm2, %xmm2, %xmm1
26        vmulsd  %xmm0, %xmm1, %xmm0
27        vcvtsd2ss       %xmm0, %xmm0, %xmm0
> 
> Do you have a root cause analysis that explains the above?
> 
> Alexander


More information about the Gcc-patches mailing list