This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Minor improvement for double-word register allocations
- From: Jeff Law <law at redhat dot com>
- To: gcc-patches <gcc-patches at gcc dot gnu dot org>
- Date: Thu, 22 Oct 2009 21:43:21 -0600
- Subject: Minor improvement for double-word register allocations
Compiling the mulvsi3 from libgcc with -O2 -fPIC on i686-pc-linux-gnu
you'll see a poor register selection by IRA which ultimately leads to
unnecessary reloading.
Pass 0 for finding pseudo/allocno costs
a1 (r69,l0) best GENERAL_REGS, cover GENERAL_REGS
a2 (r67,l0) best AD_REGS, cover GENERAL_REGS
a0 (r60,l0) best AD_REGS, cover GENERAL_REGS
a0(r60,l0) costs: AD_REGS:0,0 CLOBBERED_REGS:2000,2000
Q_REGS:2000,2000 NON_Q_REGS:2000,2000 GENERAL_REGS:2000,2000 MEM:16997
a1(r69,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0
GENERAL_REGS:0,0 MEM:13000
a2(r67,l0) costs: AREG:0,0 DREG:0,0 CREG:2000,2000 SIREG:2000,2000
DIREG:2000,2000 AD_REGS:0,0 CLOBBERED_REGS:2000,2000 Q_REGS:2000,2000
NON_Q_REGS:2000,2000 GENERAL_REGS:2000,2000 MEM:4000
Pass 1 for finding pseudo/allocno costs
r69: preferred GENERAL_REGS, alternative NO_REGS, cover GENERAL_REGS
r67: preferred AD_REGS, alternative GENERAL_REGS, cover GENERAL_REGS
r60: preferred AD_REGS, alternative GENERAL_REGS, cover GENERAL_REGS
a0(r60,l0) costs: AD_REGS:0,0 CLOBBERED_REGS:2000,2000
Q_REGS:2000,2000 NON_Q_REGS:2000,2000 GENERAL_REGS:2000,2000 MEM:16997
a1(r69,l0) costs: AREG:0,0 DREG:0,0 CREG:0,0 SIREG:0,0 DIREG:0,0
AD_REGS:0,0 CLOBBERED_REGS:0,0 Q_REGS:0,0 NON_Q_REGS:0,0
GENERAL_REGS:0,0 MEM:13000
a2(r67,l0) costs: AREG:0,0 DREG:0,0 CREG:2000,2000 SIREG:2000,2000
DIREG:2000,2000 AD_REGS:0,0 CLOBBERED_REGS:2000,2000 Q_REGS:2000,2000
NON_Q_REGS:2000,2000 GENERAL_REGS:2000,2000 MEM:4000
+++Allocating 8 bytes for conflict table (uncompressed size 12)
;; a0(r60,l0) conflicts: a1(r69,l0)
;; total conflict hard regs:
;; conflict hard regs:
;; a1(r69,l0) conflicts: a0(r60,l0)
;; total conflict hard regs:
;; conflict hard regs:
;; a2(r67,l0) conflicts:
;; total conflict hard regs:
;; conflict hard regs:
**** Allocnos coloring:
Loop 0 (parent -1, header bb0, depth 0)
bbs: 4 3 2
all: 0r60 1r69 2r67
modified regnos: 60 67 69
border:
Pressure: GENERAL_REGS=3
Pushing a2(r67,l0)
Pushing a0(r60,l0)
Pushing a1(r69,l0)
Popping a1(r69,l0) -- assign reg 1
Popping a0(r60,l0) -- (memory is more profitable 16997 vs
1000015) spill
Popping a2(r67,l0) -- assign reg 0
r60 and r69 are the key pseudos. r60 is a double-word pseudo that
we'd really like to allocate into the r0/r1 pair. Unfortunately, IRA
selects r1 for r69, which makes r0/r1 unavailable for r60 and ultimately
leads to r60 not getting a hard register.
The full_costs array just prior to allocation of r69 looks like:
(gdb) p full_costs[0]@8
$1 = {65487827, 0, 0, 0, 0, 0, 0, 0}
So there's no reason not to select r1 for pseudo 69, which in turn makes
r0/r1 unavailable for pseudo 60 and ultimately pseudo 60 gets no hard
register and reload does its thing and makes a mess of the code.
The fix is pretty simple. When we adjust costs for a single register
class operand, we need to account for the size of the operand. If we
make that fix we get a full_costs array for pseudo 69 that looks like:
(gdb) p full_costs[0]@8
$2 = {65487827, 65487827, 0, 0, 0, 0, 0, 0}
Which makes r0/r1 equally bad selections for pseudo, IRA (reasonably)
assigns r2 for pseudo 69 and r0 for pseudo 60 resulting in a perfect
allocation and no reloads:
Loop 0 (parent -1, header bb0, depth 0)
bbs: 4 3 2
all: 0r60 1r69 2r67
modified regnos: 60 67 69
border:
Pressure: GENERAL_REGS=3
Pushing a2(r67,l0)
Pushing a0(r60,l0)
Pushing a1(r69,l0)
Popping a1(r69,l0) -- assign reg 2
Popping a0(r60,l0) -- assign reg 0
Popping a2(r67,l0) -- assign reg 0
This also looks to be a (minor) improvement on a variety of other code
which uses DImode values.
Bootstrapped and regression tested on i686-pc-linux-gnu. Approved by
Vlad in a private email.
Attachment:
PATCH
Description: Text document