This is the mail archive of the
mailing list for the GCC project.
Improve offset combination during LRA virtual register elimination?
- From: Jiong Wang <jiong dot wang at arm dot com>
- To: "gcc\ at gcc dot gnu dot org" <gcc at gcc dot gnu dot org>
- Date: Wed, 22 Apr 2015 12:24:26 +0100
- Subject: Improve offset combination during LRA virtual register elimination?
- Authentication-results: sourceware.org; auth=none
During investigate PR62173, another issue I found is gcc is doing bad
offset combination during LRA virtual register elimination.
For example, suppose we access one element from a local array A[i],
normally we get:
rA = sfp + rB
rC = MEM[rA + off0]
rB contains index "i", off0 is the stack offset of this local array.
LRA virtual register elimination then eliminate sfp to "hard_fp + off1",
so the final insn sequences becomes:
rT = hard_fp + off1
rA = rT + rB
rC = MEM [rA + off0]
off1 and off0 can be combined into one offset while LRA haven't.
I have modified TARGET_LEGITIMIZE_ADRESS to legitimize sequence1 into:
rA = sfp + off0
rC = MEM[rA + rB]
As LRA do combine constants for a simple "sfp + const", it's eliminated
into sequence4, no extra instruction introduced.
rA = hard_fp + off3 (off3 = off0 + off1)
rC = MEM[rA + rB]
But the problem is as #comment 8 in PR62173, it's not always good to
generate sequences3, as it's not friendly to CSE, it increases register pressure.
For example, suppose we have three local arries, A[i], B[i], C[i]. Then
the instruction sequences will be:
rA0 = sfp + off0
rC0 = MEM[rA0 + rB]
rA1 = sfp + off1
rC1 = MEM[rA1 + rB]
rA2 = sfp + off2
rC2 = MEM[rA2 + rB]
While the old one will be:
rA0 = sfp + RB
rC0 = MEM[rA0 + off0]
rA1 = sfp + RB
rC1 = MEM[rA1 + off1]
rA2 = sfp + RB
rC2 = MEM[rA2 + off2]
sfp + RB will be CSEd, thus lower register pressure.
IMO, if such instruction sequences (which is quite normal for RISC)
occur in loop, then we should always legitimize them into the format of
sequence3, as it will facilitate constant combination during LRA virtual
register elimination and create one more loop iv as side effect, thus
normally save two instructions in the loop.
While if such instruction sequences do not occur in loop, any thoughts
how to teach LRA to combine the two constants in sequence2?
(I have filed https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64082)