This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Performance regression
- From: Dale Johannesen <dalej at apple dot com>
- To: gcc at gcc dot gnu dot org, gcc-patches at gcc dot gnu dot org
- Cc: Dale Johannesen <dalej at apple dot com>
- Date: Tue, 24 Sep 2002 12:31:19 -0700
- Subject: Performance regression
This patch:
2002-07-10 Roger Sayle <roger@eyesopen.com>
PR c/2454
* combine.c (nonzero_bits): LOAD_EXTEND_OP should only apply
to SUBREGs of MEMs. (num_sign_bit_copies): Likewise.
introduces a performance regression on the following code (ppc
[on which LOAD_EXTEND_OP==ZERO_EXTEND and WORD_REGISTER_OPERATIONS]
and probably other RISCs):
struct x { unsigned char c; };
unsigned char foo(struct x* x, unsigned char q) {
return x->c + (unsigned int)(255 - x->c)*q;
}
An extra and unnecessary zero-extend-byte-to-word insn is now emitted.
Formerly, combine improved this RTL [where reg 120 is also used later]:
(insn 15 7 16 (set (reg:QI 120)
(mem/s/j:QI (reg/v/f:SI 116) [0 <variable>.c+0 S1 A8])) 314
{*rs6000.md:8046} (insn_list 4 (nil))
(expr_list:REG_DEAD (reg/v/f:SI 116)
(nil)))
(insn 16 15 18 (set (reg:SI 119)
(zero_extend:SI (reg:QI 120))) 18 {*rs6000.md:1247} (insn_list
15 (nil))
(nil))
(insn 18 16 20 (set (reg:SI 121)
(minus:SI (const_int 255 [0xff])
(reg:SI 119))) 44 {*rs6000.md:1908} (insn_list 16 (nil))
(expr_list:REG_DEAD (reg:SI 119)
(nil)))
into this:
(insn 15 7 16 (set (reg:QI 120)
(mem/s/j:QI (reg:SI 3 r3) [0 <variable>.c+0 S1 A8])) 314
{*rs6000.md:8046} (nil)
(expr_list:REG_DEAD (reg:SI 3 r3)
(nil)))
(note 16 15 18 NOTE_INSN_DELETED)
(insn 18 16 20 (set (reg:SI 121)
(minus:SI (const_int 255 [0xff])
(subreg:SI (reg:QI 120) 0))) 44 {*rs6000.md:1908}
(insn_list 15 (nil))
(nil))
With the changed semantics of SUBREG, this no longer happens. And
since the merged instruction 18 above no longer has the same semantics
as the 2 insns, it should not happen. (Transforming zero_extend
to subreg now loses information; unfortunately disabling this seems
impractical, since the subreg RTL is needed to match a lot of
register_operand constraints.) I'm not suggesting reverting the patch
because it seems to be necessary for correctness, and that's more
important; but I'd like to get this fixed. Suggestions?