Improve code generated by cmpstrsi patterns
Jeffrey A Law
law@cygnus.com
Sun Jul 16 23:35:00 GMT 2000
In message < 20000715174343.X21403@wolery.cumb.org >you write:
> On Sun, Jul 16, 2000 at 01:17:45AM +0200, Jan Hubicka wrote:
> I am not sure I understand your description, but I can see the problem
> in Jakub's test case. We've got a triad of insns like so:
>
> (insn/i 16 14 17 (set (reg/v:QI 44)
> (const_int 5 [0x5])) 44 {*movqi_1} (nil)
> (expr_list:REG_EQUAL (const_int 5 [0x5])
> (nil)))
>
> (insn/i 99 98 100 (set (reg:CCZ 17 flags)
> (compare:CCZ (zero_extract:SI (subreg:SI (reg/v:QI 44) 0)
> (const_int 1 [0x1])
> (const_int 2 [0x2]))
> (const_int 0 [0x0]))) 161 {*testqi_ext_3} (nil)
> (nil))
>
> (jump_insn/i 100 99 196 (set (pc)
> (if_then_else (eq (reg:CCZ 17 flags)
> (const_int 0 [0x0]))
> (label_ref 104)
> (pc))) 272 {*jcc_1} (insn_list 99 (nil))
> (expr_list:REG_DEAD (reg:CCZ 17 flags)
> (expr_list:REG_BR_PROB (const_int 3999 [0xf9f])
> (nil))))
>
> and combine should have simplified it down to
>
> (set (pc) (pc))
>
> but there's no such insn. Later, reload blows up trying to set up
> zero_extract of a constant.
FWIW, cse.c and gcse.c handles this by allowing (set (pc) (pc)) to be
created, then zapping them (via the jump optimizer) when the pass is
finished.
[ They aren't deleted earlier since that would likely require the cfg
to be rebuilt. ]
> But this is not the entire story, because adding
>
> (define_insn "*pc_nop_move"
> [(set (pc) (pc))]
> ""
> "* abort ();"
> [(set_attr "type" "other")])
>
> to i386.md does no good at all. Part of the problem is a bug in
> simplify_ternary_operation, such that zero_extract operations are
> never simplified.
The changes to simplify-rtx.c seem pretty reasonable. It may be worth
breaking them into their own patch which can be installed independently
of the others.
> The other part of the problem is that combine never considers merging
> insn 99 with insn 16, so we never get as far as that. It does
> consider merging insns 99 and 100, but only with a register inside the
> zero_extract. I don't know why that is, but my guess would be:
> before combine begins, we have
>
> (insn/i 98 97 99 (parallel[
> (set (reg:QI 58)
> (and:QI (reg/v:QI 44)
> (const_int 4 [0x4])))
> (clobber (reg:CC 17 flags))
> ] ) 167 {*andqi_1} (nil)
> (expr_list:REG_UNUSED (reg:CC 17 flags)
> (nil)))
> (gdb) call debug_rtx (i3)
>
> (insn/i 99 98 100 (set (reg:CCZ 17 flags)
> (compare:CCZ (reg:QI 58)
> (const_int 0 [0x0]))) 5 {cmpqi_ccz_1} (insn_list 98 (nil))
> (expr_list:REG_DEAD (reg:QI 58)
> (nil)))
>
> Combine does manage to merge those two insns. But notice that there
> is no LOG_LINK on insn 98 pointing to insn 16. So combine has no idea
> that (reg/v:QI 44) has a known value. The link should've been put
> there by life analysis, but it didn't - I wonder if this is because
> insn 72 got it.
The LOG_LINKS are not complete def-use chains, it may be the case that
the link you care about is not represented in the LOG_LINKS chain
explicitly for some reason.
> I do think it's a good idea for combine to be able to delete no-op
> moves. When I tried to do it, however, I ran into one problem after
> another. The cleanup block simply isn't expecting I3 to disappear.
You may be better off turning it into a nop move or something like that;
or (set (pc) (pc)), then zapping it when combine is finished.
jeff
More information about the Gcc-patches
mailing list