[PATCH, combine] Try REG_EQUAL for nonzero_bits
Eric Botcazou
ebotcazou@adacore.com
Sat Nov 22 11:46:00 GMT 2014
> The patch tries to use REG_EQUAL to get more precise info for nonzero_bits,
> which helps to remove unnecessary zero_extend.
>
> Here is an example when compiling Coremark, we have rtx like,
>
> (insn 1244 386 388 47 (set (reg:SI 263 [ D.5767 ])
> (reg:SI 384 [ D.5767 ])) 786 {*thumb2_movsi_insn}
> (expr_list:REG_EQUAL (zero_extend:SI (mem:QI (reg/v/f:SI 271 [ memblock
> ]) [0 *memblock_13(D)+0 S1 A8]))
> (nil)))
>
> from "reg:SI 384", we can only know it is a 32-bit value. But from
> REG_EQUAL, we can know it is an 8-bit value. Then for the following rtx seq,
>
> (insn 409 407 410 50 (set (reg:SI 308)
> (plus:SI (reg:SI 263 [ D.5767 ])
> (const_int -48 [0xffffffffffffffd0]))) core_state.c:170 4
> {*arm_addsi3}
> (nil))
> (insn 410 409 411 50 (set (reg:SI 309)
> (zero_extend:SI (subreg:QI (reg:SI 308) 0))) core_state.c:170 812
> {thumb2_zero_extendqisi2_v6}
> (expr_list:REG_DEAD (reg:SI 308)
> (nil)))
>
> the zero_extend for r309 can be optimized by combine pass.
This sounds like a good idea.
> 2014-11-21 Zhenqiang Chen <zhenqiang.chen@arm.com>
>
> * combine.c (set_nonzero_bits_and_sign_copies): Try REG_EQUAL note.
>
> diff --git a/gcc/combine.c b/gcc/combine.c
> index 6a7d16b..68a883b 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -1713,7 +1713,15 @@ set_nonzero_bits_and_sign_copies (rtx x, const_rtx
> set, void *data)
>
> /* Don't call nonzero_bits if it cannot change anything. */
> if (rsp->nonzero_bits != ~(unsigned HOST_WIDE_INT) 0)
> - rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode);
> + {
> + rtx reg_equal = insn ? find_reg_note (insn, REG_EQUAL,
> NULL_RTX)
> + : NULL_RTX;
> + if (reg_equal)
> + rsp->nonzero_bits |= nonzero_bits (XEXP (reg_equal, 0),
> + nonzero_bits_mode);
> + else
> + rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode);
> + }
> num = num_sign_bit_copies (SET_SRC (set), GET_MODE (x));
> if (rsp->sign_bit_copies == 0
>
> || rsp->sign_bit_copies > num)
Use find_reg_equal_equiv_note instead. Are you sure that this won't yield
inferior results in very peculiar cases? IOW, why not use both sources?
Note that 'src' is massaged just above if SHORT_IMMEDIATES_SIGN_EXTEND is
defined so we should probably do it for the note datum too, for example by
factoring the code into a function and invoking it.
Why not do the same for num_sign_bit_copies?
--
Eric Botcazou
More information about the Gcc-patches
mailing list