This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
RE: [PATCH, combine] Try REG_EQUAL for nonzero_bits
- From: "Zhenqiang Chen" <zhenqiang dot chen at arm dot com>
- To: "'Eric Botcazou'" <ebotcazou at adacore dot com>
- Cc: <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 24 Nov 2014 16:21:12 +0800
- Subject: RE: [PATCH, combine] Try REG_EQUAL for nonzero_bits
- Authentication-results: sourceware.org; auth=none
- References: <000001d00546$33445430$99ccfc90$ at arm dot com> <2022755 dot rYv7BUgZ2N at polaris>
> -----Original Message-----
> From: Eric Botcazou [mailto:ebotcazou@adacore.com]
> Sent: Saturday, November 22, 2014 6:15 PM
> To: Zhenqiang Chen
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, combine] Try REG_EQUAL for nonzero_bits
>
> > The patch tries to use REG_EQUAL to get more precise info for
> > nonzero_bits, which helps to remove unnecessary zero_extend.
> >
> > Here is an example when compiling Coremark, we have rtx like,
> >
> > (insn 1244 386 388 47 (set (reg:SI 263 [ D.5767 ])
> > (reg:SI 384 [ D.5767 ])) 786 {*thumb2_movsi_insn}
> > (expr_list:REG_EQUAL (zero_extend:SI (mem:QI (reg/v/f:SI 271 [
> > memblock
> > ]) [0 *memblock_13(D)+0 S1 A8]))
> > (nil)))
> >
> > from "reg:SI 384", we can only know it is a 32-bit value. But from
> > REG_EQUAL, we can know it is an 8-bit value. Then for the following
> > rtx seq,
> >
> > (insn 409 407 410 50 (set (reg:SI 308)
> > (plus:SI (reg:SI 263 [ D.5767 ])
> > (const_int -48 [0xffffffffffffffd0]))) core_state.c:170 4
> > {*arm_addsi3}
> > (nil))
> > (insn 410 409 411 50 (set (reg:SI 309)
> > (zero_extend:SI (subreg:QI (reg:SI 308) 0))) core_state.c:170
> > 812 {thumb2_zero_extendqisi2_v6}
> > (expr_list:REG_DEAD (reg:SI 308)
> > (nil)))
> >
> > the zero_extend for r309 can be optimized by combine pass.
>
> This sounds like a good idea.
>
> > 2014-11-21 Zhenqiang Chen <zhenqiang.chen@arm.com>
> >
> > * combine.c (set_nonzero_bits_and_sign_copies): Try REG_EQUAL
> note.
> >
> > diff --git a/gcc/combine.c b/gcc/combine.c index 6a7d16b..68a883b
> > 100644
> > --- a/gcc/combine.c
> > +++ b/gcc/combine.c
> > @@ -1713,7 +1713,15 @@ set_nonzero_bits_and_sign_copies (rtx x,
> > const_rtx set, void *data)
> >
> > /* Don't call nonzero_bits if it cannot change anything. */
> > if (rsp->nonzero_bits != ~(unsigned HOST_WIDE_INT) 0)
> > - rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode);
> > + {
> > + rtx reg_equal = insn ? find_reg_note (insn, REG_EQUAL,
> > NULL_RTX)
> > + : NULL_RTX;
> > + if (reg_equal)
> > + rsp->nonzero_bits |= nonzero_bits (XEXP (reg_equal, 0),
> > + nonzero_bits_mode);
> > + else
> > + rsp->nonzero_bits |= nonzero_bits (src,
> nonzero_bits_mode);
> > + }
> > num = num_sign_bit_copies (SET_SRC (set), GET_MODE (x));
> > if (rsp->sign_bit_copies == 0
> >
> > || rsp->sign_bit_copies > num)
>
> Use find_reg_equal_equiv_note instead. Are you sure that this won't yield
> inferior results in very peculiar cases? IOW, why not use both sources?
Thanks for the comments. I will compare the two nonzero_bits from src and
REG_EQUAL. Then select the smaller one.
> Note that 'src' is massaged just above if SHORT_IMMEDIATES_SIGN_EXTEND
> is defined so we should probably do it for the note datum too, for example
> by factoring the code into a function and invoking it.
>
> Why not do the same for num_sign_bit_copies?
Do you know why it use " SET_SRC (set)" other than "src" for
num_sign_bit_copies?
If it is "src", I should do the same for num_sign_bit_copies with REG_EQUAL
info.
Thanks!
-Zhenqiang