[PATCH] Fold (add -1; zero_ext; add +1) operations to zero_ext when not zero (PR37451, PR61837)

Richard Sandiford richard.sandiford@arm.com
Wed Apr 15 09:18:16 GMT 2020


luoxhu--- via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> From: Xionghu Luo <luoxhu@linux.ibm.com>
>
> This "subtract/extend/add" existed for a long time and still annoying us
> (PR37451, PR61837) when converting from 32bits to 64bits, as the ctr
> register is used as 64bits on powerpc64, Andraw Pinski had a patch but
> caused some issue and reverted by Joseph S. Myers(PR37451, PR37782).
>
> Andraw:
> http://gcc.gnu.org/ml/gcc-patches/2008-09/msg01070.html
> http://gcc.gnu.org/ml/gcc-patches/2008-10/msg01321.html
> Joseph:
> https://gcc.gnu.org/legacy-ml/gcc-patches/2011-11/msg02405.html
>
> However, the doloop code improved a lot since so many years passed,
> gcc.c-torture/execute/doloop-1.c is no longer a simple loop with constant
> desc->niter_expr since r125:SI#0 is not SImode, so it is not a valid doloop
> and no transform done in doloop again.  Thus we can do the simplification
> from "subtract/extend/add" to only extend as the condition in doloop will
> never be false based on loop ch's optimization.
> What's more, this patch is slightly different with Andrw's implementation,
> the check of ZERO_EXT and SImode will guard the count won't be changed
> from char/short caused cases not time out on slow platforms before.
> Any comments?  Thanks.
>
> doloop-1.c.257r.loop2_doloop
> ...
> 12: [r129:DI]=r123:SI
>   REG_DEAD r129:DI
>   REG_DEAD r123:SI
> 13: r125:SI=r120:DI#0-0x1
>   REG_DEAD r120:DI
> 14: r120:DI=zero_extend(r125:SI#0)
>   REG_DEAD r125:SI
> 16: r126:CC=cmp(r120:DI,0)
> 17: pc={(r126:CC!=0)?L43:pc}
>   REG_DEAD r126:CC
> ...
>
> Bootstrap and regression tested pass on Power8-LE.
>
> gcc/ChangeLog
>
> 	2020-04-15  Xiong Hu Luo  <luoxhu@linux.ibm.com>
>
> 	PR rtl-optimization/37451, PR target/61837
> 	loop-doloop.c (doloop_modify): Simplify (add -1; zero_ext; add +1)
> 	to zero_ext.
> ---
>  gcc/loop-doloop.c | 26 +++++++++++++++++++++++++-
>  1 file changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/loop-doloop.c b/gcc/loop-doloop.c
> index db6a014e43d..9f967fa3a0b 100644
> --- a/gcc/loop-doloop.c
> +++ b/gcc/loop-doloop.c
> @@ -477,7 +477,31 @@ doloop_modify (class loop *loop, class niter_desc *desc,
>      }
>  
>    if (increment_count)
> -    count = simplify_gen_binary (PLUS, mode, count, const1_rtx);
> +    {
> +      /* Fold (add -1; zero_ext; add +1) operations to zero_ext based on addop0
> +	 is never zero, as gimple pass loop ch will do optimization to simplify
> +	 the loop to NO loop for loop condition is false.  */

IMO the code needs to prove this, rather than just assume that previous
passes have made it so.

Thanks,
Richard

> +      bool simplify_zext = false;
> +      rtx extop0 = XEXP (count, 0);
> +      if (mode == E_DImode
> +	  && GET_CODE (count) == ZERO_EXTEND
> +	  && GET_CODE (extop0) == PLUS)
> +	{
> +	  rtx addop0 = XEXP (extop0, 0);
> +	  rtx addop1 = XEXP (extop0, 1);
> +	  if (CONST_SCALAR_INT_P (addop1)
> +	      && GET_MODE (addop0) == E_SImode
> +	      && addop1 == GEN_INT (-1))
> +	    {
> +	      count = simplify_gen_unary (ZERO_EXTEND, mode, addop0,
> +					  GET_MODE (addop0));
> +	      simplify_zext = true;
> +	    }
> +	}
> +
> +      if (!simplify_zext)
> +	count = simplify_gen_binary (PLUS, mode, count, const1_rtx);
> +    }
>  
>    /* Insert initialization of the count register into the loop header.  */
>    start_sequence ();


More information about the Gcc-patches mailing list