[Bug target/93453] PPC: rldimi not taken into account to avoid shift+or

segher at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Mon Nov 22 19:51:05 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93453

--- Comment #5 from Segher Boessenkool <segher at gcc dot gnu.org> ---
(In reply to HaoChen Gui from comment #4)
> (define_insn_and_split "*rotl<mode>3_insert_8"
>   [(set (match_operand:GPR 0 "gpc_reg_operand" "=r")
>         (plus_ior_xor:GPR (ashift:GPR (match_operand:GPR 1 "gpc_reg_operand" "r")
>                                       (match_operand:SI 2 "const_int_operand" "n"))
>                           (match_operand:GPR 3 "gpc_reg_operand" "0")))]
>   "INTVAL (operands[2]) > 0
>    && (nonzero_bits (operands[3], <MODE>mode)
>        < HOST_WIDE_INT_1U << INTVAL (operands[2]))"
> {
>   if (<MODE>mode == SImode)
>     return "rlwimi %0,%1,%h2,0,31-%h2";
>   else
>     return "rldimi %0,%1,%H2,0";
> }
>   "&& 1"
>   [(set (match_dup 0)
>         (ior:GPR (and:GPR (match_dup 3)
>                           (match_dup 4))
>                  (ashift:GPR (match_dup 1)
>                              (match_dup 2))))]
> {
>   operands[4] = GEN_INT ((HOST_WIDE_INT_1U << INTVAL (operands[2])) - 1);
> }
>   [(set_attr "type" "insert")])
> 
> But I found that nonzero_bits can't return an exact value except in combine
> pass.

It can return a different value after the combine pass, yes.  But making the
version of nonzero_bits used by combine be the generic version would be a big
regression, and the version used by combine cannot be used anywhere else (it
was an extension of flow.c originally, but this wasn't ported to the dataflow
framework).

I planned to fix this for GCC 12, but we are in stage 3 already :-)

> So the pattern finally can't be split to pattern of
> 'rotl<mode>3_insert_3'. Also if the pass after combine changes the insn, it
> can't be recognized as the nonzero_bits doesn't return exact value in that
> pass.
> 
> I am thinking if we can convert third operand to "reg and a mask" when the
> nonzero_bits is known in combine pass. Thus the pattern can be directly
> combined to 'rotl<mode>3_insert_3'.  
> 
> (set (reg:DI 123)
>     (ior:DI (ashift:DI (reg:DI 125)
>             (const_int 32 [0x20]))
>         (reg:DI 126)))
> 
> (set (reg:DI 123)
>     (ior:DI (ashift:DI (reg:DI 125)
>                        (const_int 32 [0x20]))
>             (and:DI (reg:DI 126)
>                     (const_int 4294967295 [0xfffffff]))))

Will nothing modify it back?


More information about the Gcc-bugs mailing list