This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] combine: Convert subreg-of-lshiftrt to zero_extract properly (PR78390)


Hi,

On Thu, 24 Nov 2016, Dominik Vogt wrote:

> On s390 (31-Bit) we get two (easily fixable) test regression
> supposedly because of the original path (+ this fix), and I don't
> know what to do about it.  Perhaps these point to two situations,
> where the change from lshiftrt to zero_extract sould not be done?

The _extract forms are considered to be the canonical ones for backends 
that support them (I don't know why).  It's just that the canonicalization 
wasn't applied as often as it could and hence some backends aren't 
prepared to see it for also trivial cases.

> 1) (subtests f29 and f31 in s390/risbg-ll-1.c)
> 
> An unpatched Gcc would generate this:
> 
> --
> (insn 19 17 20 2 (set (reg:DI 2 %r2 [+-4 ])
>         (lshiftrt:DI (reg:DI 66 [ v_shl ])
>             (const_int 32 [0x20])))
> 
> (insn 20 19 21 2 (set (reg:SI 3 %r3)
>         (subreg:SI (reg:DI 66 [ v_shl ]) 4))
> --
> 
> A patched Gcc generates
> 
> --
> (insn 19 17 20 2 (parallel [
>             (set (reg:DI 2 %r2 [+-4 ])
>                 (zero_extract:DI (reg:DI 66 [ v_shl ])
>                     (const_int 32 [0x20])
>                     (const_int 0 [0])))
>             (clobber (reg:CC 33 %cc))

Well, one insn instead of two, seems sensible to me.

> ==> Should the change from lshiftrt to zero_extract be skipped if
> it's just a 32 bit shift right operation in DImode, or should this
> be "fixed" in the backend by adding a special case to the
> "extzv<mode>" pattern?

Special cases in extzv IMHO.

> 2) There's not committed test case for the second finding yet.  In
> 
> Rtl before combine is:
> 
> --
> (insn 4 3 5 2 (set (reg:DI 65 [ v_x ])
>         (subreg:DI (reg:SI 67 [ v_x+4 ]) 0))
> 
> (insn 5 4 7 2 (parallel [
>             (set (zero_extract:DI (reg:DI 65 [ v_x ])
>                     (const_int 32 [0x20])
>                     (const_int 0 [0]))
>                 (subreg:DI (reg:SI 66 [ v_x ]) 0))
>             (clobber (reg:CC 33 %cc))
> 
> (insn 10 7 11 2 (set (reg:DI 69)
>         (lshiftrt:DI (reg:DI 65 [ v_x ])
>             (const_int 12 [0xc])))
> 
> (insn 11 10 16 2 (parallel [
>             (set (reg:SI 68 [ v_and ])
>                 (and:SI (subreg:SI (reg:DI 69) 4)
>                     (const_int 10 [0xa])))
>             (clobber (reg:CC 33 %cc))
> --
> 
> Before the patch, combine generated
> 
> --
> Trying 5, 10 -> 11:
> Failed to match this instruction:
> (parallel [
> ...
> Failed to match this instruction:
> (set (reg:SI 68 [ v_and ])
>     (and:SI (subreg:SI (lshiftrt:DI (reg:DI 65 [ v_x ])
>                 (const_int 12 [0xc])) 4)
>         (const_int 10 [0xa])))
> Successfully matched this instruction:
> (set (reg:DI 69)
>     (lshiftrt:DI (reg:DI 65 [ v_x ])  <---------
>         (const_int 12 [0xc])))
> Successfully matched this instruction:
> (set (reg:SI 68 [ v_and ])
>     (and:SI (subreg:SI (reg:DI 69) 4)
>         (const_int 10 [0xa])))
> allowing combination of insns 5, 10 and 11
> ...
> --
> 
> With the patch it uses zero_extract instead of lshiftrt (which is
> the whole point of the patch?):
> 
> --
> Trying 5, 10 -> 11:
> Failed to match this instruction:
> ...
> Successfully matched this instruction:
> (set (reg:DI 69)
>     (zero_extract:DI (reg:DI 65 [ v_x ])   <---------
>         (const_int 32 [0x20])
>         (const_int 20 [0x14])))

Hmm, this transformation (from v_x>>12 to zext(v_x,32,20) is only valid 
when the top 20 bits of v_x are known to be zero already, or alternatively 
if it's know that the top 32bits of reg 69 won't matter in the future.  I 
wonder where that knowledge is coming from.

> Successfully matched this instruction:
> (set (reg:SI 68 [ v_and ])
>     (and:SI (subreg:SI (reg:DI 69) 4)
>         (const_int 10 [0xa])))
> allowing combination of insns 5, 10 and 11
> --
> 
> Regardless of whether zeroes are shifted into the word or
> zero_extract explicitly extends the extracted bits, none of the
> bits set to zero are used at all because of the following "(and
> ... (const_int 10))".
> 
> In this specific case zero_extract has more (unnecessary) "side
> effects" than lshiftrt, and s390 needs a more complicated
> instruction to do the extraction (risbg) than it needs for the
> shift (slrg).

The above zero-extract is equivalent to the shift under the above 
conditions so it could also be special cased in the extzv pattern again.  
As said I'm not sure where the knowledge of that condition comes from and 
if it's available to the backend.


Ciao,
Michael.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]