This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix wrong code on aarch64 due to paradoxical subreg


Hi,

On 07/27/16 23:31, Jeff Law wrote:
> So you're stumbling into another really interesting area.
>

Absolutely, I am just too curious what's going on here ;-)


> I can hazard a guess that the reason we create the paradoxical SUBREG
> rather than a zero or sign extension is because various optimizers know
> that the upper bits in a paradoxical are don't care bits and may make
> transformations with that knowledge.
>
> Once you turn that into a zero/sign extend, then the rest of the
> optimizers must preserve the zero/sign extension property -- they have
> no way of knowing the bits were don't cares.
>
> So it's not necessarily clear that your change results in generally
> better code or not.
>
> More importantly, it really feels like you're papering over a real bug
> elsewhere.  AFAICT the use of a paradoxical subreg here is safe.  So the
> real bug ought to be downstream of this point.  Several folks have
> pointed at reload, which would certainly be a point ripe for a
> paradoxical subreg problem.


I have looked again at the test case of PR 71779 and discovered
something that I wanted to discuss here.

I have tried to figure out what made combine change this:


(insn 1048 1047 1049 101 (set (zero_extract:DI (reg/v:DI 191 [ 
obj1D.17943 ])
             (const_int 32 [0x20])
             (const_int 0 [0]))
         (reg/f:DI 481)) isl_input.c:2496 691 {*insv_regdi}
      (nil))
(insn 1049 1048 1050 101 (set (zero_extract:DI (reg/v:DI 191 [ 
obj1D.17943 ])
             (const_int 32 [0x20])
             (const_int 32 [0x20]))
         (reg/v/f:DI 145 [ SR.327D.17957 ])) isl_input.c:2496 691 
{*insv_regdi}
      (expr_list:REG_DEAD (reg/v/f:DI 145 [ SR.327D.17957 ])
         (nil)))
(insn 1050 1049 1051 101 (set (reg:DI 1 x1)
         (reg/v:DI 191 [ obj1D.17943 ])) isl_input.c:2496 50 
{*movdi_aarch64}
      (expr_list:REG_DEAD (reg/v:DI 191 [ obj1D.17943 ])
         (nil)))


into that (which is wrong because reg 481 has undefined bits
in insn 1050):


(insn 1048 1047 1049 101 (set (zero_extract:DI (reg/v:DI 191 [ 
obj1D.17943 ])
             (const_int 32 [0x20])
             (const_int 0 [0]))
         (reg/f:DI 481)) isl_input.c:2496 691 {*insv_regdi}
      (nil))
(insn 1049 1048 1050 101 (set (zero_extract:DI (reg/v:DI 191 [ 
obj1D.17943 ])
             (const_int 32 [0x20])
             (const_int 32 [0x20]))
         (reg/v/f:DI 145 [ SR.327D.17957 ])) isl_input.c:2496 691 
{*insv_regdi}
      (expr_list:REG_DEAD (reg/v/f:DI 145 [ SR.327D.17957 ])
         (nil)))
(insn 1050 1049 1051 101 (set (reg:DI 1 x1)
         (ior:DI (ashift:DI (reg/v/f:DI 145 [ SR.327D.17957 ])
                 (const_int 32 [0x20]))
             (reg/f:DI 481))) isl_input.c:2496 50 {*movdi_aarch64}
      (nil))


The interesting fact is that combine did that while it
was only considering 1048, 1049 -> 1050,
and not the instruction with the paradoxical subreg:

(insn 1047 1044 1048 101 (set (reg/f:DI 481)
         (subreg:DI (reg/f:SI 545) 0)) isl_input.c:2496 50 {*movdi_aarch64}
      (nil))

I found by debugging that expand_field_assignment
was trying to build:

(and:DI (reg/f:DI 481)
         (const_int -4294967296 [0xffffffff00000000]))

with this statement:

       masked = simplify_gen_binary (ASHIFT, compute_mode,
                                     simplify_gen_binary (
                                       AND, compute_mode,
                                       gen_lowpart (compute_mode, 
SET_SRC (x)),
                                       mask),
                                     pos);

but the result is just (reg/f:DI 481) !

I debugged and found the first wrong result in
rtlanal.c (nonzero_bits1) where the following if-statement
tells us that the upper 32 bits of reg 481 must be zero:

   switch (code)
     {
     case REG:
#if defined(POINTERS_EXTEND_UNSIGNED)
       /* If pointers extend unsigned and this is a pointer in Pmode, 
say that
          all the bits above ptr_mode are known to be zero.  */
       /* As we do not know which address space the pointer is referring to,
          we can do this only if the target does not support different 
pointer
          or address modes depending on the address space.  */
       if (target_default_pointer_address_modes_p ()
           && POINTERS_EXTEND_UNSIGNED && GET_MODE (x) == Pmode
           && REG_POINTER (x)
           && !targetm.have_ptr_extend ())
         nonzero &= GET_MODE_MASK (ptr_mode);
#endif


Pmode = DImode, ptr_mode=SImode, and REG_POINTER(x) is true.

Yes, the value is actually a pointer!

Which means, that it is not safe to extend a pointer from SI to DI
with anything but a zero-extend.

And if I remove this statement the wrong code is still not fixed.
So there must be more places where the similar assumptions are
made.

However, PR 70903 is not about pointers, and uses a mode where
Pmode=ptr_mode=DImode, so there must be a different as hard to track
down reason why this did not work out right.


So the question is, if the paradoxical subreg is really such a good
idea, when it triggers so many different bugs all over the place?


Bernd.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]