This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix wrong code on aarch64 due to paradoxical subreg


On 07/29/16 15:03, Bernd Edlinger wrote:
> On 07/29/16 09:02, Segher Boessenkool wrote:
>> On Thu, Jul 28, 2016 at 08:59:44PM +0000, Bernd Edlinger wrote:
>>> (insn 1047 1044 1048 101 (set (reg/f:DI 481)
>>>           (subreg:DI (reg/f:SI 545) 0)) isl_input.c:2496 50
>>> {*movdi_aarch64}
>>>        (nil))
>>
>> In your first mail you showed reg 481 as _not_ being REG_POINTER:
>>
>> (insn 1047 1046 1048 (set (reg:DI 481)
>>          (subreg:DI (reg/f:SI 479) 0)) y.c:12702 -1
>>       (nil))
>>
>> (note the lack of /f).  So which is it?  REG_POINTER here is not correct
>> as far as I can see.
>>
>
> Oh yes, that's an interesting point, in expand I still see this:
>
>
> (insn 1047 1046 1048 (set (reg:DI 481)
>          (subreg:DI (reg/f:SI 479) 0)) isl_input.c:2496 -1
>       (nil))
>
> But in the last dump before combine I have this:
>
> (insn 1047 1044 1048 101 (set (reg/f:DI 481)
>          (subreg:DI (reg/f:SI 545) 0)) isl_input.c:2496 50 {*movdi_aarch64}
>       (nil))
>
>
> However I was not really surpised by that, because the reg 545 does
> in deed hold a pointer value: &isl_obj_map_vtable
>
> (insn 22 17 23 51 (set (reg/f:SI 544)
>          (high:SI (symbol_ref:SI ("isl_obj_map_vtable") [flags 0xc0]
> <var_decl 0x7f83d3273f30 isl_obj_map_vtable>))) isl_input.c:2415 49
> {*movsi_aarch64}
>       (nil))
> (insn 23 22 24 51 (set (reg/f:SI 545)
>          (lo_sum:SI (reg/f:SI 544)
>              (symbol_ref:SI ("isl_obj_map_vtable") [flags 0xc0]
> <var_decl 0x7f83d3273f30 isl_obj_map_vtable>))) isl_input.c:2415 917
> {add_losym_si}
>       (expr_list:REG_DEAD (reg/f:SI 544)
>          (expr_list:REG_EQUAL (symbol_ref:SI ("isl_obj_map_vtable")
> [flags 0xc0] <var_decl 0x7f83d3273f30 isl_obj_map_vtable>)
>              (nil))))
>
> The "reg/f:DI 481" first appeared in cse1.
>
>
> I'll try to see what's happening there next....


Ok, I the incorrect REG_POINTER is done here:

cse_main -> reg_scan -> reg_scan_mark_refs -> set_reg_attrs_from_value

and here I see a bug, because if POINTERS_EXTEND_UNSIGNED
can_be_reg_pointer is only set to false if x is SIGN_EXTEND but not
if x is a SUBREG as in this case.

So I think that should be fixed this way:

Index: emit-rtl.c
===================================================================
--- emit-rtl.c	(revision 238891)
+++ emit-rtl.c	(working copy)
@@ -1155,7 +1155,7 @@ set_reg_attrs_from_value (rtx reg, rtx x)
  	 || (GET_CODE (x) == SUBREG && subreg_lowpart_p (x)))
      {
  #if defined(POINTERS_EXTEND_UNSIGNED)
-      if (((GET_CODE (x) == SIGN_EXTEND && POINTERS_EXTEND_UNSIGNED)
+      if (((GET_CODE (x) != ZERO_EXTEND && POINTERS_EXTEND_UNSIGNED)
  	   || (GET_CODE (x) != SIGN_EXTEND && ! POINTERS_EXTEND_UNSIGNED))
  	  && !targetm.have_ptr_extend ())
  	can_be_reg_pointer = false;


What do you think does this look like the right fix?

With this patch the code the reg/f:DI 481 does no longer appear,
and also the invalid combine does no longer happen.

However the test case from pr70903 does not get fixed by this.

But when I look at the dumps, I see again the first incorrect
transformation in cse2 (again cse!):

(insn 20 19 21 2 (set (reg:DI 94)
         (subreg:DI (reg:QI 93) 0)) pr.c:8 50 {*movdi_aarch64}
      (expr_list:REG_EQUAL (const_int 255 [0xff])
         (expr_list:REG_DEAD (reg:QI 93)
             (nil))))

but that is simply wrong, because later optimization passes
expect reg 94 to be 0x000000ff but the upper bits are unspecified,
so that REG_EQUAL should better not exist.

When I looked at cse.c I saw a comment in #if 0, which exactly
describes the problem that we have with paradoxical subreg here:

Index: cse.c
===================================================================
--- cse.c	(revision 238891)
+++ cse.c	(working copy)
@@ -4716,10 +4716,6 @@ cse_insn (rtx_insn *insn)
  	    }
  	}

-#if 0
-      /* It is no longer clear why we used to do this, but it doesn't
-	 appear to still be needed.  So let's try without it since this
-	 code hurts cse'ing widened ops.  */
        /* If source is a paradoxical subreg (such as QI treated as an SI),
  	 treat it as volatile.  It may do the work of an SI in one context
  	 where the extra bits are not being used, but cannot replace an SI
@@ -4726,7 +4722,6 @@ cse_insn (rtx_insn *insn)
  	 in general.  */
        if (paradoxical_subreg_p (src))
  	sets[i].src_volatile = 1;
-#endif

        /* Locate all possible equivalent forms for SRC.  Try to replace
           SRC in the insn with each cheaper equivalent.


removing the #if 0 seems to fix the sample from pr70903, but generates
3 more instructions than with my initial patch:

@@ -6,12 +6,15 @@
  foo:
  	adrp	x0, .LC0
  	add	x0, x0, :lo12:.LC0
+	mov	x5, -1
  	mov	x1, 0
  	ldp	x4, x3, [x0, 8]
  	ldr	x2, [x0, 24]
-	mov	x0, 255
-	bfi	x1, x0, 0, 32
+	mov	x0, 0
+	bfi	x0, x5, 0, 8
  	stp	x3, x2, [x8, 16]
+	bfi	x1, x0, 0, 32
+	mov	x0, 255
  	bfi	x1, x0, 32, 32
  	stp	x1, x4, [x8]
  	ret

- = code without paradoxical subreg
+ = code with paradoxical subreg and above patch.


The generated code looks correct, but I can not run the test cases,
so I am still waiting for confirmation that this actually works io
the target.

When I looked at svn blame cse.c, I saw that this #if 0 was introduced
here:

svn log -r1614 -v
------------------------------------------------------------------------
r1614 | kenner | 1992-07-17 11:57:24 +0200 (Fri, 17 Jul 1992) | 2 lines
Changed paths:
    M /trunk/gcc/cse.c
    M /trunk/gcc/libgcc2.c
    M /trunk/gcc/reload.c
    M /trunk/gcc/reload1.c

*** empty log message ***

------------------------------------------------------------------------


Again, a check-in without comment, that fixes at least 3 different
bugs, without test case, and of course the gcc-patches archives do only
go back to 1998...

I am pretty sure, that this has to be reverted, and that it can
generate less efficient code.

What do you think?


Bernd.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]