Bug 32273

Summary: 'restrict' is forgotten after loop unrolling
Product: gcc Reporter: Tomash Brechko <tomash.brechko>
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED DUPLICATE    
Severity: enhancement CC: acahalan, dberlin, dnovillo, fang, gcc-bugs, gcc, hoogerbrugge, ian, martin, pinskia, rguenth, tim, tomash.brechko
Priority: P3 Keywords: alias, missed-optimization, TREE
Version: 4.2.0   
Target Milestone: ---   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2007-06-11 00:21:57

Description Tomash Brechko 2007-06-10 14:47:49 UTC
The following two functions are equivalent (especially after loop unrolling):

void
foo(const int *restrict a, int *restrict b, int *restrict c)
{
  b[0] += a[0];
  c[0] += a[0];

  b[1] += a[1];
  c[1] += a[1];
}

void
bar(const int *restrict a, int *restrict b, int *restrict c)
{
  for (int i = 0; i < 2; ++i)
    {
      b[i] += a[i];
      c[i] += a[i];
    }
}


However gcc forgets about 'restrict' after the first iteration of the loop, and foo() and bar() produce different code:

foo:
	pushl	%ebx
	movl	8(%esp), %ebx
	movl	12(%esp), %eax
	movl	16(%esp), %edx
	movl	(%ebx), %ecx
	addl	%ecx, (%eax)
	addl	%ecx, (%edx)     ;; Correct: no reloading of (%ebx) is needed.
	movl	4(%ebx), %ecx
	addl	%ecx, 4(%eax)
	addl	%ecx, 4(%edx)    ;; Correct: no reloading of 4(%ebx) is needed.
	popl	%ebx
	ret

bar:
	pushl	%ebx
	movl	8(%esp), %ebx
	movl	12(%esp), %edx
	movl	16(%esp), %ecx
	movl	(%ebx), %eax
	addl	%eax, (%edx)
	addl	%eax, (%ecx)    ;; Correct: no reloading of (%ebx) is needed.
	movl	4(%ebx), %eax
	addl	%eax, 4(%edx)
	movl	4(%ebx), %eax   ;; BUG: unnecessary reloading of 4(%ebx).
	addl	%eax, 4(%ecx)
	popl	%ebx
	ret

For any number of iterations only the first iteration honors the 'restrict' qualifier.  This is wrong, because 'restrict' is a property of a pointer, not data, so if p and q pointers reference different objects, then (p + OFF1) and (q + OFF2) also expected to reference different objects.  Correct assembler for foo() supports that.
Comment 1 Richard Biener 2007-06-10 20:07:18 UTC
Danny, as looked at restrict handling a few days ago - maybe you know instantly
why it doesn't work ;)  (apart from us not recomputing aliasing after loop
optimizations on the tree level -- and the complete unrolling happens there)
Comment 2 Daniel Berlin 2007-06-10 22:41:19 UTC
Complete guess:

alias.c relies not on TYPE_RESTRICT, but on DECL_BASED_ON_RESTRICT_P
I never noticed we even had such a thing :)

My guess is that loop unrolling makes new ssa names, and when they get transformed during un-ssa, this flag no longer exists on them.

Realistically, may-alias should propagate the DECL_* stuff to SSA_NAME_PTR_INFO, which loop unrolling copies.

When they get un-ssa'd, we should then copy the restrict info from the ssa name back to the base variable we create.
Comment 3 Andrew Pinski 2007-06-10 22:55:45 UTC
This works on the pointer_plus branch :)  Also Predictive commoning fixes it up even without unrolling at the tree level so it works at -O3 (this is on the pointer_plus branch I have not tried on the mainline).
Comment 4 Andrew Pinski 2007-06-11 00:21:56 UTC
Yes this is fixed on the pointer_plus branch, the pointer_plus branch is better at keeping track of what the decl is the restrict pointer's base.

-;; *D.1537 = *D.1539 + *D.1537
+;; *D.1538 = *D.1541 + *D.1538
 (insn 14 13 15 t.c:16 (set (reg:SI 66)
-        (mem:SI (reg:SI 59 [ D.1539 ]) [8 S4 A32])) -1 (nil)
+        (mem:SI (reg:SI 59 [ D.1541 ]) [2 S4 A32])) -1 (nil)
     (nil))

 (insn 15 14 0 t.c:16 (parallel [
-            (set (mem:SI (reg:SI 60 [ D.1537 ]) [7 S4 A32])
-                (plus:SI (mem:SI (reg:SI 60 [ D.1537 ]) [7 S4 A32])
+            (set (mem:SI (reg:SI 60 [ D.1538 ]) [2 S4 A32])
+                (plus:SI (mem:SI (reg:SI 60 [ D.1538 ]) [2 S4 A32])
                     (reg:SI 66)))
             (clobber (reg:CC 17 flags))
         ]) -1 (nil)
-    (expr_list:REG_EQUAL (plus:SI (mem:SI (reg:SI 60 [ D.1537 ]) [7 S4 A32])
-            (mem:SI (reg:SI 59 [ D.1539 ]) [8 S4 A32]))
+    (expr_list:REG_EQUAL (plus:SI (mem:SI (reg:SI 60 [ D.1538 ]) [2 S4 A32])
+            (mem:SI (reg:SI 59 [ D.1541 ]) [2 S4 A32]))
         (nil)))


See how the - has different aliasing sets than the +, the - has the correct aliasing set.

So this is now mine.
Comment 5 Andrew Pinski 2007-06-16 05:49:45 UTC
Fixed at revision 125755 on the trunk by the merge of the pointer_plus.  Though this is now a TREE level missed optimization so unassigning me.
Comment 6 Richard Biener 2008-10-01 14:35:57 UTC
PR14187 again.

*** This bug has been marked as a duplicate of 14187 ***