14192 – Restrict pointers don't help

Bug 14192 - Restrict pointers don't help

Summary: Restrict pointers don't help

Status:	RESOLVED DUPLICATE of bug 14187

Alias:	None

Product:	gcc
Classification:	Unclassified
Component:	middle-end (show other bugs)
Version:	4.0.0

Importance:	P2 normal
Target Milestone:	---
Assignee:	Not yet assigned to anyone

URL:
Keywords:	alias, missed-optimization

Duplicates (2):	14193 43907 (view as bug list)
Depends on:	14187
Blocks:	16913
	Show dependency tree / graph

Reported:	2004-02-18 13:55 UTC by Jan Hoogerbrugge
Modified:	2013-06-26 16:15 UTC (History)
CC List:	10 users (show)

See Also:
Host:	linux
Target:	trimedia
Build:	linux
Known to work:
Known to fail:
Last reconfirmed:	2005-09-28 05:21:37

Attachments
Add an attachment (proposed patch, testcase, etc.)

Note You need to log in before you can comment on or make changes to this bug.

Description Jan Hoogerbrugge 2004-02-18 13:55:31 UTC

I have two problems. It could be that they are related.

First. In the following code there is no dependence between the two stores 
which is what I would expect:

void restricttest2(int *restrict p, int *restrict q)
{
        *p = 0;
        *q = 0;
}

However, if only one of the two pointers is restrict then gcc generates a 
dependence. 

void restricttest1(int *restrict p, int *q)
{
        *p = 0;
        *q = 0;
}

According to my interpretation of the restrict spec this is not necessary. 
Also, two other (non-gcc) compilers that I have as a reference don't generate 
this dependence.

The second case is the following loop:

void copy(int *restrict pb, int *restrict qb)
{
        int i;

        for(i = 0; i < 100; i++)
        {
                qb[i] = pb[i]; i++;
                qb[i] = pb[i]; i++;
                qb[i] = pb[i]; i++;
                qb[i] = pb[i];
        }
}

When compiled with -fno-unroll-loops I see lots of dependences between the 
loads and stores. 

_copy:
        .function
        /***************************/
        ident r6 -> r35
        ident r5 -> r34
        ident r0 -> r33
        iaddi(99) r0 -> r36
L5:     asli(2) r33 -> r51
        iaddi(1) r33 -> r52
        asli(2) r52 -> r46
        iaddi(2) r33 -> r47
        iadd r51 r34 -> r50
        asli(2) r47 -> r41
        iaddi(3) r33 -> r42
        iadd r51 r35 -> r48
        ld32 r50 -> r49         [ID=1125][REGION=2]
        iadd r46 r34 -> r45
        st32 r48 r49            [ID=1031][REGION=2]
        asli(2) r42 -> r37
        iadd r46 r35 -> r43
        ld32 r45 -> r44         [ID=1126][REGION=2]
        iadd r41 r34 -> r40
        st32 r43 r44            [ID=1040][REGION=2]
        iadd r41 r35 -> r38
        ld32 r40 -> r39         [ID=1127][REGION=2]
        iadd r37 r34 -> r9
        st32 r38 r39            [ID=1049][REGION=2]
        iaddi(4) r33 -> r33
        iadd r37 r35 -> r8
        ld32 r9 -> r7           [ID=1128][REGION=2]
        st32 r8 r7              [ID=1058][REGION=2]
        igtr r33 r36 -> r6
        if !r6 ijmpi(L5)                [PROB=96]
        /***************************/
        ijmpf r0 r2     /* return */
        .depend 1128 -> 1058
        .depend 1049 -> 1058
        .depend 1127 -> 1049
        .depend 1040 -> 1049 1058
        .depend 1126 -> 1040
        .depend 1031 -> 1040 1049 1058
        .depend 1125 -> 1031
        .endfunction

(The .depend line indicate dependences between instructions (IDs))

However, in this loop there are no dependences at all:

void zero(int *restrict qb)
{
        int i;

        for(i = 0; i < 100; i++)
        {
                qb[i] = 0; i++;
                qb[i] = 0; i++;
                qb[i] = 0; i++;
                qb[i] = 0;
        }
}

I use "gcc version 3.5.0 20040119 (experimental)"

I observe the dependences in the INSN_DEPEND(insn) list in sched_finish().

Comment 1 Andrew Pinski 2004-02-18 14:10:59 UTC

*** Bug 14193 has been marked as a duplicate of this bug. ***

Comment 2 Andrew Pinski 2004-02-18 14:15:43 UTC

I cannot reproduce the problem you are having on powerpc-apple-darwin or i686-pc-linux-gnu with 
3.5-tree-ssa 20040208 (merged 20040126) or 3.5.0 20040204.

Comment 3 Jan Hoogerbrugge 2004-02-18 15:33:33 UTC

I just did a update to version 3.5.0 20040218 and the problem is still there.

Are you not able to reproduce it because you don't have the sources of the my 
target port? If so, should I sent sources of the port?

Comment 4 Jan Hoogerbrugge 2004-02-19 14:02:46 UTC

I can demonstrate the problem on ia64. Here are three cases:

int zero_restrict_pointers(int *p, int *q)
{
        *p = 0;
        return *q;
}

int one_restrict_pointer(int *restrict p, int *q)
{
        *p = 0;
        return *q;
}

int two_restrict_pointers(int *restrict p, int *restrict q)
{
        *p = 0;
        return *q;
}

This gives the following ia64 code (version 20040217):

        .pred.safe_across_calls p1-p5,p16-p63
        .text
        .align 16
        .global zero_restrict_pointers#
        .proc zero_restrict_pointers#
zero_restrict_pointers:
        .prologue
        .body
        .mmb
        nop 0
        st4 [r32] = r0
        nop 0
        .mbb
        ld4 r8 = [r33]
        nop 0
        br.ret.sptk.many b0
        .endp zero_restrict_pointers#
        .align 16
        .global one_restrict_pointer#
        .proc one_restrict_pointer#
one_restrict_pointer:
        .prologue
        .body
        .mmb
        nop 0
        st4 [r32] = r0
        nop 0
        .mbb
        ld4 r8 = [r33]
        nop 0
        br.ret.sptk.many b0
        .endp one_restrict_pointer#
        .align 16
        .global two_restrict_pointers#
        .proc two_restrict_pointers#
two_restrict_pointers:
        .prologue
        .body
        .mmb
        nop 0
        ld4 r8 = [r33]
        nop 0
        .mbb
        st4 [r32] = r0
        nop 0
        br.ret.sptk.many b0
        .endp two_restrict_pointers#
        .ident  "GCC: (GNU) 3.5.0 20040217 (experimental)"


As you can imagine, the scheduler wants to schedule the long latency
load above the store. As you can see, this only happens in the case 
of two restrict pointers. I would also like to see it happen in the
case of one restrict pointer.

Comment 5 Giovanni Bajo 2004-10-11 10:56:18 UTC

Zack, do you confirm that the semantics of restrict (in C99 and/or GNU C) allow 
us to schedule the load before the store even if there is only one restrict 
pointer? Otherwise, this bug report is invalid.

Comment 6 Steven Bosscher 2005-01-23 18:48:42 UTC

Zack, Joseph, could one of you comment on Jan's interpretation of 
the semantics of restrict? 
 
This restrict keyword keeps coming back to haunt us, bah.

Comment 7 Steven Bosscher 2005-01-23 18:51:33 UTC

We should probably have a metabug for restrict.  We don't take advantage 
of the restrict keyword very well either in some (unfortunately proprierty) 
benchmark suites.

Comment 8 jsm-csl@polyomino.org.uk 2005-01-23 20:59:40 UTC

Subject: Re:  Restrict pointers don't help

On Sun, 23 Jan 2005, steven at gcc dot gnu dot org wrote:

> Zack, Joseph, could one of you comment on Jan's interpretation of 
> the semantics of restrict? 

       6.7.3.1  Formal definition of restrict

       [#1]  Let  D be a declaration of an ordinary identifier that
       provides a means of designating an object P as  a  restrict-
       qualified pointer to type T.

       [#2]  If  D appears inside a block and does not have storage
       class extern, let B denote the block.  If D appears  in  the
       list of parameter declarations of a function definition, let
       B denote the associated block.  Otherwise, let B denote  the
       block  of  main (or the block of whatever function is called
       at program startup in a freestanding environment).

I believe this much is straightforward.

       [#3] In what follows, a pointer expression E is said  to  be
       based  on  object  P  if  (at  some  sequence  point  in the
       execution of B prior to the evaluation of E) modifying P  to
       point  to  a copy of the array object into which it formerly
       pointed   would  change  the  value  of  E.117)   Note  that
       ``based'' is  defined  only  for  expressions  with  pointer
       types.

Although things may not be clear in some cases where the representation of 
pointers is inspected bitwise, in plausibly optimizable cases I think the 
meaning of "based on" is OK.

       [#4]  During  each  execution of B, let L be any lvalue that
       has &L based on P.  If L is used to access the value of  the
       object  X that it designates, and X is also modified (by any
       means), then the following requirements apply: T  shall  not
       be  const-qualified.   Every other lvalue used to access the
       value of X shall also have its address based  on  P.   Every

So if something is accessed through a restricted pointer, and it is 
modified, all accesses are based on that pointer: all pointer dereferences 
not based on that pointer are independent of accesses to that which is 
modified.  (It is still valid to have two restricted pointers to the same 
array of char, one used to modify the odd numbered bytes and one to modify 
the even numbered bytes: but a byte accessed through one pointer and 
modified can't be accessed through any pointer not based on the restricted 
pointer, whether or not that other pointer is restricted.)

       access that modifies X shall be considered also to modify P,
       for the purposes of this subclause.  If P  is  assigned  the

For example, given

int *restrict *restrict p;
int *restrict *restrict q;

with **p modified, *p is deemed modified, so even if q == p you can't then 
modify **q as an alias to **p that goes through the same restricted 
pointer object *p.  But given

int *restrict *p;
int *restrict *q;

(without the second level of "restrict") it is possible that p == q and so 
you can modify both **p and **q, as both those expressions involve the 
same restricted pointer object *p (where &*p == &*q == p == q).

       value  of  a  pointer  expression E that is based on another
       restricted pointer object P2, associated with block B2, then
       either  the execution of B2 shall begin before the execution
       of B, or  the  execution  of  B2  shall  end  prior  to  the
       assignment.   If  these  requirements  are not met, then the
       behavior is undefined.

The aliasing information provided by "restrict" only applies during the 
execution of the block which contains the declaration providing the 
"restrict" information.  It is possible to copy the restricted pointer to 
an outside non-restricted pointer, and once the block has finished 
executing that pointer might freely alias; so that outside pointer, of 
similar type but lacking "restrict", can't be treated as completely 
interchangable with the inner pointer although it may have the same value.  

The restrictions on assignment between restricted pointers reduce the 
possible ways in which one restricted pointer might dynamically become 
based on another, but I'm not sure exactly what optimizations this is 
intended to facilitate.  As optimizations don't respect block boundaries, 
both the fact that "restrict" only provides aliasing information during 
its block's execution and the possibility of restricted pointers becoming 
based on other restricted pointers are things to bear in mind.

       [#5] Here an execution  of  B  means  that  portion  of  the
       execution  of  the  program  that  would  correspond  to the
       lifetime of an object with scalar type and automatic storage
       duration associated with B.

       [#6]  A  translator  is  free  to ignore any or all aliasing
       implications of uses of restrict.

Comment 9 Ian Lance Taylor 2005-09-28 05:21:37 UTC

We are not waiting for anything in this bug, as far as I can tell.

Comment 10 Andrew Pinski 2007-06-11 00:34:04 UTC

> The second case is the following loop:

This is just caused by how we represent pointer addition.  I have a fix for that one, we now get the correct aliasing sets for it.

Comment 11 Richard Biener 2008-10-01 14:28:18 UTC

Only two_restrict_pointers is valid.  This is a dup of PR14187.

*** This bug has been marked as a duplicate of 14187 ***

Comment 12 Andrew Pinski 2010-04-27 18:08:24 UTC

*** Bug 43907 has been marked as a duplicate of this bug. ***

Comment 13 Alexey Salmin 2010-04-28 15:53:08 UTC

Sorry, but I still don't get it :( Why exactly we can't remove the second load of "*b"?

void f(int *a, const int *restrict b) {
        *a++ = *b + 1;
        *a++ = *b + 1;
}

Comment 14 Chris King 2013-06-26 16:15:58 UTC

> Only two_restrict_pointers is valid.

> Sorry, but I still don't get it

I agree.  None of the above responses clearly explain why one_restrict_pointer does not represent a valid bug.  The missed optimization still exists in 4.8.0.