Bug 22485 - pointer +- integer is never NULL
Summary: pointer +- integer is never NULL
Status: SUSPENDED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.0.0
: P2 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks: 78655
  Show dependency treegraph
 
Reported: 2005-07-14 13:10 UTC by Mattias Engdegård
Modified: 2021-08-12 05:17 UTC (History)
1 user (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2012-01-11 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Mattias Engdegård 2005-07-14 13:10:31 UTC
The code

void stuff(void);
void f(int *p, int x)
{
        int *q = p + x;
        if (!q)
                stuff();
}

should never call stuff() - the test is unnecessary since pointer +/- integer is
undefined when the pointer does not point to an object or just past the end of
one (6.5.6 paragraph 8). This is important in cases such as:

static inline struct foo *lookup(struct foo *table, int x)
{
    if (match(table, x))
        return table + x;
    else
        return NULL;
}
...
    struct foo *e = lookup(tbl, x);
    if (e) ...

The code that calls the above function ends up checking for NULL twice: once
inside the (inlined) function, and one after the call. Were Q = P +- I
recognised as implying that P != NULL, Q != NULL (as we are allowed to do
according to the Standard), then the extraneous NULL test could be eliminated.
Comment 1 Mattias Engdegård 2005-07-14 13:13:00 UTC
I forgot to state the version (4.0.0), but I have not seen any gcc version
optimising this case.
Comment 2 Andrew Pinski 2005-07-14 13:14:28 UTC
This is invalid and here is why, if both p and x are NULL/0, then p+x will always be NULL so in this is 
invalid optimization.
Comment 3 Falk Hueffner 2005-07-14 13:19:05 UTC
(In reply to comment #2)
> This is invalid and here is why, if both p and x are NULL/0, then p+x will
always be NULL so in this is 
> invalid optimization.

Huh? nullpointer + 0 is undefined, it doesn't matter what happens in that
case. Reopening.
Comment 4 Daniel Berlin 2005-07-14 13:38:17 UTC
Hmmmm.
I swear we just had this discussion for VRP purposes, and that a bug was
recently fixed so that we don't assume that pointer + - integer is NULL. 

But if it's really undefined, maybe we should optimize it.
Diego?
Comment 5 Falk Hueffner 2005-07-14 13:45:15 UTC
(In reply to comment #4)
> Hmmmm.
> I swear we just had this discussion for VRP purposes, and that a bug was
> recently fixed so that we don't assume that pointer + - integer is NULL. 

Well, I checked C99 and C++, and they explain a few cases, never mentioning
null pointers, and then say everything else is undefined. Maybe this was
different in C89? Seems unlikely, though...
Comment 6 Gabriel Dos Reis 2005-07-14 15:17:03 UTC
Subject: Re:  pointer +- integer is never NULL

"dberlin at gcc dot gnu dot org" <gcc-bugzilla@gcc.gnu.org> writes:

| Hmmmm.
| I swear we just had this discussion for VRP purposes, and that a bug was
| recently fixed so that we don't assume that pointer + - integer is NULL. 

Please notice that there still are implementation of the offset macros
out there, in one form of the other that relies on pointer arithmetic
and NULL.

| But if it's really undefined, maybe we should optimize it.

I would recommend caution.

-- Gaby
Comment 7 Gabriel Dos Reis 2005-07-14 15:19:39 UTC
Subject: Re:  pointer +- integer is never NULL

"falk at debian dot org" <gcc-bugzilla@gcc.gnu.org> writes:

| (In reply to comment #4)
| > Hmmmm.
| > I swear we just had this discussion for VRP purposes, and that a bug was
| > recently fixed so that we don't assume that pointer + - integer is NULL. 
| 
| Well, I checked C99 and C++, and they explain a few cases, never mentioning
| null pointers, and then say everything else is undefined. Maybe this was
| different in C89? Seems unlikely, though...

I'm failing to find anything in the C++ standard that suggests that the
following shall be undefined

   (reinterpret_cast<int*>(0) + 5) - 5

-- Gaby
Comment 8 Falk Hueffner 2005-07-14 15:37:00 UTC
(In reply to comment #7)

> I'm failing to find anything in the C++ standard that suggests that the
> following shall be undefined
> 
>    (reinterpret_cast<int*>(0) + 5) - 5

If (reinterpret_cast<int*>(0) + 5) - 5 is not undefined, then neither is
reinterpret_cast<int*>(0) + 5. Then what is its result, by which paragraph
in the standard?
Comment 9 Gabriel Dos Reis 2005-07-14 19:38:19 UTC
Subject: Re:  pointer +- integer is never NULL

"falk at debian dot org" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Additional Comments From falk at debian dot org  2005-07-14 15:37 -------
| (In reply to comment #7)
| 
| > I'm failing to find anything in the C++ standard that suggests that the
| > following shall be undefined
| > 
| >    (reinterpret_cast<int*>(0) + 5) - 5
| 
| If (reinterpret_cast<int*>(0) + 5) - 5 is not undefined, then neither is
| reinterpret_cast<int*>(0) + 5. Then what is its result, by which paragraph
| in the standard?

The standard says that the mapping used by reinterpret_cast to turn an
integer into a pointer is *implemented-defined*.  It is not undefined.
GCC uses the "obvious" mapping, which is reinterpret_cast<int*>(0) is
the null pointer.

-- Gaby
Comment 10 Falk Hueffner 2005-07-14 21:47:20 UTC
Subject: Re:  pointer +- integer is never NULL

Gabriel Dos Reis <gdr@integrable-solutions.net> writes:

> "falk at debian dot org" <gcc-bugzilla@gcc.gnu.org> writes:
>
> | ------- Additional Comments From falk at debian dot org  2005-07-14 15:37 -------
> | (In reply to comment #7)
> | 
> | > I'm failing to find anything in the C++ standard that suggests that the
> | > following shall be undefined
> | > 
> | >    (reinterpret_cast<int*>(0) + 5) - 5
> | 
> | If (reinterpret_cast<int*>(0) + 5) - 5 is not undefined, then neither is
> | reinterpret_cast<int*>(0) + 5. Then what is its result, by which paragraph
> | in the standard?
>
> The standard says that the mapping used by reinterpret_cast to turn an
> integer into a pointer is *implemented-defined*.  It is not undefined.
> GCC uses the "obvious" mapping, which is reinterpret_cast<int*>(0) is
> the null pointer.

So your example boils down further to the question of whether
((int*)0) + 5 is undefined, but you didn't answer my question
yet. What is the result of ((int*)0) + 5, by which paragraph in the
standard?

Comment 11 Gabriel Dos Reis 2005-07-14 23:03:56 UTC
Subject: Re:  pointer +- integer is never NULL

"falk at debian dot org" <gcc-bugzilla@gcc.gnu.org> writes:

[...]

| > | (In reply to comment #7)
| > | 
| > | > I'm failing to find anything in the C++ standard that suggests that the
| > | > following shall be undefined
| > | > 
| > | >    (reinterpret_cast<int*>(0) + 5) - 5
| > | 
| > | If (reinterpret_cast<int*>(0) + 5) - 5 is not undefined, then neither is
| > | reinterpret_cast<int*>(0) + 5. Then what is its result, by which paragraph
| > | in the standard?
| >
| > The standard says that the mapping used by reinterpret_cast to turn an
| > integer into a pointer is *implemented-defined*.  It is not undefined.
| > GCC uses the "obvious" mapping, which is reinterpret_cast<int*>(0) is
| > the null pointer.
| 
| So your example boils down further to the question of whether
| ((int*)0) + 5 is undefined, but you didn't answer my question
| yet.

No.  I think I made an indirect observation.  

There is no requirement from the standard that (int *)0 is the same as
reinterpret_cast<int*>(0) -- yes, I do not know many of practical
compilers that do things differently, but it needs pointing out.
Therefore, you cannot directly reduce reinterpret_cast<int*>(0) + 5 to
(int *)0 + 5. 

However, for practical purposes, GCC uses the obvious mapping
following the standard intent 5.2.10/4 for reinterpret_cast:

  [...] [Note: it is intended to be unsurprising to those who know the 
  addressing structure of the underlying machine. ]

My indirect observation was that reinterpret_cast is intended for specific
needs that cannot adequately be expressed at the purely object type level.
The result is intended to be unsurprising to those who know the
addressing structure.  Consequently it takes a creative compiler to
make reinterpret_cast<int*>(0) + 5 undefined.  Furthermore, given the
mapping chosen by GCC, it takes even more creative compiler to make
(int *)0 + 5 also undefined. 

There still are reasonable codes for system programming out there
that needs the to go through the play with null pointer -- we, GCC,
even used to distribute such things in the past.

-- Gaby
Comment 12 Falk Hueffner 2005-07-15 06:41:42 UTC
Subject: Re:  pointer +- integer is never NULL

"gdr at integrable-solutions dot net" <gcc-bugzilla@gcc.gnu.org> writes:

> My indirect observation was that reinterpret_cast is intended for
> specific needs that cannot adequately be expressed at the purely
> object type level.  The result is intended to be unsurprising to
> those who know the addressing structure.  Consequently it takes a
> creative compiler to make reinterpret_cast<int*>(0) + 5 undefined.

Sorry, I cannot follow you. I'd find it massively unsurprising if
reinterpret_cast<int*>(0) produces a null pointer, and if I then get
undefined behavior for doing something with it that is undefined for a
null pointer. In fact I'd find it very *surprising* if
reinterpret_cast<int*>(0) behaves different than a normally
constructed null pointer anywhere.

> Furthermore, given the mapping chosen by GCC, it takes even more
> creative compiler to make (int *)0 + 5 also undefined.

And I don't see how that follows, either.

As it seems, arguing with different levels of surprisingness seems to
be somewhat subjective, so I don't think this leads us anywhere.

> There still are reasonable codes for system programming out there
> that needs the to go through the play with null pointer -- we, GCC,
> even used to distribute such things in the past.

This is a more relevant point. I don't think this optimization would
break offsetof-like macros, since they'd use null pointer *constants*,
which we could easily avoid to tag as non-null.

Comment 13 Gabriel Dos Reis 2005-07-15 08:10:04 UTC
Subject: Re:  pointer +- integer is never NULL

"falk at debian dot org" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Additional Comments From falk at debian dot org  2005-07-15 06:41 -------
| Subject: Re:  pointer +- integer is never NULL
| 
| "gdr at integrable-solutions dot net" <gcc-bugzilla@gcc.gnu.org> writes:
| 
| > My indirect observation was that reinterpret_cast is intended for
| > specific needs that cannot adequately be expressed at the purely
| > object type level.  The result is intended to be unsurprising to
| > those who know the addressing structure.  Consequently it takes a
| > creative compiler to make reinterpret_cast<int*>(0) + 5 undefined.
| 
| Sorry, I cannot follow you. I'd find it massively unsurprising if
| reinterpret_cast<int*>(0) produces a null pointer, and if I then get
| undefined behavior for doing something with it that is undefined for a
| null pointer.

But, if I used reinterpret_cast to turn an integer value 0 into a
pointer, there is no reason why the compiler would assume that I do not
know the underlying machine and what I'm doing with the pointer.

| In fact I'd find it very *surprising* if
| reinterpret_cast<int*>(0) behaves different than a normally
| constructed null pointer anywhere.

At least, you get that part of my indirect observation! :-)

| > Furthermore, given the mapping chosen by GCC, it takes even more
| > creative compiler to make (int *)0 + 5 also undefined.
| 
| And I don't see how that follows, either.

if follows from your surprise that reinterpret_cast<int*> does
something different from the null pointer constant (int*)0.

| As it seems, arguing with different levels of surprisingness seems to
| be somewhat subjective, so I don't think this leads us anywhere.

I'm not actually arguing on different level of surprisingness.  I'm
just looking at reinterpret_cast and its implication. 

| > There still are reasonable codes for system programming out there
| > that needs the to go through the play with null pointer -- we, GCC,
| > even used to distribute such things in the past.
| 
| This is a more relevant point. I don't think this optimization would
| break offsetof-like macros, since they'd use null pointer *constants*,
                                                            ^^^^^^^^^^^

For the offsetof *macro*, yes
But that is not the case for codes that uses
reinterpret_cat<int*>(expr), where expr is an integer expression with
value 0.  Scanning a region of memory starting from zero, is not
exactly the kind of thing never done in practice.

| which we could easily avoid to tag as non-null.

so you would have to pretend that a null pointer constant is not null?
That is even more bizarre arithmetic.

-- Gaby
Comment 14 Mattias Engdegård 2005-07-15 09:12:53 UTC
It could be made an option so the user can tell GCC whether to make
standard-conforming code go as fast as possible or if arithmetic on null
pointers (as a gcc extension, say) is needed. -fnull-pointer-arith?

(In reply to comment #13)
> Scanning a region of memory starting from zero, is not
> exactly the kind of thing never done in practice.

True, but that sort of code is already in danger, since GCC assumes that in

      x = *p;
      if (!p) shout();

the condition is never true (even if it is possible to read from location 0).
Comment 15 Gabriel Dos Reis 2005-07-15 10:26:48 UTC
Subject: Re:  pointer +- integer is never NULL

"mattias at virtutech dot se" <gcc-bugzilla@gcc.gnu.org> writes:

| ------- Additional Comments From mattias at virtutech dot se  2005-07-15 09:12 -------
| It could be made an option so the user can tell GCC whether to make
| standard-conforming code go as fast as possible or if arithmetic on null
| pointers (as a gcc extension, say) is needed. -fnull-pointer-arith?
| 
| (In reply to comment #13)
| > Scanning a region of memory starting from zero, is not
| > exactly the kind of thing never done in practice.
| 
| True, but that sort of code is already in danger, since GCC assumes that in
| 
|       x = *p;
|       if (!p) shout();
| 
| the condition is never true (even if it is possible to read from location 0).

True but that is not the kind of codes I'm talking about.

-- Gaby
Comment 16 Pawel Sikora 2005-07-15 10:35:22 UTC
minor comment:
e.g. on ARM7 i can read data from address 0x00000000 (-> exception vector table).

int* p = (int*)0x00000004;
int d = -1;
int *m = *p + d;    // it's valid for ARM arch.
Comment 17 Falk Hueffner 2005-07-15 14:22:21 UTC
(In reply to comment #13)
> Subject: Re:  pointer +- integer is never NULL
> 
> "falk at debian dot org" <gcc-bugzilla@gcc.gnu.org> writes:
> | Sorry, I cannot follow you. I'd find it massively unsurprising if
> | reinterpret_cast<int*>(0) produces a null pointer, and if I then get
> | undefined behavior for doing something with it that is undefined for a
> | null pointer.
> 
> But, if I used reinterpret_cast to turn an integer value 0 into a
> pointer, there is no reason why the compiler would assume that I do not
> know the underlying machine and what I'm doing with the pointer.

The note merely requires the result of the mapping to be unsurprising;
it does not say anything about further operations of this result. Therefore,
it is completely irrelevant here.

> | As it seems, arguing with different levels of surprisingness seems to
> | be somewhat subjective, so I don't think this leads us anywhere.
> 
> I'm not actually arguing on different level of surprisingness.  I'm
> just looking at reinterpret_cast and its implication. 

I don't see you bringing any argument here exept one based on a side note
about surprisingness, which IMHO doesn't even apply here. So I am still
convinced that nullpointer+0 is clearly undefined.

> | This is a more relevant point. I don't think this optimization would
> | break offsetof-like macros, since they'd use null pointer *constants*,
>                                                             ^^^^^^^^^^^
> 
> For the offsetof *macro*, yes
> But that is not the case for codes that uses
> reinterpret_cat<int*>(expr), where expr is an integer expression with
> value 0.  Scanning a region of memory starting from zero, is not
> exactly the kind of thing never done in practice.

Can you give a complete example where this optimization would fail, that you
would consider reasonable and realistic?

> | which we could easily avoid to tag as non-null.
> 
> so you would have to pretend that a null pointer constant is not null?
> That is even more bizarre arithmetic.

I have no trouble doing bizarre arithmetic when the user gives invalid input.
Comment 18 Gabriel Dos Reis 2005-07-15 14:43:12 UTC
Subject: Re:  pointer +- integer is never NULL

"falk at debian dot org" <gcc-bugzilla@gcc.gnu.org> writes:

| (In reply to comment #13)
| > Subject: Re:  pointer +- integer is never NULL
| > 
| > "falk at debian dot org" <gcc-bugzilla@gcc.gnu.org> writes:
| > | Sorry, I cannot follow you. I'd find it massively unsurprising if
| > | reinterpret_cast<int*>(0) produces a null pointer, and if I then get
| > | undefined behavior for doing something with it that is undefined for a
| > | null pointer.
| > 
| > But, if I used reinterpret_cast to turn an integer value 0 into a
| > pointer, there is no reason why the compiler would assume that I do not
| > know the underlying machine and what I'm doing with the pointer.
| 
| The note merely requires the result of the mapping to be unsurprising;
| it does not say anything about further operations of this result. Therefore,
| it is completely irrelevant here.

The "side" notes were written by people who know what they
intend. Therefore their inputs are completely relevant here.

-- Gaby
Comment 19 Falk Hueffner 2005-07-15 15:05:08 UTC
(In reply to comment #18)

> The "side" notes were written by people who know what they
> intend. Therefore their inputs are completely relevant here.

This is going nowhere. I give up.
Comment 20 Mattias Engdegård 2005-07-15 15:29:53 UTC
(In reply to comment #18)
> The "side" notes were written by people who know what they
> intend. Therefore their inputs are completely relevant here.

Even if you could show that these optimisations would contradict the letter
and/or spirit of the C++ standard, this does not mean the same thing for C,
where this clearly (my judgement) is a missed optimisation opportunity for
standard-conforming code.

(It is really two related optimisations: that Q=P+I implies P!=NULL and that it
implies Q!=NULL.)
Comment 21 Gabriel Dos Reis 2005-09-24 05:37:20 UTC
(In reply to comment #0)
> The code
> 
> void stuff(void);
> void f(int *p, int x)
> {
>         int *q = p + x;
>         if (!q)
>                 stuff();
> }
> 
> should never call stuff() - the test is unnecessary since pointer +/- integer is
> undefined when the pointer does not point to an object or just past the end of
> one (6.5.6 paragraph 8). This is important in cases such as:
> 
> static inline struct foo *lookup(struct foo *table, int x)
> {
>     if (match(table, x))
>         return table + x;
>     else
>         return NULL;
> }
> ...
>     struct foo *e = lookup(tbl, x);
>     if (e) ...
> 
> The code that calls the above function ends up checking for NULL twice: once
> inside the (inlined) function, and one after the call. Were Q = P +- I
> recognised as implying that P != NULL, Q != NULL (as we are allowed to do
> according to the Standard), then the extraneous NULL test could be eliminated.


There have been lots of messages exchanged on this topic.  It was just
pointed to me that the C++ standard -- unlike the C99 standard -- has the
following wording 5.7/7:

   If the value 0 is added to or subtracted from a pointer value, the result
   compares equal to the original pointer value. If two pointers point to the
   same object or both point one past the end of the same array or both are
   null, and the two pointers are subtracted, the result compares equal to the
   value 0 converted to the type ptrdiff_t.
  
-- Gaby

  

   
Comment 22 Falk Hueffner 2005-09-24 09:39:28 UTC
(In reply to comment #21)

> There have been lots of messages exchanged on this topic.  It was just
> pointed to me that the C++ standard -- unlike the C99 standard -- has the
> following wording 5.7/7:

Hmm, I missed that. So can we agree now the following implications are valid:

int *p, *q, x;

C99:
q = p +- x formed => p nonnull, q nonnull
x = p - q  formed => p nonnull, q nonnull

C++:
q = p +- x formed, p nonnull => q nonnull
q = p +- x formed, x nonnull => p nonnull, q nonnull
x = p - q  formed, p nonnull => q nonnull
x = p - q  formed, q nonnull => p nonnull
Comment 23 Richard Biener 2012-01-11 14:10:51 UTC
Suspending.  As a matter of QOI treating NULL + 0 as non-NULL sounds like not
a good idea.
Comment 24 Andrew Pinski 2021-08-12 05:17:16 UTC
https://twitter.com/__phantomderp/status/1419067136983638017 is definintely a twitter thread that should be read.