Bug 39753 - [4.6/4.7/4.8 Regression] Objective-C(++) and C90 strict-aliasing interaction bug
Summary: [4.6/4.7/4.8 Regression] Objective-C(++) and C90 strict-aliasing interaction bug
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: objc (show other bugs)
Version: unknown
: P5 normal
Target Milestone: 4.6.4
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-13 17:45 UTC by John Engelhart
Modified: 2013-03-05 23:16 UTC (History)
8 users (show)

See Also:
Host:
Target:
Build:
Known to work: 2.95.3
Known to fail: 3.0, 3.1, 3.2.3, 3.3.3, 3.4.0, 4.1.0, 4.3.0, 4.4.0
Last reconfirmed: 2009-04-13 17:51:29


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description John Engelhart 2009-04-13 17:45:51 UTC
Please keep in mind that I'm not a GCC internals expert, and this really requires some analysis from an ObjC maintainer (and expert) along with someone who is familiar with the details of how -fstrict-aliasing works.

See also: http://gcc.gnu.org/ml/gcc/2009-04/msg00290.html

The short version is this:  Currently, it would appear the compiler (and I'm assuming all versions of the compiler that perform -fstrict-aliasing) applies C99 strict-aliasing rules to pointers to Objective-C objects.  Although there is no formal language specification for Objective-C, it would seem that the most appropriate way to treat pointers to Objective-C objects is the same as 'char *' in terms of C99 aliasing rules.  That is to say, a pointer to an Objective-C object can aliasing anything.

As an experiment, I added the following line to c-common.c / c_common_get_alias_set() to determine how often the compiler is applying incorrect aliasing rules while informing the alias analyzer that the pointer can aliasing anything.

if(((c_language == clk_objc) || (c_language == clk_objcxx)) && ((TYPE_LANG_SPECIFIC (t) && (TYPE_LANG_SPECIFIC(t)->objc_info)) || objc_is_object_ptr(t))) {
  warning(OPT_Wstrict_aliasing, "Caught and returning 'can alias anything' for objc type");
  return(0);
}

right before the following line:

if (c_language != clk_c || flag_isoc99)

This returned a number (actually, a lot) of warnings when compiling Objective-C code at -O2 and -Wstrict-aliasing.

In the mailing-list post referenced above, someone mentions that they think that GNUstep uses '-fno-strict-aliasing' when compiling code.  This seems like a good test to see how effective a patch like this is.

Recommendation:

Apply the above 'patch' (or functional equivalent) to the compiler, minus the warning() line, to all versions of the compiler that apply C99 strict-aliasing rules.
Comment 1 Andrew Pinski 2009-04-13 17:51:29 UTC
Confirmed, the aliasing rules were also in ISO C90 (aka ANSI C89).  This has been broken since 3.0 when strict aliasing was enabled by default.
Comment 2 Steven Bosscher 2009-04-13 21:59:05 UTC
Is this really "broken" when the Apple compiler has the same behavior (assuming we all accept that the Apple Objective-C semantics are the de facto standard)?
Comment 3 Andrew Pinski 2009-04-13 22:00:40 UTC
(In reply to comment #2)
> Is this really "broken" when the Apple compiler has the same behavior (assuming
> we all accept that the Apple Objective-C semantics are the de facto standard)?

Apple's compiler does not have the same behavior because they disable strict aliasing by default for all languages.
Comment 4 John Engelhart 2009-04-14 01:15:45 UTC
Another point to consider is whether or not C99's 'restrict' is a legitimate type qualifier for pointers to Objective-C objects.  This is really more of an observation that pragmatically, very little can be said about which object a particular objc object pointer points to at any point in time.  This is beyond the normal 'points to' ambiguity one deals with in C code due to the very dynamic nature of Objective-C. IMHO, a programmer would have an extremely difficult time keeping the promise that 'restrict' implies, even if one wanted to.

I would guess that the 'safest' way of dealing with restrict qualified pointers to objc objects would be to silently drop the restrict qualifier internally, just make it a no-op.

The same could also be said of 'const'.
Comment 5 Richard Biener 2009-08-04 12:30:04 UTC
GCC 4.3.4 is being released, adjusting target milestone.
Comment 6 Richard Biener 2010-05-22 18:13:32 UTC
GCC 4.3.5 is being released, adjusting target milestone.
Comment 7 Iain Sandoe 2011-03-15 11:29:29 UTC
can the language lawyers take a look at this so that we can decide on a way forward?
Comment 8 Richard Biener 2011-03-15 11:49:12 UTC
The easiest way is to attach the may_alias attribute to all object types that
should not be subject to TBAA.  Bonus point if you transition that attribute
to use a flag instead.
Comment 9 mrs@gcc.gnu.org 2011-03-15 18:36:33 UTC
So, I'm sort of skeptical of this problem.  Please engineer a test case that shows bad code.  I think you'll find it is a rather bit harder than you expect.  I think most dynamic things happen at the end of a function call, that you can't see into (the Object-C run-time), and those things that happen before that point, must happen before that point, and those things that happen after that point, can't come before it.  Objective-C adds a ton of these type of calls all over the place, which controls just how far the optimizer can move anything.  Escape analysis should quickly realize that it doesn't own much of anything, which further prevents almost anything from happening.  As for an individual pointer, statically, the type should always be reasonable, though, we do expect to up-cast and downcast pointers.  Some on the C side of things might disagree, but, once the C people realize that up-casting and down-casting are both valid, then even this is fine.  Once you combine all these factors, there is no wiggle room left for the optimizer to do anything.  If you can find any, test case please.  We can then address the specific concern.

Until then, we'll wait for a testcase.

As far as missing language definition bits, please describe a missing bit, be specific.  I can't think of any off the top of my head.
Comment 10 Nicola Pero 2011-03-19 00:22:52 UTC
Mike,

to clarify, the problem is that if you do not use -fno-strict-aliasing
when compiling Objective-C, then compiling any largish Objective-C project
(with perfectly correct Objective-C code) will generate many strict aliasing 
warnings.

The rumours that GNUstep compiles everything with -fno-strict-aliasing are 
correct - that is the case.  The reason is simply to avoid the warnings.

But I guess that this means that all C code that is scattered inside 
Objective-C source files is generally not optimized as much as it could be. :-(

So, it would be nice to clarify the problem once and for all, and make sure
it is safe to use -fstrict-aliasing in Objective-C (and it doesn't generate 
warnings), then GNUstep could stop using -fno-strict-aliasing and people
could get the full benefit of -O2. :-)

The next step is producing a few testcases showing the actual warnings, so we
have something to discuss about. :-)

Thanks
Comment 11 Nicola Pero 2011-03-19 01:24:47 UTC
Having looked at some of the warnings generated in GNUstep if you compile
with -fstrict-aliasing, they seem to be C warnings with little relevance
to Objective-C (they mostly seem to be due to casting to (void **)).

So on that side (the warnings) I guess I am inclined to join Mike and become 
skeptical of this issue.

Unless the original report was that using -fstrict-aliasing would miscompile
Objective-C code ?  We need an example though.

Thanks
Comment 12 Mike Stump 2011-03-19 03:58:25 UTC
Any warnings generated that are invalid are bugs.  These bugs should be filed, and we'll fix them.  Please attach an example file that generate warnings.
Comment 13 John Engelhart 2011-03-19 04:06:28 UTC
I'm the original reporter.

At this point, I no longer have any specific examples that demonstrate this problem (this was filed quite some time ago).

mrs@gcc.gnu.org:

> So, I'm sort of skeptical of this problem.  Please engineer a test case that shows bad code.  

Are the fundamental principles and reasoning for why this could be a problem sound?

Specifically: Can pointers to Objective-C objects break C99's strict aliasing rules?

// objc method rewritten as a C function, for those not familiar with ObjC
void objc_method(
  MYSuperMutableArray self, // subclass of NSMutableArray
  SEL _cmd,
  NSMutableArray *mutableArray,
  NSArray *array) {
  int myCount = count; // compiler rewrites as self->count
  int arrayCount = array->count;
  int mutableArrayCount = mutableArray->count;

  int totalCount = mutableArray->count++;
  totalCount += array->count;
  totalCount += count;
  
}

Does this break C99 strict aliasing rules?

What if self, array, and mutableArray all point to the same thing (i.e., self = array = mutableArray)?

The fact is that Objective-C uses pointers to objects in a way that violates C99 strict aliasing rules.  I don't think anyone has disputed this point.

If this is indeed the case, then there exists the possibility of generating bad optimized code when you apply C99's strict type based aliasing rules to pointers to Objective-C objects.

> I think you'll find it is a rather bit harder than you expect.

How difficult it is, or how unlikely it might be in practice, does not change the fact that C99's strict aliasing rules are not compatible with Objective-C's use of pointers to objects.  As someone suggested, the easiest solution is to simply automatically tag them with "may alias" and be done with it.

This turns it from a "maybe / hypothetically / phase of the moon" in to "can not possibly be a problem / correct by design."

> I think most dynamic things happen at the end of a function call, that you
> can't see into (the Object-C run-time), and those things that happen before
> that point, must happen before that point, and those things that happen after
> that point, can't come before it.  Objective-C adds a ton of these type of
> calls all over the place, which controls just how far the optimizer can move
> anything.  Escape analysis should quickly realize that it doesn't own much of
> anything, which further prevents almost anything from happening.

This is true of any violation of C99's strict aliasing rules- certain bits of codes, or usage styles, can make a huge difference as to whether or not a strict aliasing violation ends up manifesting itself as a genuine problem.

With pure C99 code, the programmer has basically two options: Don't do something that the C99 spec says is undefined behavior, or result to compiler hacks, like adding __attribute__((may_alias)) or -fno-strict-aliasing.  You might not agree with the C99 spec, but the spec is Correct(tm), even the parts that are wrong.

Objective-C has no formal spec.  This is a bit of a problem, for a lot of reasons.  However, it is often defined as a "strict super-set of C".  If you use -std=(gnu|c)99, I interpret this to mean "a strict super-set of C99".  I won't get in to definitions, but both super-set, and the qualifier "strict" have very specific meanings.  The short version is that it doesn't really leave you a whole lot of wiggle room to pick and choose which parts of the C99 spec you want to include in your interpretation of Objective-C- you're basically stuck importing the C99 spec wholesale.

This particular bug is due to the fact that C99 is not Objective-C.  C99 was not written (to my knowledge) in a way that took the needs of Objective-C in to account.

For better or for worse, C99's strict aliasing and Objective-C (as it is used today) just do not get along.  It's unreasonable to expect the vast body of Objective-C code to change to a C99 sanctioned way of accomplishing object oriented inheritance (don't, or use "unions"!), so we're sort of stuck with grand-fathering in Objective-C's wild C99 strict-aliasing violating ways.

> As for an
> individual pointer, statically, the type should always be reasonable, though,
> we do expect to up-cast and downcast pointers.

In Objective-C?  Sure.  But this (usually) violates C99's strict aliasing rules.  This is why this bug exists- Objective-C needs to be able to do this, but it also needs to let the C99 part of the compiler know that the usual type base strict-aliasing rules don't apply to a pointer to an Objective-C object (i.e., treat it as void * or char *, aliases anything).

http://mail-index.netbsd.org/tech-kern/2003/08/11/0001.html
http://mail.python.org/pipermail/python-dev/2003-July/036898.html
http://lkml.org/lkml/2003/2/26/158
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html

From Apples gcc-5664/gcc/config/i386.c:

void
optimization_options (int level, int size ATTRIBUTE_UNUSED)
{
  /* APPLE LOCAL begin disable strict aliasing; breaks too much existing code.  */
#if TARGET_MACHO
  flag_strict_aliasing = 0;
#endif
  /* APPLE LOCAL end disable strict aliasing; breaks too much existing code.  */

You should take particular note of this last little bit- you can't even enable -fstrict-aliasing on Apples compilers, which is arguably the compiler used by the majority of Objective-C developers.  I'd argue that this little detail is masking how big this problem is in practice (i.e., the fact that no one is reporting it is can't be used as an indicator of how wide spread the problem is in real world code).

> Some on the C side of things
> might disagree, but, once the C people realize that up-casting and down-casting
> are both valid, then even this is fine.

Errgh, sorry, but if by "C side of things" you mean C99 as it is defined in WG14/N1256 ISO/IEC 9899:TC3 (aka, the C99 spec), then I'm afraid you're wrong.

> Once you combine all these factors,
> there is no wiggle room left for the optimizer to do anything.

... and?  This is like saying "Most of the time the C99 strict aliasing rules won't bite you in the ass.  And most of the time is similar to almost never.  And almost never has the word never in it, so it's obviously never a problem."

> If you can find
> any, test case please.  We can then address the specific concern.
>
> Until then, we'll wait for a testcase.

What?

Sorry, but it seems to me like you acknowledge that C99's strict aliasing rules could potentially / theoretically cause a problem, then you did a lot of hand waving in which you rationalized that because the specific hand waving scenario you came up with probably doesn't suffer the problem, the problem must not exist.

> As far as missing language definition bits, please describe a missing bit, be
> specific.  I can't think of any off the top of my head.

Sorry, I'm not sure what you mean by "language definition bits".  Do you mean an Objective-C language definition?  There literally is none, unfortunately (as an aside, when this has been brought to Apples attention, the official response was "You should file a documentation bug."  Seriously.)
Comment 14 Jack Howarth 2011-03-19 04:53:53 UTC
(In reply to comment #13)

Have you asked these questions on the cfe-dev mailing list? Isn't clang effectively defining the objective c language now?
Comment 15 mrs@gcc.gnu.org 2011-03-19 06:13:31 UTC
I can't do anything about this until we have a test case.  The last email has a lot of interesting discussion potential, but the bug database is a poor place for it, so, I'll resist the temptation.
Comment 16 Richard Biener 2011-06-27 12:14:36 UTC
4.3 branch is being closed, moving to 4.4.7 target.
Comment 17 Jakub Jelinek 2012-03-13 12:48:05 UTC
4.4 branch is being closed, moving to 4.5.4 target.
Comment 18 Steven Bosscher 2013-03-05 23:16:58 UTC
In WAITING mode since 2011-03-19 06:13:31 UTC 
No test case.
=> INVALID