Bug 66110

Summary: uint8_t memory access not optimized
Product: gcc Reporter: Kevin OConnor <kevin>
Component: middle-endAssignee: Richard Biener <rguenth>
Status: RESOLVED FIXED    
Severity: normal CC: jsm28, rguenth, sjames
Priority: P3 Keywords: alias, missed-optimization
Version: 5.1.0   
Target Milestone: 6.0   
Host: Target:
Build: Known to work:
Known to fail: Last reconfirmed: 2015-05-12 00:00:00

Description Kevin OConnor 2015-05-11 17:02:54 UTC
It appears that gcc does not do a good job of optimizing memory accesses to 'uint8_t' variables.  In particular, it appears as if "strict aliasing" is not done on uint8_t variables, and it appears it is not done even if the uint8_t is in a struct.

============ GCC version:

$ ~/src/install-5.1.0/bin/gcc -v
Using built-in specs.
COLLECT_GCC=/home/kevin/src/install-5.1.0/bin/gcc
COLLECT_LTO_WRAPPER=/home/kevin/src/install-5.1.0/libexec/gcc/x86_64-unknown-linux-gnu/5.1.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-5.1.0/configure --prefix=/home/kevin/src/install-5.1.0 --enable-languages=c
Thread model: posix
gcc version 5.1.0 (GCC) 

=========== Compile command line:

~/src/install-5.1.0/bin/gcc -v -save-temps -O2 -Wall u8alias.c -c

=========== Contents of u8alias.c:

typedef __UINT8_TYPE__ uint8_t;
typedef __UINT16_TYPE__ uint16_t;

struct s1 {
    uint16_t f1;
    uint8_t f2;
};

struct s2 {
    struct s1 *p1;
};

void f1(struct s2 *p)
{
    p->p1->f2 = 9;
    p->p1->f2 = 10;
}

=========== Contents of u8alias.i:

# 1 "u8alias.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "<command-line>" 2
# 1 "u8alias.c"
typedef unsigned char uint8_t;
typedef short unsigned int uint16_t;

struct s1 {
    uint16_t f1;
    uint8_t f2;
};

struct s2 {
    struct s1 *p1;
};

void f1(struct s2 *p)
{
    p->p1->f2 = 9;
    p->p1->f2 = 10;
}

=========== Results of compilation:

$ objdump -ldr u8alias.o

0000000000000000 <f1>:
f1():
   0:   48 8b 07                mov    (%rdi),%rax
   3:   c6 40 02 09             movb   $0x9,0x2(%rax)
   7:   48 8b 07                mov    (%rdi),%rax
   a:   c6 40 02 0a             movb   $0xa,0x2(%rax)
   e:   c3                      retq   

=========== Expected results:

I expected the second redundant load to be eliminated - for example, clang produces this assembler (after replacing the gcc specific uint8_t typedefs with an include of <stdint.h>):

$ clang -Wall -O2 u8alias.c -c
$ objdump -ldr u8alias.o

0000000000000000 <f1>:
f1():
   0:   48 8b 07                mov    (%rdi),%rax
   3:   c6 40 02 0a             movb   $0xa,0x2(%rax)
   7:   c3                      retq   

=========== Other notes:

If the code is changed so that there are two redundant writes to ->f1 then gcc does properly optimize away the first store.  Also, if the above definition of f2 is changed to an 8-bit bitfield (ie, "uint8_t f2 : 8;") then gcc does properly optimize away the first store.

This occurs on other platforms besides x86_64.  (In particular, it happens on avr-gcc where 8-bit integers are very common.)  I reproduced the above on gcc 5.1.0, but I've also seen it on variants of gcc 4.8 and gcc 4.9.

My guess is that the above is the result of an interpretation of the C99 specification - in particular section 6.5:

  An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
[...]
      -- a character type.

However, I do not think that should apply to the above test case for either of the two following reasons:

1 - the memory access was not to a character type, it was to an integer type that happened to be 1 byte in size (ie, a uint8_t type)
2 - the memory access was not to a character type, it was to a member of 'struct s1'.

As noted above, clang (eg, 3.4.2) does perform the expected optimization.
Comment 1 Andrew Pinski 2015-05-11 17:59:03 UTC
Uint8_t is a typedef by the c standard. And we typedef it to unsigned char which means this is a character type.
Comment 2 Kevin OConnor 2015-05-11 18:10:33 UTC
I understand that it is not incorrect to produce two redundant stores with the sample code.  However, I do believe it would be correct to produce a single store.

As such, I think this is an opportunity to improve gcc's optimization.  That is, I think gcc could validly interpret the C spec so that two redundant uint8_t stores could be replaced with one.

If this is not the best place to discuss that, what would be the best forum?
Comment 3 Richard Biener 2015-05-12 08:51:09 UTC
So what happens is that GCC sees

  _3 = p_2(D)->p1;
  _3->f2 = 9;
  _5 = p_2(D)->p1;
  _5->f2 = 10;

and to remove the first store it first has to prove that _3 and _5 are equal.
CSE cannot prove this because it thinks the store to _3->f2 can clobber the
value at p_2(D)->p1 (if p->p1 points to p).

get_alias_set (_3->f2) returns 0, the alias oracle has special code to also
consider the alias set of the base objects:

  /* Do type-based disambiguation.  */
  if (base1_alias_set != base2_alias_set
      && !alias_sets_conflict_p (base1_alias_set, base2_alias_set))
    return false;

but their alias sets happen to conflict because struct s2 alias set is a subset
of the struct s1 alias set (which is because alias set zero is a subset of
the struct s1 alias set...):

struct GTY(()) alias_set_entry_d {
  /* The alias set number, as stored in MEM_ALIAS_SET.  */
  alias_set_type alias_set;

  /* Nonzero if would have a child of zero: this effectively makes this
     alias set the same as alias set zero.  */
  int has_zero_child;

(I've always questioned this... - the code that looks at has_zero_child in
alias_set_subset_of / alias_sets_conflict_p).

Index: gcc/alias.c
===================================================================
--- gcc/alias.c (revision 222996)
+++ gcc/alias.c (working copy)
@@ -470,15 +470,13 @@ alias_sets_conflict_p (alias_set_type se
   /* See if the first alias set is a subset of the second.  */
   ase = get_alias_set_entry (set1);
   if (ase != 0
-      && (ase->has_zero_child
-         || ase->children->get (set2)))
+      && ase->children->get (set2))
     return 1;
 
   /* Now do the same, but with the alias sets reversed.  */
   ase = get_alias_set_entry (set2);
   if (ase != 0
-      && (ase->has_zero_child
-         || ase->children->get (set1)))
+      && ase->children->get (set1))
     return 1;
 
   /* The two alias sets are distinct and neither one is the

fixes this.  I don't remember what broke (but I remember trying this for a few
times - maybe with also changing alias_set_subset_of which isn't that obvious).
Comment 4 Richard Biener 2015-05-12 09:17:53 UTC
Ah, it for example breaks bootstrap via (broken...) -Wstrict-aliasing:

In file included from /space/rguenther/src/svn/trunk/gcc/../libdecnumber/bid/decimal128Local.h:1:0,
                 from /space/rguenther/src/svn/trunk/gcc/dfp.c:43:
/space/rguenther/src/svn/trunk/gcc/dfp.c: In function ‘bool decimal_real_arithmetic(real_value*, tree_code, const real_value*, const real_value*)’:
/space/rguenther/src/svn/trunk/gcc/../libdecnumber/dpd/decimal128Local.h:40:8: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
   { (d)->bytes[WORDS_BIGENDIAN ? 0 : 15] ^= 0x80; }
        ^
/space/rguenther/src/svn/trunk/gcc/dfp.c:704:2: note: in expansion of macro ‘decimal128FlipSign’
  decimal128FlipSign ((decimal128 *) r->sig);
  ^
/space/rguenther/src/svn/trunk/gcc/../libdecnumber/dpd/decimal128Local.h:36:8: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
   { (d)->bytes[WORDS_BIGENDIAN ? 0 : 15] &= ~0x80; }
        ^
/space/rguenther/src/svn/trunk/gcc/dfp.c:714:2: note: in expansion of macro ‘decimal128ClearSign’
  decimal128ClearSign ((decimal128 *) r->sig);
  ^
/space/rguenther/src/svn/trunk/gcc/dfp.c: In function ‘void decimal_real_maxval(real_value*, int, machine_mode)’:
/space/rguenther/src/svn/trunk/gcc/../libdecnumber/dpd/decimal128Local.h:32:8: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
   { (d)->bytes[WORDS_BIGENDIAN ? 0 : 15] |= ((unsigned) (b) << 7); }
        ^
/space/rguenther/src/svn/trunk/gcc/dfp.c:754:5: note: in expansion of macro ‘decimal128SetSign’
     decimal128SetSign ((decimal128 *) r->sig, 1);
     ^
cc1plus: all warnings being treated as errors
make[3]: *** [dfp.o] Error 1


I remember dfp.c _does_ violate strict-aliasing.  Trying --disable-werror.
Comment 5 Richard Biener 2015-05-12 13:19:13 UTC
Bootstrapped / tested on x86_64 with --disable-werror

Running target unix/
FAIL: gcc.dg/alias-2.c  (test for warnings, line 14)
FAIL: gcc.dg/alias-2.c (test for excess errors)

Running target unix//-m32
FAIL: gcc.dg/alias-2.c  (test for warnings, line 14)
FAIL: gcc.dg/alias-2.c (test for excess errors)
Comment 6 Richard Biener 2015-05-13 10:54:15 UTC
Author: rguenth
Date: Wed May 13 10:53:42 2015
New Revision: 223126

URL: https://gcc.gnu.org/viewcvs?rev=223126&root=gcc&view=rev
Log:
2015-05-13  Richard Biener  <rguenther@suse.de>

	PR middle-end/66110
	* alias.c (alias_sets_conflict_p): Do not treat has_zero_child
	specially.
	* Makefile.in (dfp.o-warn): Add -Wno-strict-aliasing.

	* gcc.dg/alias-2.c: Adjust.
	* gcc.dg/tree-ssa/ssa-dse-17.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-17.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/Makefile.in
    trunk/gcc/alias.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/testsuite/gcc.dg/alias-2.c
Comment 7 Richard Biener 2015-05-13 10:54:41 UTC
Fixed on trunk.
Comment 8 Kevin OConnor 2015-05-13 15:43:41 UTC
Thanks!  I can confirm the latest trunk performs the desired optimization.

However, this test case still isn't fully optimized:

void f2(struct s1 *ps1, uint8_t *pi8)
{
    ps1->f1 = 3;
    *pi8 = 8;
    ps1->f1 += 2;
}

That is, an "uint8_t*" still aliases with every other type. The "struct" optimization is more important for my usage, but it is unfortunate that uint8_t*/int8_t* are pessimized.  (In particular, there does not appear to be any way to declare a pointer to an 8 bit integer that doesn't alias every other type.)

I can open a separate bugzilla entry on the above.
Comment 9 Richard Biener 2015-05-18 08:51:21 UTC
(In reply to Kevin OConnor from comment #8)
> Thanks!  I can confirm the latest trunk performs the desired optimization.
> 
> However, this test case still isn't fully optimized:
> 
> void f2(struct s1 *ps1, uint8_t *pi8)
> {
>     ps1->f1 = 3;
>     *pi8 = 8;
>     ps1->f1 += 2;
> }
> 
> That is, an "uint8_t*" still aliases with every other type. The "struct"
> optimization is more important for my usage, but it is unfortunate that
> uint8_t*/int8_t* are pessimized.  (In particular, there does not appear to
> be any way to declare a pointer to an 8 bit integer that doesn't alias every
> other type.)
> 
> I can open a separate bugzilla entry on the above.

Well, so it ultimiatively boils down to GCC defining

#define __UINT8_TYPE__ unsigned char

and that being a character type.  I'm not sure whether the C or C++ standards
mandate that uint8_t be a 'character type' as far as type-based aliasing goes
but I would expect that user-code may rely on that property.  So yes, there
would be no 1-byte type that doesn't get alias-set zero.  Well - you could
use

struct byte { uint8_t b; };

and make sure to use aggregate assignments everywhere, only ever extracting
the actual value from temporary aggregates...
Comment 10 Kevin OConnor 2015-05-18 15:41:41 UTC
I've looked through the C specs (both C99 and C11) and I don't see anything that requires uint8_t (nor int8_t) to be considered "character types".  I do see three relevant sections in the spec which I'm including below.

For aliasing there is section 6.5:

  7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
  [...]
      -- a character type.

For "character types" there is section 6.2.5:

  15 The three types char, signed char, and unsigned char are collectively called the character types. [...]

For uint8_t there is section 7.20.1.1:

     7.20.1.1 Exact-width integer types

  1 The typedef name intN_t designates a signed integer type with width N , no padding bits, and a two's complement representation. Thus, int8_t denotes such a signed integer type with a width of exactly 8 bits.

  2 The typedef name uintN_t designates an unsigned integer type with width N and no padding bits. Thus, uint24_t denotes such an unsigned integer type with a width of exactly 24 bits.

So, my read is that uint8_t must be considered an integer type, but is not required to be considered a "character type".

That said, I understand that changing uint8_t may cause problems for some existing user-code.  I'd think those enabling -fstrict-aliasing would want the optimization benefit though.

I certainly understand if the cost of introducing another integer type and the potential cost of causing issues for existing code is considered too high.  It is unfortunate, though, that there doesn't appear to currently be any way to declare a pointer to a non-aliasing 8-bit integer (that doesn't involve 'struct' hacks).
Comment 11 Andreas Schwab 2015-05-18 15:52:49 UTC
Since typedef does not create a new type the effect of uint8_t is exactly the same as the type it is defined from.  Thus if uint8_t is defined from unsigned char then uint8_t is a character type.
Comment 12 Kevin OConnor 2015-05-18 16:05:00 UTC
(In reply to Andreas Schwab from comment #11)
> Since typedef does not create a new type the effect of uint8_t is exactly
> the same as the type it is defined from.  Thus if uint8_t is defined from
> unsigned char then uint8_t is a character type.

Yes, of course.  My point is that gcc does not need to define uint8_t / __UINT8_TYPE__ as 'unsigned char'.  Instead it could define it as a new integer type (eg, __gcc_uint8_t) that is an 8-bit integer that does not alias.

As before, I understand if the cost of doing this is too high, but it's unfortunate that there currently does not appear to be any way to define a pointer to an 8-bit integer that doesn't alias.
Comment 13 jsm-csl@polyomino.org.uk 2015-05-18 20:43:23 UTC
I concur that it would be valid to define those typedefs to be extended 
integer types withing the special aliasing properties.  The first 
suggestion of that on the GCC lists that I know of is 
<https://gcc.gnu.org/ml/gcc/2000-05/msg01106.html>.  However, it was noted 
in <https://gcc.gnu.org/ml/gcc/2000-07/msg00155.html> that there would be 
C++ issues.  I see that C++ does now have extended integer types, which it 
didn't at that time, but there would still be questions of mangling 
(changing typedefs from standard headers breaks the library ABI for 
anything using those types in C++ interfaces, because of changed 
mangling).  And of course the question of what expectations might be 
broken in C by making these typedefs into extended integer types.

Cf <https://gcc.gnu.org/ml/gcc-patches/2002-01/msg01941.html> implementing 
a char_no_alias_everything attribute.
Comment 14 Kevin OConnor 2015-05-18 22:28:19 UTC
(In reply to joseph@codesourcery.com from comment #13)
> I concur that it would be valid to define those typedefs to be extended 
> integer types withing the special aliasing properties.  The first 
> suggestion of that on the GCC lists that I know of is 
> <https://gcc.gnu.org/ml/gcc/2000-05/msg01106.html>.  However, it was noted 
> in <https://gcc.gnu.org/ml/gcc/2000-07/msg00155.html> that there would be 
> C++ issues.  I see that C++ does now have extended integer types, which it 
> didn't at that time, but there would still be questions of mangling 
> (changing typedefs from standard headers breaks the library ABI for 
> anything using those types in C++ interfaces, because of changed 
> mangling).  And of course the question of what expectations might be 
> broken in C by making these typedefs into extended integer types.

Many thanks.  Those links and the background information is quite helpful.  I agree that breaking the C++ ABI would not make sense on the major platforms.

> Cf <https://gcc.gnu.org/ml/gcc-patches/2002-01/msg01941.html> implementing 
> a char_no_alias_everything attribute.

I do think having something like "char_no_alias_everything" would be useful.  My interest in this discussion was due to the code generation on the embedded AVR platform (where 8-bit integers are very common).  If non-aliasing 8-bit integer types existed, I suspect dealing with ABI breakage would be more tenable on AVR.  (And if nothing else, utilizing them within an individual AVR project would not be difficult.)
Comment 15 rguenther@suse.de 2015-05-19 08:50:09 UTC
On Mon, 18 May 2015, kevin at koconnor dot net wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66110
> 
> --- Comment #12 from Kevin OConnor <kevin at koconnor dot net> ---
> (In reply to Andreas Schwab from comment #11)
> > Since typedef does not create a new type the effect of uint8_t is exactly
> > the same as the type it is defined from.  Thus if uint8_t is defined from
> > unsigned char then uint8_t is a character type.
> 
> Yes, of course.  My point is that gcc does not need to define uint8_t /
> __UINT8_TYPE__ as 'unsigned char'.  Instead it could define it as a new integer
> type (eg, __gcc_uint8_t) that is an 8-bit integer that does not alias.

Yes, agreed.  Like {signed,unsigned,} __gcc_int8_t or so...

> As before, I understand if the cost of doing this is too high, but it's
> unfortunate that there currently does not appear to be any way to define a
> pointer to an 8-bit integer that doesn't alias.

Another possibility would be to introduce an attribute similar to
may_alias, "not_alias" so you can do

typedef unsigned char uint8_t __attribute__((no_alias));

or sth like that.  As that wouldn't alias with

typedef unsigned char my_uint8_t __attribute__((no_alias));

the effects might be surprising though.
Comment 16 Pedro Alves 2016-05-18 20:53:26 UTC
I ran into this, and thought that:

 typedef unsigned int __attribute__ ((__mode__(__byte__))) byte;

or:

 typedef unsigned int __attribute__ ((__mode__(QI))) byte;

might be syntax that already worked for creating an 8-bit type that doesn't alias.  Alas, it does not work.  Maybe it should.
Comment 17 Richard Biener 2016-05-19 07:37:21 UTC
(In reply to Pedro Alves from comment #16)
> I ran into this, and thought that:
> 
>  typedef unsigned int __attribute__ ((__mode__(__byte__))) byte;
> 
> or:
> 
>  typedef unsigned int __attribute__ ((__mode__(QI))) byte;
> 
> might be syntax that already worked for creating an 8-bit type that doesn't
> alias.  Alas, it does not work.  Maybe it should.

The reason it does not work is that we handle it by querying the frontend
for a type corresponding to that mode so it returns unsigned char.