It appears that gcc does not do a good job of optimizing memory accesses to 'uint8_t' variables. In particular, it appears as if "strict aliasing" is not done on uint8_t variables, and it appears it is not done even if the uint8_t is in a struct. ============ GCC version: $ ~/src/install-5.1.0/bin/gcc -v Using built-in specs. COLLECT_GCC=/home/kevin/src/install-5.1.0/bin/gcc COLLECT_LTO_WRAPPER=/home/kevin/src/install-5.1.0/libexec/gcc/x86_64-unknown-linux-gnu/5.1.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc-5.1.0/configure --prefix=/home/kevin/src/install-5.1.0 --enable-languages=c Thread model: posix gcc version 5.1.0 (GCC) =========== Compile command line: ~/src/install-5.1.0/bin/gcc -v -save-temps -O2 -Wall u8alias.c -c =========== Contents of u8alias.c: typedef __UINT8_TYPE__ uint8_t; typedef __UINT16_TYPE__ uint16_t; struct s1 { uint16_t f1; uint8_t f2; }; struct s2 { struct s1 *p1; }; void f1(struct s2 *p) { p->p1->f2 = 9; p->p1->f2 = 10; } =========== Contents of u8alias.i: # 1 "u8alias.c" # 1 "<built-in>" # 1 "<command-line>" # 1 "/usr/include/stdc-predef.h" 1 3 4 # 1 "<command-line>" 2 # 1 "u8alias.c" typedef unsigned char uint8_t; typedef short unsigned int uint16_t; struct s1 { uint16_t f1; uint8_t f2; }; struct s2 { struct s1 *p1; }; void f1(struct s2 *p) { p->p1->f2 = 9; p->p1->f2 = 10; } =========== Results of compilation: $ objdump -ldr u8alias.o 0000000000000000 <f1>: f1(): 0: 48 8b 07 mov (%rdi),%rax 3: c6 40 02 09 movb $0x9,0x2(%rax) 7: 48 8b 07 mov (%rdi),%rax a: c6 40 02 0a movb $0xa,0x2(%rax) e: c3 retq =========== Expected results: I expected the second redundant load to be eliminated - for example, clang produces this assembler (after replacing the gcc specific uint8_t typedefs with an include of <stdint.h>): $ clang -Wall -O2 u8alias.c -c $ objdump -ldr u8alias.o 0000000000000000 <f1>: f1(): 0: 48 8b 07 mov (%rdi),%rax 3: c6 40 02 0a movb $0xa,0x2(%rax) 7: c3 retq =========== Other notes: If the code is changed so that there are two redundant writes to ->f1 then gcc does properly optimize away the first store. Also, if the above definition of f2 is changed to an 8-bit bitfield (ie, "uint8_t f2 : 8;") then gcc does properly optimize away the first store. This occurs on other platforms besides x86_64. (In particular, it happens on avr-gcc where 8-bit integers are very common.) I reproduced the above on gcc 5.1.0, but I've also seen it on variants of gcc 4.8 and gcc 4.9. My guess is that the above is the result of an interpretation of the C99 specification - in particular section 6.5: An object shall have its stored value accessed only by an lvalue expression that has one of the following types: [...] -- a character type. However, I do not think that should apply to the above test case for either of the two following reasons: 1 - the memory access was not to a character type, it was to an integer type that happened to be 1 byte in size (ie, a uint8_t type) 2 - the memory access was not to a character type, it was to a member of 'struct s1'. As noted above, clang (eg, 3.4.2) does perform the expected optimization.
Uint8_t is a typedef by the c standard. And we typedef it to unsigned char which means this is a character type.
I understand that it is not incorrect to produce two redundant stores with the sample code. However, I do believe it would be correct to produce a single store. As such, I think this is an opportunity to improve gcc's optimization. That is, I think gcc could validly interpret the C spec so that two redundant uint8_t stores could be replaced with one. If this is not the best place to discuss that, what would be the best forum?
So what happens is that GCC sees _3 = p_2(D)->p1; _3->f2 = 9; _5 = p_2(D)->p1; _5->f2 = 10; and to remove the first store it first has to prove that _3 and _5 are equal. CSE cannot prove this because it thinks the store to _3->f2 can clobber the value at p_2(D)->p1 (if p->p1 points to p). get_alias_set (_3->f2) returns 0, the alias oracle has special code to also consider the alias set of the base objects: /* Do type-based disambiguation. */ if (base1_alias_set != base2_alias_set && !alias_sets_conflict_p (base1_alias_set, base2_alias_set)) return false; but their alias sets happen to conflict because struct s2 alias set is a subset of the struct s1 alias set (which is because alias set zero is a subset of the struct s1 alias set...): struct GTY(()) alias_set_entry_d { /* The alias set number, as stored in MEM_ALIAS_SET. */ alias_set_type alias_set; /* Nonzero if would have a child of zero: this effectively makes this alias set the same as alias set zero. */ int has_zero_child; (I've always questioned this... - the code that looks at has_zero_child in alias_set_subset_of / alias_sets_conflict_p). Index: gcc/alias.c =================================================================== --- gcc/alias.c (revision 222996) +++ gcc/alias.c (working copy) @@ -470,15 +470,13 @@ alias_sets_conflict_p (alias_set_type se /* See if the first alias set is a subset of the second. */ ase = get_alias_set_entry (set1); if (ase != 0 - && (ase->has_zero_child - || ase->children->get (set2))) + && ase->children->get (set2)) return 1; /* Now do the same, but with the alias sets reversed. */ ase = get_alias_set_entry (set2); if (ase != 0 - && (ase->has_zero_child - || ase->children->get (set1))) + && ase->children->get (set1)) return 1; /* The two alias sets are distinct and neither one is the fixes this. I don't remember what broke (but I remember trying this for a few times - maybe with also changing alias_set_subset_of which isn't that obvious).
Ah, it for example breaks bootstrap via (broken...) -Wstrict-aliasing: In file included from /space/rguenther/src/svn/trunk/gcc/../libdecnumber/bid/decimal128Local.h:1:0, from /space/rguenther/src/svn/trunk/gcc/dfp.c:43: /space/rguenther/src/svn/trunk/gcc/dfp.c: In function ‘bool decimal_real_arithmetic(real_value*, tree_code, const real_value*, const real_value*)’: /space/rguenther/src/svn/trunk/gcc/../libdecnumber/dpd/decimal128Local.h:40:8: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing] { (d)->bytes[WORDS_BIGENDIAN ? 0 : 15] ^= 0x80; } ^ /space/rguenther/src/svn/trunk/gcc/dfp.c:704:2: note: in expansion of macro ‘decimal128FlipSign’ decimal128FlipSign ((decimal128 *) r->sig); ^ /space/rguenther/src/svn/trunk/gcc/../libdecnumber/dpd/decimal128Local.h:36:8: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing] { (d)->bytes[WORDS_BIGENDIAN ? 0 : 15] &= ~0x80; } ^ /space/rguenther/src/svn/trunk/gcc/dfp.c:714:2: note: in expansion of macro ‘decimal128ClearSign’ decimal128ClearSign ((decimal128 *) r->sig); ^ /space/rguenther/src/svn/trunk/gcc/dfp.c: In function ‘void decimal_real_maxval(real_value*, int, machine_mode)’: /space/rguenther/src/svn/trunk/gcc/../libdecnumber/dpd/decimal128Local.h:32:8: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing] { (d)->bytes[WORDS_BIGENDIAN ? 0 : 15] |= ((unsigned) (b) << 7); } ^ /space/rguenther/src/svn/trunk/gcc/dfp.c:754:5: note: in expansion of macro ‘decimal128SetSign’ decimal128SetSign ((decimal128 *) r->sig, 1); ^ cc1plus: all warnings being treated as errors make[3]: *** [dfp.o] Error 1 I remember dfp.c _does_ violate strict-aliasing. Trying --disable-werror.
Bootstrapped / tested on x86_64 with --disable-werror Running target unix/ FAIL: gcc.dg/alias-2.c (test for warnings, line 14) FAIL: gcc.dg/alias-2.c (test for excess errors) Running target unix//-m32 FAIL: gcc.dg/alias-2.c (test for warnings, line 14) FAIL: gcc.dg/alias-2.c (test for excess errors)
Author: rguenth Date: Wed May 13 10:53:42 2015 New Revision: 223126 URL: https://gcc.gnu.org/viewcvs?rev=223126&root=gcc&view=rev Log: 2015-05-13 Richard Biener <rguenther@suse.de> PR middle-end/66110 * alias.c (alias_sets_conflict_p): Do not treat has_zero_child specially. * Makefile.in (dfp.o-warn): Add -Wno-strict-aliasing. * gcc.dg/alias-2.c: Adjust. * gcc.dg/tree-ssa/ssa-dse-17.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-17.c Modified: trunk/gcc/ChangeLog trunk/gcc/Makefile.in trunk/gcc/alias.c trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/alias-2.c
Fixed on trunk.
Thanks! I can confirm the latest trunk performs the desired optimization. However, this test case still isn't fully optimized: void f2(struct s1 *ps1, uint8_t *pi8) { ps1->f1 = 3; *pi8 = 8; ps1->f1 += 2; } That is, an "uint8_t*" still aliases with every other type. The "struct" optimization is more important for my usage, but it is unfortunate that uint8_t*/int8_t* are pessimized. (In particular, there does not appear to be any way to declare a pointer to an 8 bit integer that doesn't alias every other type.) I can open a separate bugzilla entry on the above.
(In reply to Kevin OConnor from comment #8) > Thanks! I can confirm the latest trunk performs the desired optimization. > > However, this test case still isn't fully optimized: > > void f2(struct s1 *ps1, uint8_t *pi8) > { > ps1->f1 = 3; > *pi8 = 8; > ps1->f1 += 2; > } > > That is, an "uint8_t*" still aliases with every other type. The "struct" > optimization is more important for my usage, but it is unfortunate that > uint8_t*/int8_t* are pessimized. (In particular, there does not appear to > be any way to declare a pointer to an 8 bit integer that doesn't alias every > other type.) > > I can open a separate bugzilla entry on the above. Well, so it ultimiatively boils down to GCC defining #define __UINT8_TYPE__ unsigned char and that being a character type. I'm not sure whether the C or C++ standards mandate that uint8_t be a 'character type' as far as type-based aliasing goes but I would expect that user-code may rely on that property. So yes, there would be no 1-byte type that doesn't get alias-set zero. Well - you could use struct byte { uint8_t b; }; and make sure to use aggregate assignments everywhere, only ever extracting the actual value from temporary aggregates...
I've looked through the C specs (both C99 and C11) and I don't see anything that requires uint8_t (nor int8_t) to be considered "character types". I do see three relevant sections in the spec which I'm including below. For aliasing there is section 6.5: 7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types: [...] -- a character type. For "character types" there is section 6.2.5: 15 The three types char, signed char, and unsigned char are collectively called the character types. [...] For uint8_t there is section 7.20.1.1: 7.20.1.1 Exact-width integer types 1 The typedef name intN_t designates a signed integer type with width N , no padding bits, and a two's complement representation. Thus, int8_t denotes such a signed integer type with a width of exactly 8 bits. 2 The typedef name uintN_t designates an unsigned integer type with width N and no padding bits. Thus, uint24_t denotes such an unsigned integer type with a width of exactly 24 bits. So, my read is that uint8_t must be considered an integer type, but is not required to be considered a "character type". That said, I understand that changing uint8_t may cause problems for some existing user-code. I'd think those enabling -fstrict-aliasing would want the optimization benefit though. I certainly understand if the cost of introducing another integer type and the potential cost of causing issues for existing code is considered too high. It is unfortunate, though, that there doesn't appear to currently be any way to declare a pointer to a non-aliasing 8-bit integer (that doesn't involve 'struct' hacks).
Since typedef does not create a new type the effect of uint8_t is exactly the same as the type it is defined from. Thus if uint8_t is defined from unsigned char then uint8_t is a character type.
(In reply to Andreas Schwab from comment #11) > Since typedef does not create a new type the effect of uint8_t is exactly > the same as the type it is defined from. Thus if uint8_t is defined from > unsigned char then uint8_t is a character type. Yes, of course. My point is that gcc does not need to define uint8_t / __UINT8_TYPE__ as 'unsigned char'. Instead it could define it as a new integer type (eg, __gcc_uint8_t) that is an 8-bit integer that does not alias. As before, I understand if the cost of doing this is too high, but it's unfortunate that there currently does not appear to be any way to define a pointer to an 8-bit integer that doesn't alias.
I concur that it would be valid to define those typedefs to be extended integer types withing the special aliasing properties. The first suggestion of that on the GCC lists that I know of is <https://gcc.gnu.org/ml/gcc/2000-05/msg01106.html>. However, it was noted in <https://gcc.gnu.org/ml/gcc/2000-07/msg00155.html> that there would be C++ issues. I see that C++ does now have extended integer types, which it didn't at that time, but there would still be questions of mangling (changing typedefs from standard headers breaks the library ABI for anything using those types in C++ interfaces, because of changed mangling). And of course the question of what expectations might be broken in C by making these typedefs into extended integer types. Cf <https://gcc.gnu.org/ml/gcc-patches/2002-01/msg01941.html> implementing a char_no_alias_everything attribute.
(In reply to joseph@codesourcery.com from comment #13) > I concur that it would be valid to define those typedefs to be extended > integer types withing the special aliasing properties. The first > suggestion of that on the GCC lists that I know of is > <https://gcc.gnu.org/ml/gcc/2000-05/msg01106.html>. However, it was noted > in <https://gcc.gnu.org/ml/gcc/2000-07/msg00155.html> that there would be > C++ issues. I see that C++ does now have extended integer types, which it > didn't at that time, but there would still be questions of mangling > (changing typedefs from standard headers breaks the library ABI for > anything using those types in C++ interfaces, because of changed > mangling). And of course the question of what expectations might be > broken in C by making these typedefs into extended integer types. Many thanks. Those links and the background information is quite helpful. I agree that breaking the C++ ABI would not make sense on the major platforms. > Cf <https://gcc.gnu.org/ml/gcc-patches/2002-01/msg01941.html> implementing > a char_no_alias_everything attribute. I do think having something like "char_no_alias_everything" would be useful. My interest in this discussion was due to the code generation on the embedded AVR platform (where 8-bit integers are very common). If non-aliasing 8-bit integer types existed, I suspect dealing with ABI breakage would be more tenable on AVR. (And if nothing else, utilizing them within an individual AVR project would not be difficult.)
On Mon, 18 May 2015, kevin at koconnor dot net wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66110 > > --- Comment #12 from Kevin OConnor <kevin at koconnor dot net> --- > (In reply to Andreas Schwab from comment #11) > > Since typedef does not create a new type the effect of uint8_t is exactly > > the same as the type it is defined from. Thus if uint8_t is defined from > > unsigned char then uint8_t is a character type. > > Yes, of course. My point is that gcc does not need to define uint8_t / > __UINT8_TYPE__ as 'unsigned char'. Instead it could define it as a new integer > type (eg, __gcc_uint8_t) that is an 8-bit integer that does not alias. Yes, agreed. Like {signed,unsigned,} __gcc_int8_t or so... > As before, I understand if the cost of doing this is too high, but it's > unfortunate that there currently does not appear to be any way to define a > pointer to an 8-bit integer that doesn't alias. Another possibility would be to introduce an attribute similar to may_alias, "not_alias" so you can do typedef unsigned char uint8_t __attribute__((no_alias)); or sth like that. As that wouldn't alias with typedef unsigned char my_uint8_t __attribute__((no_alias)); the effects might be surprising though.
I ran into this, and thought that: typedef unsigned int __attribute__ ((__mode__(__byte__))) byte; or: typedef unsigned int __attribute__ ((__mode__(QI))) byte; might be syntax that already worked for creating an 8-bit type that doesn't alias. Alas, it does not work. Maybe it should.
(In reply to Pedro Alves from comment #16) > I ran into this, and thought that: > > typedef unsigned int __attribute__ ((__mode__(__byte__))) byte; > > or: > > typedef unsigned int __attribute__ ((__mode__(QI))) byte; > > might be syntax that already worked for creating an 8-bit type that doesn't > alias. Alas, it does not work. Maybe it should. The reason it does not work is that we handle it by querying the frontend for a type corresponding to that mode so it returns unsigned char.