The function: int f(int *pi, long *pl) { *pi = 1; // (1) *pl = 0; // (2) return *(char *)pi; // (3) } is optimized (with -O2) to always return 1: f: movl $1, (%rdi) movl $1, %eax movq $0, (%rsi) ret This is wrong if pi and pl both point to the same allocated block. In this case the function should return 0. The first impression could be that it's invalid to call this function with equal arguments as it violates strict aliasing rules. This is wrong. Suppose the function is called like this: void *p = malloc(sizeof(long)); int result = f(p, p); Then (1) is clearly Ok. (2) is fine too because allocated memory can be repurposed freely. C11, 6.5p6, reads: "If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value." (3) is fine according to 6.5p7 because a character type can be used to access anything. Full example with a function: ---------------------------------------------------------------------- extern void *malloc (__SIZE_TYPE__); extern void abort (void); __attribute__((noinline,noclone)) int f(int *pi, long *pl) { *pi = 1; *pl = 0; return *(char *)pi; } int main() { void *p = malloc(sizeof(long)); if (f(p, p) != 0) abort(); } ---------------------------------------------------------------------- Full example with volatile: ---------------------------------------------------------------------- extern void *malloc (__SIZE_TYPE__); extern void abort (void); int main() { void *volatile p = malloc(sizeof(long)); int *pi = p; long *pl = p; *pi = 1; *pl = 0; if (*(char *)pi != 0) abort(); } ---------------------------------------------------------------------- Tested on gcc 6.0.0 20160331. According to https://gcc.godbolt.org/ the bug is present since at least 4.4.7.
Still ok on the GIMPLE level: f (int * pi, long int * pl) { char _6; int _7; <bb 2>: *pi_2(D) = 1; *pl_4(D) = 0; _6 = MEM[(char *)pi_2(D)]; _7 = (int) _6; return _7; confirmed assembler: f: .LFB0: .cfi_startproc movl $1, (%rdi) movl $1, %eax movq $0, (%rsi) ret
It's DSE1 that does this. trying to replace QImode load in insn 9 from SImode store in insn 7 deferring rescan insn with uid = 9. deferring rescan insn with uid = 17. -- replaced the loaded MEM with (reg 92) else if (s_info->rhs) /* Need to see if it is possible for this store to overwrite the value of store_info. If it is, set the rhs to NULL to keep it from being used to remove a load. */ { if (canon_true_dependence (s_info->mem, GET_MODE (s_info->mem), s_info->mem_addr, mem, mem_addr)) { s_info->rhs = NULL; s_info->const_rhs = NULL; } it shouldn't use true_dependence but output_dependence (canon_output_dependence is missing but trivial to add). So the following patch fixes it. Index: gcc/dse.c =================================================================== --- gcc/dse.c (revision 234663) +++ gcc/dse.c (working copy) @@ -1609,10 +1609,7 @@ record_store (rtx body, bb_info_t bb_inf the value of store_info. If it is, set the rhs to NULL to keep it from being used to remove a load. */ { - if (canon_true_dependence (s_info->mem, - GET_MODE (s_info->mem), - s_info->mem_addr, - mem, mem_addr)) + if (output_dependence (s_info->mem, mem)) { s_info->rhs = NULL; s_info->const_rhs = NULL;
4.3.4 works thus this is a regression (possibly DSE got enhanced, I do see the same bogus canon_true_dependence check there). extern void abort (void); int __attribute__((noinline,noclone)) f(int *pi, long *pl) { *pi = 1; *pl = 0; return *(char *)pi; } int main() { char a[sizeof (long)]; if (f ((int *)a, (long *)a) != 0) abort (); return 0; }
Author: rguenth Date: Mon Apr 4 09:30:16 2016 New Revision: 234709 URL: https://gcc.gnu.org/viewcvs?rev=234709&root=gcc&view=rev Log: 2016-04-04 Richard Biener <rguenther@suse.de> PR rtl-optimization/70484 * rtl.h (canon_output_dependence): Declare. * alias.c (canon_output_dependence): New function. * dse.c (record_store): Use canon_output_dependence rather than canon_true_dependence. * gcc.dg/torture/pr70484.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/torture/pr70484.c Modified: trunk/gcc/ChangeLog trunk/gcc/alias.c trunk/gcc/dse.c trunk/gcc/rtl.h trunk/gcc/testsuite/ChangeLog
Fixed on trunk sofar.
Author: rguenth Date: Wed Apr 6 07:45:34 2016 New Revision: 234772 URL: https://gcc.gnu.org/viewcvs?rev=234772&root=gcc&view=rev Log: 2016-04-06 Richard Biener <rguenther@suse.de> Backport from mainline 2016-03-30 Richard Biener <rguenther@suse.de> PR middle-end/70450 * fold-const.c (extract_muldiv_1): Fix thinko in wide_int::from usage. * gcc.dg/torture/pr70450.c: New testcase. 2016-03-22 Richard Biener <rguenther@suse.de> PR middle-end/70333 * fold-const.c (extract_muldiv_1): Properly perform multiplication in the wide type. * gcc.dg/torture/pr70333.c: New testcase. 2016-04-04 Richard Biener <rguenther@suse.de> PR rtl-optimization/70484 * rtl.h (canon_output_dependence): Declare. * alias.c (canon_output_dependence): New function. * dse.c (record_store): Use canon_output_dependence rather than canon_true_dependence. * gcc.dg/torture/pr70484.c: New testcase. 2016-03-31 Richard Biener <rguenther@suse.de> PR c++/70430 * typeck.c (cp_build_binary_op): Fix operand order of vector conditional in truth op handling. * g++.dg/ext/vector30.C: New testcase. Added: branches/gcc-5-branch/gcc/testsuite/g++.dg/ext/vector30.C branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr70333.c branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr70450.c branches/gcc-5-branch/gcc/testsuite/gcc.dg/torture/pr70484.c Modified: branches/gcc-5-branch/gcc/ChangeLog branches/gcc-5-branch/gcc/alias.c branches/gcc-5-branch/gcc/cp/ChangeLog branches/gcc-5-branch/gcc/cp/typeck.c branches/gcc-5-branch/gcc/dse.c branches/gcc-5-branch/gcc/fold-const.c branches/gcc-5-branch/gcc/rtl.h branches/gcc-5-branch/gcc/testsuite/ChangeLog
On 04/04/2016 12:37 PM, rguenth at gcc dot gnu.org wrote: > Fixed on trunk sofar. Thanks. I've checked some variations of the original testcase -- everything works fine now.
Author: rguenth Date: Thu Jul 7 11:46:08 2016 New Revision: 238087 URL: https://gcc.gnu.org/viewcvs?rev=238087&root=gcc&view=rev Log: 2016-07-07 Richard Biener <rguenther@suse.de> Backport from mainline 2016-04-04 Richard Biener <rguenther@suse.de> PR rtl-optimization/70484 * rtl.h (canon_output_dependence): Declare. * alias.c (canon_output_dependence): New function. * dse.c (record_store): Use canon_output_dependence rather than canon_true_dependence. * gcc.dg/torture/pr70484.c: New testcase. 2016-06-08 Richard Biener <rguenther@suse.de> PR tree-optimization/71452 * tree-ssa.c (non_rewritable_lvalue_p): Make sure that the type used for the SSA rewrite has enough precision to cover the dynamic type of the location. * gcc.dg/torture/pr71452.c: New testcase. 2016-05-06 Richard Biener <rguenther@suse.de> PR middle-end/70931 * dwarf2out.c (native_encode_initializer): Skip zero-sized fields. * gfortran.dg/pr70931.f90: New testcase. 2016-03-01 Richard Biener <rguenther@suse.de> PR middle-end/70022 * fold-const.c (fold_indirect_ref_1): Fix range checking for vector BIT_FIELD_REF extract. * gcc.dg/pr70022.c: New testcase. Added: branches/gcc-4_9-branch/gcc/testsuite/g++.dg/torture/pr71452.C branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/pr70022.c branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/torture/pr70484.c branches/gcc-4_9-branch/gcc/testsuite/gcc.dg/torture/pr71452.c branches/gcc-4_9-branch/gcc/testsuite/gfortran.dg/pr70931.f90 Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/alias.c branches/gcc-4_9-branch/gcc/dse.c branches/gcc-4_9-branch/gcc/dwarf2out.c branches/gcc-4_9-branch/gcc/fold-const.c branches/gcc-4_9-branch/gcc/rtl.h branches/gcc-4_9-branch/gcc/testsuite/ChangeLog branches/gcc-4_9-branch/gcc/tree-ssa.c
Fixed.