This occurs on both powerpc64 LE and BE. I am not sure about other architectures. /home/seurer/gcc/cpu2006b/bin/specinvoke -E -d /home/seurer/gcc/cpu2006b/benchspec/CPU2006/416.gamess/run/run_base_test_base_64.0000 -c 1 -e compare.err -o compare.stdout -f compare.cmd **************************************** Contents of exam29.err **************************************** Note: The following floating-point exceptions are signalling: IEEE_DIVIDE_BY_ZERO IEEE_UNDERFLOW_FLAG STOP IN ABRT **************************************** *** Miscompare of exam29.out; for details see /home/seurer/gcc/cpu2006b/benchspec/CPU2006/416.gamess/run/run_base_test_base_64.0000/exam29.out.mis Invalid run; unable to continue. Contents of /home/seurer/gcc/cpu2006b/benchspec/CPU2006/416.gamess/run/run_base_test_base_64.0000/exam29.out.mis 0452: MEMORY REQUIREMENTS FOR SEGMENTED MP2 TRANSFORMATION INTEGRAL TRANSFORMATION CANNOT ASSIGN A DEFINITE ORBITAL SYMMETRY TO MO 1 ^ 0453: MINIMUM= 52902 WORDS, USING 1 MOLECULAR ORBITAL PER PASS INTEGRAL TRANSFORMATION CANNOT ASSIGN A DEFINITE ORBITAL SYMMETRY TO MO 2 ^ 0454: MAXIMUM= 238032 WORDS, MAKING ONLY 1 INTEGRAL PASS INTEGRAL TRANSFORMATION CANNOT ASSIGN A DEFINITE ORBITAL SYMMETRY TO MO 3 ^ 0455: CHOOSING THE SEGMENTED MP2 TRANSFORMATION... INTEGRAL TRANSFORMATION CANNOT ASSIGN A DEFINITE ORBITAL SYMMETRY TO MO 4 ^ 0456: NUMBER OF MOS/PASS = 11 INTEGRAL TRANSFORMATION CANNOT ASSIGN A DEFINITE ORBITAL SYMMETRY TO MO 5 ^ 0457: NUMBER OF PASSES = 1 INTEGRAL TRANSFORMATION CANNOT ASSIGN A DEFINITE ORBITAL SYMMETRY TO MO 6 ^ 0458: MEMORY USED = 238032 WORDS. INTEGRAL TRANSFORMATION CANNOT ASSIGN A DEFINITE ORBITAL SYMMETRY TO MO 7 ^ 0459: PASS # 1 TOOK 0.00 SECONDS. INTEGRAL TRANSFORMATION CANNOT ASSIGN A DEFINITE ORBITAL SYMMETRY TO MO 8 ^ 0460: DONE WITH PARTIAL TRANSFORMATION (PQ|RS) TO (IA|JB) INTEGRAL TRANSFORMATION CANNOT ASSIGN A DEFINITE ORBITAL SYMMETRY TO MO 9 ^ 0461: RESULTS OF MOLLER-PLESSET 2ND ORDER CORRECTION ARE INTEGRAL TRANSFORMATION CANNOT ASSIGN A DEFINITE ORBITAL SYMMETRY TO MO 11 ^
Please just retry with r232559 or newer. *** This bug has been marked as a duplicate of bug 69336 ***
This still fails on AArch64 in exactly the same way with latest trunk - can someone reopen this? I don't seem to have the right permissions...
Reopening at Wilco's request
So - can you please bisect to a source file (just drop all others to -O0) at least?
This still fails on AArch64 in exactly the same way with latest trunk - can someone reopen this? I don't seem to have the right permissions... (In reply to Richard Biener from comment #4) > So - can you please bisect to a source file (just drop all others to -O0) at > least? Yes I am working on that. It's OK with O2, but starts failing with -O3/-Ofast.
This still fails on AArch64 in exactly the same way with latest trunk - can someone reopen this? I don't seem to have the right permissions... (In reply to Richard Biener from comment #4) > So - can you please bisect to a source file (just drop all others to -O0) at > least? It looks like the failure is in mp2.fppized.o. Compiling it with -O3 -fomit-frame-pointer -fno-aggressive-loop-optimizations -fno-inline causes it to fail the exam29 test with the recent tree-ssa-scopedtables.c changes, while it passes without them. The diff is quite large so it's hard to tell which function it is...
On Mon, 1 Feb 2016, wdijkstr at arm dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > --- Comment #6 from Wilco <wdijkstr at arm dot com> --- > This still fails on AArch64 in exactly the same way with latest trunk - can > someone reopen this? I don't seem to have the right permissions... > (In reply to Richard Biener from comment #4) > > So - can you please bisect to a source file (just drop all others to -O0) at > > least? > > It looks like the failure is in mp2.fppized.o. Compiling it with -O3 > -fomit-frame-pointer -fno-aggressive-loop-optimizations -fno-inline causes it > to fail the exam29 test with the recent tree-ssa-scopedtables.c changes, while > it passes without them. The diff is quite large so it's hard to tell which > function it is... Probably easier to look at a diff backing out the mem-ref hashing changes.
In a few functions GCC decides that the assignments in loops are redundant. The loops still execute but have their loads and stores removed. Eg. the first DO loop in MP2NRG should be: .L1027: add w7, w7, 1 add w8, w8, w10 cmp w7, w26 beq .L1026 .L1029: add w0, w11, w7 add x0, x2, x0, sxtw 3 ldr x1, [x0, -8] add x0, x2, x7, sxtw 3 str x1, [x0, -8] cmp w9, 0 ble .L1027 But with the scopedtables change it becomes: .L1027: add w2, w2, 1 cmp w2, w3 beq .L1026 .L1029: cmp w4, 0 ble .L1027
The loops get optimized away in dom2. The info this phase emits is hard to figure out, so it's not obvious why it thinks the array assignments are redundant (the array is used all over the place so clearly cannot be removed).
The stores are getting optimized out because equal_mem_array_ref_p considers equal pairs of MEM_REFS like fmcom.x[_168] and fmcom.x[_208] That is, a ARRAY_REF whose first operand is a COMPONENT_REF fmcom.x (of a VAR_DECL and a FIELD_DECL), and whose second operand is an SSA_NAME _168 or _208; I don't see anything obvious to suggest that they should be equal). get_ref_base_and_extent then returns base=fmcom, size=64, max_size=64 (so not a variable-sized access), and offset 0 :-(.
Bet we shouldn't use get_ref_base_and_extent comparisons if there is a variable ARRAY_REF index and the corresponding array is flexible array member or poor man's flexible array member.
Otherwise, we happily consider in struct S { char a[1]; int b; char c[1]; }; void foo (struct S *p, int i, int j) { p->c[i] ... p->c[j] } the above two as equivalent, while we should only consider those for p->a[i] and p->a[j], where it is only valid to use i == j == 0. If char c[2]; is used, then max_size would indicate variable access, but for poor man's flexible array member of size 1 array we need to avoid this, both during hashing and during equality comparison.
Seems get_ref_base_and_extent already has code for some cases of the flexible array members, but it has apparently some dead code in it that wasn't really meant to be dead (hint, this check is after a while (1) { switch (TREE_CODE (exp)) { ... } exp = TREE_OPERAND (exp, 0); } loop, thus break inside of the switch does not terminate the loop and the only way to get out of the loop is then goto done;, but that bypasses the stmt): --- gcc/tree-dfa.c.jj 2016-01-21 08:58:44.000000000 +0100 +++ gcc/tree-dfa.c 2016-02-04 11:18:40.061621702 +0100 @@ -588,6 +588,7 @@ get_ref_base_and_extent (tree exp, HOST_ exp = TREE_OPERAND (exp, 0); } + done: /* We need to deal with variable arrays ending structures. */ if (seen_variable_array_ref && maxsize != -1 @@ -597,7 +598,6 @@ get_ref_base_and_extent (tree exp, HOST_ == wi::to_offset (TYPE_SIZE (TREE_TYPE (exp)))))) maxsize = -1; - done: if (!wi::fits_shwi_p (bitsize) || wi::neg_p (bitsize)) { *poffset = 0; But I guess even this doesn't help, while it will help poor man's flexible array members in C/C++, in this Fortran case the problem is that there is COMMON /FMCOM/ X(1) in this TU, while COMMON /FMCOM / X(1000000) in some other TU, and get_ref_base_and_extent caps maxsize at DECL_SIZE - bit_offset (which is generally right, but Fortran COMMON is just weird). So, perhaps we want some flag on the Fortran COMMON decls that would be set on COMMON that ends with an array and would tell get_ref_base_and_extent (and other spots?) that accesses can be beyond end of the decl?
> But I guess even this doesn't help, while it will help poor man's flexible > array members in C/C++, in this Fortran case the problem is that there is > COMMON /FMCOM/ X(1) > in this TU, while > COMMON /FMCOM / X(1000000) > in some other TU, and get_ref_base_and_extent caps maxsize at > DECL_SIZE - bit_offset (which is generally right, but Fortran COMMON > is just weird). This invalid Fortran. > So, perhaps we want some flag on the Fortran COMMON decls that would be > set on COMMON that ends with an array and would tell get_ref_base_and_extent > (and other spots?) that accesses can be beyond end of the decl? Related to/ duplicate of pr44882?
Note the dead code is there since r204391. Perhaps we need goto done; vs. goto done2;, so that for MEM_REF/TARGET_MEM_REF we bypass this check?
(In reply to Dominique d'Humieres from comment #14) > > But I guess even this doesn't help, while it will help poor man's flexible > > array members in C/C++, in this Fortran case the problem is that there is > > COMMON /FMCOM/ X(1) > > in this TU, while > > COMMON /FMCOM / X(1000000) > > in some other TU, and get_ref_base_and_extent caps maxsize at > > DECL_SIZE - bit_offset (which is generally right, but Fortran COMMON > > is just weird). > > This invalid Fortran. Yes, but this is SPEC2k6, so we need some workaround or workaround option or something for that I'm afraid. > > So, perhaps we want some flag on the Fortran COMMON decls that would be > > set on COMMON that ends with an array and would tell get_ref_base_and_extent > > (and other spots?) that accesses can be beyond end of the decl? > > Related to/ duplicate of pr44882?
(In reply to Jakub Jelinek from comment #16) > (In reply to Dominique d'Humieres from comment #14) > > > But I guess even this doesn't help, while it will help poor man's flexible > > > array members in C/C++, in this Fortran case the problem is that there is > > > COMMON /FMCOM/ X(1) > > > in this TU, while > > > COMMON /FMCOM / X(1000000) > > > in some other TU, and get_ref_base_and_extent caps maxsize at > > > DECL_SIZE - bit_offset (which is generally right, but Fortran COMMON > > > is just weird). > > > > This invalid Fortran. > > Yes, but this is SPEC2k6, so we need some workaround or workaround option or > something for that I'm afraid. No, tonto had this same crap and we required patching it. That said, all users of get_ref_base_and_extent are effected here (SRA, FRE, PRE, DSE), so this isn't sth new - it's just that now we catch this as well after loop opts did enough massaging to the IL to expose it. > > > So, perhaps we want some flag on the Fortran COMMON decls that would be > > > set on COMMON that ends with an array and would tell get_ref_base_and_extent > > > (and other spots?) that accesses can be beyond end of the decl? > > > > Related to/ duplicate of pr44882?
(In reply to Jakub Jelinek from comment #13) > Seems get_ref_base_and_extent already has code for some cases of the > flexible array members, but it has apparently some dead code in it that > wasn't really meant to be dead (hint, this check is after a while (1) { > switch (TREE_CODE (exp)) { ... } exp = TREE_OPERAND (exp, 0); } loop, thus > break inside of the switch does not terminate the loop and the only way to > get out of the loop is then goto done;, but that bypasses the stmt): > > --- gcc/tree-dfa.c.jj 2016-01-21 08:58:44.000000000 +0100 > +++ gcc/tree-dfa.c 2016-02-04 11:18:40.061621702 +0100 > @@ -588,6 +588,7 @@ get_ref_base_and_extent (tree exp, HOST_ > exp = TREE_OPERAND (exp, 0); > } > > + done: > /* We need to deal with variable arrays ending structures. */ > if (seen_variable_array_ref > && maxsize != -1 > @@ -597,7 +598,6 @@ get_ref_base_and_extent (tree exp, HOST_ > == wi::to_offset (TYPE_SIZE (TREE_TYPE (exp)))))) > maxsize = -1; > > - done: > if (!wi::fits_shwi_p (bitsize) || wi::neg_p (bitsize)) > { > *poffset = 0; > > But I guess even this doesn't help, while it will help poor man's flexible > array members in C/C++, get_ref_base_and_extent is supposed to properly detect trailing arrays in case the size cannot be adjusted by the underlying decls size. You don't mention which goto done you think is broken here, the only relvant one is that from the [TARGET_]MEM_REF case which sets maxsize to -1 itself. I don't remember what the seen_variable_array_ref case after the while (1) loop body was for. Care to build a C testcase that goes wrong and that your patch would fix? > in this Fortran case the problem is that there is > COMMON /FMCOM/ X(1) > in this TU, while > COMMON /FMCOM / X(1000000) > in some other TU, and get_ref_base_and_extent caps maxsize at DECL_SIZE - > bit_offset (which is generally right, but Fortran COMMON is just weird). > So, perhaps we want some flag on the Fortran COMMON decls that would be set > on COMMON that ends with an array and would tell get_ref_base_and_extent > (and other spots?) that accesses can be beyond end of the decl? We certainly don't want to do that, it will pessimize all code that wants max_size != -1 for commons.
Ok, so after discussions on IRC, I'm going to just remove the unreachable code from get_ref_base_and_extent, and the conclusion is that this is a 416.gamess bug. If the Fortran folks want to provide a workaround on the Fortran FE side, it is possible, otherwise let's close this as INVALID.
Hmmm, hang on. In unport.fppized.f, shouldn't we be using the 'F2C/GCC COMPILER ON PC RUNNING UNIX (LINUX,BSD386,ETC)' version? In which case X has size (1) everywhere?
On Thu, 4 Feb 2016, alalaw01 at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > --- Comment #20 from alalaw01 at gcc dot gnu.org --- > Hmmm, hang on. In unport.fppized.f, shouldn't we be using the 'F2C/GCC COMPILER > ON PC RUNNING UNIX (LINUX,BSD386,ETC)' version? In which case X has size (1) > everywhere? It's called DM elsewhere, just look for /FMCOM/ and compare its contents
This is a dup, our testers have local fixes since that. *** This bug has been marked as a duplicate of bug 53086 ***
Well, this one is not fixed by -fno-aggressive-loop-optimizations.
FIXED??
(In reply to alalaw01 from comment #23) > Well, this one is not fixed by -fno-aggressive-loop-optimizations. No, that just disabled one symptom of the issue at that point in time. Fixing the issue also fixes this occurance (well, I hope so ;))
Author: jakub Date: Thu Feb 4 22:15:33 2016 New Revision: 233153 URL: https://gcc.gnu.org/viewcvs?rev=233153&root=gcc&view=rev Log: PR fortran/69368 * tree-dfa.c (get_ref_base_and_extent): Remove unreachable code. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-dfa.c
(In reply to Richard Biener from comment #25) > (In reply to alalaw01 from comment #23) > > Well, this one is not fixed by -fno-aggressive-loop-optimizations. > > No, that just disabled one symptom of the issue at that point in time. > Fixing the issue also fixes this occurance (well, I hope so ;)) So by "fixing the issue" - we mean, making --std=legacy prevent this (as although against the SPEC, colleagues with more FORTRAN knowledge than I suggest this is common)? SPEC seem to be saying they will not change the source: https://www.spec.org/cpu2006/Docs/faq.html#Run.05 As Jakub suggested in comment #13: > So, perhaps we want some flag on the Fortran COMMON decls that would be set on > COMMON that ends with an array and would tell get_ref_base_and_extent (and > other spots?) that accesses can be beyond end of the decl? but only if --std=legacy ? ? ? Should I raise a new bug for this, as both this and 53068 are CLOSED?
On Fri, 5 Feb 2016, alalaw01 at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > alalaw01 at gcc dot gnu.org changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > Resolution|INVALID |FIXED > > --- Comment #27 from alalaw01 at gcc dot gnu.org --- > (In reply to Richard Biener from comment #25) > > (In reply to alalaw01 from comment #23) > > > Well, this one is not fixed by -fno-aggressive-loop-optimizations. > > > > No, that just disabled one symptom of the issue at that point in time. > > Fixing the issue also fixes this occurance (well, I hope so ;)) > > So by "fixing the issue" - we mean, making --std=legacy prevent this (as > although against the SPEC, colleagues with more FORTRAN knowledge than I > suggest this is common)? SPEC seem to be saying they will not change the > source: https://www.spec.org/cpu2006/Docs/faq.html#Run.05 > > > As Jakub suggested in comment #13: > > > So, perhaps we want some flag on the Fortran COMMON decls that would be set on > COMMON that ends with an array and would tell get_ref_base_and_extent (and > > other spots?) that accesses can be beyond end of the decl? > > but only if --std=legacy ? ? ? > > Should I raise a new bug for this, as both this and 53068 are CLOSED? I think this has been discussed in some other dup already and the Fortran FE folks disagreed (it was never "legal", not even in F77). I also don't see how it can be a FE only fix. Possibly we can implemnet a middle-end switch that tells us that the size of commons is not to be trusted. The FE could then set that flag with -std=legacy. You can, after all, "simulate" the very same failure with C.
(In reply to rguenther@suse.de from comment #28) > On Fri, 5 Feb 2016, alalaw01 at gcc dot gnu.org wrote: > > Should I raise a new bug for this, as both this and 53068 are CLOSED? > > I think this has been discussed in some other dup already and > the Fortran FE folks disagreed (it was never "legal", not even in F77). > > I also don't see how it can be a FE only fix. Possibly we can > implemnet a middle-end switch that tells us that the size of commons > is not to be trusted. The FE could then set that flag with -std=legacy. > > You can, after all, "simulate" the very same failure with C. Isn't there already a special exception for C (array of size 1 at end of structure)? The same exception could be enabled with -std=legacy. You'd only need to do extra FE work if you wanted to just do this for COMMON, but that seems hardly worth the extra effort - how much gain do we really get from the array size 1 optimization apart from repeatedly breaking SPEC benchmarks in various ways? Disabling it will likely remove the need for -fno-aggressive-loop-optimizations as well.
(In reply to Wilco from comment #29) > (In reply to rguenther@suse.de from comment #28) > > On Fri, 5 Feb 2016, alalaw01 at gcc dot gnu.org wrote: > > > > Should I raise a new bug for this, as both this and 53068 are CLOSED? > > > > I think this has been discussed in some other dup already and > > the Fortran FE folks disagreed (it was never "legal", not even in F77). > > > > I also don't see how it can be a FE only fix. Possibly we can > > implemnet a middle-end switch that tells us that the size of commons > > is not to be trusted. The FE could then set that flag with -std=legacy. > > > > You can, after all, "simulate" the very same failure with C. > > Isn't there already a special exception for C (array of size 1 at end of > structure)? Yes, the exception is for flexible array members and what people commonly use as flexible array member replacement (basically [0] sized arrays and all other arrays at the end of structures/unions, unless followed by some other field in outer structure or by another array member. But, all those exceptions are for the case how you can legally use flexible array members, i.e. heap allocated objects (or mmap etc.), so pretty much where the base is a pointer dereference. If the base is a symbol, the only supported case is the GNU extension of initializing flexible array members. Otherwise, if you have size of some decl 32 bytes, trying to access 32th and following byte is considered undefined. And what Richi suggests, some new switch which would allow treating DECL_COMMON bases more conservatively, expecting that if you have a flexible array member or something like that at the end of a DECL_COMMON decl that in (invalid) program it might be unified with a larger DECL_COMMON from other TU.
On Fri, 5 Feb 2016, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > --- Comment #30 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > (In reply to Wilco from comment #29) > > (In reply to rguenther@suse.de from comment #28) > > > On Fri, 5 Feb 2016, alalaw01 at gcc dot gnu.org wrote: > > > > > > Should I raise a new bug for this, as both this and 53068 are CLOSED? > > > > > > I think this has been discussed in some other dup already and > > > the Fortran FE folks disagreed (it was never "legal", not even in F77). > > > > > > I also don't see how it can be a FE only fix. Possibly we can > > > implemnet a middle-end switch that tells us that the size of commons > > > is not to be trusted. The FE could then set that flag with -std=legacy. > > > > > > You can, after all, "simulate" the very same failure with C. > > > > Isn't there already a special exception for C (array of size 1 at end of > > structure)? > > Yes, the exception is for flexible array members and what people commonly use > as flexible array member replacement (basically [0] sized arrays and all other > arrays at the end of structures/unions, unless followed by some other field in > outer structure or by another array member. > But, all those exceptions are for the case how you can legally use flexible > array members, i.e. heap allocated objects (or mmap etc.), so pretty much where > the base is a pointer dereference. If the base is a symbol, the only supported > case is the GNU extension of initializing flexible array members. > Otherwise, if you have size of some decl 32 bytes, trying to access 32th and > following byte is considered undefined. > > And what Richi suggests, some new switch which would allow treating DECL_COMMON > bases more conservatively, expecting that if you have a flexible array member > or something like that at the end of a DECL_COMMON decl that in (invalid) > program it might be unified with a larger DECL_COMMON from other TU. Yes. Note there isn't really a special-casing of [1] sized arrays. It just happens to fall out of the case where maxsize ends up equal to size. Thus a "fix" for the case where treating a[i] as a[0] is the issue would be Index: gcc/tree-dfa.c =================================================================== --- gcc/tree-dfa.c (revision 233172) +++ gcc/tree-dfa.c (working copy) @@ -617,7 +617,11 @@ get_ref_base_and_extent (tree exp, HOST_ if (maxsize == -1 && DECL_SIZE (exp) && TREE_CODE (DECL_SIZE (exp)) == INTEGER_CST) - maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; + { + maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; + if (maxsize == size) + maxsize = -1; + } } else if (CONSTANT_CLASS_P (exp)) { but that wouldn't fix the aggressive-loop optimization issue as that is _not_ looking at DECL_SIZE but at the array types domain. Fortran has a union/record type with one array element of size one. A Fortran FE fix for that would be to make that array have unknown size but still somehow force a size to the decl (the FE doesn't know there is another unit that also defines the decl but with some larger size). We have a similar issue with deriving alignment from DECL_COMMONs where appearantly the ELF spec doesn't say anything here and thus we give up and can't re-align global DECL_COMMONs for better vectorization for example.
(In reply to rguenther@suse.de from comment #31) > > Thus a "fix" for the case where treating a[i] as a[0] is the issue > would be > > Index: gcc/tree-dfa.c > =================================================================== > --- gcc/tree-dfa.c (revision 233172) > +++ gcc/tree-dfa.c (working copy) > @@ -617,7 +617,11 @@ get_ref_base_and_extent (tree exp, HOST_ > if (maxsize == -1 > && DECL_SIZE (exp) > && TREE_CODE (DECL_SIZE (exp)) == INTEGER_CST) > - maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; > + { > + maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; > + if (maxsize == size) > + maxsize = -1; > + } > } > else if (CONSTANT_CLASS_P (exp)) > { Maybe if we only did that for DECL_COMMONs if -std=legacy was in force? Tho as you say: > but that wouldn't fix the aggressive-loop optimization issue as that is > _not_ looking at DECL_SIZE but at the array types domain. I wonder if we can't get both places looking at the same thing (DECL_SIZE or array type domain), but I haven't looked into that at all.
(In reply to rguenther@suse.de from comment #31) > > Thus a "fix" for the case where treating a[i] as a[0] is the issue > would be > > Index: gcc/tree-dfa.c > =================================================================== > --- gcc/tree-dfa.c (revision 233172) > +++ gcc/tree-dfa.c (working copy) > @@ -617,7 +617,11 @@ get_ref_base_and_extent (tree exp, HOST_ > if (maxsize == -1 > && DECL_SIZE (exp) > && TREE_CODE (DECL_SIZE (exp)) == INTEGER_CST) > - maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; > + { > + maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; > + if (maxsize == size) > + maxsize = -1; > + } > } > else if (CONSTANT_CLASS_P (exp)) > { So is there a case where we want this for C ? If I declare a struct with a VLA, and access it through a pointer - GCC recognizes the VLA idiom and keeps the accesses. If I access it from a decl, yes we optimize away the out-of-bounds accesses (in FRE, long before we reach the tree-ssa-scopedtables changes). So OK, if I access it from a extern or __attribute__((weak) decl, which I then get the linker to replace with a bigger decl, then I get "wrong" code (it ignores the extra elements in the bigger decl) - but I'd say that was invalid code. So if this is Fortran-only, we probably have to hook off --std=legacy, right?
On February 8, 2016 4:49:20 PM GMT+01:00, "alalaw01 at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > >alalaw01 at gcc dot gnu.org changed: > > What |Removed |Added >---------------------------------------------------------------------------- > Status|RESOLVED |REOPENED > Resolution|FIXED |--- > >--- Comment #33 from alalaw01 at gcc dot gnu.org --- >(In reply to rguenther@suse.de from comment #31) >> >> Thus a "fix" for the case where treating a[i] as a[0] is the issue >> would be >> >> Index: gcc/tree-dfa.c >> =================================================================== >> --- gcc/tree-dfa.c (revision 233172) >> +++ gcc/tree-dfa.c (working copy) >> @@ -617,7 +617,11 @@ get_ref_base_and_extent (tree exp, HOST_ >> if (maxsize == -1 >> && DECL_SIZE (exp) >> && TREE_CODE (DECL_SIZE (exp)) == INTEGER_CST) >> - maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; >> + { >> + maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; >> + if (maxsize == size) >> + maxsize = -1; >> + } >> } >> else if (CONSTANT_CLASS_P (exp)) >> { > >So is there a case where we want this for C ? > >If I declare a struct with a VLA, and access it through a pointer - GCC >recognizes the VLA idiom and keeps the accesses. If I access it from a >decl, >yes we optimize away the out-of-bounds accesses (in FRE, long before we >reach >the tree-ssa-scopedtables changes). So OK, if I access it from a extern >or >__attribute__((weak) decl, which I then get the linker to replace with >a bigger >decl, then I get "wrong" code (it ignores the extra elements in the >bigger >decl) - but I'd say that was invalid code. > >So if this is Fortran-only, we probably have to hook off --std=legacy, >right? Yes.
(In reply to rguenther@suse.de from comment #34) > >So if this is Fortran-only, we probably have to hook off --std=legacy, > >right? > > Yes. Ouch. This is not really in line to what std=legacy does. Maybe using a separate Fortran-only option, something like -fbroken-common-blocks, would be better. On the other hand, it might be a good idea to warn if such stores are removed. I suspect that such a warning will catch many bugs in the existing codebase, if it is found in SPEC...
As Richard said, you can do similar (invalid too) stuff in C too, say: struct S { int a[10000]; } s; in one TU and struct S { int a[1]; } s; int foo (int x) { return s.a[x]; } int bar (int x) { return s.a[1 + x] + s.a[0] + s.a[x]; } GCC 5 would compile it to what the author might have meant, while GCC 6 will optimize bar into s.a[0] * 3;
(In reply to Jakub Jelinek from comment #36) > As Richard said, you can do similar (invalid too) stuff in C too, say: > struct S { int a[10000]; } s; > in one TU and > struct S { int a[1]; } s; > > int > foo (int x) > { > return s.a[x]; > } > > int > bar (int x) > { > return s.a[1 + x] + s.a[0] + s.a[x]; > } > > GCC 5 would compile it to what the author might have meant, while GCC 6 will > optimize bar into s.a[0] * 3; Yes, this was what I meant in comment #33. The question is, do we care? (Or, do we only care in the FORTRAN case?) If so, then we presumably want a -fbroken-common-blocks (or something!) that is not FE-specific.
On February 9, 2016 5:28:07 PM GMT+01:00, "jakub at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > >--- Comment #36 from Jakub Jelinek <jakub at gcc dot gnu.org> --- >As Richard said, you can do similar (invalid too) stuff in C too, say: >struct S { int a[10000]; } s; >in one TU and >struct S { int a[1]; } s; > >int >foo (int x) >{ > return s.a[x]; >} > >int >bar (int x) >{ > return s.a[1 + x] + s.a[0] + s.a[x]; >} > >GCC 5 would compile it to what the author might have meant, while GCC 6 >will >optimize bar into s.a[0] * 3; I think already GCC 4.8 would do that. New in GCC 6 is that DOM will now know this trick and thus you get this after loop opts
Created attachment 37726 [details] Proposed patch (without flag). Here's a prototype patch, that sets TYPE_SIZE to NULL_TREE but leaves DECL_SIZE intact. For the moment I'm applying this universally, rather than gating under a flag, to ease testing check-fortran. Only gfortran.dg/gomp/appendix-a/a.24.1.f90 fails; in practice I think it's OK just to not use the new code in conjunction with -fopenmp. On AArch64, it fixes the 416.gamess issue, and allows compiling 416.gamess without the -fno-aggressive-loop-optimizations previously required. Also bootstraps and passes check-gcc check-fortran check-g++, on aarch64 and x86_64, except as noted above. I expect to add a Fortran-only flag to gate the trans-common.c changes before taking this to gcc-patches@ . The worry is that while many cases in the mid-end were happy with a null TYPE_SIZE, I still had to patch up a couple, so the worry is I might not have got them all. (Indeed, omp-low.c had too many!) I'm not sure this is any worse than adding a new flag to the decl (indicating that the DECL_SIZE is not to be trusted) and then trying to find all the cases where the DECL_SIZE is wrongly relied upon - with the latter approach, the compiler would generate invalid code, rather than "failing fast". Thoughts welcome!
Do you have a reduced test case of the Fortran code we can look at? I am concerned that you are changing things to accommodate invalid Fortran. I don't have 416.gamess to look at. I do see your suggested patch touches the Fortran frontend. Maybe it needs to be put behind -std=legacy? I assume the "(without flag)" means you intend to do something like that?
(In reply to Jerry DeLisle from comment #40) > Do you have a reduced test case of the Fortran code we can look at? See comment 13/14, the same common array is declared with different sizes in various places. > I am concerned that you are changing things to accommodate invalid Fortran. > I don't have 416.gamess to look at. I do see your suggested patch touches > the Fortran frontend. Since SPEC does not want to change the invalid Fortran, GCC will have to accomodate for it somehow. > Maybe it needs to be put behind -std=legacy? > > I assume the "(without flag)" means you intend to do something like that? Yes, but it was suggested that -std=legacy wasn't the right flag in comment 35...
(In reply to Wilco from comment #41) > Yes, but it was suggested that -std=legacy wasn't the right flag in comment > 35... What -std=legacy mostly does is to allow extensioms, not to accept code which was always broken. This is is why I suggested some other option. I do not feel very strongly about this, though.
Yeah, I plan to add a fortran-specific option for this, it's easy enough, but I can't run the gfortran testsuite with that, because there are lots of C files in there too, for which the compiler doesn't accept the option... I'm having trouble writing a testcase though. My subroutine with IMPLICIT DOUBLE PRECISION (X) COMMON /MYCOMMON / X(1) produces "mycommon.x" a COMPONENT_REF, but with "mycommon" being a MEM_REF, which requires only the hunk to tree-dfa.c to handle correctly; whereas in SPEC2006, what looks to me to be equivalent FORTRAN, ends up with "mycommon" being a VAR_DECL, which requires the much-bigger patch to the fortran FE... I've very little fortran experience here, any tips? Thanks, Alan
I don't have access to SPEC, so I can only guess... Is there maybe an equivalence involved, something like COMMON /FOO/ X(1) EQUIVALENCE (X,Y) ?
Consider e.g. one source file INTEGER FOO COMMON /BLK/ K(64) DO I=1,64 K(I)=1 END DO IF (FOO(5,12).NE.51) CALL ABORT END and another one: FUNCTION FOO(I, J) COMMON /BLK/ K(1) FOO = K(I) + K(J) + K(2*I) + K(2*J) END FUNCTION If the second source file is compiled with -O2, it will return "wrong" value, as if all accesses to K were using the same index (because the only valid index is 1). If you compile with -O2 -fno-tree-dominator-opts, then it "works".
On Thu, 18 Feb 2016, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > --- Comment #45 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > Consider e.g. one source file > INTEGER FOO > COMMON /BLK/ K(64) > DO I=1,64 > K(I)=1 > END DO > IF (FOO(5,12).NE.51) CALL ABORT > END > and another one: > FUNCTION FOO(I, J) > COMMON /BLK/ K(1) > FOO = K(I) + K(J) + K(2*I) + K(2*J) > END FUNCTION > If the second source file is compiled with -O2, it will return "wrong" value, > as if all accesses to K were using the same index (because the only valid index > is 1). If you compile with -O2 -fno-tree-dominator-opts, then it "works". Which is odd since FRE already should optimize it that way. Richard.
(In reply to rguenther@suse.de from comment #46) > Which is odd since FRE already should optimize it that way. Bet FRE punts on that because it considers it poor man's flexible array member. But, you know FRE better than me ;)
On Thu, 18 Feb 2016, rguenther at suse dot de wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > --- Comment #46 from rguenther at suse dot de <rguenther at suse dot de> --- > On Thu, 18 Feb 2016, jakub at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > > > --- Comment #45 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > > Consider e.g. one source file > > INTEGER FOO > > COMMON /BLK/ K(64) > > DO I=1,64 > > K(I)=1 > > END DO > > IF (FOO(5,12).NE.51) CALL ABORT > > END > > and another one: > > FUNCTION FOO(I, J) > > COMMON /BLK/ K(1) > > FOO = K(I) + K(J) + K(2*I) + K(2*J) > > END FUNCTION > > If the second source file is compiled with -O2, it will return "wrong" value, > > as if all accesses to K were using the same index (because the only valid index > > is 1). If you compile with -O2 -fno-tree-dominator-opts, then it "works". > > Which is odd since FRE already should optimize it that way. Ok, it does not. Supposedly more complicated situations are needed to trigger sth wrong in FRE.
> Yeah, I plan to add a fortran-specific option for this, it's easy enough, I don't see the point to add yet another option just because "SPEC does not want to change the invalid Fortran". I think SPEC should be run with the option(s) causing the problem disabled. > but I can't run the gfortran testsuite with that, because there are lots > of C files in there too, for which the compiler doesn't accept the option... You can use something along the following --- ../_clean/gcc/testsuite/lib/prune.exp 2016-01-04 19:51:06.000000000 +0100 +++ gcc/testsuite/lib/prune.exp 2016-01-04 20:30:13.000000000 +0100 @@ -51,6 +51,9 @@ proc prune_gcc_output { text } { regsub -all "(^|\n)\[^\n\]*: Additional NOP may be necessary to workaround Itanium processor A/B step errata" $text "" text regsub -all "(^|\n)\[^\n*\]*: Assembler messages:\[^\n\]*" $text "" text + # Ignore warning for gfortran options passed to the C compilers. + regsub -all "(^|\n)cc1(plus)?: warning: command line option .-f\[^\"\]*. is valid for Fortran but not for C\[^\n\]*" $text "" text + # Ignore harmless VTA note. regsub -all "(^|\n)\[^\n\]*: note: variable tracking size limit exceeded with -fvar-tracking-assignments, retrying without\[^\n\]*" $text "" text
Has my fix for 416.gamess source been applied when the failure happened?
On Thu, 18 Feb 2016, hjl.tools at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > H.J. Lu <hjl.tools at gmail dot com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |hjl.tools at gmail dot com > > --- Comment #50 from H.J. Lu <hjl.tools at gmail dot com> --- > Has my fix for 416.gamess source been applied when the failure happened? No.
(In reply to rguenther@suse.de from comment #51) > > --- Comment #50 from H.J. Lu <hjl.tools at gmail dot com> --- > > Has my fix for 416.gamess source been applied when the failure happened? > > No. So, there is nothing to fix in GCC? Why isn't this bug closed as invalid?
(In reply to Thomas Koenig from comment #44) > I don't have access to SPEC, so I can only guess... Is there maybe an equivalence involved, something like Turns out the COMMON is accessed via a MEM_REF in a loop, or as a VAR_DECL inside. Go figure! :) (In reply to Dominique d'Humieres from comment #49) > I don't see the point to add yet another option just because "SPEC does not > want to change the invalid Fortran". I think SPEC should be run with the > option(s) causing the problem disabled. Anecdotally I hear from Fortran-using colleagues this may occur in other places too. Moreover, the list of phases using get_ref_base_and_extent, is long; we could end up compiling with an ever-growing -fno-this -fno-that as more and more phases make use of the "bad" analysis results (that is correct by the language spec after all). In this case, there are a few other equivalences found due to the tree-ssa-scopedtables.c changes, that we'd lose with -fno-tree-dominator-opts, too. (In reply to H.J. Lu from comment #52) > >So, there is nothing to fix in GCC? Why isn't this bug closed as invalid? Not everyone wants to patch SPEC sources.
(In reply to alalaw01 from comment #53) > > > >So, there is nothing to fix in GCC? Why isn't this bug closed as invalid? > > Not everyone wants to patch SPEC sources. If the SPEC source is invalid according to Fortran standard, why should we change GCC for it? This isn't the only invalid source in SPEC CPU 2006. There are at least 3.
(In reply to H.J. Lu from comment #54) > (In reply to alalaw01 from comment #53) > > > > > >So, there is nothing to fix in GCC? Why isn't this bug closed as invalid? > > > > Not everyone wants to patch SPEC sources. > > If the SPEC source is invalid according to Fortran standard, why should > we change GCC for it? This isn't the only invalid source in SPEC CPU > 2006. There are at least 3. The Fortran code is obviously legacy code. I think handling it behind -std=legacy is the correct flag. ----snip---- > > > > --- Comment #45 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > > Consider e.g. one source file > > INTEGER FOO > > COMMON /BLK/ K(64) > > DO I=1,64 > > K(I)=1 !!!!!!! typo should be =I > > END DO > > IF (FOO(5,12).NE.51) CALL ABORT > > END > > and another one: > > FUNCTION FOO(I, J) > > COMMON /BLK/ K(1) > > FOO = K(I) + K(J) + K(2*I) + K(2*J) > > END FUNCTION Note: Typo in the example The correct solution to me is to detect the unequal sizes and convert the COMMON /BLK/ K(1) to the larger size, ie K(64). Currently we accept this code when it is all in one file and give a warning only. Playing with the test case in two files I see the problem is only with K(1), K(2) or anything else gives correct results when optimized -O2, O3 etc. Based on what I am seeing, I see no need for a flag at all, just a warning and create the block to the size of the largest specified.
> > FUNCTION FOO(I, J) > > COMMON /BLK/ K(1) > > FOO = K(I) + K(J) + K(2*I) + K(2*J) > > END FUNCTION This piece of code is also invalid (out of bound access to K): K(I) + K(J) is valid iff I=J=1, but K(2*I) + K(2*J) is always accessing elements outside the array K(1). > The correct solution to me is to detect the unequal sizes and convert > the COMMON /BLK/ K(1) to the larger size, ie K(64). This can be done at link time only if the two TU are in different files.
(In reply to Dominique d'Humieres from comment #56) > > > FUNCTION FOO(I, J) > > > COMMON /BLK/ K(1) > > > FOO = K(I) + K(J) + K(2*I) + K(2*J) > > > END FUNCTION > > This piece of code is also invalid (out of bound access to K): K(I) + K(J) > is valid iff I=J=1, but K(2*I) + K(2*J) is always accessing elements outside > the array K(1). > > > The correct solution to me is to detect the unequal sizes and convert > > the COMMON /BLK/ K(1) to the larger size, ie K(64). > > This can be done at link time only if the two TU are in different files. There are two kinds of people to consider here. One is sPEC. I think they are wrong in their policy, but I doubt we are going to change it. The other are normal users, who have an existing code base relying on such behavior. What can we do for them? I think it would be appropriate to warn (with -Wall) if code - uses this idiom (one-element array as trailing common block member) - uses a non-constant expression to access an element while telling the user how to get around this issue. If we can prove that the access is out of bounds, then we could also issue an error. This appears to be a frequent idiom in old-style Fortran, which is not always easy to fix in legacy code. We would do a large part of our user base a disservice if there was no way to get around this.
On February 19, 2016 6:48:38 AM GMT+01:00, "tkoenig at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > >--- Comment #57 from Thomas Koenig <tkoenig at gcc dot gnu.org> --- >(In reply to Dominique d'Humieres from comment #56) >> > > FUNCTION FOO(I, J) >> > > COMMON /BLK/ K(1) >> > > FOO = K(I) + K(J) + K(2*I) + K(2*J) >> > > END FUNCTION >> >> This piece of code is also invalid (out of bound access to K): K(I) + >K(J) >> is valid iff I=J=1, but K(2*I) + K(2*J) is always accessing elements >outside >> the array K(1). >> >> > The correct solution to me is to detect the unequal sizes and >convert >> > the COMMON /BLK/ K(1) to the larger size, ie K(64). >> >> This can be done at link time only if the two TU are in different >files. > >There are two kinds of people to consider here. > >One is sPEC. I think they are wrong in their policy, but I doubt we >are >going to change it. > >The other are normal users, who have an existing code base relying >on such behavior. What can we do for them? > >I think it would be appropriate to warn (with -Wall) if code > >- uses this idiom (one-element array as trailing common block member) > >- uses a non-constant expression to access an element > >while telling the user how to get around this issue. > >If we can prove that the access is out of bounds, then we >could also issue an error. > >This appears to be a frequent idiom in old-style Fortran, which is >not always easy to fix in legacy code. We would do a large part >of our user base a disservice if there was no way to get around this. We already warn about mismatches sizes at LTO link time
> We already warn about mismatches sizes at LTO link time Confirmed [Book15] f90/bug% gfc -c -O2 pr69368_a.f90 -flto [Book15] f90/bug% gfc -O2 pr69368_a.o pr69368_b.f90 -flto pr69368_a.f90:3:0: warning: type of 'blk' does not match original declaration [-Wlto-type-mismatch] COMMON /BLK/ K(1) pr69368_b.f90:3:0: note: 'blk' was previously declared here COMMON /BLK/ K(64) pr69368_b.f90:3:0: note: code may be misoptimized unless -fno-strict-aliasing is used and the executable gives the expected output. IMO the second note is bogus (pr68717). If I replace COMMON /BLK/ K(1) with COMMON /BLK/ K(2) the executable gives the expected output also for all the compiling options I have tried. AFAICT the code (invalid for any value of I and J) FUNCTION FOO(I, J) COMMON /BLK/ K(1) FOO = K(I) + K(J) + K(2*I) + K(2*J) END FUNCTION is optimized as FOO = K(I) + K(I) + K(I) + K(I) Although I know that a compiler can do whatever it deems suitable with invalid code, I don't understand the rationale of the above "optimization": it is as invalid as the original code. Another thing I don't understand is that the following code is not "optimized" while as invalid as the one above: INTEGER FUNCTION FOO(I, J, K) INTEGER K(1) FOO = K(I) + K(J) + K(2*I) + K(2*J) END FUNCTION INTEGER FOO INTEGER K(64) DO I=1,64 K(I)=I END DO IF (FOO(5,12,K).NE.51) CALL ABORT END
(In reply to H.J. Lu from comment #54) > (In reply to alalaw01 from comment #53) > > > > > >So, there is nothing to fix in GCC? Why isn't this bug closed as invalid? > > > > Not everyone wants to patch SPEC sources. > > If the SPEC source is invalid according to Fortran standard, why should > we change GCC for it? This isn't the only invalid source in SPEC CPU > 2006. There are at least 3. BTW, how to fix this in SPEC? I also noticed this problem with 416.gamess (but did not report it because my knowledge of Fortran is not enough for correct triage of this bug). In my build the only 'problematic' file is 'aldeci.fppized.f'.
Another oddity of the "optimization" introduced by r232508 is the following test (borrowed from https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01356.html) [Book15] f90/bug% cat pr69368_1_a.f90 SUBROUTINE FOO IMPLICIT DOUBLE PRECISION (X) INTEGER J ! COMMON /MYCOMMON / X(1031) COMMON /MYCOMMON / X(1) DO 10 J=1,1024 X(J+1)=X(J+7) 10 CONTINUE RETURN END [Book15] f90/bug% cat pr69368_1_b.f90 IMPLICIT DOUBLE PRECISION (X) COMMON /MYCOMMON/ X(1031) DO I=1,1031 X(I)=I END DO call FOO() print *, X(1025) IF (X(1025).NE.1031) CALL ABORT END The test fails if pr69368_1_a.f90 is compiled with -O2 (FOO is compiled as a simple RETURN). Now if I replace COMMON /MYCOMMON / X(1) with COMMON /MYCOMMON / X(2) I get at compile tile with -O2 pr69368_1_a_1.f90:7:0: X(J+1)=X(J+7) Warning: iteration 1 invokes undefined behavior [-Waggressive-loop-optimizations] pr69368_1_a_1.f90:6:0: DO 10 J=1,1024 note: within this loop and the test succeeds. Any reason why there is no warning for "COMMON /MYCOMMON / X(1)"? Also the following invalid code [Book15] f90/bug% cat pr69368_1_c.f90 SUBROUTINE FOO(X) IMPLICIT DOUBLE PRECISION (X) INTEGER J DIMENSION X(1) DO 10 J=1,1024 X(J+1)=X(J+7) 10 CONTINUE RETURN END [Book15] f90/bug% cat pr69368_1_d.f90 IMPLICIT DOUBLE PRECISION (X) DIMENSION X(1031) DO I=1,1031 X(I)=I END DO call FOO(X) print *, X(1025) IF (X(1025).NE.1031.0) CALL ABORT END gives the expected result.
(In reply to Mikhail Maltsev from comment #60) > > BTW, how to fix this in SPEC? I also noticed this problem with 416.gamess See PR 53086.
(In reply to Dominique d'Humieres from comment #59) > > We already warn about mismatches sizes at LTO link time > > Confirmed > > [Book15] f90/bug% gfc -c -O2 pr69368_a.f90 -flto > [Book15] f90/bug% gfc -O2 pr69368_a.o pr69368_b.f90 -flto > pr69368_a.f90:3:0: warning: type of 'blk' does not match original > declaration [-Wlto-type-mismatch] > COMMON /BLK/ K(1) > > pr69368_b.f90:3:0: note: 'blk' was previously declared here > COMMON /BLK/ K(64) > > pr69368_b.f90:3:0: note: code may be misoptimized unless > -fno-strict-aliasing is used > > and the executable gives the expected output. IMO the second note is bogus > (pr68717). > > If I replace > > COMMON /BLK/ K(1) > > with > > COMMON /BLK/ K(2) > > the executable gives the expected output also for all the compiling options > I have tried. > > AFAICT the code (invalid for any value of I and J) > > FUNCTION FOO(I, J) > COMMON /BLK/ K(1) > FOO = K(I) + K(J) + K(2*I) + K(2*J) > END FUNCTION > > is optimized as > > FOO = K(I) + K(I) + K(I) + K(I) > > Although I know that a compiler can do whatever it deems suitable with > invalid code, I don't understand the rationale of the above "optimization": > it is as invalid as the original code. There is obviously no "rationale". Fact is that we don't exploit the undefinedness explicitely but just as a side-effect of how CSE works in DOM now. This means we don't propagate '1' as the only valid value of I but just CSE the last three loads to the first as we know they are the same (without knowing the actual value). > Another thing I don't understand is that the following code is not > "optimized" while as invalid as the one above: > > INTEGER FUNCTION FOO(I, J, K) > INTEGER K(1) > FOO = K(I) + K(J) + K(2*I) + K(2*J) > END FUNCTION > > INTEGER FOO > INTEGER K(64) > DO I=1,64 > K(I)=I > END DO > IF (FOO(5,12,K).NE.51) CALL ABORT > END Sth to do with K being passed as a pointer and thus K being treated as a "flexible array" (we can't know whether the caller passed the address of a last member of a struct).
(In reply to Dominique d'Humieres from comment #61) > Another oddity of the "optimization" introduced by r232508 is the following > test (borrowed from https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01356.html) > > [Book15] f90/bug% cat pr69368_1_a.f90 > SUBROUTINE FOO > IMPLICIT DOUBLE PRECISION (X) > INTEGER J > ! COMMON /MYCOMMON / X(1031) > COMMON /MYCOMMON / X(1) > DO 10 J=1,1024 > X(J+1)=X(J+7) > 10 CONTINUE > RETURN > END > [Book15] f90/bug% cat pr69368_1_b.f90 > IMPLICIT DOUBLE PRECISION (X) > COMMON /MYCOMMON/ X(1031) > DO I=1,1031 > X(I)=I > END DO > call FOO() > print *, X(1025) > IF (X(1025).NE.1031) CALL ABORT > END > > The test fails if pr69368_1_a.f90 is compiled with -O2 (FOO is compiled as a > simple RETURN). Now if I replace > > COMMON /MYCOMMON / X(1) > > with > > COMMON /MYCOMMON / X(2) > > I get at compile tile with -O2 > > pr69368_1_a_1.f90:7:0: > > X(J+1)=X(J+7) > > Warning: iteration 1 invokes undefined behavior > [-Waggressive-loop-optimizations] > pr69368_1_a_1.f90:6:0: > > DO 10 J=1,1024 > > note: within this loop > > and the test succeeds. Any reason why there is no warning for "COMMON > /MYCOMMON / X(1)"? Because it's not a loop optimization that optimizes it. > Also the following invalid code > > [Book15] f90/bug% cat pr69368_1_c.f90 > SUBROUTINE FOO(X) > IMPLICIT DOUBLE PRECISION (X) > INTEGER J > DIMENSION X(1) > DO 10 J=1,1024 > X(J+1)=X(J+7) > 10 CONTINUE > RETURN > END > [Book15] f90/bug% cat pr69368_1_d.f90 > IMPLICIT DOUBLE PRECISION (X) > DIMENSION X(1031) > DO I=1,1031 > X(I)=I > END DO > call FOO(X) > print *, X(1025) > IF (X(1025).NE.1031.0) CALL ABORT > END > > gives the expected result. Well, undefined behavior doesn't guarantee a "not expected" result ;)
> There is obviously no "rationale". Fact is that we don't exploit the > undefinedness explicitely but just as a side-effect of how CSE works in > DOM now. This means we don't propagate '1' as the only valid value of I > but just CSE the last three loads to the first as we know they are the > same (without knowing the actual value). How can K(1) and K(2*1) be the same without using undefinedness explicitely?
(In reply to Dominique d'Humieres from comment #65) > > There is obviously no "rationale". Fact is that we don't exploit the > > undefinedness explicitely but just as a side-effect of how CSE works in > > DOM now. This means we don't propagate '1' as the only valid value of I > > but just CSE the last three loads to the first as we know they are the > > same (without knowing the actual value). > > How can K(1) and K(2*1) be the same without using undefinedness explicitely? They can't, but why does that matter for undefined behavior? The CSE code in DOM doesn't try to analyze the array indices at all, it is enough for it to prove that in valid program they must be all same. Whether the compiler could with additional efforts prove they are different or not (it can't e.g. without LTO and without seeing caller that passes two different variables).
> > How can K(1) and K(2*1) be the same without using undefinedness explicitely? > They can't, but why does that matter for undefined behavior? > The CSE code in DOM doesn't try to analyze the array indices at all, it is > enough for it to prove that in valid program they must be all same. How can you "prove" that I and 2*I are the same in a valid program? > Whether the compiler could with additional efforts prove they are different or > not (it can't e.g. without LTO and without seeing caller that passes two > different variables). If you trust COMMON /BLK/ K(1) then FOO = K(I) + K(J) + K(2*I) + K(2*J) is invalid for any value of I and J: K(I) + K(J) is valid iff I=J=1, then the access to K(2) is out of bound, hence invalid.
The problem here is that there is a large existing codebase which depends on memory management which was - illegal from the start - the only game in town if you wanted to do any sort of dynamic memory management. because FORTRAN 66 and 77 explicitly forbade any sort of dynamic memory management. We can lament the fact fifty years later, but we cannot change it. The idea was to use COMMON /FOO/ A(1) in a subroutine (even a library), and use COMMON /FOO/ A(10000) or whatever 'dynamic' size you needed for your memory in the main program. This idiom appears to be common enough that, if we don't support it, or support it only with a severe performance penalty, we will simply push people away from gfortran.
On Mon, 22 Feb 2016, tkoenig at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > Thomas Koenig <tkoenig at gcc dot gnu.org> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > Severity|normal |major > > --- Comment #68 from Thomas Koenig <tkoenig at gcc dot gnu.org> --- > The problem here is that there is a large existing codebase > which depends on memory management which was > > - illegal from the start > > - the only game in town if you wanted to do any > sort of dynamic memory management. > > because FORTRAN 66 and 77 explicitly forbade any sort > of dynamic memory management. We can lament the fact > fifty years later, but we cannot change it. > > The idea was to use > > COMMON /FOO/ A(1) > > in a subroutine (even a library), and use > > COMMON /FOO/ A(10000) > > or whatever 'dynamic' size you needed for your memory > in the main program. > > This idiom appears to be common enough that, if we > don't support it, or support it only with a severe > performance penalty, we will simply push people away from > gfortran. Ok, so we can also use sth like the following to work around that unintended optimization side-effect of get_ref_base_and_extent. Untested, not sure if it catches all cases, it at least "fixes" some of the small undefined testcases in this PR (not tested on 416.gamess). I'm not sure about testsuite fallout (I might have decided to add a testcase for the surprising behavior). Index: gcc/tree-dfa.c =================================================================== --- gcc/tree-dfa.c (revision 233598) +++ gcc/tree-dfa.c (working copy) @@ -617,7 +617,19 @@ get_ref_base_and_extent (tree exp, HOST_ if (maxsize == -1 && DECL_SIZE (exp) && TREE_CODE (DECL_SIZE (exp)) == INTEGER_CST) - maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; + { + maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; + /* If we've seen a variable array ref and the above adjusted + maxsize to size make sure the caller doesn't mistake this + as a non-variable access by adjusting maxsize slightly. + ??? This is strictly pessimizing the case where a + one element array is accessed with a variable index + which should be a rare case in practice but hits legacy + fortran code - see PR69368 for example. */ + if (seen_variable_array_ref + && maxsize == bitsize) + maxsize *= 2; + } } else if (CONSTANT_CLASS_P (exp)) {
If you want to go this way, I'd at least key it off DECL_COMMON on the decl. And instead of multiplying max_size by 2 perhaps just add BITS_PER_UNIT? And only if it is equal to size?
On Tue, 23 Feb 2016, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > --- Comment #70 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > If you want to go this way, I'd at least key it off DECL_COMMON on the decl. > And instead of multiplying max_size by 2 perhaps just add BITS_PER_UNIT? Works for me. > And only if it is equal to size? That's already done.
On Tue, 23 Feb 2016, Richard Biener wrote: > On Tue, 23 Feb 2016, jakub at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > > > --- Comment #70 from Jakub Jelinek <jakub at gcc dot gnu.org> --- > > If you want to go this way, I'd at least key it off DECL_COMMON on the decl. > > And instead of multiplying max_size by 2 perhaps just add BITS_PER_UNIT? > > Works for me. > > > And only if it is equal to size? > > That's already done. Patch as posted passed bootstrap & regtest. Adjusted according to comments but not tested otherwise - please somebody throw at unpatched 416.gamess. Index: gcc/tree-dfa.c =================================================================== --- gcc/tree-dfa.c (revision 233620) +++ gcc/tree-dfa.c (working copy) @@ -617,7 +617,20 @@ get_ref_base_and_extent (tree exp, HOST_ if (maxsize == -1 && DECL_SIZE (exp) && TREE_CODE (DECL_SIZE (exp)) == INTEGER_CST) - maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; + { + maxsize = wi::to_offset (DECL_SIZE (exp)) - bit_offset; + /* If we've seen a variable array ref and the above adjusted + maxsize to size make sure the caller doesn't mistake this + as a non-variable access by adjusting maxsize slightly. + ??? This is strictly pessimizing the case where a + one element array is accessed with a variable index + which should be a rare case in practice but hits legacy + fortran code - see PR69368 for example. */ + if (seen_variable_array_ref + && maxsize == bitsize + && DECL_COMMON (exp)) + maxsize += BITS_PER_UNIT; + } } else if (CONSTANT_CLASS_P (exp)) {
> Patch as posted passed bootstrap & regtest. Adjusted according to > comments but not tested otherwise The patch "fixes" the test in comment 45, but not the one in comment 61.
On Tue, 23 Feb 2016, dominiq at lps dot ens.fr wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > --- Comment #73 from Dominique d'Humieres <dominiq at lps dot ens.fr> --- > > Patch as posted passed bootstrap & regtest. Adjusted according to > > comments but not tested otherwise > > The patch "fixes" the test in comment 45, but not the one in comment 61. It's expected that it doesn't fix the one w/o /COMMON/
(In reply to rguenther@suse.de from comment #74) > On Tue, 23 Feb 2016, dominiq at lps dot ens.fr wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > > > --- Comment #73 from Dominique d'Humieres <dominiq at lps dot ens.fr> --- > > > Patch as posted passed bootstrap & regtest. Adjusted according to > > > comments but not tested otherwise > > > > The patch "fixes" the test in comment 45, but not the one in comment 61. > > It's expected that it doesn't fix the one w/o /COMMON/ The #c61 testcase is with /COMMON/ too. But doesn't really use get_ref_base_and_extent for the optimization, but array_at_struct_end_p. Which is why I've suggested to Alan on gcc-patches to tweak those two places for -funknown-commons.
> The #c61 testcase is with /COMMON/ too. Well the test aborts for me on x86_64-apple-darwin15 (I still get 1025.0). AFAIR some default sizes are larger on darwin than on linux.
(In reply to rguenther@suse.de from comment #72) > > Patch as posted passed bootstrap & regtest. Adjusted according to > comments but not tested otherwise - please somebody throw at > unpatched 416.gamess. Still miscompares on aarch64, I'm afraid. (Both with and without -fno-aggressive-loop-optimizations.) Also where Jakub wrote: > If you want to go this way, I'd at least key it off DECL_COMMON on the decl. > And instead of multiplying max_size by 2 perhaps just add BITS_PER_UNIT? I wonder why you prefer setting such an arbitrary guess at max_size rather than going with -1 which is defined as "unknown" ?
On Tue, 23 Feb 2016, alalaw01 at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > --- Comment #77 from alalaw01 at gcc dot gnu.org --- > (In reply to rguenther@suse.de from comment #72) > > > > Patch as posted passed bootstrap & regtest. Adjusted according to > > comments but not tested otherwise - please somebody throw at > > unpatched 416.gamess. > > Still miscompares on aarch64, I'm afraid. (Both with and without > -fno-aggressive-loop-optimizations.) > > Also where Jakub wrote: > > If you want to go this way, I'd at least key it off DECL_COMMON on the decl. > > And instead of multiplying max_size by 2 perhaps just add BITS_PER_UNIT? > > I wonder why you prefer setting such an arbitrary guess at max_size rather than > going with -1 which is defined as "unknown" ? That would pessimize it too much IMHO.
(In reply to rguenther@suse.de from comment #78) > > That would pessimize it too much IMHO. I'm not sure how to evaluate the pessimization, given it's thought to be a widespread pseudo-FORTRAN construct; so I probably have to defer to your judgement here. However... Given maxsize of an array as two elements, say, would the compiler not be entitled to optimize an index selection down to, say, computing only the LSBit of the actual index? Whereas 'unknown' means, well, exactly what is the case. So I fear this is storing problems up for the future. Is the concern that we can't hide this behind an option, as that would "drive people away from gfortran" ? If that's the case, can we hide it behind an option that defaults to pessimization (?? at least for fortran)??
(In reply to alalaw01 from comment #79) > Is the concern that we can't hide this behind an option, as that would > "drive people away from gfortran" ? If that's the case, can we hide it > behind an option that defaults to pessimization (?? at least for fortran)?? The concern would be if we did not provide a reasonable way around this. I am not in favor of pessimizing valid code by default :-) My preference would be to hide it behind an option, include this option in -std=legacy, and warn from the gfortran front end (with -Wall) if the construct in question (or questionable construct) is encountered. The warning could tell the user which option to select in this case. A one-element array trailing a common block is legal, but its normal use should be quite rare, because it can be replaced without problems with a scalar variable, so I doubt we will generate many false positives. And even if we do, we now have the nice feature of displaying the option which issued the warning, so people should have no problem turning it off.
On February 23, 2016 4:20:48 PM GMT+01:00, "alalaw01 at gcc dot gnu.org" <gcc-bugzilla@gcc.gnu.org> wrote: >https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > >--- Comment #79 from alalaw01 at gcc dot gnu.org --- >(In reply to rguenther@suse.de from comment #78) >> >> That would pessimize it too much IMHO. > >I'm not sure how to evaluate the pessimization, given it's thought to >be a >widespread pseudo-FORTRAN construct; so I probably have to defer to >your >judgement here. However... > >Given maxsize of an array as two elements, say, would the compiler not >be >entitled to optimize an index selection down to, say, computing only >the LSBit >of the actual index? Whereas 'unknown' means, well, exactly what is >the case. >So I fear this is storing problems up for the future. It doesn't do that. >Is the concern that we can't hide this behind an option, as that would >"drive >people away from gfortran" ? If that's the case, can we hide it behind >an >option that defaults to pessimization (?? at least for fortran)?? I don't think it's necessary to hide it behind an option.
For those who haven't seen it, I've put forward this patch on the mailing list: https://gcc.gnu.org/ml/gcc-patches/2016-02/msg01746.html based on a suggestion from Jakub. (Unlike Richi's comment72 patch, this fixes 416.gamess on AArch64.)
I tried that patch and the 416.gamess test still fails on power.
Bah. Do you normally use -fno-aggressive-loop-optimizations? With -funknown-commons, did you try with/out aggressive loop opts? Powerpc{,64}{be,le} ? The unknown-commons testcase I included in that patch looks to pass on powerpc64le-unknown-linux-gnu. Does HJ Lu's spec source-patching work on powerpc following r232559? I am not a lawyer...but I don't think the SPEC2006 license allows me to upload onto the GCC Compile Farm and runspec. So if you could narrow down to an object file that's broken with a recent compiler and -funknown-commons, with the rest compiled with a gcc prior to r232508, that'd be very helpful - then I could see what assembly I'm changing (and what expressions equal_mem_array_ref is falsely declaring equivalent)...?
I just grabbed a fresh copy of the gcc source, applied the patch, built it, fixed up the options for 416.gamess, and when I ran it it worked! I should have done that yesterday.
I also tried it on a power8 BE machine and it worked fine there, too.
Great, many thanks for the tests, I was worried if we had hit another distinct issue. (Of course this would be better on gcc-patches!)
Can this now be closed, or should I leave open for possible Fortran FE warnings?
I'd say it can be closed now.
Fixed then.
This problem appears to have reappeared in GCC 7 trunk. Looking at the performance results posted at: http://gcc.opensuse.org/SPEC/CFP/sb-megrez-head-64-2006/recent.html suggests the problem reappeared somewhere around May 4th. 416.gamess also fails with latest gcc-7-20160515 snapshot despite adding the -funconstrained-commons flag.
Passes with snapshot: gcc-7-20160501 Fails with snapshot: gcc-7-20160508
On Fri, 20 May 2016, mike at vermeulen dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > Mike Vermeulen <mike at vermeulen dot com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |mike at vermeulen dot com > > --- Comment #91 from Mike Vermeulen <mike at vermeulen dot com> --- > This problem appears to have reappeared in GCC 7 trunk. > > Looking at the performance results posted at: > http://gcc.opensuse.org/SPEC/CFP/sb-megrez-head-64-2006/recent.html suggests > the problem reappeared somewhere around May 4th. > > 416.gamess also fails with latest gcc-7-20160515 snapshot despite adding the > -funconstrained-commons flag. I don't think it is the same problem as the tester you cite above has the problem fixed in 416.gamess sources. It looks like we're big into "stage1" with the last run showing Error: 2x416.gamess 2x435.gromacs 2x436.cactusADM 1x437.leslie3d 1x447.dealII 2x454.calculix 2x465.tonto 2x481.wrf Note that gamess is a build error: dftexc.fppized.f:137:0: internal compiler error: in zero_one_operation, at tree-ssa-reassoc.c:1230 SUBROUTINE CALCEXC(ROA,ROB,FTOTWT,GRDAA,GRDBB,GRDAB, zheev.fppized.f: In function 'zher2k.constprop': zheev.fppized.f:7273:0: internal compiler error: in zero_one_operation, at tree-ssa-reassoc.c:1230 SUBROUTINE ZHER2K( UPLO, TRANS, N, K, ALPHA, A, LDA, B, LDB, 0xaa3596 zero_one_operation /gcc/spec/sb-czerny-head-64-2006/gcc/gcc/tree-ssa-reassoc.c:1229 0xaab013 undistribute_ops_list
We currently observe miscompare for x86 (32 and 64 bits) and bisect points to r235817. No source bisecting yet, however
On Mon, 23 May 2016, izamyatin at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69368 > > Igor Zamyatin <izamyatin at gmail dot com> changed: > > What |Removed |Added > ---------------------------------------------------------------------------- > CC| |izamyatin at gmail dot com > > --- Comment #94 from Igor Zamyatin <izamyatin at gmail dot com> --- > We currently observe miscompare for x86 (32 and 64 bits) and bisect points to > r235817. > > No source bisecting yet, however Yeah, the ICE is more recent than the start of gamess failing.
*** Bug 86101 has been marked as a duplicate of this bug. ***