Bug 70018 - [6 Regression] Possible issue around IPO and C++ comdats discovered as pure/const
Summary: [6 Regression] Possible issue around IPO and C++ comdats discovered as pure/c...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: c++ (show other bugs)
Version: 5.3.0
: P2 normal
Target Milestone: 7.0
Assignee: Jan Hubicka
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2016-02-29 19:20 UTC by Sanjoy Das
Modified: 2018-10-26 10:46 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.3.6
Known to fail: 4.7.4, 4.8.5, 5.3.0, 6.0
Last reconfirmed: 2016-03-01 00:00:00


Attachments
testcase as tar file (7.03 KB, application/octet-stream)
2016-03-01 10:09 UTC, Richard Biener
Details
Patch I am testing (7.23 KB, patch)
2016-04-05 23:05 UTC, Jan Hubicka
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sanjoy Das 2016-02-29 19:20:01 UTC
I don't grok gcc internals so I cannot write a terribly well-informed
bug report, but GCC 5.3.0 seems to miscompile
https://github.com/sanjoy/comdat-ipo

This was discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-February/095833.html and
the thread contains a description of the underlying cause for
LLVM/Clang.

The TL;DR is that for C++ inline functions (and other functions with
similar linkage rules), you can override a more-refined function
implementation with a less-refined one at link time, and that can
retroactively invalidate earlier transforms, where "refined" in this
case means "undefined in fewer situations".

E.g. if we have (this is very similar to the reproducer above minus
some mechanical details):

inline void foo(int* ptr) {
  100 / ptr[0];
}

void bar(int* ptr) {
  *ptr = 40;
  foo(ptr):
  *ptr = 50;
}

=>

inline void foo(int* ptr) {
  // 100 / ptr[0]; removed, dead code
}

void bar(int* ptr) {
  *ptr = 40;
  foo(ptr):
  *ptr = 50;
}

=>

inline void foo(int* ptr) {
  // 100 / ptr[0]; removed, dead code
}

void bar(int* ptr) {
  // *ptr = 40; dead store, since foo does not read memory
  foo(ptr):
  *ptr = 50;
}

we've miscompiled if *ptr == 0 on entry to bar, and foo is replaced
with the original definition at link time.
Comment 1 Andrew Pinski 2016-02-29 19:37:39 UTC
In C++ code, the one definition rule says that all TU that contains an inline function, they need to have the same definition.  If they have different definition, then the code is undefined at runtime and no diagnostic is required.
Comment 2 Andrew Pinski 2016-02-29 19:44:09 UTC
Also one more point, in this example both inline functions have the same definition.  Both answers are valid answers for this case.  That is doing the division (causing the trap on x86_64) don't have to happen to get valid case.

Basically read up about ODR and the reason why both behaviors are valid.  If always_false was returning two different values for the TU, then you would have an ODR violation but since they return false in both TU removing or not removing the divide is both valid answers as division by 0 is undefined behavior at runtime.  That is both trap and not trapping is a valid behavior.


So in summary what you are seeing is two effects going into effect here: undefined behavior of division by 0 and ODR.
Comment 3 Sanjoy Das 2016-02-29 19:46:33 UTC
(In reply to Andrew Pinski from comment #1)
> In C++ code, the one definition rule says that all TU that contains an
> inline function, they need to have the same definition.  If they have
> different definition, then the code is undefined at runtime and no
> diagnostic is required.

Hi Andrew,

I believe that is an incorrect assessment of the test case:
https://github.com/sanjoy/comdat-ipo

At the C++ source level there is only one definition of maybe_divide
that is in header.hpp included by both source-a.cpp and source-b.cpp.
If this violates ODR then every C++ program with inline functions in
the header violates ODR. :)
Comment 4 Sanjoy Das 2016-02-29 19:47:29 UTC
(In reply to Andrew Pinski from comment #2)

> So in summary what you are seeing is two effects going into effect here:
> undefined behavior of division by 0 and ODR.

There is no division by zero (or any other UB that I'm aware of) in the
original program.
Comment 5 Andrew Pinski 2016-02-29 19:52:20 UTC
Oh I see pure/const behavior.

The problem is more complex, in that in one TU, the comdat function is figured out to be pure/const so we remove the store before the function call.  While in the other we don't.  Now we got a comdat function which is pure in one TU while not in the other case.  We use the non pure version of the comdat function.

I thought this was fixed in 6.
Comment 6 Sanjoy Das 2016-02-29 19:55:09 UTC
(In reply to Andrew Pinski from comment #5)
> Oh I see pure/const behavior.
> 
> The problem is more complex, in that in one TU, the comdat function is
> figured out to be pure/const so we remove the store before the function
> call.  While in the other we don't.  Now we got a comdat function which is
> pure in one TU while not in the other case.  We use the non pure version of
> the comdat function.
> 
> I thought this was fixed in 6.

Ah, okay.  I'm testing with 5.3.  If this is fixed in gcc 6, please feel free to close.
Comment 7 Richard Biener 2016-03-01 10:08:52 UTC
No, it's not fixed in GCC 6.  Basically it would mean that we have to disable
local pure/const discovery for all comdat functions which would be sad.  Only
with LTO when we can localize them we can do the IPA discovery (so we need
to optimistically do local analyze and throw it away if we can't localize).

Confirmed.
Comment 8 Richard Biener 2016-03-01 10:09:54 UTC
Created attachment 37829 [details]
testcase as tar file
Comment 9 Jan Hubicka 2016-03-02 17:15:26 UTC
To properly track possible loads/stores w/o any optimizations we probably need an info from Frontend, because we optimize out things prior gimplification. Jason, how feasible it is to do that?

For pure/const a workaround is -fno-ipa-pure-const. The bug also affects nothrow at least with -fnoncall-exceptions and nothrow discovery is not controled by a flag.

Fixing this correctly will need bit of work, because we really want to track nothrow/pure/const for the inline functions and make us of it when we know the call bids to current def. This can change during the optimization queue, so we probably want two sets of these flags.
Comment 10 Jason Merrill 2016-03-27 03:02:24 UTC
(In reply to Jan Hubicka from comment #9)
> To properly track possible loads/stores w/o any optimizations we probably
> need an info from Frontend, because we optimize out things prior
> gimplification. Jason, how feasible it is to do that?

The front end isn't tracking that information currently.  But I don't think it does any optimizations that would cause this kind of issue, either.
Comment 11 Jan Hubicka 2016-04-04 17:27:58 UTC
For

int *a;
bool t()
{
  return *a==*a;
}

we get

$ more t.C.*original

;; Function bool t() (null)
;; enabled by -tree-original


return <retval> = 1;

even at -O0 while I think very stupid compiler is allowed to keep the memory access.

In fact clang does precisely that:

_Z1tv:                                  # @_Z1tv
        .cfi_startproc
# BB#0:
        pushq   %rbp
.Ltmp0:
        .cfi_def_cfa_offset 16
.Ltmp1:
        .cfi_offset %rbp, -16
        movq    %rsp, %rbp
.Ltmp2:
        .cfi_def_cfa_register %rbp
        movq    a, %rax
        movl    (%rax), %ecx
        movq    a, %rax
        cmpl    (%rax), %ecx
        sete    %dl
        andb    $1, %dl
        movzbl  %dl, %eax
        popq    %rbp
        retq
.Ltmp3:

I am trying to make conservatively correct fix for this. I suppose we can still discover functions seen as const/pure as pure and incrementally teach optimizer that given function does possibly use global memory state but it doesn't depend on it. It should be still correct to CSE those functions.
For that we finally need to cleanup pure/const/looping flags into sane form (i.e. clobber memory/has side effects/is CSEable and uses no memory/is CSEable and uses memory)
Comment 12 Jan Hubicka 2016-04-05 23:05:48 UTC
Created attachment 38195 [details]
Patch I am testing

This patch handles const->pure transition for all functions detected const that are not necessarily binding to current def. I think this is best we can do without tracking down constness from the FE (i.e. before the folding).
The patch tries to be careful to optimize some cases (such as take into account that inlined functions will bind locally).

I will try to collect some data on code quality effect.
Comment 13 Jan Hubicka 2016-04-16 16:31:20 UTC
Author: hubicka
Date: Sat Apr 16 16:30:48 2016
New Revision: 235063

URL: https://gcc.gnu.org/viewcvs?rev=235063&root=gcc&view=rev
Log:

	PR ipa/70018
	* cgraph.c (cgraph_node::get_availability): Add REF parameter.
	(cgraph_node::function_symbol): Likewise.
	(cgraph_node::function_or_virtual_thunk_symbol): Likewise.
	* cgraph.h (symtab_node::get_availabbility): Add REF parameter.
	(symtab_node::ultimate_alias_target): Add REF parameter.
	(symtab_node::binds_to_current_def_p): Declare.
	(symtab_node;:ultimate_alias_target_1): Add REF parameter.
	(cgraph_node::function_symbol): Likewise.
	(cgraph_node::function_or_virtual_thunk_symbol): Likewise.
	(cgraph_node::get_availability): Likewise.
	(cgraph_edge::binds_to_current_def_p): New inline function.
	(varpool_node::get_availability): Add REF parameter.
	(varpool_node::ultimate_alias_target): Likewise.
	* symtab.c (symtab_node::ultimate_alias_target_1): Likewise.
	(symtab_node::binds_to_current_def_p): Likewise.
	* varpool.c (varpool_node::get_availability): Likewise.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cgraph.c
    trunk/gcc/cgraph.h
    trunk/gcc/symtab.c
    trunk/gcc/varpool.c
Comment 14 Jan Hubicka 2016-04-16 18:55:21 UTC
Author: hubicka
Date: Sat Apr 16 18:54:49 2016
New Revision: 235065

URL: https://gcc.gnu.org/viewcvs?rev=235065&root=gcc&view=rev
Log:

	PR ipa/70018
	* cgraph.c (cgraph_set_const_flag_1): Only set as pure if
	function does not bind to current def.
	* ipa-pure-const.c (worse_state): Add FROM and TO parameters;
	handle conservatively calls to functions that does not need to bind
	to current def.
	(check_call): Update call of worse_state.
	(ignore_edge_for_nothrow): Update.
	(ignore_edge_for_pure_const): Likewise.
	(propagate_pure_const): Update calls to worse_state.
	(skip_function_for_local_pure_const): Reformat comments.

	* g++.dg/ipa/pure-const-1.C: New testcase.
	* g++.dg/ipa/pure-const-2.C: New testcase.
	* g++.dg/ipa/pure-const-3.C: New testcase.

Added:
    trunk/gcc/testsuite/g++.dg/ipa/pure-const-1.C
    trunk/gcc/testsuite/g++.dg/ipa/pure-const-2.C
    trunk/gcc/testsuite/g++.dg/ipa/pure-const-3.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cgraph.c
    trunk/gcc/ipa-pure-const.c
    trunk/gcc/testsuite/ChangeLog
Comment 15 Jan Hubicka 2016-04-17 16:04:37 UTC
Author: hubicka
Date: Sun Apr 17 16:04:05 2016
New Revision: 235081

URL: https://gcc.gnu.org/viewcvs?rev=235081&root=gcc&view=rev
Log:

	PR ipa/70018
	* cgraph.h (cgraph_node::set_const_flag,
	cgraph_node::set_pure_flag): Update prototype to return bool;
	update comment.
	* cgraph.c (cgraph_node::call_for_symbol_thunks_and_aliases): Thunks
	of interposable symbol are interposable, too.
	(cgraph_set_const_flag_1): Rename to ...
	(set_const_flag_1): ... this one; change to self recursive function
	instead of call_for_symbol_thunks_and_aliases. Handle correctly
	clearnig the flag in all variants and also virtual thunks of const
	functions are pure; track if any change was done.
	(cgraph_node::set_const_flag): Update.
	(struct set_pure_flag_info): New struct.
	(cgraph_set_pure_flag_1): Rename to ...
	(set_pure_flag_1): ... this one; take set_pure_flag_info parameter
	rather than pointer encoded flags; track if any changes was done;
	handle correctly clearning flag and setting flag of aliases already
	declared const.
	(cgraph_node::set_pure_flag): Update.
	(cgraph_node::set_nothrow_flag): Handle correctly clearning the flag.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cgraph.c
    trunk/gcc/cgraph.h
Comment 16 Jan Hubicka 2016-04-21 09:05:39 UTC
Author: hubicka
Date: Thu Apr 21 09:05:07 2016
New Revision: 235318

URL: https://gcc.gnu.org/viewcvs?rev=235318&root=gcc&view=rev
Log:

	PR ipa/70018
	* cgraph.c (cgraph_set_nothrow_flag_1): Rename to ...
	(set_nothrow_flag_1): ... this; handle interposition correctly;
	recurse on aliases and thunks.
	(cgraph_node::set_nothrow_flag): New.
	* ipa-pure-const.c (ignore_edge_for_nothrow): Ignore calls to
	functions compiled with non-call exceptions that binds to current
	def.
	(propagate_nothrow): Be safe WRT interposition.
	* cgraph.h (set_nothrow_flag): Update prototype.

	* g++.dg/ipa/nothrow-1.C: New testcase.

Added:
    trunk/gcc/testsuite/g++.dg/ipa/nothrow-1.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cgraph.c
    trunk/gcc/cgraph.h
    trunk/gcc/ipa-pure-const.c
    trunk/gcc/testsuite/ChangeLog
Comment 17 Jan Hubicka 2016-05-13 15:56:25 UTC
Fixed on trunk so far.
Comment 18 Richard Biener 2016-08-03 11:53:47 UTC
GCC 4.9 branch is being closed
Comment 19 Jakub Jelinek 2017-10-10 13:27:28 UTC
GCC 5 branch is being closed
Comment 20 Jakub Jelinek 2018-10-26 10:46:12 UTC
GCC 6 branch is being closed, fixed in 7.x.