User account creation filtered due to spam.

Bug 58555 - [4.9 Regression] Floating point exception in want_inline_self_recursive_call_p
Summary: [4.9 Regression] Floating point exception in want_inline_self_recursive_call_p
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.9.0
: P1 normal
Target Milestone: 4.9.0
Assignee: Jan Hubicka
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-09-27 19:19 UTC by David Binderman
Modified: 2014-02-20 07:43 UTC (History)
7 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2013-09-30 00:00:00


Attachments
gzipped C++ source code (334.21 KB, application/x-gzip)
2013-09-27 19:19 UTC, David Binderman
Details
reduced testcase (676 bytes, text/plain)
2013-09-27 21:03 UTC, Markus Trippelsdorf
Details
reduced testcase (863 bytes, text/plain)
2013-11-05 16:21 UTC, Markus Trippelsdorf
Details

Note You need to log in before you can comment on or make changes to this bug.
Description David Binderman 2013-09-27 19:19:43 UTC
Created attachment 30919 [details]
gzipped C++ source code

I just tried to compile package flamerobin-0.9.3-4.20130401snap with
gcc 4.9 trunk dated 20130925. It said

./src/metadata/root.cpp:375:1: internal compiler error: Floating point exception
 }
 ^
0xafbfff crash_signal
    ../../src/trunk/gcc/toplev.c:335
0x50ed95 want_inline_self_recursive_call_p
    ../../src/trunk/gcc/ipa-inline.c:699
0xf72320 inline_small_functions
    ../../src/trunk/gcc/ipa-inline.c:1756
0xf72320 ipa_inline
    ../../src/trunk/gcc/ipa-inline.c:2009
0xf72320 execute
    ../../src/trunk/gcc/ipa-inline.c:2379
Please submit a full bug report,
with preprocessed source if appropriate.

Preprocessed source code attached. Flag -O3 required.

Checking the compiler source code, the offending line is

      if (!max_count
      && (edge->frequency * CGRAPH_FREQ_BASE / caller_freq
          >= max_prob))

I speculate that caller_freq == 0 and someone has missed out
a belt'n'braces check for zero before making the division.
Comment 1 Markus Trippelsdorf 2013-09-27 21:03:59 UTC
Created attachment 30920 [details]
reduced testcase
Comment 2 Paolo Carlini 2013-09-28 11:36:33 UTC
I think this is for Honza
Comment 3 Markus Trippelsdorf 2013-09-28 11:52:34 UTC
(In reply to Paolo Carlini from comment #2)
> I think this is for Honza

Yes. Started with r202185 .
Comment 4 Marek Polacek 2013-09-30 10:54:19 UTC
Thus confirmed.
Comment 5 David Binderman 2013-10-17 09:46:34 UTC
(In reply to Marek Polacek from comment #4)
> Thus confirmed.

Still broken in trunk dated 20131016.
Comment 6 Markus Trippelsdorf 2013-11-05 15:15:30 UTC
Program received signal SIGFPE, Arithmetic exception.
[Switching to process 21235]
0x000000000052531d in want_inline_self_recursive_call_p (edge=0x7ffff74412d8, outer_node=0x7ffff7442260, peeling=<optimized out>, depth=1) at ../../gcc/gcc/ipa-inline.c:702
702               && (edge->frequency * CGRAPH_FREQ_BASE / caller_freq
(gdb) bt
#0  0x000000000052531d in want_inline_self_recursive_call_p (edge=0x7ffff74412d8, outer_node=0x7ffff7442260, peeling=<optimized out>, depth=1)
    at ../../gcc/gcc/ipa-inline.c:702
#1  0x0000000000fca3bb in inline_small_functions () at ../../gcc/gcc/ipa-inline.c:1761
#2  ipa_inline () at ../../gcc/gcc/ipa-inline.c:2015
#3  (anonymous namespace)::pass_ipa_inline::execute (this=0x7ffff74412d8) at ../../gcc/gcc/ipa-inline.c:2385
#4  0x0000000000a6ad8a in execute_one_pass (pass=pass@entry=0x16c5520) at ../../gcc/gcc/passes.c:2215
#5  0x0000000000a6b61b in execute_ipa_pass_list (pass=0x16c5520) at ../../gcc/gcc/passes.c:2579
#6  0x000000000080e287 in ipa_passes () at ../../gcc/gcc/cgraphunit.c:2028
#7  compile () at ../../gcc/gcc/cgraphunit.c:2118
#8  0x000000000080ea95 in finalize_compilation_unit () at ../../gcc/gcc/cgraphunit.c:2272
#9  0x000000000061175f in cp_write_global_declarations () at ../../gcc/gcc/cp/decl2.c:4403
#10 0x0000000000b152dd in compile_file () at ../../gcc/gcc/toplev.c:559
#11 0x0000000000b17188 in do_compile () at ../../gcc/gcc/toplev.c:1894
#12 toplev_main (argc=13, argv=0x7fffffffe308) at ../../gcc/gcc/toplev.c:1970
#13 0x00007ffff75fba6e in __libc_start_main () from /lib/libc.so.6
#14 0x0000000000527081 in _start ()
(gdb) p caller_freq
$1 = 0
(gdb) p outer_node->global.inlined_to
$2 = (cgraph_node *) 0x7ffff73ff980
(gdb) p outer_node->callers->frequency
$3 = 0
Comment 7 Markus Trippelsdorf 2013-11-05 16:21:03 UTC
Created attachment 31162 [details]
reduced testcase

Here's a better reduced testcase. The first one was crappy.
Comment 8 Markus Trippelsdorf 2013-11-05 19:27:46 UTC
The following patch seems to fix the issue:

diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index f4cb72a9c2b0..b69d4d96176a 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -698,7 +698,7 @@ want_inline_self_recursive_call_p (struct cgraph_edge *edge,
          reason = "profile of recursive call is too large";
          want_inline = false;
        }
-      if (!max_count
+      if (!max_count && caller_freq
          && (edge->frequency * CGRAPH_FREQ_BASE / caller_freq
              >= max_prob))
        {
@@ -731,7 +731,7 @@ want_inline_self_recursive_call_p (struct cgraph_edge *edge,
          reason = "profile of recursive call is too small";
          want_inline = false;
        }
-      else if (!max_count
+      else if (!max_count && caller_freq
               && (edge->frequency * 100 / caller_freq
                   <= PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY)))
Comment 9 David Binderman 2013-11-13 14:05:50 UTC
(In reply to Markus Trippelsdorf from comment #8)
> The following patch seems to fix the issue:

Looks good to me. 

I think it needs to get into the compiler source code somehow.

Would posting the patch to gcc-patches be the correct way forward ?
Comment 10 Markus Trippelsdorf 2013-11-13 15:20:42 UTC
(In reply to David Binderman from comment #9)
> (In reply to Markus Trippelsdorf from comment #8)
> > The following patch seems to fix the issue:
> 
> Looks good to me. 
> 
> I think it needs to get into the compiler source code somehow.
> 
> Would posting the patch to gcc-patches be the correct way forward ?

Yes, but I'd prefer not to. Because without commit rights you'll
have to find someone to commit things for you. And the need to 
repeatedly ping even for trivial patches soon gets ridiculous.
Comment 11 Jakub Jelinek 2013-12-04 15:29:25 UTC
Can't reproduce this, with various snapshots from around the date of comments, or current trunk, neither on the larger nor shorter testcase.
Comment 12 Markus Trippelsdorf 2013-12-04 15:55:29 UTC
(In reply to Jakub Jelinek from comment #11)
> Can't reproduce this, with various snapshots from around the date of
> comments, or current trunk, neither on the larger nor shorter testcase.

Strange.
Just double checked and it still ICEs for me:
  % ../gcc/configure --disable-bootstrap --disable-libvtv --disable-libitm --disable-libcilkrts --disable-libssp --disable-libgomp --disable-werror --disable-multilib --enable-languages=c,c++
  % make -j4 && make DESTDIR=/var/tmp/gcc_test install
  % /var/tmp/gcc_test/usr/local/bin/g++ -O3 -c test.ii
test.ii:113:35: internal compiler error: Floating point exception
 void addServer() { I<int>(new P); }
                                   ^
0xb44e9f crash_signal
        ../../gcc/gcc/toplev.c:336
0x53c863 want_inline_self_recursive_call_p
        ../../gcc/gcc/ipa-inline.c:708
0x1023819 inline_small_functions
        ../../gcc/gcc/ipa-inline.c:1767
0x1023819 ipa_inline
        ../../gcc/gcc/ipa-inline.c:2020
0x1023819 execute
        ../../gcc/gcc/ipa-inline.c:2390
Please submit a full bug report,
with preprocessed source if appro
Comment 13 Jan Hubicka 2013-12-04 18:04:07 UTC
I see, we should just short citcuit case when caller_freq
is 0.  I will test patch.
Comment 14 David Binderman 2014-01-02 09:31:22 UTC
(In reply to Jan Hubicka from comment #13)
> I see, we should just short citcuit case when caller_freq
> is 0.  I will test patch.

Any news with the patch ?
Comment 15 David Binderman 2014-01-09 09:33:13 UTC
(In reply to David Binderman from comment #14)
> (In reply to Jan Hubicka from comment #13)
> > I see, we should just short citcuit case when caller_freq
> > is 0.  I will test patch.
> 
> Any news with the patch ?

I looked into this some more and there are four
places in the same function where it is assumed
that for X / Y, that Y is always != 0.

Belt'n'braces programming suggests it might be a
wise idea to sanity check all four places.

I am current using this diff.

Index: src/trunk/gcc/ipa-inline.c
===================================================================
--- src/trunk/gcc/ipa-inline.c	(revision 206420)
+++ src/trunk/gcc/ipa-inline.c	(working copy)
@@ -703,6 +703,7 @@
       for (i = 1; i < depth; i++)
 	max_prob = max_prob * max_prob / CGRAPH_FREQ_BASE;
       if (max_count
+          && outer_node->count != 0
 	  && (edge->count * CGRAPH_FREQ_BASE / outer_node->count
 	      >= max_prob))
 	{
@@ -710,6 +711,7 @@
 	  want_inline = false;
 	}
       if (!max_count
+          && caller_freq != 0
 	  && (edge->frequency * CGRAPH_FREQ_BASE / caller_freq
 	      >= max_prob))
 	{
@@ -736,6 +738,7 @@
   else
     {
       if (max_count
+          && outer_node->count != 0
 	  && (edge->count * 100 / outer_node->count
 	      <= PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY)))
 	{
@@ -743,6 +746,7 @@
 	  want_inline = false;
 	}
       else if (!max_count
+               && caller_freq != 0
 	       && (edge->frequency * 100 / caller_freq
 	           <= PARAM_VALUE (PARAM_MIN_INLINE_RECURSIVE_PROBABILITY)))
 	{
Comment 16 Jeffrey A. Law 2014-01-27 20:40:52 UTC
I haven't managed to reproduce this either.  Clearly Jakub and I are doing something different that's preventing us from reproducing this problem.

I don't want to just go forward with the patch that checks that the denominator is nonzero without knowing more about how/why it is zero.  Much like a NULL pointer dereference, this specific failure may well be just the manifestation of a problem elsewhere.

David, Markus -- can y'all reproduce on one of the gcc farm machines?  Or a VM you can give me access to?   Or at least a core file and corresponding svn revision for the GCC build you're using?
Comment 17 Jeffrey A. Law 2014-01-27 20:54:43 UTC
Nevermind, I've got it failing now.
Comment 18 Jan Hubicka 2014-01-27 21:23:11 UTC
When caller frequency is 0, we do not really want to inline recursively.  I will make a patch.
Comment 19 Jeffrey A. Law 2014-01-27 21:27:13 UTC
That part is pretty obvious.  What I wanted to look at is how we got there and did it make any sense to be in that code with a caller frequency of zero.
Comment 20 Jan Hubicka 2014-01-27 21:41:34 UTC
> That part is pretty obvious.  What I wanted to look at is how we got there and
> did it make any sense to be in that code with a caller frequency of zero.

Recursive inliner will happily call the predicate on every recursive call it is
given.  Usually we do not get there, because in those cases we get
!cgraph_maybe_hot_edge_p

That predicate however can be convinced to return true even for calls with
0 frequency if profile feedback says there are stil many counts or if user
drops hot attribute on the outer function. (Theresa made this code to depends
on edge historgram and in some cases it may behave in funy ways).
So check is needed.

Question remains why this reproduces on testcase without profile feedback & hot
attribute.  I assume that because roundoff error We have A inlined in B with
call frequency 0 while A still have callees with non-zero frequency because it
contains a loop.  It is apparently what happens here, will double check.
Comment 21 Jeffrey A. Law 2014-01-27 21:52:36 UTC
Right, I'd been speculating that we were calling want_inline_self_recursive_call_p for all self-recursive calls and that maybe the hot edge or other heuristic was doing something unexpected.

Anyway, I'll let you run with it.  You'll be far more efficient than I nailing this down.  I only started looking at it because it was one of the P1s with no activity for the longest time.

jeff
Comment 22 Jakub Jelinek 2014-02-18 06:37:49 UTC
So, shall we just apply #c15 here?
Comment 23 David Binderman 2014-02-18 08:26:24 UTC
(In reply to Jakub Jelinek from comment #22)
> So, shall we just apply #c15 here?

Diff works fine for me for over five weeks now, so I say yes.
Comment 24 Jan Hubicka 2014-02-20 01:32:54 UTC
> > So, shall we just apply #c15 here?
> 
> Diff works fine for me for over five weeks now, so I say yes.

It is easier to just return at beggining instead of duplicating the check.  Have patch for it, just for some reason
I wanted to look deper into why we inline here.  I forgot the reason, but will work it out again today.

Honza
Comment 25 Jan Hubicka 2014-02-20 02:52:28 UTC
> It is easier to just return at beggining instead of duplicating the check. 
> Have patch for it, just for some reason
> I wanted to look deper into why we inline here.  I forgot the reason, but will
> work it out again today.
OK, remember the reason, the inlining decisions did not make sense :)

It turned out to be very basic and apparently ages old profile updating but in
clone_inlined_nodes. This function is used when one inlines call a->b
and it produces clone of b to be inlined into a.  It updates frequencies in
the clone, so if the edge is, say, executing twice per invocation of a,
the frequencies in b are doubled.

Now it may happen that b has inlined functions to it. In that case the function
recurses and inlines further. It stil updates frequencies on the same basis:
i.e. it takes average number of executions of edge per invocation and multiply.
This is completely wrong.  If there is chain b->c->d->e all inlined and c is
executed at average 10 times per execution of b and c->d, the the frequenceis
of edge c->d->e are already scaled up 10fold.
The current logic sees edge c->d already multiplied and scale again causing
quite an explosion in frequencies.

Bootstrapping/regtesting x86_64-linux.

Index: ipa-inline.c
===================================================================
--- ipa-inline.c	(revision 207870)
+++ ipa-inline.c	(working copy)
@@ -708,6 +684,12 @@
   if (outer_node->global.inlined_to)
     caller_freq = outer_node->callers->frequency;
 
+  if (!caller_freq)
+    {
+      reason = "function is inlined and ulikely";
+      want_inline = false;
+    }
+
   if (!want_inline)
     ;
   /* Inlining of self recursive function into copy of itself within other function
Index: ipa-inline.h
===================================================================
--- ipa-inline.h	(revision 207870)
+++ ipa-inline.h	(working copy)
@@ -233,7 +234,8 @@
 /* In ipa-inline-transform.c  */
 bool inline_call (struct cgraph_edge *, bool, vec<cgraph_edge_p> *, int *, bool);
 unsigned int inline_transform (struct cgraph_node *);
-void clone_inlined_nodes (struct cgraph_edge *e, bool, bool, int *);
+void clone_inlined_nodes (struct cgraph_edge *e, bool, bool, int *,
+			  int freq_scale = -1);
 
 extern int ncalls_inlined;
 extern int nfunctions_inlined;
Index: ipa-inline-transform.c
===================================================================
--- ipa-inline-transform.c	(revision 207870)
+++ ipa-inline-transform.c	(working copy)
@@ -127,11 +127,16 @@
    the edge and redirect it to the new clone.
    DUPLICATE is used for bookkeeping on whether we are actually creating new
    clones or re-using node originally representing out-of-line function call.
-   */
+   By default the offline copy is removed, when it appers dead after inlining.
+   UPDATE_ORIGINAL prevents this transformation.
+   If OVERALL_SIZE is non-NULL, the size is updated to reflect the
+   transformation.
+   FREQ_SCALE is implicit parameter used for internal bookeeping when
+   recursively copying functions inlined into the clone.  */
 
 void
 clone_inlined_nodes (struct cgraph_edge *e, bool duplicate,
-		     bool update_original, int *overall_size)
+		     bool update_original, int *overall_size, int freq_scale)
 {
   struct cgraph_node *inlining_into;
   struct cgraph_edge *next;
@@ -171,12 +176,16 @@
 	  duplicate = false;
 	  e->callee->externally_visible = false;
           update_noncloned_frequencies (e->callee, e->frequency);
+	  gcc_assert (freq_scale == -1);
 	}
       else
 	{
 	  struct cgraph_node *n;
+
+	  if (freq_scale == -1)
+	    freq_scale = e->frequency;
 	  n = cgraph_clone_node (e->callee, e->callee->decl,
-				 e->count, e->frequency, update_original,
+				 e->count, freq_scale, update_original,
 				 vNULL, true, inlining_into);
 	  cgraph_redirect_edge_callee (e, n);
 	}
@@ -191,7 +200,7 @@
     {
       next = e->next_callee;
       if (!e->inline_failed)
-        clone_inlined_nodes (e, duplicate, update_original, overall_size);
+        clone_inlined_nodes (e, duplicate, update_original, overall_size, freq_scale);
       if (e->speculative && !speculation_useful_p (e, true))
 	{
 	  cgraph_resolve_speculation (e, NULL);
Comment 26 Jakub Jelinek 2014-02-20 06:15:05 UTC
Thanks, but please fix the typoes:
s/ulikely/unlikely/
s/appers/appears/
s/bookeeping/bookkeeping/
?
Comment 27 Markus Trippelsdorf 2014-02-20 06:22:34 UTC
(In reply to Jakub Jelinek from comment #26)
> Thanks, but please fix the typoes:

s/typoes/typos
Comment 28 Jakub Jelinek 2014-02-20 06:25:12 UTC
(In reply to Markus Trippelsdorf from comment #27)
> s/typoes/typos

;)
Comment 29 Jan Hubicka 2014-02-20 06:40:38 UTC
Author: hubicka
Date: Thu Feb 20 06:40:07 2014
New Revision: 207934

URL: http://gcc.gnu.org/viewcvs?rev=207934&root=gcc&view=rev
Log:

	PR ipa/58555
	* ipa-inline-transform.c (clone_inlined_nodes): Add freq_scale parameter
	specifying the scaling.
	(inline_call): Update.
	(want_inline_recursively): Guard division by zero.
	(recursive_inlining): Update.
	* ipa-inline.h (clone_inlined_nodes): Update.
	* testsuite/g++.dg/torture/pr58555.C: New testcase.

Added:
    trunk/gcc/testsuite/g++.dg/torture/pr58555.C
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ipa-inline-transform.c
    trunk/gcc/ipa-inline.c
    trunk/gcc/ipa-inline.h
    trunk/gcc/testsuite/ChangeLog
Comment 30 Markus Trippelsdorf 2014-02-20 07:43:03 UTC
Fixed. Thanks.