Bug 40761

Summary: IRA memory hog for insanely nested loops
Product: gcc Reporter: Paolo Bonzini <bonzini>
Component: rtl-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: gcc-bugs, jakub, jeffreyalaw, tkoenig, vmakarov, vmakarov
Priority: P2 Keywords: compile-time-hog, ice-on-valid-code, memory-hog
Version: 4.5.0   
Target Milestone: ---   
Host: Target:
Build: Known to work: 4.3.0
Known to fail: 4.5.0 Last reconfirmed: 2012-01-04 00:00:00
Bug Depends on:    
Bug Blocks: 47344    

Description Paolo Bonzini 2009-07-15 06:14:54 UTC
#define ONE     while (b())
#define TEN     ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE
#define HUN     TEN TEN TEN TEN TEN TEN TEN TEN TEN TEN
#define THOU    HUN HUN HUN HUN HUN HUN HUN HUN HUN HUN

void foo()
{
  /* 11,000 nested whiles.  */
  THOU THOU THOU THOU THOU THOU THOU THOU THOU THOU THOU
  a();
}

shows IRA taking 82% of compile-time wall time (25 seconds here at -O0) and needing 1 GB of memory.  The reason why I'm reporting this, is that it seems to be quadratic or at least well superlinear.  Changing it to 5500 loops requires only 300 MB.

Not that it's the most important bug.
Comment 1 Steven Bosscher 2009-07-15 07:21:07 UTC
Ack.
Comment 2 Richard Biener 2009-07-15 09:56:14 UTC
Is it the nesting of loops or really the number of function calls that is important?
Comment 3 Paolo Bonzini 2009-07-15 11:11:52 UTC
do while does not have the same behavior, so the loop "shape" is important.  the following is as bad and does not have function calls.

#define ONE     while (x-- > y)
#define TEN     ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE
#define HUN     TEN TEN TEN TEN TEN TEN TEN TEN TEN TEN
#define THOU    HUN HUN HUN HUN HUN HUN HUN HUN HUN HUN

void foo(int x, int y)
{
  /* 11,000 nested whiles.  */
  THOU THOU THOU THOU THOU THOU THOU THOU THOU THOU THOU
  y++;
}

Comment 4 Thomas Koenig 2010-03-01 19:27:36 UTC
This now causes an ICE:

ig25@linux-fd1f:/tmp> cat haha.c
#define ONE     while (b())
#define TEN     ONE ONE ONE ONE ONE ONE ONE ONE ONE ONE
#define HUN     TEN TEN TEN TEN TEN TEN TEN TEN TEN TEN
#define THOU    HUN HUN HUN HUN HUN HUN HUN HUN HUN HUN

void foo()
{
  /* 1,000 nested whiles.  */
  THOU
  a();
}
ig25@linux-fd1f:/tmp> gcc -O3 haha.c
haha.c: In Funktion »foo«:
haha.c:6:6: interner Compiler-Fehler: in compute_antic, bei tree-ssa-pre.c:2557
Bitte senden Sie einen vollständigen Fehlerbericht auf Englisch ein;
bearbeiten Sie die Quellen zunächst mit einem Präprozessor, wenn es
dienlich ist.
Fehler in der deutschen Übersetzung sind an translation-team-de@lists.sourceforge.net zu melden.

Gehen Sie gemäß den Hinweisen in <http://gcc.gnu.org/bugs.html> vor.

Backtrace:

Breakpoint 1, fancy_abort (file=0xd791f0 "../../trunk/gcc/tree-ssa-pre.c", line=2557,
    function=0xd799ed "compute_antic") at ../../trunk/gcc/diagnostic.c:728
728     {
(gdb) up
#1  0x00000000008ab23c in compute_antic () at ../../trunk/gcc/tree-ssa-pre.c:2557
2557          gcc_assert (num_iterations < 500);

The gcc_assert appears to be triggered for this case was introduced
by jakub:

118169    dberlin       }
137036      jakub #ifdef ENABLE_CHECKING
118821    dberlin       /* Theoretically possible, but *highly* unlikely.  */
137036      jakub       gcc_assert (num_iterations < 500);
137036      jakub #endif
 81764   dnovillo     }


Resetting to P3 for new triage, as this is now ice-on-valid-code.
Comment 5 Richard Biener 2010-03-03 15:42:09 UTC
We shouldn't have deferred BB31 in the first iteration (with just TEN):

<bb 2>:
  goto <bb 13>;

...
<bb 31>:

<bb 13>:
  D.2759_1 = b ();
  if (D.2759_1 != 0)
    goto <bb 32>;
  else
    goto <bb 14>;

<bb 32>:
  goto <bb 12>;

<bb 14>:
  return;

Starting iteration 0
...
Block 31 was deferred for a future iteration.
ANTIC_OUT[14] := {  }
ANTIC_IN[14] := {  }
S[14] := {  }
ANTIC_OUT[13] := {  }
ANTIC_IN[13] := {  }
S[13] := {  }
ANTIC_OUT[2] := {  }
ANTIC_IN[2] := {  }
S[2] := {  }
Starting iteration 1

which means we are iterating in non-optimal order - which is postorder.
It looks like post_order_compute computes BS.
Comment 6 Richard Biener 2010-03-03 15:59:09 UTC
Mine.
Comment 7 Richard Biener 2010-03-04 13:25:43 UTC
Subject: Bug 40761

Author: rguenth
Date: Thu Mar  4 13:25:27 2010
New Revision: 157225

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=157225
Log:
2010-03-04  Richard Guenther  <rguenther@suse.de>

	PR tree-optimization/40761
	* tree-ssa-pre.c (compute_antic): Walk reverse postorder
	in reverse order.
	(my_rev_post_order_compute): New function.
	(init_pre): Call it.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-ssa-pre.c

Comment 8 Richard Biener 2010-03-04 13:25:52 UTC
PRE issue fixed for 4.5, still latent on the 4.4 branch.
Comment 9 Thomas Koenig 2010-03-04 19:40:06 UTC
Should we also commit the test case from #4 ?
Comment 10 Richard Biener 2010-03-05 10:17:27 UTC
(In reply to comment #9)
> Should we also commit the test case from #4 ?

No.  It's too slow.
Comment 11 Richard Biener 2010-04-06 11:20:29 UTC
GCC 4.5.0 is being released.  Deferring to 4.5.1.
Comment 12 Richard Biener 2010-07-31 09:29:34 UTC
GCC 4.5.1 is being released, adjusting target milestone.
Comment 13 Richard Biener 2010-12-16 13:03:40 UTC
GCC 4.5.2 is being released, adjusting target milestone.
Comment 14 Richard Biener 2011-04-28 14:51:50 UTC
GCC 4.5.3 is being released, adjusting target milestone.
Comment 15 Richard Biener 2012-01-16 14:22:49 UTC
Blocks the memory-hog/compile-time-hog regression.  Removing regression
markers.
Comment 16 Vladimir Makarov 2012-01-18 22:01:11 UTC
I'll work on it.
Comment 17 Vladimir Makarov 2012-01-19 20:42:57 UTC
  The problem was in building CFG loops which took the most of time.   CFG loops were built even if we don't use regional allocation as for -O0.

  I'll send a patch soon.  It is not small because IRA in any case uses one region with CFG loop representing the whole function.
Comment 18 Vladimir Makarov 2012-01-19 20:46:48 UTC
Author: vmakarov
Date: Thu Jan 19 20:46:31 2012
New Revision: 183312

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=183312
Log:
2012-01-19  Vladimir Makarov  <vmakarov@redhat.com>

	PR rtl-optimization/40761
	* ira-int.h (struct ira_loop_tree_node): Add comment for member
	loop.  Add new member loop_num.
	(IRA_LOOP_NODE_BY_INDEX): Modify the check.
	(ira_build): Remove the parameter.

	* ira.c (ira_print_disposition): Use loop_num instead of
	loop->num.
	(ira.c): Do not build CFG loops for one region allocation.  Remove
	argument from ira_build call.

	* ira-build.c (init_loop_tree_node): New function.
	(create_loop_tree_nodes): Use it.  Separate the case when CFG
	loops are not built.
	(more_one_region_p): Check current_loops.
	(finish_loop_tree_nodes): Separate the case when CFG loops are not
	built.
	(add_loop_to_tree): Process loop equal to NULL too.
	(form_loop_tree): Separate the case when CFG loops are not built.
	Use explicitly number for the root.
	(rebuild_regno_allocno_maps, create_loop_tree_node_allocnos): Add
	an assertion.
	(ira_print_expanded_allocno, loop_compare_func): Use loop_num
	instead of loop->num.
	(mark_loops_for_removal): Ditto.  Use loop_num instead of
	loop->num.
	(mark_all_loops_for_removal): Ditto.
	(remove_unnecessary_regions): Separate the case when CFG loops
	are not built.
	(ira_build): Remove the parameter.  Use explicit number of regions
	when CFG loops are not built.

	* ira-color.c (print_loop_title): Separate the case for the root
	node.  Use loop_num instead of loop->num.
	(move_spill_restore): Use loop_num instead of loop->num.

	* ira-emit.c (setup_entered_from_non_parent_p): Add an assertion.
	(change_loop): Ditto.
	(change_loop): Use loop_num instead of loop->num.

	* ira-lives.c (process_bb_node_lives): Ditto.

	* ira-costs.c (print_allocno_costs, find_costs_and_classes):
	Ditto.

	* ira-conflicts.c (print_allocno_conflicts): Ditto.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/ira-build.c
    trunk/gcc/ira-color.c
    trunk/gcc/ira-conflicts.c
    trunk/gcc/ira-costs.c
    trunk/gcc/ira-emit.c
    trunk/gcc/ira-int.h
    trunk/gcc/ira-lives.c
    trunk/gcc/ira.c
Comment 19 Jeffrey A. Law 2012-04-27 03:12:46 UTC
See c#18.  This issue has been resolved.