Bug 17503

Summary: quadratic behaviour in invalid_mode_change_p
Product: gcc Reporter: Daniel Jacobowitz <drow>
Component: middle-endAssignee: Richard Henderson <rth>
Status: RESOLVED FIXED    
Severity: normal CC: gcc-bugs, jozef.kruger
Priority: P2 Keywords: compile-time-hog, patch
Version: 4.0.0   
Target Milestone: 3.3.6   
Host: i686-pc-linux-gnu Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu Known to work:
Known to fail: Last reconfirmed: 2004-09-21 15:21:21
Attachments: possible patch

Description Daniel Jacobowitz 2004-09-15 16:50:51 UTC
Compiling insn-attrtab.i with compilers from a few weeks ago, this function is
way down the profile.  Now it's at the very top.

                               :bool
                               :invalid_mode_change_p (unsigned int regno, enum
reg_class class,
                               :                       enum machine_mode from_mode)
   625  0.1102    55  0.5528   :{ /* invalid_mode_change_p total: 117807 20.7631
 4747 47.7085 */
                               :  enum machine_mode to_mode;
                               :  int n;
   113  0.0199    17  0.1709   :  int start = regno * MAX_MACHINE_MODE;
                               :
116873 20.5985  4675 46.9849   :  EXECUTE_IF_SET_IN_BITMAP (&subregs_of_mode,
start, n,
                               :    if (n >= MAX_MACHINE_MODE + start)
                               :      return 0;
                               :    to_mode = n - start;
                               :    if (CANNOT_CHANGE_MODE_CLASS (from_mode,
to_mode, class))
                               :      return 1;
                               :  );
                               :  return 0;
   196  0.0345     0 0.0e+00   :}

I am not positive whether the function is being called more, or just spending
more time in the bitmap, but it looks like the latter - we're creating many more
entries in subregs_of_mode.  I selected a random call to invalid_mode_change_p,
and compared the size of the bitmap in both compilers.  In the older compiler,
the high entry in the bitmap was around 10,000 bits; in the newer, around 450,000.

The newer compiler was updated this morning.
Comment 1 Andrew Pinski 2004-09-15 17:49:33 UTC
Most likely caused by:
2004-09-14  Roger Sayle  <roger@eyesopen.com>

        PR rtl-optimization/9771
        * regclass.c (CALL_REALLY_USED_REGNO_P): New macro to eliminate
        conditional compilation in init_reg_sets_1.
        (init_reg_sets_1): Let global_regs[i] take priority over the frame
        (but not stack) pointer exceptions to regs_invalidated_by_call.
        (globalize_reg): Globalizing a fixed register may need to update
        regs_invalidated_by_call.
Comment 2 roger 2004-09-15 19:09:02 UTC
I find it extremely unlikely that the patch mentioned in comment #2 could have
any performance impact, even on code that declares register variables which
should be the only observable functionality change.

Daniel, I wonder whether you could re-run your profiles after reverting my
patch and confirm there's no performance change, i.e. that it's innocent?
Comment 3 Daniel Jacobowitz 2004-09-20 19:23:26 UTC
Yes, Roger's patch is innocent.

Richard, I haven't narrowed it down to one patch yet, but it's down to the
six-hour window in which you committed the patch for PR 9997.  That's my
suspect.  There is no substantive difference in the RTL diffs except that
numbers for pseudoregisters get larger:
-(insn 9 8 10 1 (set (reg:SI 59 [ T.665 ])
+(insn 9 8 10 1 (set (reg:SI 11374 [ T.665 ])
         (mem/s/j:SI (plus:SI (reg/f:SI 11377)

and stack offsets fluctuate a bit.

I'll verify that I've pegged the right patch.  Do we need to do this at -O0?
Comment 4 Daniel Jacobowitz 2004-09-20 19:54:15 UTC
Yes, definitely caused by that patch.
Comment 5 Richard Henderson 2004-09-20 21:41:00 UTC
"Huge"?  Ok, so the thing's now at the top of the profile, but it still only uses
four (4) seconds of cpu time, so fixing it isn't going to help *that* much.  I've
reduced the severity of the pr.

I suspect that you can only see this effect at -O0 with Truely Large synthetic
functions, as seen in insn-attrtab.c.  So I am caring only a little at the moment.

Note that the "regression" comes not from the stack slot sharing per-se, but 
actually honoring use_register_for_decl, which is true for scalar temporaries.
Comment 6 Richard Henderson 2004-09-20 22:16:51 UTC
Created attachment 7182 [details]
possible patch

Given that you seem to care, Daniel, please try this patch on a number of
platforms.  At minumum, i386 and ia64 would be good.
Comment 7 Daniel Jacobowitz 2004-09-21 15:21:18 UTC
Thanks.  Patch bootstrapped on ia64-hpux, i386-linux, and powerpc-linux with no
regressions.  Clears up the performance spike for cc1-i-files at -O0 and
decreases page faults on some other testing by a (very) small amount.
Comment 8 GCC Commits 2004-09-24 19:47:16 UTC
Subject: Bug 17503

CVSROOT:	/cvs/gcc
Module name:	gcc
Changes by:	rth@gcc.gnu.org	2004-09-24 19:47:06

Modified files:
	gcc            : ChangeLog combine.c flow.c regclass.c regs.h 
	                 rtl.h 

Log message:
	PR rtl-opt/17503
	* regclass.c (subregs_of_mode): Turn into an htab.  Make static.
	(som_hash, som_eq): New.
	(init_subregs_of_mode, record_subregs_of_mode): New.
	(cannot_change_mode_set_regs): Rewrite for htab implementation.
	(invalid_mode_change_p): Likewise.
	* combine.c (gen_lowpart_for_combine): Use record_subregs_of_mode.
	* flow.c (mark_used_regs): Likewise.
	(life_analysis): Use init_subregs_of_mode.
	* regs.h (subregs_of_mode): Remove.
	* rtl.h (init_subregs_of_mode, record_subregs_of_mode): Declare.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&r1=2.5606&r2=2.5607
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/combine.c.diff?cvsroot=gcc&r1=1.454&r2=1.455
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/flow.c.diff?cvsroot=gcc&r1=1.597&r2=1.598
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/regclass.c.diff?cvsroot=gcc&r1=1.197&r2=1.198
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/regs.h.diff?cvsroot=gcc&r1=1.35&r2=1.36
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/rtl.h.diff?cvsroot=gcc&r1=1.509&r2=1.510

Comment 9 Richard Henderson 2004-09-24 20:00:06 UTC
Fixed.
Comment 10 GCC Commits 2004-10-12 23:35:48 UTC
Subject: Bug 17503

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-3_4-branch
Changes by:	rth@gcc.gnu.org	2004-10-12 23:35:39

Modified files:
	gcc            : ChangeLog combine.c flow.c regclass.c regs.h 
	                 rtl.h 

Log message:
	PR rtl-opt/17503
	* regclass.c (subregs_of_mode): Turn into an htab.  Make static.
	(som_hash, som_eq): New.
	(init_subregs_of_mode, record_subregs_of_mode): New.
	(cannot_change_mode_set_regs): Rewrite for htab implementation.
	(invalid_mode_change_p): Likewise.
	* combine.c (gen_lowpart_for_combine): Use record_subregs_of_mode.
	* flow.c (mark_used_regs): Likewise.
	(life_analysis): Use init_subregs_of_mode.
	* regs.h (subregs_of_mode): Remove.
	* rtl.h (init_subregs_of_mode, record_subregs_of_mode): Declare.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=2.2326.2.653&r2=2.2326.2.654
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/combine.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.400.4.10&r2=1.400.4.11
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/flow.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.572.4.2&r2=1.572.4.3
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/regclass.c.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.183.4.1&r2=1.183.4.2
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/regs.h.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.31&r2=1.31.4.1
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/rtl.h.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.448.4.6&r2=1.448.4.7

Comment 11 Richard Henderson 2004-10-13 16:59:23 UTC
*** Bug 16834 has been marked as a duplicate of this bug. ***
Comment 12 GCC Commits 2004-12-04 00:37:25 UTC
Subject: Bug 17503

CVSROOT:	/cvs/gcc
Module name:	gcc
Branch: 	gcc-3_3-branch
Changes by:	rth@gcc.gnu.org	2004-12-04 00:36:41

Modified files:
	gcc            : ChangeLog combine.c flow.c regclass.c regs.h 
	                 rtl.h 

Log message:
	PR rtl-opt/17503
	* regclass.c (subregs_of_mode): Turn into an htab.  Make static.
	(som_hash, som_eq): New.
	(init_subregs_of_mode, record_subregs_of_mode): New.
	(cannot_change_mode_set_regs): Rewrite for htab implementation.
	(invalid_mode_change_p): Likewise.
	* combine.c (gen_lowpart_for_combine): Use record_subregs_of_mode.
	* flow.c (mark_used_regs): Likewise.
	(life_analysis): Use init_subregs_of_mode.
	* regs.h (subregs_of_mode): Remove.
	* rtl.h (init_subregs_of_mode, record_subregs_of_mode): Declare.

Patches:
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.16114.2.1031&r2=1.16114.2.1032
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/combine.c.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.325.2.17&r2=1.325.2.18
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/flow.c.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.541.2.6&r2=1.541.2.7
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/regclass.c.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.160.4.7&r2=1.160.4.8
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/regs.h.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.26.4.2&r2=1.26.4.3
http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/gcc/rtl.h.diff?cvsroot=gcc&only_with_tag=gcc-3_3-branch&r1=1.375.2.8&r2=1.375.2.9