RFA: patch to fix IRA degradation on SPEC2006 calculix

H.J. Lu hjl.tools@gmail.com
Wed Nov 19 15:30:00 GMT 2008


On Tue, Nov 18, 2008 at 10:35:27PM -0800, H.J. Lu wrote:
> On Tue, Nov 18, 2008 at 10:30 PM, H.J. Lu <hjl.tools@gmail.com> wrote:
> > On Tue, Nov 18, 2008 at 7:45 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
> >> H.J. reported huge degradation of IRA on calculix on x86_64
> >>
> >> http://gcc.gnu.org/ml/gcc-patches/2008-11/msg00812.html
> >>
> >> Calculix has one hot function (about 80% of all program time) e_c3d
> >> with small most frequently executed loop.  One accidental spill in the
> >> loop results in 30% degradation.  The degradation occurred after
> >> submitting patch introducing bad spills processing.  I missed another
> >> checks of bad spills in allocno comparison when all previous high
> >> priority criteria are the same (see
> >> ira-color.c::push_allocnos_to_stack).
> >>
> >> I also found that the hot function in calculix has more 50 loops,
> >> therefore IRA decides to use one region allocation.  Instead of such
> >> approach, I change the code to make no more ira-max-loops-num regions
> >> (most frequenly used ones) instead of one region.  I also changed
> >> ira-max-loops-num value to 100 (there are still functions in calculix
> >> which contains more 100 loops) because it looks like fortran functions
> >> have many implicit loops (from vector operations).  It permits to
> >> improve calculix score a bit more.
> >>
> >> The rest of the patch is a code printing more info which I used for
> >> debugging.
> >>
> >> The patch was successfully bootstrapped on x86_64.
> >>
> >> Is it ok to commit?
> >
> > It doesn't bootstrap Linux/x86-64:
> >
> > [hjl@gnu-27 src]$ /export/build/gnu/gcc/build-x86_64-linux/./gcc/xgcc
> > -shared-libgcc -B/export/build/gnu/gcc/build-x86_64-linux/./gcc
> > -nostdinc++ -L/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/libstdc++-v3/src
> > -L/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/libstdc++-v3/src/.libs
> > -B/usr/gcc-4.4/x86_64-unknown-linux-gnu/bin/
> > -B/usr/gcc-4.4/x86_64-unknown-linux-gnu/lib/ -isystem
> > /usr/gcc-4.4/x86_64-unknown-linux-gnu/include -isystem
> > /usr/gcc-4.4/x86_64-unknown-linux-gnu/sys-include
> > -I/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/libstdc++-v3/include/x86_64-unknown-linux-gnu
> > -I/export/build/gnu/gcc/build-x86_64-linux/x86_64-unknown-linux-gnu/libstdc++-v3/include
> > -I/net/gnu-13/export/gnu/src/gcc/gcc/libstdc++-v3/libsupc++
> > -fno-implicit-templates -Wall -Wextra -Wwrite-strings -Wcast-qual
> > -fdiagnostics-show-location=once -ffunction-sections -fdata-sections
> > -g -O2 -D_GNU_SOURCE -c
> > /net/gnu-13/export/gnu/src/gcc/gcc/libstdc++-v3/src/bitmap_allocator.cc
> >  -fPIC -DPIC -o .libs/bitmap_allocator.o
> > /net/gnu-13/export/gnu/src/gcc/gcc/libstdc++-v3/src/bitmap_allocator.cc:
> > In member function âsize_t* __gnu_cxx::free_list::_M_get(size_t)â:
> > /net/gnu-13/export/gnu/src/gcc/gcc/libstdc++-v3/src/bitmap_allocator.cc:103:
> > internal compiler error: Segmentation fault
> > Please submit a full bug report,
> > with preprocessed source if appropriate.
> > See <http://gcc.gnu.org/bugs.html> for instructions.
> > [hjl@gnu-27 src]$
> >
> > I used "--enable-clocale=gnu --with-system-zlib
> > --enable-checking=assert --with-demangler-in-ld --enable-shared
> > --enable-threads=posix --enable-haifa --prefix=/usr/gcc-4.4
> > --with-local-prefix=/usr/local" to configure gcc.
> >
> 
> Valgrind reports:
> 
> ==666== Conditional jump or move depends on uninitialised value(s)
> ==666==    at 0xA260CF: ira_build (ira-build.c:1697)
> ==666==    by 0xA21673: rest_of_handle_ira (ira.c:1788)
> ==666==    by 0x6387D8: execute_one_pass (passes.c:1279)
> ==666==    by 0x638A04: execute_pass_list (passes.c:1328)
> ==666==    by 0x638A1C: execute_pass_list (passes.c:1329)
> ==666==    by 0x708166: tree_rest_of_compilation (tree-optimize.c:418)
> ==666==    by 0x8260C3: cgraph_expand_function (cgraphunit.c:1047)
> ==666==    by 0x8277A4: cgraph_optimize (cgraphunit.c:1106)
> ==666==    by 0x450B1C: cp_write_global_declarations (decl2.c:3615)
> ==666==    by 0x6CE146: toplev_main (toplev.c:979)
> ==666==    by 0x50448B3: (below main) (in /lib64/libc-2.5.so)
> ==666==
> ==666== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 4 from 1)
> ==666== malloc/free: in use at exit: 993,009 bytes in 7,944 blocks.
> ==666== malloc/free: 373,081 allocs, 365,137 frees, 225,318,650 bytes allocated.
> ==666== For counts of detected errors, rerun with: -v
> ==666== searching for pointers to 7,944 not-freed blocks.
> ==666== checked 21,004,136 bytes.
> ==666==
> ==666== LEAK SUMMARY:
> ==666==    definitely lost: 144 bytes in 14 blocks.
> ==666==      possibly lost: 0 bytes in 0 blocks.
> ==666==    still reachable: 992,865 bytes in 7,930 blocks.
> ==666==         suppressed: 0 bytes in 0 blocks.
> 
> 

The patch has

 for (n = i = 0; VEC_iterate (loop_p, ira_loops.larray, i, loop); i++)
    {  
      if (ira_loop_nodes[i].parent == NULL)
        {  
          /* Don't remove the root.  */
          ira_loop_nodes[i].to_remove_p = false;
          continue;
        }
      sorted_loops[n++] = &ira_loop_nodes[i];
      ira_loop_nodes[i].to_remove_p
        = (low_pressure_loop_node_p (ira_loop_nodes[i].parent)
           && low_pressure_loop_node_p (&ira_loop_nodes[i]));
    }

But the "parent" field may not be initialized. This patch works
for me.

H.J.
----
Index: ChangeLog.ira
===================================================================
--- ChangeLog.ira	(revision 142002)
+++ ChangeLog.ira	(working copy)
@@ -1,3 +1,8 @@
+2008-11-19  H.J. Lu  <hongjiu.lu@intel.com>
+
+	* ira-build.c (create_loop_tree_nodes): Always initialize the
+	parent field.
+
 2008-11-18  H.J. Lu  <hongjiu.lu@intel.com>
 
 	* ira-color.c (pseudos_have_intersected_live_ranges_p): Define
Index: ira-build.c
===================================================================
--- ira-build.c	(revision 142002)
+++ ira-build.c	(working copy)
@@ -137,18 +137,22 @@ create_loop_tree_nodes (bool loops_p)
 		skip_p = true;
 		break;
 	      }
+	  if (!skip_p)
+	    {
+	      edges = get_loop_exit_edges (loop);
+	      for (j = 0; VEC_iterate (edge, edges, j, e); j++)
+		if ((e->flags & EDGE_ABNORMAL) && EDGE_CRITICAL_P (e))
+		  {
+		    skip_p = true;
+		    break;
+		  }
+	      VEC_free (edge, heap, edges);
+	    }
 	  if (skip_p)
-	    continue;
-	  edges = get_loop_exit_edges (loop);
-	  for (j = 0; VEC_iterate (edge, edges, j, e); j++)
-	    if ((e->flags & EDGE_ABNORMAL) && EDGE_CRITICAL_P (e))
-	      {
-		skip_p = true;
-		break;
-	      }
-	  VEC_free (edge, heap, edges);
-	  if (skip_p)
-	    continue;
+	    {
+	      ira_loop_nodes[i].parent = NULL;
+	      continue;
+	    }
 	}
       ira_loop_nodes[i].regno_allocno_map
 	= (ira_allocno_t *) ira_allocate (sizeof (ira_allocno_t) * max_regno);



More information about the Gcc-patches mailing list