[Bug tree-optimization/21734] [4.1 regression] ICE: -ftree-vectorize, segfault

dorit at il dot ibm dot com gcc-bugzilla@gcc.gnu.org
Wed Jun 1 13:22:00 GMT 2005


------- Additional Comments From dorit at il dot ibm dot com  2005-06-01 13:22 -------
> Note that if !new_ssa_name, we continue the loop without ever
> adding the PHI argument.  The net result being that we have a
> PHI where PHI_ARG_DEF for one of the PHI's incoming edges is null.
> 
> I'm pretty sure that this can only happen if the result of the
> PHI is not set anywhere in the loop.  

you're right, this is exactly what we have here:

before loop duplication we have:
   m2 = phi <init: m15, latch: m15>

The phi is dead, and has no defs in the loop, which, as you identified, results 
in the fact that it doesn't have a current_def set, and then in the duplicated 
loop we have:
   m24 = phi <init: m15, latch: NULL>

> In that case the PHI
> argument in question should be the same SSA_NAME as the PHI_RESULT
> [ ie, we ultimately end up generating a degenerate phi of the form
> 
> 
>   x_3 = PHI (x_3 (latch edge), x_2 (initial value from entry edge))
> 
> 

Indeed applying the following patch, which does exactly that, solves the 
problem:

Index: tree-vectorizer.c
===================================================================
RCS file: /cvs/gcc/gcc/gcc/tree-vectorizer.c,v
retrieving revision 2.91
diff -u -3 -p -r2.91 tree-vectorizer.c
--- tree-vectorizer.c   26 May 2005 18:14:48 -0000      2.91
+++ tree-vectorizer.c   1 Jun 2005 13:11:01 -0000
@@ -1,3 +1,4 @@
+
 /* Loop Vectorization
    Copyright (C) 2003, 2004, 2005 Free Software Foundation, Inc.
    Contributed by Dorit Naishlos <dorit@il.ibm.com>
@@ -321,8 +322,11 @@ slpeel_update_phis_for_duplicate_loop (s

       new_ssa_name = get_current_def (def);
       if (!new_ssa_name)
-        /* Something defined outside of the loop.  */
-        continue;
+       {
+          /* This only happens if there are no definitions
+            inside the loop. use the phi_result in this case.  */
+         new_ssa_name = PHI_RESULT (phi_new);
+       }

       /* An ordinary ssa name defined in the loop.  */
       add_phi_arg (phi_new, new_ssa_name, loop_latch_edge (new_loop));
@@ -566,7 +570,12 @@ slpeel_update_phi_nodes_for_guard1 (edge
       else
         {
           current_new_name = get_current_def (loop_arg);
-          gcc_assert (current_new_name);
+         /* current_def is not available only if the variable does not
+            change inside the loop, in which case we also don't care
+            about recording a current_def for it because we won't be
+            trying to create loop-exit-phis for it.  */
+         if (!current_new_name)
+           continue;
         }



> 
> What I don't know yet is if the problem is really that we haven't
> set up the current def properly (thus causing get_current_def to
> return NULL) or if we just need code to compensate for this
> situation in slpeel_update_phis_for_duplicate_loop.
> 
> Thoughts?
>

I don't know which current_def would make sense to set for this phi, if at all.

It originally was:
   m2 = phi <init: m15, latch: m16>
   m16 = <v_may_def m2>

after t41.alias4 it became:
   m2 = phi <init: m15, latch: m2>

and after t44.store_ccp it got to its current form:
   m2 = phi <init: m15, latch: m15>

The best thing would be to detect such redundant phis and clean them up, and in 
the vectorizer work under the assumption that they don't exit. The code to do 
peeling would be cleaner (not having to consider these special cases), and we 
would generate much less code (see below how many phis we end up generating 
when peeling before and after this loop). By the way, all these garbage phis do 
get eliminated later on, by dce (at t68.cd_dce). Calling dce just before loop 
optimizations or just before the vectorizer also solved the problem. We can 
actually also detect invariant/dead phis at the beginning of the vectorizer (it 
will be pretty much for free cause we examine all phis and their uses anyhow. 
might as well get rid of them). In the meantime, I'll test the patch above.

FYI, when applying the patch above, the resulting code that we generate is as 
shown below:

==========================================
>>>before:

orig_loop:
  m2 = phi<init: m15, latch: m15>


>>>after:

   if C1 goto new_prolog_loop
   else  goto bb1

new_prolog_loop (dup):
   m24 = phi<init: m15, latch: m24>
loop_exit:
   m34 = phi <m24>
   if C2 goto bb1
   else  goto bb3

bb1:
   m33 = phi <m15, m34>
   if C3 goto orig_loop
   else  goto bb2

orig_loop:
  m2 = phi<init: m33, latch: m15>
loop_exit:
  m54 = phi<m15>
  if C4 goto bb2 
  else  goto bb4

bb2:
  m53 = phi<m33, m54>

new_epilog_loop (dup):
  m44 = phi <init: m53, latch: m44>
loop exit:

bb4:

bb3:
==========================================


-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21734



More information about the Gcc-bugs mailing list