gfortran -c -v -O3 -ffast-math -funroll-loops -ftree-vectorize -march=native -ffree-form bug.F crashes with bug.F: In function ‘check_dnucint_ana’: bug.F:1: internal compiler error: in vectorizable_assignment, at tree-vect-transform.c:3671 for /data03/vondele/gcc_trunk/build/libexec/gcc/x86_64-unknown-linux-gnu/4.4.0/f951 /tmp/cc7t33Xt.f -march=k8-sse3 -mcx16 -msahf --param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=1024 -mtune=k8 -quiet -dumpbase bug.F -auxbase bug -O3 -version -ffast-math -funroll-loops -ftree-vectorize -ffree-form -fpreprocessed -fintrinsic-modules-path /data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.4.0/finclude -o /tmp/ccwqJEaQ.s GNU Fortran (GCC) version 4.4.0 20080503 (experimental) [trunk revision 134897] (x86_64-unknown-linux-gnu) This blocks compilation of CVS CP2K. A reduced testcase is: SUBROUTINE check_dnucint_ana (dcore) IMPLICIT NONE INTEGER, PARAMETER :: dp=8 REAL(dp), DIMENSION(10, 2), INTENT(IN),& OPTIONAL :: dcore INTEGER :: i, j REAL(dp) :: delta, nssss, od, rn, ssssm, & ssssp REAL(dp), DIMENSION(10, 2) :: corem, corep, ncore LOGICAL :: check_value delta = 1.0E-8_dp od = 0.5_dp/delta ncore = od * (corep - corem) nssss = od * (ssssp - ssssm) IF (PRESENT(dcore)) THEN DO i = 1, 2 DO j = 1, 10 IF (.NOT.check_value(ncore(j,i), dcore(j,i), delta, 0.1_dp)) THEN END IF END DO END DO END IF END SUBROUTINE check_dnucint_ana
I don't get the ICE: revision 134926, x86_64-linux, same flags. The loop in line 14 gets vectorized. Ira
I can reproduce this on i686 with -O3 -mfpmath=sse -msse2 (r134902): #1 0x086d592d in vectorizable_assignment (stmt=0xb7c9dcb0, bsi=0xbff3b824, vec_stmt=0xbff3b7c0, slp_node=0xab9ca08) at /home/richard/src/trunk/gcc/tree-vect-transform.c:3671 3671 gcc_assert (ncopies >= 1); (gdb) print ncopies $1 = 0 (gdb) call debug_generic_expr (stmt) D.989_84 = ((D.988_83)) hmm, I guess I missed a place to teach the vectorizer that PAREN_EXPR is a vectorizable_assignment. (gdb) print *stmt_info $4 = {type = assignment_vec_info_type, stmt = 0xb7c9dcb0, loop_vinfo = 0xab97730, relevant = vect_used_in_loop, live = 0 '\0', vectype = 0xb7c71208, vectorized_stmt = 0x0, data_ref_info = 0x0, dr_base_address = 0x0, dr_init = 0x0, dr_offset = 0x0, dr_step = 0x0, dr_aligned_to = 0x0, in_pattern_p = 0 '\0', related_stmt = 0x0, same_align_refs = 0xab77820, def_type = vect_loop_def, first_dr = 0x0, next_dr = 0x0, size = 0, store_count = 0, gap = 0, same_dr_stmt = 0x0, read_write_dep = 0 '\0', cost = {outside_of_loop = 0, inside_of_loop = 1}, slp_type = pure_slp} (gdb) print *loop_vinfo $5 = {loop = 0xb7ca5688, bbs = 0xab67b98, num_iters = 0xb7c9d914, num_iters_unchanged = 0xb7c9d914, min_profitable_iters = 0, vectorizable = 1 '\001', vectorization_factor = 1, unaligned_dr = 0x0, peeling_for_alignment = 0, ptr_mask = 15, datarefs = 0xab7d710, ddrs = 0xab996e0, may_alias_ddrs = 0xab77858, may_misalign_stmts = 0xab6c960, loop_line_number = 0, strided_stores = 0xab6c080, slp_instances = 0xab7bcf8, slp_unrolling_factor = 1}
In my dump this stmt is vectorized ok: bug.F:14: note: ------>vectorizing statement: D.1055_23 = ((D.1054_22)) bug.F:14: note: transform statement. bug.F:14: note: vect_is_simple_use: operand ((D.1054_22)) bug.F:14: note: non-associatable copy. bug.F:14: note: def_stmt: D.1054_22 = D.1051_19 - D.1053_21 bug.F:14: note: type of def: 3. bug.F:14: note: transform assignment. bug.F:14: note: vect_get_vec_def_for_operand: ((D.1054_22)) bug.F:14: note: vect_is_simple_use: operand ((D.1054_22)) bug.F:14: note: non-associatable copy. bug.F:14: note: def_stmt: D.1054_22 = D.1051_19 - D.1053_21 bug.F:14: note: type of def: 3. bug.F:14: note: def = D.1054_22 def_stmt = D.1054_22 = D.1051_19 - D.1053_21 bug.F:14: note: add new stmt: vect_var_.63_162 = vect_var_.62_161 I also see that vectorization factor is 1 in your dump. I think that this is the problem here, since ncopies = vf/nunits (nunits number of elements in the vector of this type). Could you please attach the vectorizer dump file? Ira
If it is really a try to SLP, I think this patch will fix the ICE: Index: tree-vect-transform.c =================================================================== --- tree-vect-transform.c (revision 134926) +++ tree-vect-transform.c (working copy) @@ -3668,6 +3668,11 @@ vectorizable_assignment (tree stmt, bloc VEC(tree,heap) *vec_oprnds = NULL; tree vop; + /* FORNOW: SLP with multiple types is not supported. The SLP analysis verifies + this, so we can safely override NCOPIES with 1 here. */ + if (slp_node) + ncopies = 1; + gcc_assert (ncopies >= 1); if (ncopies > 1) return false; /* FORNOW */
Created attachment 15574 [details] vectorizer dump Attached. The last line indeed hints at SLP: t.f90:14: note: ------>vectorizing SLP node starting from: D.989_84 = ((D.988_83))
(In reply to comment #5) > Created an attachment (id=15574) [edit] > vectorizer dump > > Attached. Thanks! > The last line indeed hints at SLP: > > t.f90:14: note: ------>vectorizing SLP node starting from: D.989_84 = > ((D.988_83)) > I am pretty sure now that the patch in comment #4 fixes the ICE. Could someone please verify this? Ira
Subject: Re: [4.4 regression] internal compiler error: in vectorizable_assignment, at tree-vect-transform.c:3671 On Sun, 4 May 2008, irar at il dot ibm dot com wrote: > ------- Comment #6 from irar at il dot ibm dot com 2008-05-04 11:49 ------- > (In reply to comment #5) > > Created an attachment (id=15574) > --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=15574&action=view) [edit] > > vectorizer dump > > > > Attached. > > Thanks! > > > The last line indeed hints at SLP: > > > > t.f90:14: note: ------>vectorizing SLP node starting from: D.989_84 = > > ((D.988_83)) > > > > I am pretty sure now that the patch in comment #4 fixes the ICE. > Could someone please verify this? It does - and the loop is vectorized. But it looks like a hack ;) Richard.
(In reply to comment #7) > It does - and the loop is vectorized. But it looks like a hack ;) It is not. We actually do this in all vectorizable_...() that support SLP. SLP currently does not support multiple types (I am working on this right now). So in the analysis phase we check that there is only one type in the loop before we try to SLP it. In loop-based vectorization of loops with multiple types we generate "copies" of stmts of the bigger type, and the number of copies is vf/nunits. In SLP this expression is meaningless, therefore, we overwrite NCOPIES with 1 (which is the correct number of copies in case there is only one type in the loop). Ira > > Richard. >
Subject: Re: [4.4 regression] internal compiler error: in vectorizable_assignment, at tree-vect-transform.c:3671 On Sun, 4 May 2008, irar at il dot ibm dot com wrote: > ------- Comment #8 from irar at il dot ibm dot com 2008-05-04 12:07 ------- > (In reply to comment #7) > > It does - and the loop is vectorized. But it looks like a hack ;) > > It is not. We actually do this in all vectorizable_...() that support SLP. > SLP currently does not support multiple types (I am working on this right now). > So in the analysis phase we check that there is only one type in the loop > before we try to SLP it. In loop-based vectorization of loops with multiple > types we generate "copies" of stmts of the bigger type, and the number of > copies is vf/nunits. In SLP this expression is meaningless, therefore, we > overwrite NCOPIES with 1 (which is the correct number of copies in case there > is only one type in the loop). Ah, I see. Can you give the patch bootstrap & test? I'll pre-approve it here. Thanks, Richard.
(In reply to comment #9) > Can you give the patch bootstrap & test? I'll pre-approve > it here. Sure, for both trunk and 4.3.1, I guess. Ira > > Thanks, > Richard. >
Subject: Re: [4.4 regression] internal compiler error: in vectorizable_assignment, at tree-vect-transform.c:3671 On Sun, 4 May 2008, irar at il dot ibm dot com wrote: > > > ------- Comment #10 from irar at il dot ibm dot com 2008-05-04 12:26 ------- > (In reply to comment #9) > > Can you give the patch bootstrap & test? I'll pre-approve > > it here. > > Sure, for both trunk and 4.3.1, I guess. Yes. Thanks. Richard.
Subject: Bug 36119 Author: irar Date: Mon May 5 07:47:49 2008 New Revision: 134944 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=134944 Log: PR tree-optimization/36119 * tree-vect-transform.c (vectorizable_assignment): Set NCOPIES to 1 in case of SLP. Added: trunk/gcc/testsuite/gfortran.dg/vect/pr36119.f Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-transform.c
Subject: Bug 36119 Author: irar Date: Mon May 5 07:48:58 2008 New Revision: 134945 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=134945 Log: PR tree-optimization/36119 * tree-vect-transform.c (vectorizable_assignment): Set NCOPIES to 1 in case of SLP. Added: branches/gcc-4_3-branch/gcc/testsuite/gfortran.dg/vect/pr36119.f Modified: branches/gcc-4_3-branch/gcc/ChangeLog branches/gcc-4_3-branch/gcc/testsuite/ChangeLog branches/gcc-4_3-branch/gcc/tree-vect-transform.c
Fixed.