After r266171 this test case is hanging on powerpc64 both be and le when compiled with -O3. If I run it via make check make -k check-fortran RUNTESTFLAGS=dg.exp=gfortran.dg/transfer_intrinsic_3.f90 it normally finishes in a few minutes but with the run I started a bit ago the executable has been running over 25 minutes: 43548 pts/1 R+ 25:34 ./transfer_intrinsic_3.exe Executing on host: /home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../gfortran -B/home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../ -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.f90 -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions -pedantic-errors -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/.libs -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs -lm -o ./transfer_intrinsic_3.exe (timeout = 300) spawn -ignore SIGHUP /home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../gfortran -B/home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../ -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.f90 -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions -pedantic-errors -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/.libs -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs -lm -o ./transfer_intrinsic_3.exe PASS: gfortran.dg/transfer_intrinsic_3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) Setting LD_LIBRARY_PATH to .:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/gcc:.:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/gcc:/home/seurer/gcc/build/gcc-test2/./gmp/.libs:/home/seurer/gcc/build/gcc-test2/./prev-gmp/.libs:/home/seurer/gcc/build/gcc-test2/./mpfr/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpfr/src/.libs:/home/seurer/gcc/build/gcc-test2/./mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./isl/.libs:/home/seurer/gcc/build/gcc-test2/./prev-isl/.libs:.:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgomp/.libs:/home/seurer/gcc/build/gcc-test2/gcc:.:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgomp/.libs:/home/seurer/gcc/build/gcc-test2/gcc:/home/seurer/gcc/build/gcc-test2/./gmp/.libs:/home/seurer/gcc/build/gcc-test2/./prev-gmp/.libs:/home/seurer/gcc/build/gcc-test2/./mpfr/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpfr/src/.libs:/home/seurer/gcc/build/gcc-test2/./mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./isl/.libs:/home/seurer/gcc/build/gcc-test2/./prev-isl/.libs:/home/seurer/gcc/install/gcc-7.2.0/lib64 Execution timeout is: 300 spawn [open ...] WARNING: program timed out. got a INT signal, interrupted by user (I hit ^c there to stop it after about 30 minutes of running)
(In reply to seurer from comment #0) > After r266171 this test case is hanging on powerpc64 both be and le when > compiled with -O3. If I run it via make check > > make -k check-fortran > RUNTESTFLAGS=dg.exp=gfortran.dg/transfer_intrinsic_3.f90 > > it normally finishes in a few minutes but with the run I started a bit ago > the executable has been running over 25 minutes: > > 43548 pts/1 R+ 25:34 ./transfer_intrinsic_3.exe > > > Executing on host: > /home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../gfortran > -B/home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../ > -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ > /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3. > f90 -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers > -fdiagnostics-color=never -O3 -fomit-frame-pointer -funroll-loops > -fpeel-loops -ftracer -finline-functions -pedantic-errors > -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ > .libs > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ > .libs > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ > .libs > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/. > libs > -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/ > .libs > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/ > .libs > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/ > .libs -lm -o ./transfer_intrinsic_3.exe (timeout = 300) > spawn -ignore SIGHUP > /home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../gfortran > -B/home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../ > -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ > /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3. > f90 -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers > -fdiagnostics-color=never -O3 -fomit-frame-pointer -funroll-loops > -fpeel-loops -ftracer -finline-functions -pedantic-errors > -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ > .libs > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ > .libs > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ > .libs > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/. > libs > -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/ > .libs > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/ > .libs > -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/ > .libs -lm -o ./transfer_intrinsic_3.exe > PASS: gfortran.dg/transfer_intrinsic_3.f90 -O3 -fomit-frame-pointer > -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess > errors) > Setting LD_LIBRARY_PATH to > .:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ > .libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./ > libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux- > gnu/./libatomic/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown- > linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64- > unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/gcc:.: > /home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/. > libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./ > libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux- > gnu/./libatomic/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown- > linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64- > unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/gcc:/ > home/seurer/gcc/build/gcc-test2/./gmp/.libs:/home/seurer/gcc/build/gcc-test2/ > ./prev-gmp/.libs:/home/seurer/gcc/build/gcc-test2/./mpfr/src/.libs:/home/ > seurer/gcc/build/gcc-test2/./prev-mpfr/src/.libs:/home/seurer/gcc/build/gcc- > test2/./mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpc/src/.libs:/ > home/seurer/gcc/build/gcc-test2/./isl/.libs:/home/seurer/gcc/build/gcc-test2/ > ./prev-isl/.libs:.:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux- > gnu/./libgomp/.libs:/home/seurer/gcc/build/gcc-test2/gcc:.:/home/seurer/gcc/ > build/gcc-test2/powerpc64-unknown-linux-gnu/./libgomp/.libs:/home/seurer/gcc/ > build/gcc-test2/gcc:/home/seurer/gcc/build/gcc-test2/./gmp/.libs:/home/ > seurer/gcc/build/gcc-test2/./prev-gmp/.libs:/home/seurer/gcc/build/gcc-test2/ > ./mpfr/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpfr/src/.libs:/ > home/seurer/gcc/build/gcc-test2/./mpc/src/.libs:/home/seurer/gcc/build/gcc- > test2/./prev-mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./isl/.libs:/ > home/seurer/gcc/build/gcc-test2/./prev-isl/.libs:/home/seurer/gcc/install/ > gcc-7.2.0/lib64 > Execution timeout is: 300 > spawn [open ...] > WARNING: program timed out. > got a INT signal, interrupted by user > > (I hit ^c there to stop it after about 30 minutes of running) Sorry for the breakage, I will have a look. Thanks
The testcase hangs also on S/390.
This test now fails in some arm configs: FAIL: gfortran.dg/transfer_intrinsic_3.f90 -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions execution test FAIL: gfortran.dg/transfer_intrinsic_3.f90 -O3 -g execution test on arm-none-linux-gnueabihf --with-cpu cortex-a5 --with-fpu vfpv3-d16-fp16 cortex-a9+neon-fp16, cortex-a15+neon-vfpv4 and cortex-a57+crypto-neon-fp-armv8 are still OK.
Any progress on this? It really slows down test runs as it hangs twice and has to wait for the timeout to occur to continue.
(In reply to seurer from comment #4) > Any progress on this? It really slows down test runs as it hangs twice and > has to wait for the timeout to occur to continue. Sorry for being slow. I am still not very sure where the issue is. Look at the dump before ivcanon (at ivcanon the code diverges w/o the patch): ;; basic block 7, loop depth 1, count 8656061039 (estimated locally), maybe hot ;; prev block 6, next block 24, flags: (NEW, REACHABLE, VISITED) ;; pred: 6 [always] count:1073312328 (estimated locally) (FALLTHRU,EXECUTABLE) ;; 23 [always] count:7582748748 (estimated locally) (FALLTHRU,DFS_BACK) # .MEM_95 = PHI <.MEM_62(6), .MEM_94(23)> # RANGE [0, 4] NONZERO 7 # n_63 = PHI <0(6), _28(23)> # RANGE [-1, 2] _19 = n_63 + -1; # RANGE [-1, 2] _20 = (integer(kind=8)D.4) _19; # RANGE [0, 2] NONZERO 3 _22 = MAX_EXPR <_20, 0>; # RANGE [0, 2] NONZERO 3 _25 = (sizetype) _22; # RANGE [1, 2] NONZERO 3 _26 = MAX_EXPR <_25, 1>; # .MEM_74 = VDEF <.MEM_95> # PT = null { D.2402 } # ALIGN = 8, MISALIGN = 0 # USE = nonlocal null # CLB = nonlocal null _27 = mallocD.235 (_26); # .MEM_75 = VDEF <.MEM_74> D.2389.spanD.2142 = 1; # .MEM_76 = VDEF <.MEM_75> MEM[(struct dtype_type *)&D.2389 + 24B] = {}; # .MEM_77 = VDEF <.MEM_76> D.2389.dtypeD.2141.elem_lenD.2079 = 1; # .MEM_78 = VDEF <.MEM_77> D.2389.dtypeD.2141.rankD.2081 = 1; # .MEM_79 = VDEF <.MEM_78> D.2389.dtypeD.2141.typeD.2082 = 6; # .MEM_80 = VDEF <.MEM_79> D.2389.dimD.2143[0].lboundD.2102 = 1; # .MEM_81 = VDEF <.MEM_80> D.2389.dimD.2143[0].uboundD.2103 = _20; # .MEM_82 = VDEF <.MEM_81> D.2389.dimD.2143[0].strideD.2101 = 1; # .MEM_83 = VDEF <.MEM_82> D.2389.dataD.2139 = pretmp_65; # .MEM_84 = VDEF <.MEM_83> D.2389.offsetD.2140 = -1; # .MEM_85 = VDEF <.MEM_84> # PT = nonlocal escaped null { D.2389 D.2401 } # USE = nonlocal escaped null { D.2389 D.2401 } # CLB = nonlocal escaped _47 = _gfortran_internal_packD.1827 (&D.2389); if (_20 >= _22) goto <bb 24>; [67.00%] else goto <bb 8>; [33.00%] ;; succ: 24 [67.0% (guessed)] count:5799560912 (estimated locally) (TRUE_VALUE,EXECUTABLE) ;; 8 [33.0% (guessed)] count:2856500127 (estimated locally) (FALSE_VALUE,EXECUTABLE) ;; basic block 24, loop depth 1, count 5799560912 (estimated locally), maybe hot ;; prev block 7, next block 8, flags: (NEW) ;; pred: 7 [67.0% (guessed)] count:5799560912 (estimated locally) (TRUE_VALUE,EXECUTABLE) goto <bb 9>; [100.00%] ;; succ: 9 [always] count:5799560912 (estimated locally) (FALLTHRU) ;; basic block 8, loop depth 1, count 2856500143 (estimated locally), maybe hot ;; prev block 24, next block 9, flags: (NEW, REACHABLE, VISITED) ;; pred: 7 [33.0% (guessed)] count:2856500127 (estimated locally) (FALSE_VALUE,EXECUTABLE) # .MEM_86 = VDEF <.MEM_85> # PT = null { D.2403 } # ALIGN = 8, MISALIGN = 0 # USE = nonlocal null # CLB = nonlocal null _49 = mallocD.235 (_26); # .MEM_87 = VDEF <.MEM_86> # USE = nonlocal null # CLB = nonlocal null memcpyD.588 (_49, _47, _25); ;; succ: 9 [always (adjusted)] count:2856500143 (estimated locally) (FALLTHRU,EXECUTABLE) ;; basic block 9, loop depth 1, count 8656061039 (estimated locally), maybe hot ;; prev block 8, next block 25, flags: (NEW, REACHABLE, VISITED) ;; pred: 8 [always (adjusted)] count:2856500143 (estimated locally) (FALLTHRU,EXECUTABLE) ;; 24 [always] count:5799560912 (estimated locally) (FALLTHRU) # PT = nonlocal escaped null { D.2389 D.2401 D.2403 } # transfer.5_53 = PHI <_49(8), _47(24)> # .MEM_56 = PHI <.MEM_87(8), .MEM_85(24)> if (_20 > 0) goto <bb 10>; [41.48%] else goto <bb 25>; [58.52%] ;; succ: 10 [41.5% (guessed)] count:3590534146 (estimated locally) (TRUE_VALUE,EXECUTABLE) ;; 25 [58.5% (guessed)] count:5065526893 (estimated locally) (FALSE_VALUE,EXECUTABLE) ;; basic block 25, loop depth 1, count 5065526893 (estimated locally), maybe hot ;; prev block 9, next block 10, flags: (NEW) ;; pred: 9 [58.5% (guessed)] count:5065526893 (estimated locally) (FALSE_VALUE,EXECUTABLE) goto <bb 11>; [100.00%] ;; succ: 11 [always] count:5065526893 (estimated locally) (FALLTHRU) ;; basic block 10, loop depth 1, count 3590534111 (estimated locally), maybe hot ;; prev block 25, next block 11, flags: (NEW, REACHABLE, VISITED) ;; pred: 9 [41.5% (guessed)] count:3590534146 (estimated locally) (TRUE_VALUE,EXECUTABLE) # .MEM_88 = VDEF <.MEM_56> # USE = nonlocal null # CLB = nonlocal null memcpyD.588 (_27, transfer.5_53, _25); ;; succ: 11 [always (adjusted)] count:3590534111 (estimated locally) (FALLTHRU,EXECUTABLE) ;; basic block 11, loop depth 1, count 8656061039 (estimated locally), maybe hot ;; prev block 10, next block 26, flags: (NEW, REACHABLE, VISITED) ;; pred: 10 [always (adjusted)] count:3590534111 (estimated locally) (FALLTHRU,EXECUTABLE) ;; 25 [always] count:5065526893 (estimated locally) (FALLTHRU) # .MEM_57 = PHI <.MEM_88(10), .MEM_56(25)> if (_47 != pretmp_65) goto <bb 12>; [53.47%] else goto <bb 26>; [46.53%] ;; succ: 12 [53.5% (guessed)] count:4628395827 (estimated locally) (TRUE_VALUE,EXECUTABLE) ;; 26 [46.5% (guessed)] count:4027665212 (estimated locally) (FALSE_VALUE,EXECUTABLE) ;; basic block 26, loop depth 1, count 4027665212 (estimated locally), maybe hot ;; prev block 11, next block 12, flags: (NEW) ;; pred: 11 [46.5% (guessed)] count:4027665212 (estimated locally) (FALSE_VALUE,EXECUTABLE) goto <bb 13>; [100.00%] ;; succ: 13 [always] count:4027665212 (estimated locally) (FALLTHRU) ;; basic block 12, loop depth 1, count 4628395839 (estimated locally), maybe hot ;; prev block 26, next block 13, flags: (NEW, REACHABLE, VISITED) ;; pred: 11 [53.5% (guessed)] count:4628395827 (estimated locally) (TRUE_VALUE,EXECUTABLE) # .MEM_89 = VDEF <.MEM_57> # USE = nonlocal null # CLB = nonlocal null freeD.234 (_47); ;; succ: 13 [always (adjusted)] count:4628395839 (estimated locally) (FALLTHRU,EXECUTABLE) ;; basic block 13, loop depth 1, count 8656061039 (estimated locally), maybe hot ;; prev block 12, next block 27, flags: (NEW, REACHABLE, VISITED) ;; pred: 12 [always (adjusted)] count:4628395839 (estimated locally) (FALLTHRU,EXECUTABLE) ;; 26 [always] count:4027665212 (estimated locally) (FALLTHRU) # .MEM_58 = PHI <.MEM_89(12), .MEM_57(26)> if (_20 < _22) goto <bb 14>; [33.00%] else goto <bb 27>; [67.00%] ;; succ: 14 [33.0% (guessed)] count:2856500127 (estimated locally) (TRUE_VALUE,EXECUTABLE) ;; 27 [67.0% (guessed)] count:5799560912 (estimated locally) (FALSE_VALUE,EXECUTABLE) ;; basic block 27, loop depth 1, count 5799560912 (estimated locally), maybe hot ;; prev block 13, next block 14, flags: (NEW) ;; pred: 13 [67.0% (guessed)] count:5799560912 (estimated locally) (FALSE_VALUE,EXECUTABLE) goto <bb 15>; [100.00%] ;; succ: 15 [always] count:5799560912 (estimated locally) (FALLTHRU) ;; basic block 14, loop depth 1, count 2856500143 (estimated locally), maybe hot ;; prev block 27, next block 15, flags: (NEW, REACHABLE, VISITED) ;; pred: 13 [33.0% (guessed)] count:2856500127 (estimated locally) (TRUE_VALUE,EXECUTABLE) # .MEM_90 = VDEF <.MEM_58> # USE = nonlocal null # CLB = nonlocal null freeD.234 (transfer.5_53); ;; succ: 15 [always (adjusted)] count:2856500143 (estimated locally) (FALLTHRU,EXECUTABLE) ;; basic block 15, loop depth 1, count 8656061039 (estimated locally), maybe hot ;; prev block 14, next block 16, flags: (NEW, REACHABLE, VISITED) ;; pred: 14 [always (adjusted)] count:2856500143 (estimated locally) (FALLTHRU,EXECUTABLE) ;; 27 [always] count:5799560912 (estimated locally) (FALLTHRU) # .MEM_59 = PHI <.MEM_90(14), .MEM_58(27)> # .MEM_91 = VDEF <.MEM_59> D.2389 ={v} {CLOBBER}; if (n_63 <= 1) goto <bb 16>; [41.00%] else goto <bb 18>; [59.00%] ;; succ: 16 [41.0% (guessed)] count:3548984995 (estimated locally) (TRUE_VALUE,EXECUTABLE) ;; 18 [59.0% (guessed)] count:5107076044 (estimated locally) (FALSE_VALUE,EXECUTABLE) ;; basic block 16, loop depth 1, count 3548985018 (estimated locally), maybe hot ;; prev block 15, next block 17, flags: (NEW, REACHABLE, VISITED) ;; pred: 15 [41.0% (guessed)] count:3548984995 (estimated locally) (TRUE_VALUE,EXECUTABLE) if (_19 > 0) goto <bb 17>; [0.04%] else goto <bb 28>; [99.96%] ;; succ: 17 [0.0% (guessed)] count:1419592 (estimated locally) (TRUE_VALUE,EXECUTABLE) ;; 28 [100.0% (guessed)] count:3547565426 (estimated locally) (FALSE_VALUE,EXECUTABLE) ;; basic block 17, loop depth 0, count 1419591 (estimated locally) ;; prev block 16, next block 18, flags: (NEW, REACHABLE, VISITED) ;; pred: 16 [0.0% (guessed)] count:1419592 (estimated locally) (TRUE_VALUE,EXECUTABLE) # .MEM_96 = VDEF <.MEM_91> # USE = nonlocal null # CLB = nonlocal null _gfortran_stop_numericD.1808 (1, 0); ;; succ: ;; basic block 18, loop depth 1, count 5106238449 (estimated locally), maybe hot ;; prev block 17, next block 29, flags: (NEW, REACHABLE, VISITED) ;; pred: 15 [59.0% (guessed)] count:5107076044 (estimated locally) (FALSE_VALUE,EXECUTABLE) if (_19 < 0) goto <bb 19>; [0.04%] else goto <bb 29>; [99.96%] ;; succ: 19 [0.0% (guessed)] count:2042492 (estimated locally) (TRUE_VALUE,EXECUTABLE) ;; 29 [100.0% (guessed)] count:5104195957 (estimated locally) (FALSE_VALUE,EXECUTABLE) ;; basic block 29, loop depth 1, count 5104195957 (estimated locally), maybe hot ;; prev block 18, next block 19, flags: (NEW) ;; pred: 18 [100.0% (guessed)] count:5104195957 (estimated locally) (FALSE_VALUE,EXECUTABLE) goto <bb 20>; [100.00%] ;; succ: 20 [always] count:5104195957 (estimated locally) (FALLTHRU) ;; basic block 19, loop depth 0, count 2042498 (estimated locally) ;; prev block 29, next block 28, flags: (NEW, REACHABLE, VISITED) ;; pred: 18 [0.0% (guessed)] count:2042492 (estimated locally) (TRUE_VALUE,EXECUTABLE) # .MEM_92 = VDEF <.MEM_91> # USE = nonlocal null # CLB = nonlocal null _gfortran_stop_numericD.1808 (2, 0); ;; succ: ;; basic block 28, loop depth 1, count 3547565426 (estimated locally), maybe hot ;; prev block 19, next block 20, flags: (NEW) ;; pred: 16 [100.0% (guessed)] count:3547565426 (estimated locally) (FALSE_VALUE,EXECUTABLE) ;; succ: 20 [always] count:3547565426 (estimated locally) (FALLTHRU) ;; basic block 20, loop depth 1, count 8652598961 (estimated locally), maybe hot ;; prev block 28, next block 23, flags: (NEW, REACHABLE, VISITED) ;; pred: 28 [always] count:3547565426 (estimated locally) (FALLTHRU) ;; 29 [always] count:5104195957 (estimated locally) (FALLTHRU) # .MEM_93 = VDEF <.MEM_91> # USE = nonlocal null # CLB = nonlocal null freeD.234 (_27); # .MEM_94 = VDEF <.MEM_93> D.2365 ={v} {CLOBBER}; # RANGE [1, 4] NONZERO 7 _28 = n_63 + 1; if (_28 == 4) goto <bb 21>; [12.36%] else goto <bb 23>; [87.64%] ;; succ: 21 [12.4% (guessed)] count:1069850213 (estimated locally) (TRUE_VALUE,EXECUTABLE) ;; 23 [87.6% (guessed)] count:7582748748 (estimated locally) (FALSE_VALUE,EXECUTABLE) ;; basic block 23, loop depth 1, count 7582748748 (estimated locally), maybe hot ;; prev block 20, next block 21, flags: (NEW) ;; pred: 20 [87.6% (guessed)] count:7582748748 (estimated locally) (FALSE_VALUE,EXECUTABLE) goto <bb 7>; [100.00%] ;; succ: 7 [always] count:7582748748 (estimated locally) (FALLTHRU,DFS_BACK) ;; basic block 21, loop depth 0, count 1069422300 (estimated locally), maybe hot ;; prev block 23, next block 1, flags: (NEW, REACHABLE, VISITED) ;; pred: 20 [12.4% (guessed)] count:1069850213 (estimated locally) (TRUE_VALUE,EXECUTABLE) # .MEM_98 = VDEF <.MEM_94> # USE = nonlocal null # CLB = nonlocal null freeD.234 (_7); # VUSE <.MEM_98> return 0; ;; succ: EXIT [always (adjusted)] count:1069422300 (estimated locally) (EXECUTABLE) Note basic blocks 16 and 18. _19 is computed as {-1, 1}, so one of checks in bb16/bb18 must exist on the first iteration by calling _gfortran_stop_numeric(1/2, 0) So this two exits edge give small niters information than exit in basic block 20, the conditional check in bb20 is turned into a goto, as in dump from optimized: ;; basic block 18, loop depth 1, count 8652598961 (estimated locally), maybe hot ;; prev block 17, next block 1, flags: (NEW, REACHABLE, VISITED) ;; pred: 14 [100.0% (guessed)] count:3547565426 (estimated locally) (FALSE_VALUE,EXECUTABLE) ;; 16 [100.0% (guessed)] count:5104195957 (estimated locally) (FALSE_VALUE,EXECUTABLE) # .MEM_93 = VDEF <.MEM_91> # USE = nonlocal null # CLB = nonlocal null freeD.234 (_27); ivtmp.50_34 = ivtmp.50_36 + 1; goto <bb 5>; [100.00%] ;; succ: 5 [always] count:8652598961 (estimated locally) (FALLTHRU,DFS_BACK,EXECUTABLE) This change is caused by code re-arrangement in the patch: /* If the loop exits immediately, there is nothing to do. */ tree tem = fold_binary (code, boolean_type_node, iv0->base, iv1->base); if (tem && integer_zerop (tem)) { niter->niter = build_int_cst (unsigned_type_for (type), 0); niter->max = 0; return true; } /* Handle special case loops: while (i-- < 10) and while (10 < i++) by adjusting iv0, iv1 and code. */ if (code != NE_EXPR && (tree_int_cst_sign_bit (iv0->step) || (!integer_zerop (iv1->step) && !tree_int_cst_sign_bit (iv1->step))) && !adjust_cond_for_loop_until_wrap (type, iv0, &code, iv1)) return false; Actually the first if-statement is bypassed previous because now the second if-statement is modified and moved after the first one. But the transformation looks good to me, unless _gfortran_stop_numeric hangs? Thanks, bin
Created attachment 45236 [details] arm-none-linux-gnueabihf with cpu and fpu options
I can confirm this test fails on arm-none-linux-gnueabihf when invoking with "-mcpu=cortex-a5 -mfpu=vfpv3-d16-fp16", as Christophe wrote. Please see the attached log.
I looked at where the code is hanging and it looks like it is hung in a loop where it keeps calling memcpy with an incrementing by 1 length. I set a breakpoint at the start of memcpy to break if the length was greater than 9000 and when the breakpoint was hit just keeping hitting continue. It was called with length 9001, 9002, 9003, ... There is nothing in the code that does anything like this as far as I can tell and this doesn't happen with the previous revision. Is the call s = transfer(vs, s) expanded into an infinite loop? I don't know fortran so I have no idea what that is supposed to do. #0 .__memcpy_power7 () at ../sysdeps/powerpc/powerpc64/power7/memcpy.S:34 #1 0x0000000010000a24 in str_vs (_vs=1, _vs=1, vs=..., .__result=9002, __result=<optimized out>) at /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.f90:13 #2 MAIN__ () at /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.f90:34 #3 main (argc=<optimized out>, argv=<optimized out>) at /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.f90:26 #4 0x00003fffb79c7a6c in generic_start_main (main=@0x1001fec0: 0x100008a0 <main>, argc=<optimized out>, argv=0x3fffffffe888, auxvec=0x3fffffffea00, init=<optimized out>, rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>) at ../csu/libc-start.c:266 #5 0x00003fffb79c7c94 in __libc_start_main (argc=<optimized out>, argv=<optimized out>, ev=<optimized out>, auxvec=<optimized out>, rtld_fini=<optimized out>, stinfo=<optimized out>, stack_on_entry=<optimized out>) at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:81 #6 0x0000000000000000 in ?? ()
Could you please replace (1) 'do n = 0, 3' with 'do n = 2, 3', and (2) 'do n = 0, 3' with 'do n = 0, 0'? I am not 100% confident about what 's = transfer(vs, s)' is supposed to do for zero-sized arrays/strings. In any case the length should never be greater than 2.
I tried both (1) and (2) and the test case does not hang.
> I tried both (1) and (2) and the test case does not hang. Could you please try '0, 1', '1, 2', and '0, 2'?
None of those hang, either. I also experimented with the options a bit. The as-is options affecting optimization are: -O3 -funroll-loops -fpeel-loops -finline-functions Change to -O1 and no hang. Dropping the other ones or using -O2 and it still hangs.
Between r266170 and r266171 the difference was in veclower21 dump with -O3: --- transfer_intrinsic_3.f90.168t.veclower21_ 2019-01-18 16:34:47.478873237 +0100 +++ transfer_intrinsic_3.f90.168t.veclower21 2019-01-18 16:35:09.503515118 +0100 @@ -370,14 +370,7 @@ main (integer(kind=4) argc, character(ki __builtin_free (_27); parm.10 ={v} {CLOBBER}; ivtmp.52_75 = ivtmp.52_82 + 1; - if (ivtmp.52_75 == 3) - goto <bb 19>; [12.36%] - else - goto <bb 5>; [87.64%] - - <bb 19> [local count: 1069422300]: - __builtin_free (_7); - return 0; + goto <bb 5>; [100.00%] } and, if I revert the r266171 change on current trunk, the difference between f951 with the patch reverted and vanilla trunk is (again -O3, powerpc64le-linux): --- transfer_intrinsic_3.f90.161t.cunroll_ 2019-01-18 17:14:06.625536698 +0100 +++ transfer_intrinsic_3.f90.161t.cunroll 2019-01-18 17:14:24.992238353 +0100 @@ -55,8 +55,10 @@ Number of blocks in CFG: 46 Number of blocks to update: 11 ( 24%) +Removing basic block 21 Removing basic block 32 Removing basic block 41 +Merging blocks 20 and 23 Merging blocks 30 and 33 Removing basic block 34 Removing basic block 36 @@ -178,8 +180,8 @@ main (integer(kind=4) argc, character(ki pretmp_65 = &MEM[(character(kind=1)[0:][1:1] *)_7][0]; <bb 7> [local count: 8656061039]: - # n_63 = PHI <0(30), _28(23)> - # ivtmp_13 = PHI <4(30), ivtmp_31(23)> + # n_63 = PHI <0(30), _28(20)> + # ivtmp_13 = PHI <4(30), ivtmp_31(20)> _19 = n_63 + -1; _20 = (integer(kind=8)) _19; _22 = MAX_EXPR <_20, 0>; @@ -264,18 +266,8 @@ main (integer(kind=4) argc, character(ki parm.10 ={v} {CLOBBER}; _28 = n_63 + 1; ivtmp_31 = ivtmp_13 - 1; - if (ivtmp_31 == 0) - goto <bb 21>; [12.36%] - else - goto <bb 23>; [87.64%] - - <bb 23> [local count: 7582748748]: goto <bb 7>; [100.00%] - <bb 21> [local count: 1069422300]: - __builtin_free (_7); - return 0; - } so in both cases, the loop condition is optimized out.
I've put logging into tree-ssa-loop-niters.c, looking for when before/after r266171 code would make a difference in the returned value, the only case it triggers on is (all types integer(kind=4) i.e. signed 32-bit integer): code LE_EXPR iv0->base 0 iv0->step 0 iv1->base -1 iv1->step 1 every_iteration false The loop starts with: <bb 7> [local count: 8656061039]: # n_63 = PHI <0(6), _28(23)> _19 = n_63 + -1; and ends with _28 = n_63 + 1; if (_28 == 4) goto <bb 21>; [12.36%] else goto <bb 23>; [87.64%] <bb 23> [local count: 7582748748]: goto <bb 7>; [100.00%] and besides the exit at the end has also: <bb 16> [local count: 3548985018]: if (_19 > 0) goto <bb 17>; [0.04%] else goto <bb 28>; [99.96%] <bb 17> [local count: 1419591]: _gfortran_stop_numeric (1, 0); <bb 18> [local count: 5106238449]: if (_19 < 0) goto <bb 19>; [0.04%] else goto <bb 29>; [99.96%] <bb 29> [local count: 5104195957]: goto <bb 20>; [100.00%] <bb 19> [local count: 2042498]: _gfortran_stop_numeric (2, 0); in the middle, so two other loop exits. But, neither bb16, nor bb18 are executed every iteration, if they were, then because _19 is -1 in the first iteration would always stop 2 and not iterate further. We have: /* If the test is not executed every iteration, wrapping may make the test to pass again. TODO: the overflow case can be still used as unreliable estimate of upper bound. But we have no API to pass it down to number of iterations code and, at present, it will not use it anyway. */ if (!every_iteration && (!iv0->no_overflow || !iv1->no_overflow || code == NE_EXPR || code == EQ_EXPR)) return false; at the start, but that doesn't trigger here, because code is not equality comparison and no_overflow is set on both IVs. If there would be an overflow, then maybe it would be right to derive number of iterations from that. But the condition that returns true is that iv0->base code iv1->base is false, if that isn't done in every iteration, it means nothing for the number of iteration analysis. The following patch works for me: 2019-01-18 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/88044 * tree-ssa-loop-niter.c (number_of_iterations_cond): If condition is false in the first iteration, but !every_iteration, return false instead of true with niter->niter zero. --- gcc/tree-ssa-loop-niter.c.jj 2019-01-10 11:43:02.254577008 +0100 +++ gcc/tree-ssa-loop-niter.c 2019-01-18 19:51:00.245504728 +0100 @@ -1824,6 +1824,8 @@ number_of_iterations_cond (struct loop * tree tem = fold_binary (code, boolean_type_node, iv0->base, iv1->base); if (tem && integer_zerop (tem)) { + if (!every_iteration) + return false; niter->niter = build_int_cst (unsigned_type_for (type), 0); niter->max = 0; return true;
Author: jakub Date: Tue Jan 22 09:58:23 2019 New Revision: 268143 URL: https://gcc.gnu.org/viewcvs?rev=268143&root=gcc&view=rev Log: PR tree-optimization/88044 * tree-ssa-loop-niter.c (number_of_iterations_cond): If condition is false in the first iteration, but !every_iteration, return false instead of true with niter->niter zero. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-loop-niter.c
Fixed.
(In reply to Jakub Jelinek from comment #16) > Fixed. I confirm the problem I mentioned in #c3 is now fixed. Thanks!