Bug 88044 - [9 regression] gfortran.dg/transfer_intrinsic_3.f90 hangs after r266171
Summary: [9 regression] gfortran.dg/transfer_intrinsic_3.f90 hangs after r266171
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 9.0
: P1 normal
Target Milestone: 9.0
Assignee: Jakub Jelinek
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2018-11-15 16:40 UTC by seurer
Modified: 2019-01-22 14:00 UTC (History)
6 users (show)

See Also:
Host: powerpc64*-*-*, s390x-*-*
Target: powerpc64*-*-*, s390x-*-*, arm
Build: powerpc64*-*-*, s390x-*-*
Known to work:
Known to fail:
Last reconfirmed: 2019-01-11 00:00:00


Attachments
arm-none-linux-gnueabihf with cpu and fpu options (1.41 KB, text/plain)
2018-12-14 13:51 UTC, Sam Tebbs
Details

Note You need to log in before you can comment on or make changes to this bug.
Description seurer 2018-11-15 16:40:21 UTC
After r266171 this test case is hanging on powerpc64 both be and le when compiled with -O3.  If I run it via make check

make -k check-fortran RUNTESTFLAGS=dg.exp=gfortran.dg/transfer_intrinsic_3.f90

it normally finishes in a few minutes but with the run I started a bit ago the executable has been running over 25 minutes:

 43548 pts/1    R+    25:34 ./transfer_intrinsic_3.exe


Executing on host: /home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../gfortran -B/home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../ -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.f90    -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never    -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions   -pedantic-errors  -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/.libs -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs  -lm  -o ./transfer_intrinsic_3.exe    (timeout = 300)
spawn -ignore SIGHUP /home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../gfortran -B/home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../ -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/ /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.f90 -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions -pedantic-errors -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/.libs -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs -lm -o ./transfer_intrinsic_3.exe
PASS: gfortran.dg/transfer_intrinsic_3.f90   -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess errors)
Setting LD_LIBRARY_PATH to .:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/gcc:.:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/gcc:/home/seurer/gcc/build/gcc-test2/./gmp/.libs:/home/seurer/gcc/build/gcc-test2/./prev-gmp/.libs:/home/seurer/gcc/build/gcc-test2/./mpfr/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpfr/src/.libs:/home/seurer/gcc/build/gcc-test2/./mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./isl/.libs:/home/seurer/gcc/build/gcc-test2/./prev-isl/.libs:.:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgomp/.libs:/home/seurer/gcc/build/gcc-test2/gcc:.:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgomp/.libs:/home/seurer/gcc/build/gcc-test2/gcc:/home/seurer/gcc/build/gcc-test2/./gmp/.libs:/home/seurer/gcc/build/gcc-test2/./prev-gmp/.libs:/home/seurer/gcc/build/gcc-test2/./mpfr/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpfr/src/.libs:/home/seurer/gcc/build/gcc-test2/./mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./isl/.libs:/home/seurer/gcc/build/gcc-test2/./prev-isl/.libs:/home/seurer/gcc/install/gcc-7.2.0/lib64
Execution timeout is: 300
spawn [open ...]
WARNING: program timed out.
got a INT signal, interrupted by user 

(I hit ^c there to stop it after about 30 minutes of running)
Comment 1 bin cheng 2018-11-16 00:58:05 UTC
(In reply to seurer from comment #0)
> After r266171 this test case is hanging on powerpc64 both be and le when
> compiled with -O3.  If I run it via make check
> 
> make -k check-fortran
> RUNTESTFLAGS=dg.exp=gfortran.dg/transfer_intrinsic_3.f90
> 
> it normally finishes in a few minutes but with the run I started a bit ago
> the executable has been running over 25 minutes:
> 
>  43548 pts/1    R+    25:34 ./transfer_intrinsic_3.exe
> 
> 
> Executing on host:
> /home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../gfortran
> -B/home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../
> -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/
> /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.
> f90    -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
> -fdiagnostics-color=never    -O3 -fomit-frame-pointer -funroll-loops
> -fpeel-loops -ftracer -finline-functions   -pedantic-errors 
> -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/
> .libs
> -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/
> .libs
> -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/
> .libs
> -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/.
> libs
> -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/
> .libs
> -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/
> .libs
> -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/
> .libs  -lm  -o ./transfer_intrinsic_3.exe    (timeout = 300)
> spawn -ignore SIGHUP
> /home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../gfortran
> -B/home/seurer/gcc/build/gcc-test2/gcc/testsuite/gfortran/../../
> -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/
> /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.
> f90 -fno-diagnostics-show-caret -fno-diagnostics-show-line-numbers
> -fdiagnostics-color=never -O3 -fomit-frame-pointer -funroll-loops
> -fpeel-loops -ftracer -finline-functions -pedantic-errors
> -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/
> .libs
> -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/
> .libs
> -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/
> .libs
> -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libatomic/.
> libs
> -B/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/
> .libs
> -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/
> .libs
> -L/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libquadmath/
> .libs -lm -o ./transfer_intrinsic_3.exe
> PASS: gfortran.dg/transfer_intrinsic_3.f90   -O3 -fomit-frame-pointer
> -funroll-loops -fpeel-loops -ftracer -finline-functions  (test for excess
> errors)
> Setting LD_LIBRARY_PATH to
> .:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/
> .libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./
> libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-
> gnu/./libatomic/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-
> linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-
> unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/gcc:.:
> /home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./libgfortran/.
> libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-gnu/./
> libgfortran/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-
> gnu/./libatomic/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-
> linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/powerpc64-
> unknown-linux-gnu/./libquadmath/.libs:/home/seurer/gcc/build/gcc-test2/gcc:/
> home/seurer/gcc/build/gcc-test2/./gmp/.libs:/home/seurer/gcc/build/gcc-test2/
> ./prev-gmp/.libs:/home/seurer/gcc/build/gcc-test2/./mpfr/src/.libs:/home/
> seurer/gcc/build/gcc-test2/./prev-mpfr/src/.libs:/home/seurer/gcc/build/gcc-
> test2/./mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpc/src/.libs:/
> home/seurer/gcc/build/gcc-test2/./isl/.libs:/home/seurer/gcc/build/gcc-test2/
> ./prev-isl/.libs:.:/home/seurer/gcc/build/gcc-test2/powerpc64-unknown-linux-
> gnu/./libgomp/.libs:/home/seurer/gcc/build/gcc-test2/gcc:.:/home/seurer/gcc/
> build/gcc-test2/powerpc64-unknown-linux-gnu/./libgomp/.libs:/home/seurer/gcc/
> build/gcc-test2/gcc:/home/seurer/gcc/build/gcc-test2/./gmp/.libs:/home/
> seurer/gcc/build/gcc-test2/./prev-gmp/.libs:/home/seurer/gcc/build/gcc-test2/
> ./mpfr/src/.libs:/home/seurer/gcc/build/gcc-test2/./prev-mpfr/src/.libs:/
> home/seurer/gcc/build/gcc-test2/./mpc/src/.libs:/home/seurer/gcc/build/gcc-
> test2/./prev-mpc/src/.libs:/home/seurer/gcc/build/gcc-test2/./isl/.libs:/
> home/seurer/gcc/build/gcc-test2/./prev-isl/.libs:/home/seurer/gcc/install/
> gcc-7.2.0/lib64
> Execution timeout is: 300
> spawn [open ...]
> WARNING: program timed out.
> got a INT signal, interrupted by user 
> 
> (I hit ^c there to stop it after about 30 minutes of running)

Sorry for the breakage, I will have a look.

Thanks
Comment 2 Andreas Krebbel 2018-11-17 07:38:10 UTC
The testcase hangs also on S/390.
Comment 3 Christophe Lyon 2018-11-20 15:19:19 UTC
This test now fails in some arm configs:

FAIL: gfortran.dg/transfer_intrinsic_3.f90   -O3 -fomit-frame-pointer
-funroll-loops -fpeel-loops -ftracer -finline-functions  execution
test
FAIL: gfortran.dg/transfer_intrinsic_3.f90   -O3 -g  execution test
on arm-none-linux-gnueabihf
--with-cpu cortex-a5
--with-fpu vfpv3-d16-fp16

cortex-a9+neon-fp16, cortex-a15+neon-vfpv4 and
cortex-a57+crypto-neon-fp-armv8 are still OK.
Comment 4 seurer 2018-11-30 16:00:34 UTC
Any progress on this?  It really slows down test runs as it hangs twice and has to wait for the timeout to occur to continue.
Comment 5 bin cheng 2018-12-04 06:36:20 UTC
(In reply to seurer from comment #4)
> Any progress on this?  It really slows down test runs as it hangs twice and
> has to wait for the timeout to occur to continue.

Sorry for being slow.  I am still not very sure where the issue is.
Look at the dump before ivcanon (at ivcanon the code diverges w/o the patch):
;;   basic block 7, loop depth 1, count 8656061039 (estimated locally), maybe hot
;;    prev block 6, next block 24, flags: (NEW, REACHABLE, VISITED)
;;    pred:       6 [always]  count:1073312328 (estimated locally) (FALLTHRU,EXECUTABLE)
;;                23 [always]  count:7582748748 (estimated locally) (FALLTHRU,DFS_BACK)
  # .MEM_95 = PHI <.MEM_62(6), .MEM_94(23)>
  # RANGE [0, 4] NONZERO 7
  # n_63 = PHI <0(6), _28(23)>
  # RANGE [-1, 2]
  _19 = n_63 + -1;
  # RANGE [-1, 2]
  _20 = (integer(kind=8)D.4) _19;
  # RANGE [0, 2] NONZERO 3
  _22 = MAX_EXPR <_20, 0>;
  # RANGE [0, 2] NONZERO 3
  _25 = (sizetype) _22;
  # RANGE [1, 2] NONZERO 3
  _26 = MAX_EXPR <_25, 1>;
  # .MEM_74 = VDEF <.MEM_95>
  # PT = null { D.2402 }
  # ALIGN = 8, MISALIGN = 0
  # USE = nonlocal null 
  # CLB = nonlocal null 
  _27 = mallocD.235 (_26);
  # .MEM_75 = VDEF <.MEM_74>
  D.2389.spanD.2142 = 1;
  # .MEM_76 = VDEF <.MEM_75>
  MEM[(struct dtype_type *)&D.2389 + 24B] = {};
  # .MEM_77 = VDEF <.MEM_76>
  D.2389.dtypeD.2141.elem_lenD.2079 = 1;
  # .MEM_78 = VDEF <.MEM_77>
  D.2389.dtypeD.2141.rankD.2081 = 1;
  # .MEM_79 = VDEF <.MEM_78>
  D.2389.dtypeD.2141.typeD.2082 = 6;
  # .MEM_80 = VDEF <.MEM_79>
  D.2389.dimD.2143[0].lboundD.2102 = 1;
  # .MEM_81 = VDEF <.MEM_80>
  D.2389.dimD.2143[0].uboundD.2103 = _20;
  # .MEM_82 = VDEF <.MEM_81>
  D.2389.dimD.2143[0].strideD.2101 = 1;
  # .MEM_83 = VDEF <.MEM_82>
  D.2389.dataD.2139 = pretmp_65;
  # .MEM_84 = VDEF <.MEM_83>
  D.2389.offsetD.2140 = -1;
  # .MEM_85 = VDEF <.MEM_84>
  # PT = nonlocal escaped null { D.2389 D.2401 }
  # USE = nonlocal escaped null { D.2389 D.2401 }
  # CLB = nonlocal escaped 
  _47 = _gfortran_internal_packD.1827 (&D.2389);
  if (_20 >= _22)
    goto <bb 24>; [67.00%]
  else
    goto <bb 8>; [33.00%]
;;    succ:       24 [67.0% (guessed)]  count:5799560912 (estimated locally) (TRUE_VALUE,EXECUTABLE)
;;                8 [33.0% (guessed)]  count:2856500127 (estimated locally) (FALSE_VALUE,EXECUTABLE)

;;   basic block 24, loop depth 1, count 5799560912 (estimated locally), maybe hot
;;    prev block 7, next block 8, flags: (NEW)
;;    pred:       7 [67.0% (guessed)]  count:5799560912 (estimated locally) (TRUE_VALUE,EXECUTABLE)
  goto <bb 9>; [100.00%]
;;    succ:       9 [always]  count:5799560912 (estimated locally) (FALLTHRU)

;;   basic block 8, loop depth 1, count 2856500143 (estimated locally), maybe hot
;;    prev block 24, next block 9, flags: (NEW, REACHABLE, VISITED)
;;    pred:       7 [33.0% (guessed)]  count:2856500127 (estimated locally) (FALSE_VALUE,EXECUTABLE)
  # .MEM_86 = VDEF <.MEM_85>
  # PT = null { D.2403 }
  # ALIGN = 8, MISALIGN = 0
  # USE = nonlocal null 
  # CLB = nonlocal null 
  _49 = mallocD.235 (_26);
  # .MEM_87 = VDEF <.MEM_86>
  # USE = nonlocal null 
  # CLB = nonlocal null 
  memcpyD.588 (_49, _47, _25);
;;    succ:       9 [always (adjusted)]  count:2856500143 (estimated locally) (FALLTHRU,EXECUTABLE)

;;   basic block 9, loop depth 1, count 8656061039 (estimated locally), maybe hot
;;    prev block 8, next block 25, flags: (NEW, REACHABLE, VISITED)
;;    pred:       8 [always (adjusted)]  count:2856500143 (estimated locally) (FALLTHRU,EXECUTABLE)
;;                24 [always]  count:5799560912 (estimated locally) (FALLTHRU)
  # PT = nonlocal escaped null { D.2389 D.2401 D.2403 }
  # transfer.5_53 = PHI <_49(8), _47(24)>
  # .MEM_56 = PHI <.MEM_87(8), .MEM_85(24)>
  if (_20 > 0)
    goto <bb 10>; [41.48%]
  else
    goto <bb 25>; [58.52%]
;;    succ:       10 [41.5% (guessed)]  count:3590534146 (estimated locally) (TRUE_VALUE,EXECUTABLE)
;;                25 [58.5% (guessed)]  count:5065526893 (estimated locally) (FALSE_VALUE,EXECUTABLE)

;;   basic block 25, loop depth 1, count 5065526893 (estimated locally), maybe hot
;;    prev block 9, next block 10, flags: (NEW)
;;    pred:       9 [58.5% (guessed)]  count:5065526893 (estimated locally) (FALSE_VALUE,EXECUTABLE)
  goto <bb 11>; [100.00%]
;;    succ:       11 [always]  count:5065526893 (estimated locally) (FALLTHRU)

;;   basic block 10, loop depth 1, count 3590534111 (estimated locally), maybe hot
;;    prev block 25, next block 11, flags: (NEW, REACHABLE, VISITED)
;;    pred:       9 [41.5% (guessed)]  count:3590534146 (estimated locally) (TRUE_VALUE,EXECUTABLE)
  # .MEM_88 = VDEF <.MEM_56>
  # USE = nonlocal null 
  # CLB = nonlocal null 
  memcpyD.588 (_27, transfer.5_53, _25);
;;    succ:       11 [always (adjusted)]  count:3590534111 (estimated locally) (FALLTHRU,EXECUTABLE)

;;   basic block 11, loop depth 1, count 8656061039 (estimated locally), maybe hot
;;    prev block 10, next block 26, flags: (NEW, REACHABLE, VISITED)
;;    pred:       10 [always (adjusted)]  count:3590534111 (estimated locally) (FALLTHRU,EXECUTABLE)
;;                25 [always]  count:5065526893 (estimated locally) (FALLTHRU)
  # .MEM_57 = PHI <.MEM_88(10), .MEM_56(25)>
  if (_47 != pretmp_65)
    goto <bb 12>; [53.47%]
  else
    goto <bb 26>; [46.53%]
;;    succ:       12 [53.5% (guessed)]  count:4628395827 (estimated locally) (TRUE_VALUE,EXECUTABLE)
;;                26 [46.5% (guessed)]  count:4027665212 (estimated locally) (FALSE_VALUE,EXECUTABLE)

;;   basic block 26, loop depth 1, count 4027665212 (estimated locally), maybe hot
;;    prev block 11, next block 12, flags: (NEW)
;;    pred:       11 [46.5% (guessed)]  count:4027665212 (estimated locally) (FALSE_VALUE,EXECUTABLE)
  goto <bb 13>; [100.00%]
;;    succ:       13 [always]  count:4027665212 (estimated locally) (FALLTHRU)

;;   basic block 12, loop depth 1, count 4628395839 (estimated locally), maybe hot
;;    prev block 26, next block 13, flags: (NEW, REACHABLE, VISITED)
;;    pred:       11 [53.5% (guessed)]  count:4628395827 (estimated locally) (TRUE_VALUE,EXECUTABLE)
  # .MEM_89 = VDEF <.MEM_57>
  # USE = nonlocal null 
  # CLB = nonlocal null 
  freeD.234 (_47);
;;    succ:       13 [always (adjusted)]  count:4628395839 (estimated locally) (FALLTHRU,EXECUTABLE)

;;   basic block 13, loop depth 1, count 8656061039 (estimated locally), maybe hot
;;    prev block 12, next block 27, flags: (NEW, REACHABLE, VISITED)
;;    pred:       12 [always (adjusted)]  count:4628395839 (estimated locally) (FALLTHRU,EXECUTABLE)
;;                26 [always]  count:4027665212 (estimated locally) (FALLTHRU)
  # .MEM_58 = PHI <.MEM_89(12), .MEM_57(26)>
  if (_20 < _22)
    goto <bb 14>; [33.00%]
  else
    goto <bb 27>; [67.00%]
;;    succ:       14 [33.0% (guessed)]  count:2856500127 (estimated locally) (TRUE_VALUE,EXECUTABLE)
;;                27 [67.0% (guessed)]  count:5799560912 (estimated locally) (FALSE_VALUE,EXECUTABLE)

;;   basic block 27, loop depth 1, count 5799560912 (estimated locally), maybe hot
;;    prev block 13, next block 14, flags: (NEW)
;;    pred:       13 [67.0% (guessed)]  count:5799560912 (estimated locally) (FALSE_VALUE,EXECUTABLE)
  goto <bb 15>; [100.00%]
;;    succ:       15 [always]  count:5799560912 (estimated locally) (FALLTHRU)

;;   basic block 14, loop depth 1, count 2856500143 (estimated locally), maybe hot
;;    prev block 27, next block 15, flags: (NEW, REACHABLE, VISITED)
;;    pred:       13 [33.0% (guessed)]  count:2856500127 (estimated locally) (TRUE_VALUE,EXECUTABLE)
  # .MEM_90 = VDEF <.MEM_58>
  # USE = nonlocal null 
  # CLB = nonlocal null 
  freeD.234 (transfer.5_53);
;;    succ:       15 [always (adjusted)]  count:2856500143 (estimated locally) (FALLTHRU,EXECUTABLE)

;;   basic block 15, loop depth 1, count 8656061039 (estimated locally), maybe hot
;;    prev block 14, next block 16, flags: (NEW, REACHABLE, VISITED)
;;    pred:       14 [always (adjusted)]  count:2856500143 (estimated locally) (FALLTHRU,EXECUTABLE)
;;                27 [always]  count:5799560912 (estimated locally) (FALLTHRU)
  # .MEM_59 = PHI <.MEM_90(14), .MEM_58(27)>
  # .MEM_91 = VDEF <.MEM_59>
  D.2389 ={v} {CLOBBER};
  if (n_63 <= 1)
    goto <bb 16>; [41.00%]
  else
    goto <bb 18>; [59.00%]
;;    succ:       16 [41.0% (guessed)]  count:3548984995 (estimated locally) (TRUE_VALUE,EXECUTABLE)
;;                18 [59.0% (guessed)]  count:5107076044 (estimated locally) (FALSE_VALUE,EXECUTABLE)

;;   basic block 16, loop depth 1, count 3548985018 (estimated locally), maybe hot
;;    prev block 15, next block 17, flags: (NEW, REACHABLE, VISITED)
;;    pred:       15 [41.0% (guessed)]  count:3548984995 (estimated locally) (TRUE_VALUE,EXECUTABLE)
  if (_19 > 0)
    goto <bb 17>; [0.04%]
  else
    goto <bb 28>; [99.96%]
;;    succ:       17 [0.0% (guessed)]  count:1419592 (estimated locally) (TRUE_VALUE,EXECUTABLE)
;;                28 [100.0% (guessed)]  count:3547565426 (estimated locally) (FALSE_VALUE,EXECUTABLE)

;;   basic block 17, loop depth 0, count 1419591 (estimated locally)
;;    prev block 16, next block 18, flags: (NEW, REACHABLE, VISITED)
;;    pred:       16 [0.0% (guessed)]  count:1419592 (estimated locally) (TRUE_VALUE,EXECUTABLE)
  # .MEM_96 = VDEF <.MEM_91>
  # USE = nonlocal null 
  # CLB = nonlocal null 
  _gfortran_stop_numericD.1808 (1, 0);
;;    succ:      

;;   basic block 18, loop depth 1, count 5106238449 (estimated locally), maybe hot
;;    prev block 17, next block 29, flags: (NEW, REACHABLE, VISITED)
;;    pred:       15 [59.0% (guessed)]  count:5107076044 (estimated locally) (FALSE_VALUE,EXECUTABLE)
  if (_19 < 0)
    goto <bb 19>; [0.04%]
  else
    goto <bb 29>; [99.96%]
;;    succ:       19 [0.0% (guessed)]  count:2042492 (estimated locally) (TRUE_VALUE,EXECUTABLE)
;;                29 [100.0% (guessed)]  count:5104195957 (estimated locally) (FALSE_VALUE,EXECUTABLE)

;;   basic block 29, loop depth 1, count 5104195957 (estimated locally), maybe hot
;;    prev block 18, next block 19, flags: (NEW)
;;    pred:       18 [100.0% (guessed)]  count:5104195957 (estimated locally) (FALSE_VALUE,EXECUTABLE)
  goto <bb 20>; [100.00%]
;;    succ:       20 [always]  count:5104195957 (estimated locally) (FALLTHRU)

;;   basic block 19, loop depth 0, count 2042498 (estimated locally)
;;    prev block 29, next block 28, flags: (NEW, REACHABLE, VISITED)
;;    pred:       18 [0.0% (guessed)]  count:2042492 (estimated locally) (TRUE_VALUE,EXECUTABLE)
  # .MEM_92 = VDEF <.MEM_91>
  # USE = nonlocal null 
  # CLB = nonlocal null 
  _gfortran_stop_numericD.1808 (2, 0);
;;    succ:      

;;   basic block 28, loop depth 1, count 3547565426 (estimated locally), maybe hot
;;    prev block 19, next block 20, flags: (NEW)
;;    pred:       16 [100.0% (guessed)]  count:3547565426 (estimated locally) (FALSE_VALUE,EXECUTABLE)
;;    succ:       20 [always]  count:3547565426 (estimated locally) (FALLTHRU)

;;   basic block 20, loop depth 1, count 8652598961 (estimated locally), maybe hot
;;    prev block 28, next block 23, flags: (NEW, REACHABLE, VISITED)
;;    pred:       28 [always]  count:3547565426 (estimated locally) (FALLTHRU)
;;                29 [always]  count:5104195957 (estimated locally) (FALLTHRU)
  # .MEM_93 = VDEF <.MEM_91>
  # USE = nonlocal null 
  # CLB = nonlocal null 
  freeD.234 (_27);
  # .MEM_94 = VDEF <.MEM_93>
  D.2365 ={v} {CLOBBER};
  # RANGE [1, 4] NONZERO 7
  _28 = n_63 + 1;
  if (_28 == 4)
    goto <bb 21>; [12.36%]
  else
    goto <bb 23>; [87.64%]
;;    succ:       21 [12.4% (guessed)]  count:1069850213 (estimated locally) (TRUE_VALUE,EXECUTABLE)
;;                23 [87.6% (guessed)]  count:7582748748 (estimated locally) (FALSE_VALUE,EXECUTABLE)

;;   basic block 23, loop depth 1, count 7582748748 (estimated locally), maybe hot
;;    prev block 20, next block 21, flags: (NEW)
;;    pred:       20 [87.6% (guessed)]  count:7582748748 (estimated locally) (FALSE_VALUE,EXECUTABLE)
  goto <bb 7>; [100.00%]
;;    succ:       7 [always]  count:7582748748 (estimated locally) (FALLTHRU,DFS_BACK)

;;   basic block 21, loop depth 0, count 1069422300 (estimated locally), maybe hot
;;    prev block 23, next block 1, flags: (NEW, REACHABLE, VISITED)
;;    pred:       20 [12.4% (guessed)]  count:1069850213 (estimated locally) (TRUE_VALUE,EXECUTABLE)
  # .MEM_98 = VDEF <.MEM_94>
  # USE = nonlocal null 
  # CLB = nonlocal null 
  freeD.234 (_7);
  # VUSE <.MEM_98>
  return 0;
;;    succ:       EXIT [always (adjusted)]  count:1069422300 (estimated locally) (EXECUTABLE)

Note basic blocks 16 and 18.  _19 is computed as {-1, 1}, so one of checks in bb16/bb18 must exist on the first iteration by calling _gfortran_stop_numeric(1/2, 0)

So this two exits edge give small niters information than exit in basic block 20, the conditional check in bb20 is turned into a goto, as in dump from optimized:
;;   basic block 18, loop depth 1, count 8652598961 (estimated locally), maybe hot
;;    prev block 17, next block 1, flags: (NEW, REACHABLE, VISITED)
;;    pred:       14 [100.0% (guessed)]  count:3547565426 (estimated locally) (FALSE_VALUE,EXECUTABLE)
;;                16 [100.0% (guessed)]  count:5104195957 (estimated locally) (FALSE_VALUE,EXECUTABLE)
  # .MEM_93 = VDEF <.MEM_91>
  # USE = nonlocal null 
  # CLB = nonlocal null 
  freeD.234 (_27);
  ivtmp.50_34 = ivtmp.50_36 + 1;
  goto <bb 5>; [100.00%]
;;    succ:       5 [always]  count:8652598961 (estimated locally) (FALLTHRU,DFS_BACK,EXECUTABLE)

This change is caused by code re-arrangement in the patch:

  /* If the loop exits immediately, there is nothing to do.  */
  tree tem = fold_binary (code, boolean_type_node, iv0->base, iv1->base);
  if (tem && integer_zerop (tem))
    {
      niter->niter = build_int_cst (unsigned_type_for (type), 0);
      niter->max = 0;
      return true;
    }

  /* Handle special case loops: while (i-- < 10) and while (10 < i++) by
     adjusting iv0, iv1 and code.  */
  if (code != NE_EXPR
      && (tree_int_cst_sign_bit (iv0->step)
	  || (!integer_zerop (iv1->step)
	      && !tree_int_cst_sign_bit (iv1->step)))
      && !adjust_cond_for_loop_until_wrap (type, iv0, &code, iv1))
    return false;

Actually the first if-statement is bypassed previous because now the second if-statement is modified and moved after the first one.

But the transformation looks good to me, unless _gfortran_stop_numeric hangs?

Thanks,
bin
Comment 6 Sam Tebbs 2018-12-14 13:51:57 UTC
Created attachment 45236 [details]
arm-none-linux-gnueabihf with cpu and fpu options
Comment 7 Sam Tebbs 2018-12-14 13:54:04 UTC
I can confirm this test fails on arm-none-linux-gnueabihf when invoking with "-mcpu=cortex-a5 -mfpu=vfpv3-d16-fp16", as Christophe wrote. Please see the attached log.
Comment 8 seurer 2019-01-10 23:31:30 UTC
I looked at where the code is hanging and it looks like it is hung in a loop where it keeps calling memcpy with an incrementing by 1 length.  

I set a breakpoint at the start of memcpy to break if the length was greater than 9000 and when the breakpoint was hit just keeping hitting continue.  It was called with length 9001, 9002, 9003, ...  There is nothing in the code that does anything like this as far as I can tell and this doesn't happen with the previous revision.

Is the call

    s = transfer(vs, s)

expanded into an infinite loop?  I don't know fortran so I have no idea what that is supposed to do.

#0  .__memcpy_power7 () at ../sysdeps/powerpc/powerpc64/power7/memcpy.S:34
#1  0x0000000010000a24 in str_vs (_vs=1, _vs=1, vs=..., .__result=9002, __result=<optimized out>)
    at /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.f90:13
#2  MAIN__ () at /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.f90:34
#3  main (argc=<optimized out>, argv=<optimized out>) at /home/seurer/gcc/gcc-test2/gcc/testsuite/gfortran.dg/transfer_intrinsic_3.f90:26
#4  0x00003fffb79c7a6c in generic_start_main (main=@0x1001fec0: 0x100008a0 <main>, argc=<optimized out>, argv=0x3fffffffe888, auxvec=0x3fffffffea00, init=<optimized out>, 
    rtld_fini=<optimized out>, stack_end=<optimized out>, fini=<optimized out>) at ../csu/libc-start.c:266
#5  0x00003fffb79c7c94 in __libc_start_main (argc=<optimized out>, argv=<optimized out>, ev=<optimized out>, auxvec=<optimized out>, rtld_fini=<optimized out>, 
    stinfo=<optimized out>, stack_on_entry=<optimized out>) at ../sysdeps/unix/sysv/linux/powerpc/libc-start.c:81
#6  0x0000000000000000 in ?? ()
Comment 9 Dominique d'Humieres 2019-01-11 10:11:59 UTC
Could you please replace

(1) 'do n = 0, 3' with 'do n = 2, 3', and
(2) 'do n = 0, 3' with 'do n = 0, 0'?


I am not 100% confident about what 's = transfer(vs, s)' is supposed to do for zero-sized arrays/strings. In any case the length should never be greater than 2.
Comment 10 seurer 2019-01-11 15:00:50 UTC
I tried both (1) and (2) and the test case does not hang.
Comment 11 Dominique d'Humieres 2019-01-11 18:20:21 UTC
> I tried both (1) and (2) and the test case does not hang.

Could you please try '0, 1', '1, 2', and '0, 2'?
Comment 12 seurer 2019-01-11 18:50:43 UTC
None of those hang, either.  

I also experimented with the options a bit.  The as-is options affecting optimization are:  -O3 -funroll-loops -fpeel-loops -finline-functions

Change to -O1 and no hang.  Dropping the other ones or using -O2 and it still hangs.
Comment 13 Jakub Jelinek 2019-01-18 16:45:25 UTC
Between r266170 and r266171 the difference was in veclower21 dump with -O3:
--- transfer_intrinsic_3.f90.168t.veclower21_	2019-01-18 16:34:47.478873237 +0100
+++ transfer_intrinsic_3.f90.168t.veclower21	2019-01-18 16:35:09.503515118 +0100
@@ -370,14 +370,7 @@ main (integer(kind=4) argc, character(ki
   __builtin_free (_27);
   parm.10 ={v} {CLOBBER};
   ivtmp.52_75 = ivtmp.52_82 + 1;
-  if (ivtmp.52_75 == 3)
-    goto <bb 19>; [12.36%]
-  else
-    goto <bb 5>; [87.64%]
-
-  <bb 19> [local count: 1069422300]:
-  __builtin_free (_7);
-  return 0;
+  goto <bb 5>; [100.00%]
 
 }
 
and, if I revert the r266171 change on current trunk, the difference between f951 with the patch reverted and vanilla trunk is (again -O3, powerpc64le-linux):
--- transfer_intrinsic_3.f90.161t.cunroll_	2019-01-18 17:14:06.625536698 +0100
+++ transfer_intrinsic_3.f90.161t.cunroll	2019-01-18 17:14:24.992238353 +0100
@@ -55,8 +55,10 @@ Number of blocks in CFG: 46
 Number of blocks to update: 11 ( 24%)
 
 
+Removing basic block 21
 Removing basic block 32
 Removing basic block 41
+Merging blocks 20 and 23
 Merging blocks 30 and 33
 Removing basic block 34
 Removing basic block 36
@@ -178,8 +180,8 @@ main (integer(kind=4) argc, character(ki
   pretmp_65 = &MEM[(character(kind=1)[0:][1:1] *)_7][0];
 
   <bb 7> [local count: 8656061039]:
-  # n_63 = PHI <0(30), _28(23)>
-  # ivtmp_13 = PHI <4(30), ivtmp_31(23)>
+  # n_63 = PHI <0(30), _28(20)>
+  # ivtmp_13 = PHI <4(30), ivtmp_31(20)>
   _19 = n_63 + -1;
   _20 = (integer(kind=8)) _19;
   _22 = MAX_EXPR <_20, 0>;
@@ -264,18 +266,8 @@ main (integer(kind=4) argc, character(ki
   parm.10 ={v} {CLOBBER};
   _28 = n_63 + 1;
   ivtmp_31 = ivtmp_13 - 1;
-  if (ivtmp_31 == 0)
-    goto <bb 21>; [12.36%]
-  else
-    goto <bb 23>; [87.64%]
-
-  <bb 23> [local count: 7582748748]:
   goto <bb 7>; [100.00%]
 
-  <bb 21> [local count: 1069422300]:
-  __builtin_free (_7);
-  return 0;
-
 }
 
 
so in both cases, the loop condition is optimized out.
Comment 14 Jakub Jelinek 2019-01-18 18:55:46 UTC
I've put logging into tree-ssa-loop-niters.c, looking for when before/after r266171 code would make a difference in the returned value, the only case it triggers on is (all types integer(kind=4) i.e. signed 32-bit integer):
code LE_EXPR
iv0->base 0
iv0->step 0
iv1->base -1
iv1->step 1
every_iteration false
The loop starts with:
  <bb 7> [local count: 8656061039]:
  # n_63 = PHI <0(6), _28(23)>
  _19 = n_63 + -1;
and ends with
  _28 = n_63 + 1;
  if (_28 == 4)
    goto <bb 21>; [12.36%]
  else
    goto <bb 23>; [87.64%]

  <bb 23> [local count: 7582748748]:
  goto <bb 7>; [100.00%]
and besides the exit at the end has also:
  <bb 16> [local count: 3548985018]:
  if (_19 > 0)
    goto <bb 17>; [0.04%]
  else
    goto <bb 28>; [99.96%]
  
  <bb 17> [local count: 1419591]:
  _gfortran_stop_numeric (1, 0);
  
  <bb 18> [local count: 5106238449]:
  if (_19 < 0)
    goto <bb 19>; [0.04%]
  else
    goto <bb 29>; [99.96%]
  
  <bb 29> [local count: 5104195957]:
  goto <bb 20>; [100.00%]
  
  <bb 19> [local count: 2042498]:
  _gfortran_stop_numeric (2, 0);
in the middle, so two other loop exits.  But, neither bb16, nor bb18 are executed every iteration, if they were, then because _19 is -1 in the first iteration would always stop 2 and not iterate further.

We have:
  /* If the test is not executed every iteration, wrapping may make the test
     to pass again.
     TODO: the overflow case can be still used as unreliable estimate of upper
     bound.  But we have no API to pass it down to number of iterations code
     and, at present, it will not use it anyway.  */
  if (!every_iteration
      && (!iv0->no_overflow || !iv1->no_overflow
          || code == NE_EXPR || code == EQ_EXPR))
    return false;
at the start, but that doesn't trigger here, because code is not equality comparison and no_overflow is set on both IVs.  If there would be an overflow, then maybe it would be right to derive number of iterations from that.
But the condition that returns true is that iv0->base code iv1->base is false, if that isn't done in every iteration, it means nothing for the number of iteration analysis.

The following patch works for me:
2019-01-18  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/88044
	* tree-ssa-loop-niter.c (number_of_iterations_cond): If condition
	is false in the first iteration, but !every_iteration, return false
	instead of true with niter->niter zero.

--- gcc/tree-ssa-loop-niter.c.jj	2019-01-10 11:43:02.254577008 +0100
+++ gcc/tree-ssa-loop-niter.c	2019-01-18 19:51:00.245504728 +0100
@@ -1824,6 +1824,8 @@ number_of_iterations_cond (struct loop *
   tree tem = fold_binary (code, boolean_type_node, iv0->base, iv1->base);
   if (tem && integer_zerop (tem))
     {
+      if (!every_iteration)
+	return false;
       niter->niter = build_int_cst (unsigned_type_for (type), 0);
       niter->max = 0;
       return true;
Comment 15 Jakub Jelinek 2019-01-22 09:58:55 UTC
Author: jakub
Date: Tue Jan 22 09:58:23 2019
New Revision: 268143

URL: https://gcc.gnu.org/viewcvs?rev=268143&root=gcc&view=rev
Log:
	PR tree-optimization/88044
	* tree-ssa-loop-niter.c (number_of_iterations_cond): If condition
	is false in the first iteration, but !every_iteration, return false
	instead of true with niter->niter zero.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/tree-ssa-loop-niter.c
Comment 16 Jakub Jelinek 2019-01-22 09:59:42 UTC
Fixed.
Comment 17 Christophe Lyon 2019-01-22 14:00:58 UTC
(In reply to Jakub Jelinek from comment #16)
> Fixed.

I confirm the problem I mentioned in #c3 is now fixed. Thanks!