Test program by James Van Buskirk, from c.l.f: $ cat > size.f90 program sum_f95 implicit none integer i, j integer, parameter :: n = 152 integer, parameter :: iv1 = size([([('',i=1,j**3)],j=1,n)]) write(*,*) n, iv1 end program sum_f95 $ time gfortran size.f90 real 4m30.253s user 4m21.508s sys 0m0.092s Timing profile: Execution times (seconds) parser : 262.05 (100%) usr 0.00 ( 0%) sys 262.05 (100%) wall 116 kB (51%) ggc final : 0.00 ( 0%) usr 0.01 (100%) sys 0.01 ( 0%) wall 0 kB ( 0%) ggc TOTAL : 262.05 0.01 262.07 227 kB
Isn't this the same as PR 20923 ?
For reference: iv1 = 135210384 A typical backtrace: #0 0xb7f0c82e in __gmpz_sub_ui () from /usr/lib/libgmp.so.3 #1 0x08050dce in expand_constructor (c=0x89b9280) at ../../../gcc/gcc/fortran/array.c:1330 #2 0x08050bca in expand_constructor (c=0x89b9360) at ../../../gcc/gcc/fortran/array.c:1372 #3 0x08050e24 in expand_constructor (c=0x89b9740) at ../../../gcc/gcc/fortran/array.c:1256 #4 0x08050ed1 in gfc_get_array_element (array=0x89b97b0, element=65535) at ../../../gcc/gcc/fortran/array.c:1696 #5 0x08052027 in gfc_expand_constructor (e=0x89b97b0) at ../../../gcc/gcc/fortran/array.c:1404 #6 0x080a7e63 in gfc_resolve_expr (e=0x89b97b0) at ../../../gcc/gcc/fortran/resolve.c:4265
On a Core2 2.1Ghz, the original code gives [ibook-dhum] f90/bug% time gfc pr34554.f90 193.792u 0.273s 3:14.46 99.7% 0+0k 0+27io 0pf+0w [ibook-dhum] f90/bug% time a.out 152 135210384 0.000u 0.001s 0:00.00 0.0% 0+0k 0+2io 0pf+0w The following code where the PARAMETERs have been removed program sum_f95 implicit none integer i, j integer n, iv1 n = 152 iv1 = size([([('',i=1,j**3)],j=1,n)]) write(*,*) n, iv1 end program sum_f95 gives [ibook-dhum] f90/bug% time gfc pr34554_2_db.f90 0.027u 0.022s 0:00.08 50.0% 0+0k 0+15io 0pf+0w [ibook-dhum] f90/bug% time a.out 152 135210384 0.745u 0.637s 0:01.38 99.2% 0+0k 0+0io 0pf+0w So using parameters give a runtime speedup at the expense of a very long compilation time. However is it reasonable that the computation of size([([('',i=1,j**3)],j=1,n)]) be almost 200 times slower when done by the compiler? For the original test, the compile times for different n are n=52 2.736u 0.024s 0:02.76 99.6% 0+0k 0+26io 0pf+0w n=102 39.522u 0.079s 0:39.79 99.4% 0+0k 0+12io 0pf+0w n=152 193.792u 0.273s 3:14.46 99.7% 0+0k 0+27io 0pf+0w So the compilation time is O(n^4). For the second test, the runtimes are n=52 0.013u 0.011s 0:00.02 100.0% 0+0k 0+0io 0pf+0w n=102 0.155u 0.139s 0:00.29 96.5% 0+0k 0+0io 0pf+0w n=152 0.745u 0.637s 0:01.38 99.2% 0+0k 0+0io 0pf+0w So the execution time is also O(n^4), but the prefactor is two order of magnitude larger at compile time (it means that you have to run the same code 200 times to have a real benefit on the total compile+execution time).
array.c (gfc_get_array_element) states: "Access is not efficient at all, but this is another place where things do not have to be particularly fast." As get_array_element shows up in #2, this might be a place to start.
The fortran-exp branch has a performance regression with these test cases. Trunk is slow to compile it, but does succeed. The branch can not even get close to it. I will start looking at this. Daniel, any ideas?
(In reply to comment #5) > I will start looking at this. Daniel, any ideas? I'd think that the fortran-exp branch tries to unroll the whole thing, which then doesn't fit into memory any more at some point. Hence the failure. One needs to reintroduce the max-constructor-size flag and stop whenever that is hit and set up the initializer on runtime. Before doing this, I intended to get rid of (or rewrite) the expand-stack as I promised at Christmas that I'd do. I'm still busy like hell (including being side-tracked by other projects) and my development machine died four week ago - still broken :( Jerry, if you want to have a go in the meantime, be my guest. I'll join in as soon as I can manage!
I have an idea.
Restoring the size check to gfortran-exp branch I get the following results with the test case in comment #1. Current trunk (4.6): $ time gfc pr34554.f90 real 8m26.965s user 8m21.252s sys 0m2.477s $ ./a.out 152 135210384 fortran-exp branch patched: $ time gfcx pr34554.f90 real 4m44.514s user 4m43.951s sys 0m0.282s $ ./a.out 152 135210384 This is a halving of compilation time. Patch to follow.
Created attachment 20360 [details] patch for fortran-exp branch Please test on fortran-exp branch.
With the patch in comment #9 applied to the fortran-exp branch, the timing for the original test is slightly slower than trunk ~250s compared to ~240s. Note that the procedure node_copy_and_append should be deleted otherwise you get the error: cc1: warnings being treated as errors ../../cons_clean/gcc/fortran/constructor.c:65:1: error: 'node_copy_and_append' defined but not used make[3]: *** [fortran/constructor.o] Error 1 make[3]: *** Waiting for unfinished jobs.... make[2]: *** [all-stage2-gcc] Error 2 make[1]: *** [stage2-bubble] Error 2 make: *** [all] Error 2 Thanks for the patch.
> + /* If we can successfully get an array element at the max array size then s/can/cannot/ ?
Dominique, thanks for testing. You are not getting near the same speedup I am. It must be related to cache size. I will submit the patch for approval some time in the next few days.
I think this can be closed.