Bug 34554 (ls) - time compiling complicated size initialization expression
Summary: time compiling complicated size initialization expression
Status: RESOLVED FIXED
Alias: ls
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: ---
Assignee: Jerry DeLisle
URL:
Keywords: compile-time-hog
Depends on:
Blocks:
 
Reported: 2007-12-22 10:56 UTC by Thomas Koenig
Modified: 2010-04-15 00:35 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-04-11 01:49:35


Attachments
patch for fortran-exp branch (1.70 KB, patch)
2010-04-11 06:44 UTC, Jerry DeLisle
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Thomas Koenig 2007-12-22 10:56:21 UTC
Test program by James Van Buskirk, from c.l.f:

$ cat > size.f90
program sum_f95
   implicit none
   integer i, j
   integer, parameter :: n = 152
   integer, parameter :: iv1 = size([([('',i=1,j**3)],j=1,n)])

   write(*,*) n, iv1
end program sum_f95
$ time gfortran size.f90 

real	4m30.253s
user	4m21.508s
sys	0m0.092s

Timing profile:

Execution times (seconds)
 parser                : 262.05 (100%) usr   0.00 ( 0%) sys 262.05 (100%) wall     116 kB (51%) ggc
 final                 :   0.00 ( 0%) usr   0.01 (100%) sys   0.01 ( 0%) wall       0 kB ( 0%) ggc
 TOTAL                 : 262.05             0.01           262.07                227 kB
Comment 1 Andrew Pinski 2007-12-22 11:44:03 UTC
Isn't this the same as PR 20923 ?
Comment 2 Daniel Franke 2007-12-22 12:18:49 UTC
For reference: iv1 = 135210384

A typical backtrace:

#0  0xb7f0c82e in __gmpz_sub_ui () from /usr/lib/libgmp.so.3
#1  0x08050dce in expand_constructor (c=0x89b9280)
    at ../../../gcc/gcc/fortran/array.c:1330
#2  0x08050bca in expand_constructor (c=0x89b9360)
    at ../../../gcc/gcc/fortran/array.c:1372
#3  0x08050e24 in expand_constructor (c=0x89b9740)
    at ../../../gcc/gcc/fortran/array.c:1256
#4  0x08050ed1 in gfc_get_array_element (array=0x89b97b0, element=65535)
    at ../../../gcc/gcc/fortran/array.c:1696
#5  0x08052027 in gfc_expand_constructor (e=0x89b97b0)
    at ../../../gcc/gcc/fortran/array.c:1404
#6  0x080a7e63 in gfc_resolve_expr (e=0x89b97b0)
    at ../../../gcc/gcc/fortran/resolve.c:4265
Comment 3 Dominique d'Humieres 2008-09-14 19:09:52 UTC
On a Core2 2.1Ghz, the original code gives

[ibook-dhum] f90/bug% time gfc pr34554.f90
193.792u 0.273s 3:14.46 99.7%   0+0k 0+27io 0pf+0w
[ibook-dhum] f90/bug% time a.out
         152   135210384
0.000u 0.001s 0:00.00 0.0%      0+0k 0+2io 0pf+0w

The following code where the PARAMETERs have been removed 

program sum_f95
   implicit none
   integer i, j
   integer n, iv1

   n = 152
   iv1 = size([([('',i=1,j**3)],j=1,n)])
   write(*,*) n, iv1
end program sum_f95

gives

[ibook-dhum] f90/bug% time gfc pr34554_2_db.f90
0.027u 0.022s 0:00.08 50.0%     0+0k 0+15io 0pf+0w
[ibook-dhum] f90/bug% time a.out
         152   135210384
0.745u 0.637s 0:01.38 99.2%     0+0k 0+0io 0pf+0w

So using parameters give a runtime speedup at the expense of a very long compilation time. However is it reasonable that the computation of size([([('',i=1,j**3)],j=1,n)]) be almost 200 times slower when done by the compiler?

For the original test, the compile times for different n are
n=52
2.736u 0.024s 0:02.76 99.6%     0+0k 0+26io 0pf+0w
n=102
39.522u 0.079s 0:39.79 99.4%    0+0k 0+12io 0pf+0w
n=152
193.792u 0.273s 3:14.46 99.7%   0+0k 0+27io 0pf+0w
So the compilation time is O(n^4). 

For the second test, the runtimes are
n=52
0.013u 0.011s 0:00.02 100.0%    0+0k 0+0io 0pf+0w
n=102
0.155u 0.139s 0:00.29 96.5%     0+0k 0+0io 0pf+0w
n=152
0.745u 0.637s 0:01.38 99.2%     0+0k 0+0io 0pf+0w
So the execution time is also O(n^4), but the prefactor is two order of magnitude larger at compile time (it means that you have to run the same code 200 times to have a real benefit on the total compile+execution time). 

Comment 4 Daniel Franke 2009-01-03 22:11:45 UTC
array.c (gfc_get_array_element) states:
"Access is not efficient at all, but this is another place where things do not have to be particularly fast."

As get_array_element shows up in #2, this might be a place to start.
Comment 5 Jerry DeLisle 2010-04-10 15:41:35 UTC
The fortran-exp branch has a performance regression with these test cases. Trunk is slow to compile it, but does succeed. The branch can not even get close to it.

I will start looking at this. Daniel, any ideas?
Comment 6 Daniel Franke 2010-04-10 15:50:32 UTC
(In reply to comment #5)
> I will start looking at this. Daniel, any ideas?

I'd think that the fortran-exp branch tries to unroll the whole thing, which then doesn't fit into memory any more at some point. Hence the failure.

One needs to reintroduce the max-constructor-size flag and stop whenever that is hit and set up the initializer on runtime. Before doing this, I intended to get rid of (or rewrite) the expand-stack as I promised at Christmas that I'd do. I'm still busy like hell (including being side-tracked by other projects) and my development machine died four week ago - still broken :(

Jerry, if you want to have a go in the meantime, be my guest. I'll join in as soon as I can manage!
Comment 7 Jerry DeLisle 2010-04-11 01:49:35 UTC
I have an idea.
Comment 8 Jerry DeLisle 2010-04-11 06:38:54 UTC
Restoring the size check to gfortran-exp branch I get the following results with the test case in comment #1.

Current trunk (4.6):

$ time gfc pr34554.f90

real	8m26.965s
user	8m21.252s
sys	0m2.477s
$ ./a.out 
         152   135210384

fortran-exp branch patched:

$ time gfcx pr34554.f90

real	4m44.514s
user	4m43.951s
sys	0m0.282s
$ ./a.out 
         152   135210384

This is a halving of compilation time.  Patch to follow.
Comment 9 Jerry DeLisle 2010-04-11 06:44:04 UTC
Created attachment 20360 [details]
patch for fortran-exp branch

Please test on fortran-exp branch.
Comment 10 Dominique d'Humieres 2010-04-11 08:11:00 UTC
With the patch in comment #9 applied to the fortran-exp branch, the timing for the original test is slightly slower than trunk ~250s compared to ~240s.

Note that the procedure node_copy_and_append should be deleted otherwise you get the error:

cc1: warnings being treated as errors
../../cons_clean/gcc/fortran/constructor.c:65:1: error: 'node_copy_and_append' defined but not used
make[3]: *** [fortran/constructor.o] Error 1
make[3]: *** Waiting for unfinished jobs....
make[2]: *** [all-stage2-gcc] Error 2
make[1]: *** [stage2-bubble] Error 2
make: *** [all] Error 2

Thanks for the patch.
Comment 11 Dominique d'Humieres 2010-04-11 08:26:18 UTC
> +  /* If we can successfully get an array element at the max array size then

s/can/cannot/ ?
Comment 12 Jerry DeLisle 2010-04-11 13:07:48 UTC
Dominique, thanks for testing.  You are not getting near the same speedup I am.  It must be related to cache size.  I will submit the patch for approval some time in the next few days.
Comment 13 Jerry DeLisle 2010-04-15 00:35:17 UTC
I think this can be closed.