Currently we create a looping structure which is executed to traverse an array and transfer it one element at a time. This works, but is not too efficient. We could improve this by converting the implied do loops into the appropriate array descriptor and call transfer_array. The two cases are: real, dimension(10) :: a write(10,'(10f8.3)') a write(10,'(10f8.3)') (a(i), i=1,10) Gives: _gfortran_st_write (&dt_parm.3); { struct array1_real(kind=4) parm.4; parm.4.dtype = 281; parm.4.dim[0].lbound = 1; parm.4.dim[0].ubound = 10; parm.4.dim[0].stride = 1; parm.4.data = (void *) &a[0]; parm.4.offset = -1; _gfortran_transfer_array (&dt_parm.3, &parm.4, 4, 0); } _gfortran_st_write_done (&dt_parm.3); and _gfortran_st_write (&dt_parm.3); i = 1; if (i <= 10) { while (1) { { logical(kind=4) D.958; _gfortran_transfer_real (&dt_parm.3, &a[(integer(kind=8)) i + -1], 4); L.2:; D.958 = i == 10; i = i + 1; if (D.958) goto L.3; } } } L.3:; _gfortran_st_write_done (&dt_parm.3); The former is needed to simplify asynchronous I/O where we need to be able to convey to the I/O thread the task to be done and not the actual code to do the looping. Putting it another way, if the implied loop has 10000 count. We need to pass that count to the I/O routines, so they can take it and run with it.
(In reply to comment #0) > real, dimension(10) :: a > write(10,'(10f8.3)') a > write(10,'(10f8.3)') (a(i), i=1,10) The problem is, this isn't possible in the most generic case: write(10,'(10f8.3)') (a(i**-2*i+13), i=1,10)
OK, I see your point. Would it be possible to create a hidden iterator function that could be called internally by the I/O library to return the index into the array? In the meantime, I am thinking through a different approach for aio that avoids the issue here.
> In the meantime, I am thinking through a different approach for aio that avoids > the issue here. > Yes it would - use gfc_conv_expr_descriptor to convert the expression and pass the resulting array descriptor. For the cases that you are concerned with, this would be a temporary. However, gfortran_transfer array would be used and the io itself taken out of the loop. Obviously, this would only work for a WRITE operation and gfc_conv_subref_array_arg would have to be used for a READ - as in the existing code in gfc_trans_transfer. In fact, I believe that the exsting code in gfc_trans_transfer would be able to handle iterator expressions, were they passed to it. However, the frontend does this scalarization; see io.c(match_io_element):2396 onwards. This builds up a DO loop and a call to EXEC_TRANSFER for each element. Where there is only one element, you will get what you want by turning the iterator expression into an EXPR_ARRAY and writing the gfc_code to pass that to EXEC_TRANSFER. I think that it should be a very straightforward job. Best of luck! Paul
I have a method figured out for async I/O that will handle things as they are now, However, it would greatly improve the situation if we fix this implied do loop business.
Depending how much ground we want to cover, this can be tricky. Off my head, I can think of the following examples: (a(i), i=1,10) --> a(1:10) (a(i), i=1,10,2) --> a(1:10:2) (a(2*i), i=1,10) --> a(2:20:2) ((a(i,j), i=1,10), j=1,2) --> a(1:10,1:20) ((a(i,j), j=1,2), i=1,10) --> no equivalent ((a(i,j), j=1,2) --> a(i,1:2) (a(i,2*i), i=1,10) --> no equivalent (a(i**2), i=1,10) --> no equivalent
Created attachment 41419 [details] Early patch for simplifying impl do loops Attached is a patch that simplifies implied do loops in the way suggested by Thomas. There are, however, still a few problems with this patch that I have a hard time understanding. For example: gcc@dcm-linux:~/pr/35339> cat z1.f90 program main implicit none integer:: i integer, dimension(2,2):: a = reshape([1, 2, 3, 4], shape(a)) write (*,*) (a(i, 1), i=1,2) end program gcc@dcm-linux:~/pr/35339> gfortran -O z1.f90 z1.f90:5:0: write (*,*) (a(i, 1), i=1,4) internal compiler error: Segmentation fault 0xbe58af crash_signal ../../trunk/gcc/toplev.c:337 0x6bea8d gfc_conv_scalarized_array_ref ../../trunk/gcc/fortran/trans-array.c:3228 0x6bfb1c gfc_conv_array_ref(gfc_se*, gfc_array_ref*, gfc_expr*, locus*) ../../trunk/gcc/fortran/trans-array.c:3382 0x6f95cd gfc_conv_variable ../../trunk/gcc/fortran/trans-expr.c:2680 0x6f529a gfc_conv_expr(gfc_se*, gfc_expr*) ../../trunk/gcc/fortran/trans-expr.c:7815 0x6fd41e gfc_conv_expr_reference(gfc_se*, gfc_expr*) ../../trunk/gcc/fortran/trans-expr.c:7915 0x724862 gfc_trans_transfer(gfc_code*) ../../trunk/gcc/fortran/trans-io.c:2539 0x6b3d37 trans_code ../../trunk/gcc/fortran/trans.c:2017 0x7214f2 build_dt ../../trunk/gcc/fortran/trans-io.c:2017 0x6b3d57 trans_code ../../trunk/gcc/fortran/trans.c:1989 0x6e5498 gfc_generate_function_code(gfc_namespace*) ../../trunk/gcc/fortran/trans-decl.c:6332 0x66c576 translate_all_program_units ../../trunk/gcc/fortran/parse.c:6074 0x66c576 gfc_parse_file() ../../trunk/gcc/fortran/parse.c:6274 0x6afe9f gfc_be_parse_file ../../trunk/gcc/fortran/f95-lang.c:204 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions. probably because the new 'transfer'-statement isn't properly inserted in the code tree.
Created attachment 41420 [details] Early patch for simplifying impl do loops - 2 Sorry, wrong patch _and_ wrong testcase... Still fails for mysterious reasons, though. gcc@dcm-linux:~/pr/35339> cat z3.f90 program main implicit none integer:: i integer, parameter:: k = 2 integer, dimension(4):: a = [(i, i=1,4)] write (*,*) (a(k), i=1,4) end program gcc@dcm-linux:~/pr/35339> gfortran -O z3.f90 z3.f90:6:0: write (*,*) (a(k), i=1,4) internal compiler error: in gfc_conv_expr_descriptor, at fortran/trans-array.c:6792 0x73175d gfc_conv_expr_descriptor(gfc_se*, gfc_expr*) ../../trunk/gcc/fortran/trans-array.c:6792 0x7a811b gfc_trans_transfer(gfc_code*) ../../trunk/gcc/fortran/trans-io.c:2581 0x71dffb trans_code ../../trunk/gcc/fortran/trans.c:2017 0x71e164 gfc_trans_code_cond(gfc_code*, tree_node*) ../../trunk/gcc/fortran/trans.c:2120 0x7c4fa2 gfc_trans_simple_do ../../trunk/gcc/fortran/trans-stmt.c:1942 0x7c5473 gfc_trans_do(gfc_code*, tree_node*) ../../trunk/gcc/fortran/trans-stmt.c:2075 0x71ddf4 trans_code ../../trunk/gcc/fortran/trans.c:1917 0x71e164 gfc_trans_code_cond(gfc_code*, tree_node*) ../../trunk/gcc/fortran/trans.c:2120 0x7a6bad build_dt ../../trunk/gcc/fortran/trans-io.c:2017 0x7a6c88 gfc_trans_write(gfc_code*) ../../trunk/gcc/fortran/trans-io.c:2056 0x71df71 trans_code ../../trunk/gcc/fortran/trans.c:1989 0x71e183 gfc_trans_code(gfc_code*) ../../trunk/gcc/fortran/trans.c:2128 0x7590ed gfc_generate_function_code(gfc_namespace*) ../../trunk/gcc/fortran/trans-decl.c:6332 0x71e1c7 gfc_generate_code(gfc_namespace*) ../../trunk/gcc/fortran/trans.c:2145 0x6b0646 translate_all_program_units ../../trunk/gcc/fortran/parse.c:6074 0x6b0c56 gfc_parse_file() ../../trunk/gcc/fortran/parse.c:6274 0x706d3f gfc_be_parse_file ../../trunk/gcc/fortran/f95-lang.c:204 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions.
(In reply to Nicolas Koenig from comment #7) > Created attachment 41420 [details] > Early patch for simplifying impl do loops - 2 > > Sorry, wrong patch _and_ wrong testcase... Still fails for mysterious > reasons, though. I was thinking the whole change would be in trans-io.c. This was before frontend-passes.c existed.
Created attachment 41430 [details] Proposed patch for simplifying impl do loops Fixed the bugs, it works now. I will submit it later this day. @Jerry: This isn't the entire solution, I fear there will have to be changes in trans-io.c to completely fix this PR ;)
As I understand this patch applies to read and write. How does this optimization behave regarding my pet issue (short array reads)? I.e. program internalread integer m parameter(m=10000000) character value*10 integer i,j,intvalues(m) DO j=1,100 write(value,'(i3,a5)') j," 5 69" CC read(value,*,end=20) (intvalues(i),i=1,m) read(value,*,end=20) intvalues 20 print*,"20" write(*,*) j,i ENDDO end program internalread On my box this is 2.8s vs. 31s, i.e. the array version is more than 10 times slower than the implied do loop.
(In reply to Manfred Schwarb from comment #10) > As I understand this patch applies to read and write. > How does this optimization behave regarding my pet issue (short array reads)? > > I.e. > program internalread > integer m > parameter(m=10000000) > character value*10 > integer i,j,intvalues(m) > > DO j=1,100 > write(value,'(i3,a5)') j," 5 69" > CC read(value,*,end=20) (intvalues(i),i=1,m) > read(value,*,end=20) intvalues > 20 print*,"20" > write(*,*) j,i > ENDDO > end program internalread > > On my box this is 2.8s vs. 31s, i.e. the array version is more > than 10 times slower than the implied do loop. The patch to 53029 fixes this one. And I think Nicolas patch will improve further the implied do loop version by converting to an array section. Both patches would be needed for that. We are still testing of course.
Author: koenigni Date: Mon Jun 5 12:35:11 2017 New Revision: 248877 URL: https://gcc.gnu.org/viewcvs?rev=248877&root=gcc&view=rev Log: 2017-06-05 Nicolas Koenig <koenigni@student.ethz.ch> PR fortran/35339 * frontend-passes.c (traverse_io_block): New function. (simplify_io_impl_do): New function. (optimize_namespace): Invoke gfc_code_walker with simplify_io_impl_do. 2017-06-05 Nicolas Koenig <koenigni@student.ethz.ch> PR fortran/35339 * gfortran.dg/implied_do_io_1.f90: New Test. * gfortran.dg/implied_do_io_2.f90: New Test. Added: trunk/gcc/testsuite/gfortran.dg/implied_do_io_1.f90 trunk/gcc/testsuite/gfortran.dg/implied_do_io_2.f90 Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/frontend-passes.c trunk/gcc/testsuite/ChangeLog
Author: tkoenig Date: Mon Feb 19 18:21:45 2018 New Revision: 257814 URL: https://gcc.gnu.org/viewcvs?rev=257814&root=gcc&view=rev Log: 2018-02-19 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/35339 * frontend-passes.c (traverse_io_block): Remove workaround for PR 80945. 2018-02-19 Thomas Koenig <tkoenig@gcc.gnu.org> PR fortran/35339 * gfortran.dg/implied_do_io_4.f90: New test. Added: trunk/gcc/testsuite/gfortran.dg/implied_do_io_4.f90 Modified: trunk/gcc/fortran/ChangeLog trunk/gcc/fortran/frontend-passes.c trunk/gcc/testsuite/ChangeLog
Fixed (finally). Closing.
(In reply to Thomas Koenig from comment #14) > Fixed (finally). > > Closing. Thanks Thomas!