This is the mail archive of the mailing list for the GNU Fortran project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: First Patch

On 06/09/2011 06:19 PM, Tobias Burnus wrote:
Remember that "mpi.c" is not automatically compiled - only single.c is
(->  libcaf_single.a). Thus, for mpi.c you need to remember to
run "mpicc -c single.c" -- or better "mpicc -c -g single.c", that gives
you debug symbols and thus the backtrace should have a line number for
the _gfortran_caf_sync_all failure.

I assume that's one of your problems. It would also help if you
compiled the Fortran program with -g - currently, it also doesn't show
the linenumber for sync_1.f90.

Ok, I got mpi.o compiled with "-g", copied to gfortran.dg/coarray and recompiled the test with "-g" again. I can now confirm that the problem is on this line:

  if (stat)
    *stat = ierr;   <----- Segmentation Fault Here

Which makes perfect sense in light of the bug you found:

There is another bug - this time of mine: I passed errmsg wrongly, but
as sync always succeeded, I didn't see it. For "stat" we now repeat the
error - and thus it fails.

+	  tmp = build_call_expr_loc (input_location, gfor_fndecl_caf_sync_all,
+				     3, stat, errmsg, errmsglen);

This passes the variable "stat" and "errmsg". However, it should pass
"&stat" and "&errmsg". Thus, "stat" has to be replaced by
   gfc_build_addr_expr (NULL, stat)
Ditto for "errmsg". (errmsglen is OK as it is passed by value.)

I replaced "stat", "errmsg" and "tmp_stat" with the gfc_build_addr_expr call and now the "sync all" runs without problems.

The next problem I had was with sync images, but thanks to the "-g" flag I found and fixed the error. The call to build_call_expr_loc had the parameters in the wrong order: I still had "stat" as the first function parameter.

With that fixed, there is still one issue left. When you run "sync images (1)" you get the error:

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
COARRAY ERROR: SYNC IMAGES not yet implemented-----------------------

Of course, this comes from here:

/* FIXME: SYNC IMAGES with a nontrivial argument cannot easily be mapped to MPI communicators. Thus, exist early with an error message. */ if (count > 0) { fprintf (stderr, "COARRAY ERROR: SYNC IMAGES not yet implemented"); error_stop (1); }

From the looks of it, it seems like either I should remove this particular test, or somehow mark the test as "expected failure".

I saw that I made another mistake in the _gfortran_caf_sync_images
stub implementation:

   if (count == 0 || (count == 1&&  images[0] == caf_this_image))
     return 0;

The problem is: I never set "*stat" in that case.

Yeah, I saw that. I fixed it yesterday.

Cheers, Daniel. -- I'm not overweight, I'm undertall.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]