This is the mail archive of the mailing list for the GNU Fortran project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: First Patch

Hi Daniel,

On Thu, Jun 09, 2011 at 03:57:48PM +0200, Daniel Carrera wrote:
> Aha... we're getting there. I fixed my LD_LIBRARY_PATH and now all the  
> tests pass except for the new one I just added. So now the issue is in  
> either my patch (most likely) or my test (less likely).

Or both ...

> I tried to trace the problem, and I suspect that I'm not running the  
> latest version of my patch

Remember that "mpi.c" is not automatically compiled - only single.c is
(-> libcaf_single.a). Thus, for mpi.c you need to remember to
run "mpicc -c single.c" -- or better "mpicc -c -g single.c", that gives
you debug symbols and thus the backtrace should have a line number for
the _gfortran_caf_sync_all failure.

I assume that's one of your problems. It would also help if you
compiled the Fortran program with -g - currently, it also doesn't show
the linenumber for sync_1.f90.

There is another bug - this time of mine: I passed errmsg wrongly, but
as sync always succeeded, I didn't see it. For "stat" we now repeat the
error - and thus it fails.

+	  tmp = build_call_expr_loc (input_location, gfor_fndecl_caf_sync_all,
+				     3, stat, errmsg, errmsglen);

This passes the variable "stat" and "errmsg". However, it should pass
"&stat" and "&errmsg". Thus, "stat" has to be replaced by
  gfc_build_addr_expr (NULL, stat)
Ditto for "errmsg". (errmsglen is OK as it is passed by value.)

> !
> !
> sync images (*)
> sync images (1)
> sync images (1, errmsg=str)
> sync images ([1])

The last three SYNC statements should lead to a deadlock,
unless one has num_images() == 1.

It has to be changed to:

 sync images (*)
 if (this_image() == 1) then
   sync images (1)
   sync images (1, errmsg=str)
   sync images ([1])
 end if

The first one acts as sync all, which is fine.
But "sync images(1)" means that all images wait for image 1,
but image 1 only waits for image 1 - which won't work. For
the test, "if (this_image() == 1)" makes sure that only image 1
only waits for image 1 - I think that version has also been
implemented in mpi.c. Another possibility would be:
 if (this_image() == 1) then
   sync images(*)
   sync images(1)
 end if
where image 1 waits for all other images, and the other
images wait for image 1 only. However, that definitely does
not work with the current MPI version (mpi.c).

I saw that I made another mistake in the _gfortran_caf_sync_images
stub implementation:

  if (count == 0 || (count == 1 && images[0] == caf_this_image))
    return 0;

The problem is: I never set "*stat" in that case.

> n = 5
> sync images (*, stat=n)
> if (n /= 0) call abort()

That one should work.
> n = 5
> sync images ([1],errmsg=str,stat=n)
> if (n /= 0) call abort()

While this one needs also an  if(this_image() == 1).


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]