Bug 40737 - Pointer references sometimes fail to define "span" symbols
Pointer references sometimes fail to define "span" symbols
Status: NEW
Product: gcc
Classification: Unclassified
Component: fortran
4.4.0
: P3 normal
: ---
Assigned To: Not yet assigned to anyone
: wrong-code
Depends on:
Blocks: 56818
  Show dependency treegraph
 
Reported: 2009-07-13 21:36 UTC by David Hough
Modified: 2013-04-02 18:52 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail: 4.3.2, 4.4.0, 4.5.0
Last reconfirmed: 2009-07-22 07:10:56


Attachments
module definition (4.10 KB, text/plain)
2009-07-13 21:39 UTC, David Hough
Details
module use file for bug report (1.65 KB, text/plain)
2009-07-13 21:40 UTC, David Hough
Details
Potential patch to fix pr40737 (671 bytes, patch)
2009-10-06 22:03 UTC, Peter Bergner
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description David Hough 2009-07-13 21:36:54 UTC
This bug appears in gfortran 4.4.0 on sparc-solaris, x86-solaris, and x86-linux.
The attached test case is extracted from SPECmpi 2007 129.tera_tf

The two test files testmod.F90 and testuse.F90 define and use pointer types.
With -DBIGMOD certain variables are defined in the module; with -UBIGMOD,
in the user.   Correct results are obtained with the latter; the code generated
for the lines

         Ro => Hydro_vars( first_cell:last_cell, j, k)%cell_var( ro_var)
         Ets => Hydro_vars( first_cell:last_cell, j, k)%cell_var( ets_var)
         Um => Hydro_vars( first_cell:last_cell, j, k)%cell_var( u_var)
         Um_p1 => Hydro_vars( first_cell:last_cell, j, k)%cell_var( up1_var)
         Um_p2 => Hydro_vars( first_cell:last_cell, j, k)%cell_var( up2_var)

both uses and defines various symbols like

testuse.s:      sethi   %h44(span.1.696), %g1
testuse.s:      or      %g1, %m44(span.1.696), %g1
testuse.s:      or      %g1, %l44(span.1.696), %g1

testuse.s:      .local  span.1.696
testuse.s:      .common span.1.696,8,8


but with -UBIGMOD, the symbols are referred to but nowhere defined.

Compile e.g.

gfortran -S testmod.F90 testuse.F90 -UBIGMOD -m64
or
gfortran -S testmod.F90 testuse.F90 -DBIGMOD -m64

Same results for -m32
Comment 1 David Hough 2009-07-13 21:39:17 UTC
Created attachment 18189 [details]
module definition

This is the module definition file for the bug report.
Comment 2 David Hough 2009-07-13 21:40:25 UTC
Created attachment 18190 [details]
module use file for bug report

Compile this module use with the other attachment module definition
Comment 3 Tobias Burnus 2009-07-14 07:50:17 UTC
Confirm (kind of) with GCC 4.3.2 on i686-linux. With -DBIGMOD one gets:

/tmp/ccmoM1rS.o: In function `tf_ad_splitting_driver_plane_':
t.F90:(.text+0xad): undefined reference to `span.1'
t.F90:(.text+0x15c): undefined reference to `span.0'
t.F90:(.text+0x20b): undefined reference to `span.2'
t.F90:(.text+0x2ba): undefined reference to `span.3'
t.F90:(.text+0x369): undefined reference to `span.4'
Comment 4 David Hough 2009-07-21 15:19:04 UTC
(In reply to comment #0)


In the original SPECmpi source code, 
I was able to make the compile-time bug go away with this source workaround:

change e.g.

         Ro => Hydro_vars( first_cell:last_cell, j, k)%cell_var( ro_var)

to

         tRo => Hydro_vars( first_cell:last_cell, j, k)%cell_var( ro_var)
         call copy_pointer(Ro, tRo)

with earlier declarations for tRo and copy_pointer

< subroutine copy_pointer(p, q)
<    
<       use TF_NUMBER_KIND
< 
<       implicit none
<       real(r8), pointer, dimension( :) :: p, q
< 
<       p => q
<       return
< 
< end subroutine

<    use TF_NUMBER_KIND
< 
<    implicit none
< 
<    real(r8), pointer, dimension( :),save ::        tRo


However execution was unsuccessful, though not necessarily due to a problem
in this part of the code.     Compilation and execution of this modified
code were successful on
the same linux system using Sun Studio 12u1, but not using gfortran 4.4.0
on a sparc solaris system...   so the root cause of failure is unknown.
Comment 5 Tobias Burnus 2009-07-22 07:10:56 UTC
Reduced test case. The crucial part is the span ("1:2") in the assignment - and that "Ro" is use-associated.


Dump:

tf_ad_splitting_driver_plane ()
{  [...]
  extern integer(kind=8) span.0 = 0;
   [...]
    span.0 = 4;


module testmod
  implicit none
  type VARIABLES_MAILLE
      real :: cell_var
  end type VARIABLES_MAILLE
  type (VARIABLES_MAILLE), pointer, dimension( :) :: Hydro_vars
  real, pointer, dimension(:) :: Ro
end module testmod

program TF_AD_SPLITTING_DRIVER_PLANE
  use testmod
  implicit none
  Ro => Hydro_vars(1:2)%cell_var
end program
Comment 6 Tobias Burnus 2009-07-22 07:12:53 UTC
Paul, do you immediately see what goes wrong here? If not, I can also dig myself.
Comment 7 Tobias Burnus 2009-07-22 09:20:03 UTC
My current understanding is that "span" is only created (in gfc_get_symbol_decl)
  if (sym->attr.subref_array_pointer)
is true - and is then assumed to live at the same place as the symbol (array descriptor) itself. But this fails for use association (and maybe also for host association).

Solution 1: Always create that variable if they symbols is a pointer to an array.
Solution 2: Defer it until we have the proper array descriptor, which handles this.
Comment 8 Peter Bergner 2009-10-06 22:03:56 UTC
Created attachment 18732 [details]
Potential patch to fix pr40737

Here is a patch from Adhemerval Zanella from our IBM LTC Performance team, that "fixes" the problem for me and bootstraps (powerpc64-linux) and regtests with no regressions.  Can someone else give this a try on their system?
Comment 9 kargl 2009-10-06 23:10:34 UTC
(In reply to comment #8)
> Created an attachment (id=18732) [edit]
> Potential patch to fix pr40737
> 
> Here is a patch from Adhemerval Zanella from our IBM LTC Performance team, that
> "fixes" the problem for me and bootstraps (powerpc64-linux) and regtests with
> no regressions.  Can someone else give this a try on their system?
> 

With the patch installed and running either of the following commands:

gfortran -S testmod.F90 testuse.F90 -UBIGMOD -m64
gfortran -S testmod.F90 testuse.F90 -DBIGMOD -m64

shows no instances of 'scan' in the resulting .s files of x86_64-*-freebsd.
Don't know if this is the result that you are looking for.
Comment 10 Tobias Burnus 2009-10-07 08:20:38 UTC
(In reply to comment #8)
> Created an attachment (id=18732) [edit]
> Potential patch to fix pr40737
> 
> Here is a patch from Adhemerval Zanella from our IBM LTC Performance team,
> that "fixes" the problem for me and bootstraps (powerpc64-linux) and
> regtests with no regressions.  Can someone else give this a try on their
> system?

I think it only paper bags the problem. The problem is that the span information needs to be available in all places where the pointer is available - also if the module a separate file from the one where the assignment is done which can be again separate from the place where the pointer is used.

Thus, as written, I only see two solutions:

Solution 1: Always create that variable if they symbols is a pointer to an
array.
Solution 2: Defer it until we have the proper array descriptor, which handles
this.

I think in 4.6 we will finally go for solution 2.

Nevertheless, one should check whether the patch improves the situation for 4.5 and should thus be applied as interim solution.
Comment 11 David Hough 2009-10-07 22:42:35 UTC
(In reply to comment #10)

> > Here is a patch from Adhemerval Zanella from our IBM LTC Performance team,
> > that "fixes" the problem for me and bootstraps (powerpc64-linux) and
> > regtests with no regressions.  Can someone else give this a try on their
> > system?

I can compile now the original source SPECmpi 129.tera_tf
but I still don't execute correctly.    The variables Ro and Ets get undefined
somehow.     Has anybody used any version of gfortran successfully to compile
and correctly execute all of SPECmpi ?

Comment 12 Daniel Franke 2010-12-28 22:30:16 UTC
Isn't this the same as PR34640?
Comment 13 Dominique d'Humieres 2011-01-22 22:37:05 UTC
(In reply to comment #10)
> Isn't this the same as PR34640?

I think so, see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=46339#c11 .
Comment 14 Kenneth Hoste 2011-06-23 07:36:41 UTC
Seems like this issue is still present in the GCC 4.6 branch, at least in GCC 4.6.0 and a checkout on 20110617 of the 4.6 branch.

I can confirm that patching the tera_tf source as suggested by David fixes the issue, but the runtime still fails: the benchmark seems stuck in an infinite loop or something when compiled with "-Ofast -march=native -mtune=native -floop-strip-mine -floop-interchange -floop-block" at least. Not sure if the failing runtime is caused by this issue though.

Is the patch that has been proposed insufficient?

Any of you know whether anyone has been able to build whole of SPEC MPI2007 using (a recent) GCC?
Comment 15 Tobias Burnus 2013-01-06 14:56:34 UTC
For another test case - using CLASS(*) - see PR 55763 comment 0 (last example); see also the analysis to that example in PR 55763 comment 6.