GCC Bugzilla will be upgraded from version 4.4.9 to 5.0rc3 on Saturday, April 25, starting around 17:00 UTC. The upgrade process should only last a few minutes. Check bug 64968 for details.

Bug 49103

Summary: [4.6/4.7 Regression] local variables exchange values / wrong code with -O3
Product: gcc Reporter: Thomas Orgis <thomas.orgis>
Component: fortranAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: burnus, dimhen, jakub
Priority: P2 Keywords: wrong-code
Version: 4.6.0   
Target Milestone: 4.6.1   
Host: x86_64-unknown-linux-gnu Target: x86_64-unknown-linux-gnu
Build: x86_64-unknown-linux-gnu Known to work:
Known to fail: Last reconfirmed: 2011-05-21 18:08:15
Attachments: test program exposing the bug
reduced test
gcc46-pr49103.patch

Description Thomas Orgis 2011-05-21 17:13:28 UTC
Created attachment 24321 [details]
test program exposing the bug

After upgrading to gfortran 4.6.0, I noticed crashes in my codebase. This seems to boil down to some local messup of variable values. I observe this with -O3, but not with -O2 plus all available flags that are included in -O3. So, this must be one of the internal optimizations that are not accessible via individual flags.

The example is ripped out of a larger codebase and it shows. I spent over a whole day diagnosing and trimming down already. Hopefully, you will mainly have to look at the lower 100 lines. The routine grid_faces is the point of interest, with a call to base_new triggering the issue.

The bad example:

shell$ gfortran -O3 gcc46gridbug.f90
shell$ ./a.out 
 LOG: Coordinate mapping:           1           2           3
 LOG:       back mapping:           1           2           3
 LOG: Base point:   0.0000000000000000        0.0000000000000000        0.0000000000000000     
 generating base functions
 grid: faces
 Here is the bug. The grades should be (2 2):           2           2
 Compare the base grades:           1           1
 Here is the bug. The grades should be (2 2):           1           1
 Compare the base grades:           1           1
 grades again           2           2
 Here is the bug. The grades should be (2 2):           2           2
 Compare the base grades:           1           1
 Here is the bug. The grades should be (2 2):           1           1
 Compare the base grades:           1           1
 grades again           2           2
 Here is the bug. The grades should be (2 2):           2           2
 Compare the base grades:           1           1
 Here is the bug. The grades should be (2 2):           1           1
 Compare the base grades:           1           1
 grades again           2           2
 grid finished
 PASS


And the good example:

shell$ gfortran -O2 gcc46gridbug.f90 
shell$ ./a.out 
 LOG: Coordinate mapping:           1           2           3
 LOG:       back mapping:           1           2           3
 LOG: Base point:   0.0000000000000000        0.0000000000000000        0.0000000000000000     
 generating base functions
 grid: faces
 Here is the bug. The grades should be (2 2):           2           2
 Compare the base grades:           1           1
 Here is the bug. The grades should be (2 2):           2           2
 Compare the base grades:           1           1
 grades again           2           2
 Here is the bug. The grades should be (2 2):           2           2
 Compare the base grades:           1           1
 Here is the bug. The grades should be (2 2):           2           2
 Compare the base grades:           1           1
 grades again           2           2
 Here is the bug. The grades should be (2 2):           2           2
 Compare the base grades:           1           1
 Here is the bug. The grades should be (2 2):           2           2
 Compare the base grades:           1           1
 grades again           2           2
 grid finished
 PASS

The code worked with gcc 4.3.3 and 4.5.1, as well as the intel compiler 11.1. I hesitate to count in open64 and the sun/oracle compilers as, while they might not show this bug, they have enough others that stopped me from using them:-/

If this still should be an error on my side, I'd appreciate to be enlightened.
Comment 1 Dominique d'Humieres 2011-05-21 18:08:15 UTC
Confirmed on x86_64-apple-darwin10. Revision 169049 is OK, revision 169161 is bugged.
Comment 2 Dominique d'Humieres 2011-05-21 21:19:47 UTC
This pr is due to revision 169083.
Comment 3 Dominique d'Humieres 2011-05-21 21:21:09 UTC
Created attachment 24322 [details]
reduced test
Comment 4 Dominique d'Humieres 2011-05-21 23:11:01 UTC
Reverting revision 169083 on trunk fixes this pr (see http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01442.html for its motivation). Further reduced test:

!----------------------------------
module dg_grid

implicit none

integer :: par_world_dims = 3

type world_t
   integer :: dims
   real(kind=8), dimension(:), pointer :: lengths
end type

contains

subroutine grid_new(grid)
   type(world_t), intent(inout) :: grid

   grid%dims = par_world_dims
   call grid_faces(grid)
      
end subroutine

! Start looking here: This is the routine that gives trouble.
subroutine grid_faces(grid)
   type(world_t), intent(inout) :: grid

   integer :: dimi, bordi
   integer, dimension(par_world_dims-1) :: fgrades
   integer, dimension(par_world_dims-1) :: fbasegrades

   do dimi=1,grid%dims-2

      ! The bug makes those two arrays exchange values temporarily.
      fbasegrades = (/ 1, 1 /)
      fgrades = (/ 2, 2 /)
      do bordi=1,2
         write(*,*) 'Here is the bug. The grades should be (2 2):', fgrades
         write(*,*) 'Compare the base grades:', fbasegrades
      end do
      write(*,*) 'grades again', fgrades
      
   end do
end subroutine

end module

program test_gcc46gridbug

use dg_grid

implicit none

type(world_t) :: grid

call grid_new(grid)

end program
!----------------------------------
Comment 5 Richard Biener 2011-05-22 10:30:58 UTC
I would say look if -fno-whole-file works (or -fno-inline) or alternatively watch out for not properly merged DECLs.  ISTR some issues with module globals.
Comment 6 Dominique d'Humieres 2011-05-22 11:07:19 UTC
> I would say look if -fno-whole-file works (or -fno-inline) or alternatively
> watch out for not properly merged DECLs.  ISTR some issues with module globals.

None of these flags fix the PR. Note that if I comment the line

         write(*,*) 'Compare the base grades:', fbasegrades

in the test in comment #4, fgrades is (2, 2).
Comment 7 Jakub Jelinek 2011-05-23 07:35:48 UTC
Ugh, the problem is that first cunrolli unrolls the loop, so we get among other things parm.9 initialized for printing fgrades_35, then _gfortran_transfer_array_write from parm.9, then parm.11 initialized for printing fbasegrades_19, then _gfortran_transfer_array_write from parm.11, then
again parm.9 initialized to the same values again, _gfortran_transfer_array_write from it, parm.11 to the same values and
_gfortran_transfer_array_write from it.
Additionally, _gfortran_transfer_array_write uses ".wr" "fn spec" attribute.
Next comes fre (2nd), which sees the ".wr" fn spec and removes the second identical initialization of parm.9 and parm.11, so parm.9 is no longer live in two separate ranges, but also across the parm.11 initialization and _gfortran_transfer_array_write call from it.  And during expansion, we figure out that parm.9 and parm.11 are in different, non-overlapping blocks and thus decide to share the stack slot for both.

Appart from the ".wr" "fn spec" attribute I think there is nothing fortran specific and with
struct S { largish struct };
for (int i = 0; i < 2; i++)
  {
    {
      struct S a;
      initialize a;
      use a;
    }
    {
      struct S b;
      initialize b;
      use b;
    }
  }
we could achieve similar effect (though, if e.g. use is a pure function, the second call would be deleted).  So, I'm not convinced reverting my patch is the right fix.  Not sure how to allow expansion to notice that unrolling made it impossible to share the stack slots though.  Ideas?
Comment 8 Dominique d'Humieres 2011-05-28 11:08:17 UTC
(In reply to comment #7)

> Ugh, the problem is that first cunrolli unrolls the loop, so we get among
> other things parm.9 initialized for printing fgrades_35, then
> _gfortran_transfer_array_write from parm.9, then parm.11 initialized for
> printing fbasegrades_19, then _gfortran_transfer_array_write from
> parm.11, then again parm.9 initialized to the same values again,
> _gfortran_transfer_array_write from it, parm.11 to the same values and
> _gfortran_transfer_array_write from it.  Additionally,
> _gfortran_transfer_array_write uses ".wr" "fn spec" attribute.  Next
> comes fre (2nd), which sees the ".wr" fn spec and removes the second
> identical initialization of parm.9 and parm.11, so parm.9 is no longer
> live in two separate ranges, but also across the parm.11 initialization
> and _gfortran_transfer_array_write call from it.  And during expansion,
> we figure out that parm.9 and parm.11 are in different, non-overlapping
> blocks and thus decide to share the stack slot for both.
> 
> Appart from the ".wr" "fn spec" attribute I think there is nothing
> fortran specific and with
> struct S { largish struct };
> for (int i = 0; i < 2; i++)
>   {
>     {
>       struct S a;
>       initialize a;
>       use a;
>     }
>     {
>       struct S b;
>       initialize b;
>       use b;
>     }
>   }
> we could achieve similar effect (though, if e.g. use is a pure function,
> the second call would be deleted).  So, I'm not convinced reverting my
> patch is the right fix.  Not sure how to allow expansion to notice that
> unrolling made it impossible to share the stack slots though.  Ideas?

I am a little bit confused by the answer:

(1) Is the ".wr" "fn spec" attribute in _gfortran_transfer_array_write a gfortran mistake?
If yes, what should be the fix?
(2) Will the C example also give a wrog code with revision 169083 reverted?

Note that I have never claimed that reverting revision 169083 is "the right fix", but only that reverting it fixed the PR!-) Now from my limited understanding of http://gcc.gnu.org/ml/gcc-patches/2011-01/msg01442.html , r169083 looks like an optimization leading to wrong code, which is bad.
Comment 9 Tobias Burnus 2011-05-28 11:27:55 UTC
(In reply to comment #8)
> I am a little bit confused by the answer:

My understanding is: It's a middle-end bug, which got exposed by Rev. 169083, but which can occur also with other C, C++ or Fortran code.

A proper fix will presumably 4.7 only as it should be rather invasive; cf. Michael's patch at http://gcc.gnu.org/ml/gcc-patches/2011-05/msg02029.html which also should fix PR 39604?

For 4.6 one probably needs a different, less invasive patch.
Comment 10 Dominique d'Humieres 2011-05-28 17:56:56 UTC
(In reply to comment #9)
> My understanding is: It's a middle-end bug, which got exposed by Rev. 169083,
> but which can occur also with other C, C++ or Fortran code.

> A proper fix will presumably 4.7 only as it should be rather invasive; cf.
> Michael's patch at http://gcc.gnu.org/ml/gcc-patches/2011-05/msg02029.html
> which also should fix PR 39604?

I have reinstalled revision 169083 along with Michael's patch at http://gcc.gnu.org/ml/gcc-patches/2011-05/msg02029.html and this indeed fix this PR. Further testing in progress.
Comment 11 Jakub Jelinek 2011-05-31 11:54:23 UTC
Smaller, self-contained testcase:

! PR fortran/49103
! { dg-do run }
  integer :: a(2), b(2), i, j
  open (10, status='scratch')
  do j = 1, 2
  a = (/ 0, 0 /)
  b = (/ 1, 1 /)
  do i = 1, 2
    write (10, *) a
    write (10, *) b
  end do
  end do
  rewind (10)
  do i = 0, 7
    read (10, *) a
    if (any (a .ne. mod (i, 2))) call abort
  end do
  close (10)
end
Comment 12 Jakub Jelinek 2011-05-31 13:28:17 UTC
Created attachment 24402 [details]
gcc46-pr49103.patch

While micha's patch is the way to go for the trunk, it most likely isn't backportable.
This untested hack is an attempt to avoid reverting my patch if the vars aren't stored in bbs with significant code motion.  If anyone has better place to
set the bit, I'd appreciate it.
Comment 13 Tobias Burnus 2011-06-13 08:41:35 UTC
(In reply to comment #12)
> This untested hack is an attempt to avoid reverting my patch

Submitted version of the workaround patch 4.6:
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg00480.html
Comment 14 Jakub Jelinek 2011-06-14 15:27:28 UTC
Author: jakub
Date: Tue Jun 14 15:27:24 2011
New Revision: 175028

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=175028
Log:
	PR fortran/49103
	* tree.h (DECL_NONSHAREABLE): Define.
	(struct tree_decl_common): Change decl_common_unused to
	decl_nonshareable_flag.
	* cfgexpand.c (expand_used_vars_for_block, clear_tree_used):
	Ignore vars with DECL_NONSHAREABLE bit set.
	* tree-cfg.c (gimple_duplicate_bb): Set DECL_NONSHAREABLE
	on stores to automatic aggregate vars.

	* gfortran.dg/pr49103.f90: New test.

Added:
    trunk/gcc/testsuite/gfortran.dg/pr49103.f90
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/cfgexpand.c
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-cfg.c
    trunk/gcc/tree.h
Comment 15 Jakub Jelinek 2011-06-14 15:28:27 UTC
Author: jakub
Date: Tue Jun 14 15:28:21 2011
New Revision: 175029

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=175029
Log:
	PR fortran/49103
	* tree.h (DECL_NONSHAREABLE): Define.
	(struct tree_decl_common): Change decl_common_unused to
	decl_nonshareable_flag.
	* cfgexpand.c (expand_used_vars_for_block, clear_tree_used):
	Ignore vars with DECL_NONSHAREABLE bit set.
	* tree-cfg.c (gimple_duplicate_bb): Set DECL_NONSHAREABLE
	on stores to automatic aggregate vars.

	* gfortran.dg/pr49103.f90: New test.

Added:
    branches/gcc-4_6-branch/gcc/testsuite/gfortran.dg/pr49103.f90
Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/cfgexpand.c
    branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_6-branch/gcc/tree-cfg.c
    branches/gcc-4_6-branch/gcc/tree.h
Comment 16 Jakub Jelinek 2011-06-14 22:16:54 UTC
Worked around for now.