Bug 32604

Summary: [4.3 regression] miscompilation at -O2
Product: gcc Reporter: Joost VandeVondele <Joost.VandeVondele>
Component: tree-optimizationAssignee: Not yet assigned to anyone <unassigned>
Status: RESOLVED FIXED    
Severity: normal CC: dberlin, gcc-bugs, pinskia
Priority: P3 Keywords: wrong-code
Version: 4.3.0   
Target Milestone: 4.3.0   
Host: Target:
Build: Known to work: 4.2.1
Known to fail: 4.3.0 Last reconfirmed: 2007-07-03 08:31:53
Bug Depends on:    
Bug Blocks: 29975    

Description Joost VandeVondele 2007-07-03 07:20:40 UTC
as mentioned in the CP2K PR 29975, current trunk miscompiles CP2K at -O2. the following illustrates the issue:

MODULE TEST
  IMPLICIT NONE
  INTEGER, PARAMETER :: dp=KIND(0.0D0)
  TYPE mulliken_restraint_type
    INTEGER                         :: ref_count
    REAL(KIND = dp)                 :: strength
    REAL(KIND = dp)                 :: TARGET
    INTEGER                         :: natoms
    INTEGER, POINTER, DIMENSION(:)  :: atoms
  END TYPE mulliken_restraint_type
CONTAINS
  SUBROUTINE INIT(mulliken)
   TYPE(mulliken_restraint_type), INTENT(INOUT) :: mulliken
   ALLOCATE(mulliken%atoms(1))
   mulliken%atoms(1)=1
   mulliken%natoms=1
   mulliken%target=0
   mulliken%strength=0
  END SUBROUTINE INIT
  SUBROUTINE restraint_functional(mulliken_restraint_control,charges, &
                                charges_deriv,energy,order_p)
    TYPE(mulliken_restraint_type), &
      INTENT(IN)                             :: mulliken_restraint_control
    REAL(KIND=dp), DIMENSION(:, :), POINTER  :: charges, charges_deriv
    REAL(KIND=dp), INTENT(OUT)               :: energy, order_p

    INTEGER                                  :: I
    REAL(KIND=dp)                            :: dum

    charges_deriv=0.0_dp
    order_p=0.0_dp

    DO I=1,mulliken_restraint_control%natoms
       order_p=order_p+charges(mulliken_restraint_control%atoms(I),1) &
                      -charges(mulliken_restraint_control%atoms(I),2)
    ENDDO
    energy=mulliken_restraint_control%strength*(order_p-mulliken_restraint_control%target)**2
    dum=2*mulliken_restraint_control%strength*(order_p-mulliken_restraint_control%target)
    DO I=1,mulliken_restraint_control%natoms
       charges_deriv(mulliken_restraint_control%atoms(I),1)=  dum
       charges_deriv(mulliken_restraint_control%atoms(I),2)= -dum
    ENDDO
END SUBROUTINE restraint_functional

END MODULE

    USE TEST
    IMPLICIT NONE
    TYPE(mulliken_restraint_type) :: mulliken
    REAL(KIND=dp), DIMENSION(:, :), POINTER  :: charges, charges_deriv
    REAL(KIND=dp) :: energy,order_p
    ALLOCATE(charges(1,2),charges_deriv(1,2))
    charges(1,1)=2.0_dp
    charges(1,2)=1.0_dp
    CALL INIT(mulliken)
    CALL restraint_functional(mulliken,charges,charges_deriv,energy,order_p)
    write(6,*) order_p
END

> gfortran -O2 test.f90
> ./a.out
   0.00000000000000
> gfortran -O1 test.f90
> ./a.out
   1.00000000000000

this is for:

Driving: gfortran -v -O2 test.f90 -lgfortranbegin -lgfortran -lm -shared-libgcc
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: /data03/vondele/gcc_trunk/gcc/configure --prefix=/data03/vondele/gcc_trunk/build --with-gmp=/data03/vondele/ --with-mpfr=/data03/vondele/ --enable-languages=c,fortran
Thread model: posix
gcc version 4.3.0 20070702 (experimental)
 /data03/vondele/gcc_trunk/build/libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/f951 test.f90 -quiet -dumpbase test.f90 -mtune=generic -auxbase test -O2 -version -fintrinsic-modules-path /data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/finclude -o /tmp/ccaqj3g7.s
GNU F95 version 4.3.0 20070702 (experimental) (x86_64-unknown-linux-gnu)
        compiled by GNU C version 4.3.0 20070702 (experimental), GMP version 4.2.1, MPFR version 2.2.1.
GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
 as -V -Qy -o /tmp/cccJhUbb.o /tmp/ccaqj3g7.s
GNU assembler version 2.16.91.0.5 (x86_64-suse-linux) using BFD version 2.16.91.0.5 20051219 (SUSE Linux)
 /data03/vondele/gcc_trunk/build/libexec/gcc/x86_64-unknown-linux-gnu/4.3.0/collect2 --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/../lib64/crt1.o /usr/lib/../lib64/crti.o /data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/crtbegin.o -L/data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.3.0 -L/data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/../../.. /tmp/cccJhUbb.o -lgfortranbegin -lgfortran -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /data03/vondele/gcc_trunk/build/lib/gcc/x86_64-unknown-linux-gnu/4.3.0/crtend.o /usr/lib/../lib64/crtn.o
Comment 1 Uroš Bizjak 2007-07-03 08:31:53 UTC
Confirmed, introduced by revision 129146:

Author: dberlin
Date: Sat Jun 30 14:15:26 2007
New Revision: 126149

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=126149
Log:
2007-06-30  Daniel Berlin  <dberlin@dberlin.org>
	
	Fix PR tree-optimization/32540
	Fix PR tree-optimization/31651
	* tree-ssa-sccvn.c: New file.

	* tree-ssa-sccvn.h: Ditto.
	
	* tree-vn.c: Include tree-ssa-sccvn.h
	...
Comment 2 Andrew Pinski 2007-07-03 09:12:52 UTC
I don't see anything obvious in the diff between before FRE and after, likewise for PRE. 
Comment 3 Uroš Bizjak 2007-07-03 10:26:57 UTC
(In reply to comment #2)
> I don't see anything obvious in the diff between before FRE and after, likewise
> for PRE. 

gfortran -O1 -fno-tree-fre pr32604.f90
./a.out
   1.00000000000000     

gfortran -O2 pr32604.f90
./a.out
   0.00000000000000     

gfortran -O2 -fno-tree-pre pr32604.f90
./a.out
   1.00000000000000     

gfortran -O2 -fno-tree-fre pr32604.f90
./a.out
   0.00000000000000     
Comment 4 Joost VandeVondele 2007-07-03 12:09:27 UTC
possibly related to
http://gcc.gnu.org/ml/gcc/2007-07/msg00037.html
Comment 5 Jakub Jelinek 2007-07-03 14:18:08 UTC
The *.pre dump is clearly wrong:
  pretmp.53_53 = *order_p_25(D);

<bb 7>:
  # prephitmp.45_125 = PHI <storetmp.41_94(20), storetmp.41_92(21)>
...
  D.1445_69 = pretmp.53_53;
  storetmp.41_92 = D.1445_69;
  *order_p_25(D) = D.1445_69;
  i_71 = i_2 + 1;
  if (i_2 == D.1401_27)
    goto <bb 11>;
  else
    goto <bb 21>;

<bb 21>:
  goto <bb 7>;

i.e. *order_p won't ever change in the loop.
Before *.pre it was changes correctly:
...
<bb 7>:
  # i_2 = PHI <1(20), i_71(21)>
  D.1421_29 = *order_p_25(D);
...
  D.1439_48 = (*D.1423_32)[D.1438_47];
  D.1440_49 = D.1421_29 + D.1439_48;
  D.1441_64 = D.1435_44 * 2;
  D.1442_65 = D.1441_64 + D.1437_46;
  D.1443_67 = D.1442_65 + D.1434_43;
  D.1444_68 = (*D.1423_32)[D.1443_67];
  D.1445_69 = D.1440_49 - D.1444_68;
  *order_p_25(D) = D.1445_69;
  i_71 = i_2 + 1;
  if (i_2 == D.1401_27)
    goto <bb 11>;
  else
    goto <bb 21>;

<bb 21>:
  goto <bb 7>;
Comment 6 Daniel Berlin 2007-07-03 15:02:53 UTC
Subject: Re:  [4.3 regression] miscompilation at -O2

>   D.1445_69 = pretmp.53_53;
>   storetmp.41_92 = D.1445_69;
>   *order_p_25(D) = D.1445_69;
>   i_71 = i_2 + 1;
>   if (i_2 == D.1401_27)
>     goto <bb 11>;
>   else
>     goto <bb 21>;
>
> <bb 21>:
>   goto <bb 7>;
>
> i.e. *order_p won't ever change in the loop.

Actually, what it has really done is prove the load and store that
happens have the same value.
It happens to also decide that this value (D.1445_69) is invariant.

This is the broken part, and it is because of antic_safe_loads not
checking whether the operands of the load change.
Comment 7 Daniel Berlin 2007-07-04 22:11:27 UTC
Subject: Bug 32604

Author: dberlin
Date: Wed Jul  4 22:11:14 2007
New Revision: 126338

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=126338
Log:
2007-07-04  Daniel Berlin  <dberlin@dberlin.org>

	PR tree-optimization/32604
	PR tree-optimization/32606
	
	* tree-ssa-pre.c (bb_bitmap_sets): Removed antic_safe_loads.
	(compute_antic_safe): Removed.
	(ANTIC_SAFE_LOADS): Ditto.
	(compute_antic_aux): Don't print ANTIC_SAFE_LOADS.
	(execute_pre): Don't call compute_antic_safe.
	(vuse_equiv): New function.
	(make_values_for_stmt): Use it
	* tree-ssa-sccvn.c (set_ssa_val_to): Remove assert, since it is
	not always true.


Added:
    trunk/gcc/testsuite/gcc.c-torture/compile/pr32606.c
    trunk/gcc/testsuite/gfortran.fortran-torture/execute/pr32604.f90
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-pre.c
    trunk/gcc/tree-ssa-sccvn.c

Comment 8 Uroš Bizjak 2007-07-05 13:45:08 UTC
Fixed.