Bug 45159 - Unnecessary temporaries
Summary: Unnecessary temporaries
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: ---
Assignee: Thomas Koenig
URL:
Keywords: missed-optimization
Depends on:
Blocks: 36854
  Show dependency treegraph
 
Reported: 2010-08-01 16:32 UTC by Tobias Burnus
Modified: 2013-03-28 23:28 UTC (History)
4 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-08-02 19:00:32


Attachments
Patch for some more improvements (1.21 KB, patch)
2010-11-27 13:16 UTC, Thomas Koenig
Details | Diff
patch which implements comment #27 (1.60 KB, patch)
2013-02-02 21:31 UTC, Thomas Koenig
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Tobias Burnus 2010-08-01 16:32:18 UTC
Using today's trunk with http://gcc.gnu.org/ml/fortran/2010-08/msg00003.html applied.

The following example were extracted from FLEUR (www.flapw.de), where -Warray-temporaries showed that a temporary has been generated.

!-- ONE ----------------------
      subroutine t1(n1,n2, gfft, ufft)
        implicit none
        integer :: n1, n2, i
        real :: gfft(n1,n2), ufft(n2)
        DO i=1, n1
          gfft(i,:)=gfft(i,:)*ufft(i)  ! No temporary needed
        END DO
      end subroutine

!-- TWO ----------------------
   we(1:noccbd) = we(n_start:n_end)

If "1" is the (known) lower bound, no temporary is needed (if one copies from the lower to the upper bound). Using memmove also no temporary is needed (as it allows for this) - using memcpy this is not possible (memory needs to be disjunct). For this case of having on the RHS only a variable, one should probably do a middle-end assignment with ARRAY_RANGE_REF. Cf. PR 40598.


!-- THREE ----------------------

   wannint(pw1,pw2,pw3,:)=wannint(pw1,pw2,pw3,:)+wannz2(j,:)

gets a temporary generated (no pointer - and no target with argument aliasing involved.)
Comment 1 Tobias Burnus 2010-08-01 16:52:26 UTC
! --- FOUR ----  (also occurs for Octopus)

      subroutine t2(b,c)
        implicit none
        REAL :: b(3,3),c(3)
        c = matmul(b, c)
      end subroutine

That's a pattern I see quite often: LHS variable on the RHS as second (or first) argument to MATMUL. However, no temporary is needed - but one probably needs to handle this carefully and it depends how one generates the MATMUL internally. While most implementation ways should work, one probably can program it such that avoiding a temporary will lead to wrong results ... Additionally, one should confirm that it works with-fexternal-blas

As a variant: The same but with one or two items TRANSPOSEd - or CONJugated.


! --- FIVE --- (Found in octopus, www.tddft.org [GPL])

  if(any(shape(a).ne.shape(b))) then

That generates two temporaries - one for each SHAPE.
Comment 2 Tobias Burnus 2010-08-01 17:28:15 UTC
(In reply to comment #1)
> ! --- FOUR ----  (also occurs for Octopus)
>         c = matmul(b, c)

As pointed out by Dominique, one needs to be careful. I think one can optimize:

  c = matmul(b, transpose(c))
  c = matmul(c,b)
where c is a rank-2 matrix. I think one cannot optimize it if c is just a vector. (Add "conj" as you wish.) And check whether I have not confused columns and rows

Side remark:
  sum(array*scalar) = sum(array)*scalar
Comment 3 Thomas Koenig 2010-08-01 17:37:25 UTC
Confirmed for the test cases in comment #0.

You were right that my patch at

http://gcc.gnu.org/ml/fortran/2010-08/msg00003.html

doesn't fix this.  I'll have a look...
Comment 4 Thomas Koenig 2010-08-01 18:12:39 UTC
This piece of code

                 /* If no intention of reversing or reversing is explicitly
                     inhibited, convert backward dependence to overlap.  */
                  if ((reverse == NULL && this_dep == GFC_DEP_BACKWARD)
                        || (reverse && reverse[n] == GFC_CANNOT_REVERSE))
                    this_dep = GFC_DEP_OVERLAP;

looks more fishy.  It sets this_dep to GFC_DEP_OVERLAP even if we don't want to reverse at all.

Shouldn't this all be conditional on

if (this_dep == GFC_DEP_BACKWARD)

Paul, do you have any ideas?
Comment 5 Dominique d'Humieres 2010-08-02 11:41:25 UTC
> As pointed out by Dominique, one needs to be careful. I think one can optimize:
>
>   c = matmul(b, transpose(c))
>   c = matmul(c,b)
> where c is a rank-2 matrix. I think one cannot optimize it if c is just a
> vector. (Add "conj" as you wish.) And check whether I have not confused columns
> and rows

So far I have never seen a product matrix*matrix done "in place" and I seriously doubt it can be done.
The best I can see is to have a temporary of rank 1 only instead of the full matrix. In addition roughly speaking the matrix product is O(n**3) while the cost of the temporary is O(n**2), probably negligible for large matrices (for small matrices calling the present matmul is much slower than any other possible algorithm).

Note also that the polyhedron test nf.f90 is compiled with unnecessary temporaries:

[macbook] lin/test% gfc -Warray-temporaries nf.f90
nf.f90:293.30:

   if ( i>i1 ) x(i:i+nxy-1) = x(i:i+nxy-1) - au3(i-nxy:i-1)*x(i-nxy:i-1)
                              1
Warning: Creating array temporary at (1)
nf.f90:280.21:

      g(i:i+nxy-1) = g(i:i+nxy-1) - au3(i-nxy:i-1)*g(i-nxy:i-1)
                     1
Warning: Creating array temporary at (1)
nf.f90:261.29:

   if ( i>i1 ) x(i:i+nx-1) = x(i:i+nx-1) - au2(i-nx:i-1)*x(i-nx:i-1)
                             1
Warning: Creating array temporary at (1)
nf.f90:248.20:

      g(i:i+nx-1) = g(i:i+nx-1) - au2(i-nx:i-1)*g(i-nx:i-1)
                    1
Warning: Creating array temporary at (1)

i.e., disjoint sections are not spotted.
Comment 6 Tobias Burnus 2010-08-02 12:41:53 UTC
(In reply to comment #5)
> > As pointed out by Dominique, one needs to be careful. I think one can optimize:
> >   c = matmul(b, transpose(c))
> >   c = matmul(c,b)
> > where c is a rank-2 matrix. I think one cannot optimize it if c is just a
> > vector. (Add "conj" as you wish.) And check whether I have not confused columns
> > and rows
> 
> So far I have never seen a product matrix*matrix done "in place" and I
> seriously doubt it can be done.
> The best I can see is to have a temporary of rank 1

I fail to see why a scalar temporary is not enough:

do j = 1, N
  do i = 1, N
    tmp = 0.0
    do k = 1, N
      tmp = a(i,k)*b(k,j)
    end do
    a(i,j) = tmp
  end do
end do

Note: it won't work with a scalar if one reverses the i and the j loop.
[It would work with a rank-1 "B" argument, but then "A = matmul(B,A)" won't work as the result is a rank-1 array ...]

> Note also that the polyhedron test nf.f90 is compiled with unnecessary
> temporaries:

My feeling is that ONE, THREE (of comment 0), FIVE (of comment 1), and your example (comment 5) point all to the same dependency-analysis problem.
Comment 7 Dominique d'Humieres 2010-08-02 13:39:15 UTC
> I fail to see why a scalar temporary is not enough:
>
> do j = 1, N
>   do i = 1, N
>     tmp = 0.0
>     do k = 1, N
 >      tmp = a(i,k)*b(k,j)
>     end do
>     a(i,j) = tmp
>   end do
> end do
>
> Note: it won't work with a scalar if one reverses the i and the j loop.

Well, try:


real :: a(3,3), b(3,3), c(3,3), tmp
integer :: i, j, k, N=3
a = reshape([(i,i=1,9)],[3,3])
b = reshape([(10-i,i=1,9)],[3,3])
c = matmul(a, b)
do j = 1, N
  do i = 1, N
    tmp = 0.0
    do k = 1, N
      tmp = tmp + a(i,k)*b(k,j)
    end do
    a(i,j) = tmp
  end do
end do
print *, a
print *, c
if(any(a/=c)) call abort()
print *, 'OK'
end

The problem is that in order to compute a(1,2) you need a(1,1)*b(1,2), so you cannot have already written a(1,1). Note that the effective way to compute the matrix product on modern CPUs is to exchange the i and k loops and use a rank 1 tmp:

real :: a(3,3), b(3,3), c(3,3), tmp(3)
integer :: i, j, k, N=3
a = reshape([(i,i=1,9)],[3,3])
b = reshape([(10-i,i=1,9)],[3,3])
c = matmul(a, b)
do j = 1, N
  tmp = 0.0
  do k = 1, N
    do i = 1, N
      tmp(i) = tmp(i) + a(i,k)*b(k,j)
    end do
  end do
  b(:,j) = tmp
end do
if(any(b/=c)) call abort()
print *, 'OK'
end

and this work with a rank 1 temporary.

If one want to enter this kind of reduced temporaries, another candidate is a=transpose(a) which requires a scalar temporary only.
Comment 8 Tobias Burnus 2010-08-02 14:10:19 UTC
(In reply to comment #7)
> > I fail to see why a scalar temporary is not enough:
> Well, try:

I concede defeat.

> If one want to enter this kind of reduced temporaries, another candidate is
> a=transpose(a) which requires a scalar temporary only.

Well, with an array descriptor it one only changes the strides/bounds ...
Comment 9 Dominique d'Humieres 2010-08-02 14:36:12 UTC
> Well, with an array descriptor it one only changes the strides/bounds ...

From polyhedron test induct.f90


[macbook] lin/test% gfc -Warray-temporaries induct.f90
induct.f90:1629.20:

      rotate_quad = transpose(rotate_quad)
                    1
Warning: Creating array temporary at (1)
induct.f90:1741.20:

      rotate_quad = transpose(rotate_quad)
                    1
Warning: Creating array temporary at (1)
induct.f90:2016.20:

      rotate_quad = transpose(rotate_quad)
                    1
Warning: Creating array temporary at (1)
induct.f90:2159.20:

      rotate_quad = transpose(rotate_quad)
                    1
Warning: Creating array temporary at (1)
induct.f90:2509.24:

              rotate1 = transpose(rotate1)
                        1
Warning: Creating array temporary at (1)
induct.f90:2560.24:

              rotate2 = transpose(rotate2)
                        1
Warning: Creating array temporary at (1)
induct.f90:2702.24:

              rotate2 = transpose(rotate2)
                        1
Warning: Creating array temporary at (1)
induct.f90:2845.24:

              rotate1 = transpose(rotate1)
                        1
Warning: Creating array temporary at (1)
induct.f90:2896.24:

              rotate2 = transpose(rotate2)
                        1
Warning: Creating array temporary at (1)
induct.f90:3038.24:

              rotate2 = transpose(rotate2)
                        1
Warning: Creating array temporary at (1)

The same trick can probably used for RESHAPE, from capacita.f90

...

Warning: Creating array temporary at (1)
capacita.f90:137.30:

      maxval(abs( reshape(Ar1,(/Ng1,Ng2/)) - Y ))
                              1
Warning: Creating array temporary at (1)
capacita.f90:137.30:

      maxval(abs( reshape(Ar1,(/Ng1,Ng2/)) - Y ))
                              1
Warning: Creating array temporary at (1)
capacita.f90:137.18:

      maxval(abs( reshape(Ar1,(/Ng1,Ng2/)) - Y ))
                  1
Warning: Creating array temporary at (1)
Comment 10 Thomas Koenig 2010-08-02 19:00:32 UTC
This fixes the test case from comment #0, and looks much more sane.

Let's see if this survives regression-testing...

Index: dependency.c
===================================================================
--- dependency.c        (Revision 162824)
+++ dependency.c        (Arbeitskopie)
@@ -1716,8 +1716,8 @@ gfc_dep_resolver (gfc_ref *lref, gfc_ref *rref, gf

                  /* If no intention of reversing or reversing is explicitly
                     inhibited, convert backward dependence to overlap.  */
-                 if ((reverse == NULL && this_dep == GFC_DEP_BACKWARD)
-                       || (reverse && reverse[n] == GFC_CANNOT_REVERSE))
+                 if (this_dep == GFC_DEP_BACKWARD
+                     && (reverse == NULL || reverse[n] == GFC_CANNOT_REVERSE))
                    this_dep = GFC_DEP_OVERLAP;
                }
Comment 11 Thomas Koenig 2010-08-02 22:04:50 UTC
Subject: Bug 45159

Author: tkoenig
Date: Mon Aug  2 22:04:36 2010
New Revision: 162829

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162829
Log:

2010-08-02  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159
	* depencency.c (gfc_dep_resolver):  Fix logic for when a loop
	can be reversed.

2010-08-02  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159

	* gfortran.dg/dependency_29.f90:  New test.


Added:
    trunk/gcc/testsuite/gfortran.dg/dependency_29.f90
Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/dependency.c
    trunk/gcc/testsuite/ChangeLog

Comment 12 Thomas Koenig 2010-08-02 22:16:56 UTC
(In reply to comment #5)

> [macbook] lin/test% gfc -Warray-temporaries nf.f90
> nf.f90:293.30:
> 
>    if ( i>i1 ) x(i:i+nxy-1) = x(i:i+nxy-1) - au3(i-nxy:i-1)*x(i-nxy:i-1)
>                               1
> Warning: Creating array temporary at (1)
> nf.f90:280.21:
> 
>       g(i:i+nxy-1) = g(i:i+nxy-1) - au3(i-nxy:i-1)*g(i-nxy:i-1)
>                      1
> Warning: Creating array temporary at (1)
> nf.f90:261.29:
> 
>    if ( i>i1 ) x(i:i+nx-1) = x(i:i+nx-1) - au2(i-nx:i-1)*x(i-nx:i-1)
>                              1
> Warning: Creating array temporary at (1)
> nf.f90:248.20:
> 
>       g(i:i+nx-1) = g(i:i+nx-1) - au2(i-nx:i-1)*g(i-nx:i-1)
>                     1
> Warning: Creating array temporary at (1)
> 
> i.e., disjoint sections are not spotted.

This vanishes with -m32.

The reason is that we, for some strange reason, convert the upper
bound to the pointer width on a particular platform.  For 64-bit, this
results in

subroutine foo(a,b,i,j,k,n)                           
  implicit none                                       
  integer, intent(in) :: i, j, k, n                   
  real, dimension(n) :: a,b                           
  a(k:i-1) = a(i:j)                                                      
end subroutine foo        

ending up with -fdump-parse-treee as

        ASSIGN foo:a(foo:k:__convert_i4_i8[[(((- foo:i 1)))]]) foo:a(foo:i:__convert_i4_i8[[((foo:j))]])

which confuses gfc_dep_compare_expr, which is too stupid to see through the type conversion.

I know where to start, but it is late in the evening ;-)
Comment 13 Thomas Koenig 2010-08-03 22:02:46 UTC
Subject: Bug 45159

Author: tkoenig
Date: Tue Aug  3 22:02:30 2010
New Revision: 162848

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162848
Log:
2010-08-03  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159
	* dependency.c (gfc_deb_compare_expr):  Remove any integer
	conversion functions to larger types from both arguments.
	Remove handling these functions futher down.

2010-08-03  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159
	* gfortran.dg/dependency_30.f90:  New test.


Added:
    trunk/gcc/testsuite/gfortran.dg/dependency_30.f90
Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/dependency.c
    trunk/gcc/testsuite/ChangeLog

Comment 14 Thomas Koenig 2010-08-06 22:33:56 UTC
Subject: Bug 45159

Author: tkoenig
Date: Fri Aug  6 22:33:37 2010
New Revision: 162966

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=162966
Log:
2010-08-06  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159
	* dependency.c (check_section_vs_section):  Handle cases where
	the start expression coincides with the lower or upper
	bound of the array.

2010-08-06  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159
	* gfortran.dg/dependency_31.f90:  New test.


Added:
    trunk/gcc/testsuite/gfortran.dg/dependency_31.f90
Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/dependency.c
    trunk/gcc/testsuite/ChangeLog

Comment 15 Thomas Koenig 2010-08-09 21:54:18 UTC
Here's another case where we generate a temporary:

program main
  integer a(100)
  a(10:16) = a(11:17:1)
end program main
ig25@linux-fd1f:~/Krempel/Dep-9> gfortran -Warray-temporaries d1.f90
d1.f90:3.13:

  a(10:16) = a(11:17:1)
             1
Warnung: Creating array temporary at (1)
Comment 16 Thomas Koenig 2010-08-09 22:56:28 UTC
(In reply to comment #15)
> Here's another case where we generate a temporary:
> 
> program main
>   integer a(100)
>   a(10:16) = a(11:17:1)
> end program main

Here's a tentative patch:

Index: dependency.c                                                         
===================================================================         
--- dependency.c        (Revision 163040)                                   
+++ dependency.c        (Arbeitskopie)                                      
@@ -1023,6 +1023,7 @@ check_section_vs_section (gfc_array_ref *l_ar, gfc    
   gfc_expr *r_lower;                                                       
   gfc_expr *r_upper;                                                       
   int r_dir;                                                               
+  bool identical_strides;                                                  
                                                                            
   /* If they are the same range, return without more ado.  */              
   if (gfc_is_same_range (l_ar, r_ar, n, 0))                                
@@ -1076,6 +1077,23 @@ check_section_vs_section (gfc_array_ref *l_ar, gfc   
   if (l_dir == 0 || r_dir == 0)                                            
     return GFC_DEP_OVERLAP;                                                
                                                                            
+  /* Determine if the strides are equal.  */                               
+                                                                           
+  if (l_stride)                                                            
+    {                                                                      
+      if (r_stride)                                                        
+       identical_strides = gfc_dep_compare_expr (l_stride, r_stride) == 1; 
+      else                                                                 
+       identical_strides = gfc_expr_is_one (l_stride, 0) == 0;             
+    }                                                                      
+  else                                                                     
+    {                                                                      
+      if (r_stride)                                                        
+       identical_strides = gfc_expr_is_one (r_stride, 0) == 1;             
+      else                                                                 
+       identical_strides = true;                                           
+    }                                                                      
+                                                                           
   /* Determine LHS upper and lower bounds.  */                             
   if (l_dir == 1)                                                          
     {                                                                      
@@ -1175,12 +1193,8 @@ check_section_vs_section (gfc_array_ref *l_ar, gfc   
       && l_start && r_start && gfc_dep_compare_expr (l_start, r_start) == -1
       && l_end && r_end && gfc_dep_compare_expr (l_end, r_end) == -1)       
     {                                                                       
-      /* Check that the strides are the same.  */
-      if (!l_stride && !r_stride)
+      if (identical_strides)
        return GFC_DEP_FORWARD;
-      if (l_stride && r_stride
-         && gfc_dep_compare_expr (l_stride, r_stride) == 0)
-       return GFC_DEP_FORWARD;
     }

   /* Check for forward dependencies x:y:-1 vs. x-1:z:-1.  */
@@ -1188,20 +1202,12 @@ check_section_vs_section (gfc_array_ref *l_ar, gfc
       && l_start && r_start && gfc_dep_compare_expr (l_start, r_start) == 1
       && l_end && r_end && gfc_dep_compare_expr (l_end, r_end) == 1)
     {
-      /* Check that the strides are the same.  */
-      if (!l_stride && !r_stride)
+      if (identical_strides)
        return GFC_DEP_FORWARD;
-      if (l_stride && r_stride
-         && gfc_dep_compare_expr (l_stride, r_stride) == 0)
-       return GFC_DEP_FORWARD;
     }


-  /*  Are the strides the same?  */
-  if ((!l_stride && !r_stride)
-       ||
-      (l_stride && r_stride
-       && gfc_dep_compare_expr (l_stride, r_stride) == 0))
+  if (identical_strides)
     {

       if (l_start && IS_ARRAY_EXPLICIT (l_ar->as))
Comment 17 Dominique d'Humieres 2010-08-10 08:45:42 UTC
With the patch in comment#16, there is no temporary created for the code in comment #15, but one is created for

  a(10:16:1) = a(11:17)

This seems to be fixed if I replace

+      if (r_stride)                                                        
+       identical_strides = gfc_dep_compare_expr (l_stride, r_stride) == 1; 
+      else                                                                 
+       identical_strides = gfc_expr_is_one (l_stride, 0) == 0;             

with

+      if (r_stride)                                                        
+       identical_strides = gfc_dep_compare_expr (l_stride, r_stride) == 1; 
+      else                                                                 
+       identical_strides = gfc_expr_is_one (l_stride, 0) == 1;             

I think that

+       identical_strides = gfc_dep_compare_expr (l_stride, r_stride) == 1; 

should also be replaced with

+       identical_strides = gfc_dep_compare_expr (l_stride, r_stride) == 0; 
Comment 18 tkoenig@netcologne.de 2010-08-10 09:19:07 UTC
Subject: Re:  Unnecessary temporaries

Am Dienstag, den 10.08.2010, 08:45 +0000 schrieb dominiq at lps dot ens
dot fr:
> I think that
> 
> +       identical_strides = gfc_dep_compare_expr (l_stride, r_stride)
> == 1; 
> 
> should also be replaced with
> 
> +       identical_strides = gfc_dep_compare_expr (l_stride, r_stride)
> == 0; 


Correct.

Although it is probably better set the stride during resolution.

Here's a tentative patch for that.  There are probably very many places
that NULL checks can be elided with this.

Index: resolve.c
===================================================================
--- resolve.c   (Revision 163041)
+++ resolve.c   (Arbeitskopie)
@@ -4378,12 +4378,19 @@
            return FAILURE;
          }

+      /* Always fill in a stride of one.  */
+
+      if (ar->dimen_type[i] == DIMEN_RANGE
+         && ar->stride[i] == NULL)
+       ar->stride[i] = gfc_get_int_expr (gfc_index_integer_kind,
+                                         &ar->where, 1);
+
       /* Fill in the upper bound, which may be lower than the
         specified one for something like a(2:10:5), which is
         identical to a(2:7:5).  Only relevant for strides not equal
         to one.  */
       if (ar->dimen_type[i] == DIMEN_RANGE
-         && ar->stride[i] != NULL && ar->stride[i]->expr_type ==
EXPR_CONSTANT
+         && ar->stride[i]->expr_type == EXPR_CONSTANT
          && mpz_cmp_si (ar->stride[i]->value.integer, 1L) != 0)
        {
          mpz_t size, end;


Comment 19 Dominique d'Humieres 2010-08-10 12:00:50 UTC
(In reply to comment #18)
> Although it is probably better set the stride during resolution.

Probably.

A few comments before leaving for ten days without access to the net.

(1) I think all the changes done recently to avoid unneeded temporaries can be extended by reversing the loops.

(2) The following code generates an unneeded temporary:

integer, parameter :: n = 10
integer :: i
integer :: a(0:n+1) = (/(i, i =0, n), 0/)
a(:6:2) = a(::3)
print *, a
end

It is in the class abs(r_stride)>max(0,abs(l_stride)) for which there is no need for temporary if
l_start<r_start+r_stride for r_stride>max(0,l_stride)
or
l_start>r_start+r_stride for r_stride<min(0,l_stride)

If I am not mistaken, this is the last case for which one can conclude for any size of the sections.
(I have done the proof on the back of an envelope that I cannot find right now.) For all the remaining cases it exists some set of triplets for which there is a dependency even if for some others there is none (depending upon the dependent indices are inside or not of the actual range). Although this can be implemented, I seriously doubt it is worth the pain.

(3) I think the coverage for missing temporaries is rather limited right now and I am planning to write some torture test(s) to improve this situation.
Comment 20 Thomas Koenig 2010-08-27 12:09:13 UTC
Subject: Bug 45159

Author: tkoenig
Date: Fri Aug 27 12:08:47 2010
New Revision: 163584

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=163584
Log:
2010-08-27  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159
	* dependency.c (check_section_vs_section):  Single test for
	identical strides which takes into account that only one
	of the strides may be NULL.

2010-08-27  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159
	* gfortran.dg/dependency_33.f90:  New test.


Added:
    trunk/gcc/testsuite/gfortran.dg/dependency_33.f90
Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/dependency.c
    trunk/gcc/testsuite/ChangeLog

Comment 21 Thomas Koenig 2010-09-03 16:17:02 UTC
Subject: Bug 45159

Author: tkoenig
Date: Fri Sep  3 16:16:34 2010
New Revision: 163834

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=163834
Log:
2010-09-03  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159
	* dependency.c (gfc_deb_compare_expr):  Compare equal for equal
	arglists for pure user functions, or for those intrinsic
	functions which are also pure.
	* intrinsics.c (add_conv):  Mark conversion functions as pure.
	(add_char_conversions):  Likewise.

2010-09-03  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159
	* gfortran.dg/dependency_34.f90:  New test.


Added:
    trunk/gcc/testsuite/gfortran.dg/dependency_34.f90
Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/dependency.c
    trunk/gcc/fortran/intrinsic.c
    trunk/gcc/testsuite/ChangeLog

Comment 22 Thomas Koenig 2010-11-27 13:16:54 UTC
Created attachment 22546 [details]
Patch for some more improvements

Here's a patch for some more improvements.
Comment 23 Thomas Koenig 2010-12-03 10:35:16 UTC
Author: tkoenig
Date: Fri Dec  3 10:35:12 2010
New Revision: 167413

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=167413
Log:
2010-12-02  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159
	* dependency.c (check_section_vs_section):  Pre-calculate
	the relationship between the strides and the relationship
	between the start values.  Use an integer constant one for
	that purpose.
	Forward dependencies for positive strides apply for where
	the lhs start <= rhs start and lhs stride <= rhs stride
	and vice versa for negative stride.  No need to compare
	end expressions in either case (assume no bounds violation).

2010-12-02  Thomas Koenig  <tkoenig@gcc.gnu.org>

	PR fortran/45159
	* gfortran.dg/dependency_38.f90:  New test.


Added:
    trunk/gcc/testsuite/gfortran.dg/dependency_38.f90
Modified:
    trunk/gcc/fortran/ChangeLog
    trunk/gcc/fortran/dependency.c
    trunk/gcc/testsuite/ChangeLog
Comment 24 Mikael Morin 2012-06-29 17:21:13 UTC
Anything left to be done?
Comment 25 Dominique d'Humieres 2012-06-29 21:44:56 UTC
> Anything left to be done?

I see

[macbook] f90/bug% gfc -Warray-temporaries pr45159_4_red.f90
pr45159_4_red.f90:7.15:

  a(-3:9:3) =  a(-6:18:6)
               1
Warning: Creating array temporary at (1)

which is an instance of

l_start<r_start+r_stride for r_stride>max(0,l_stride)

(see comment #19) for which there is no dependency.
Comment 26 Thomas Koenig 2012-06-30 17:20:10 UTC
(In reply to comment #19)

Hi Dominique,

> (I have done the proof on the back of an envelope that I cannot find right
> now.)

Can you find that particular envelope again?  I tried to do the
logic for this myself, but couldn't find a proof.
Comment 27 Thomas Koenig 2013-02-01 18:16:30 UTC
To allow expressions like

  a(n:2*n:2) = a(n+1:2*n+1:2)

to be optimized, I will try to write a function which calculates the difference between two gfc_expr() for easy cases.  This could also help in other cases,
such as determining string lenghts for expressions like c(n:n+2).
Comment 28 Thomas Koenig 2013-02-02 21:31:37 UTC
Created attachment 29340 [details]
patch which implements comment #27

Still have to verify that this one is correct in all cases.
Comment 29 Thomas Koenig 2013-03-28 23:28:01 UTC
I think we can close this bug with

http://gcc.gnu.org/ml/gcc-cvs/2013-03/msg00846.html

It's been around for some time, if somebody finds anything
else, let us open a fresh one :-)