[Bug fortran/102510] Function call has unnecessary stride check

anlauf at gcc dot gnu.org gcc-bugzilla@gcc.gnu.org
Tue Sep 28 19:03:22 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102510

--- Comment #3 from anlauf at gcc dot gnu.org ---
It helps to look at the (Fortran) context.  As written, the subroutine version
is declared with explicit size contiguous arrays.  If the caller has a
non-contiguous (strided) result array, it needs to pack/unpack.  For the
function version - as is - we might need a temporary to handle different
situations.

However, if you offer the compiler the chance to inline the calls, and using
optimization to inline the packing, you may get better code than you think.

Compile this example with -O3 -mavx:

module p
  use iso_fortran_env, only: r32 => real32
  real(r32), dimension(8)  :: a,b
  real(r32), dimension(8)  :: c1, c2
  real(r32), dimension(16) :: d1, d2
contains
  subroutine add2vecs1(a,b,c)
    real(r32), dimension(8), intent(in) :: a,b
    real(r32), dimension(8), intent(out) :: c
    c = a + b
  end subroutine add2vecs1
  function add2vecs2(a,b)
    real(r32), dimension(8), intent(in) :: a,b
    real(r32), dimension(8) :: add2vecs2
    add2vecs2 = a + b
  end function add2vecs2
  !-
  subroutine s1 ()
    call add2vecs1 (a, b, c1)
  end subroutine s1
  !-
  subroutine s2 ()
    c2         = add2vecs2 (a, b)
  end subroutine s2
  !-
  subroutine s3 ()
    call add2vecs1 (a, b, d1(1:16:2))
  end subroutine s3
  !-
  subroutine s4 ()
    d2(1:16:2) = add2vecs2 (a, b)
  end subroutine s4
end

You'll find that s1 and s2 compile to the same code, and the strided versions
s3 and s4 (at least this is my reading of the assembly, but correct me if I
am wrong).

Is there really more to expect?


More information about the Gcc-bugs mailing list