[Bug fortran/102510] New: Function call has unnecessary aliasing check

dwwork at gmail dot com gcc-bugzilla@gcc.gnu.org
Tue Sep 28 02:17:15 GMT 2021


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102510

            Bug ID: 102510
           Summary: Function call has unnecessary aliasing check
           Product: gcc
           Version: 11.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: fortran
          Assignee: unassigned at gcc dot gnu.org
          Reporter: dwwork at gmail dot com
  Target Milestone: ---

The following 2 functions semantically do the same thing, they add two fixed
size arrays and store them into a third. When compiled with "-O3 -mavx" for
x86_64, I expect to see a single avx instruction. The first version does this
correctly, while the second has an aliasing check with a vectorized branch and
a scalar branch (I think). The second version is incorrect, and should produce
similar vectorized assembly to the first, as fortran does not allow function
arguments to alias. I could be wrong of course, but that is my understanding.

subroutine add2vecs1(a,b,c)
    use iso_fortran_env, only: r32 => real32
    real(r32), dimension(8), intent(in) :: a,b
    real(r32), dimension(8), intent(out) :: c
    c = a + b
end subroutine

Output Assembly (from godbolt.org, https://godbolt.org/z/aedEe7rGM):

add2vecs1_:
        vmovups ymm0, YMMWORD PTR [rdi]
        vaddps  ymm0, ymm0, YMMWORD PTR [rsi]
        vmovups YMMWORD PTR [rdx], ymm0
        vzeroupper
        ret

function add2vecs2(a,b)
    use iso_fortran_env, only: r32 => real32
    real(r32), dimension(8), intent(in) :: a,b
    real(r32), dimension(8) :: add2vecs2
    add2vecs2 = a + b
end function

Output Assembly:

add2vecs2_:
        mov     rax, QWORD PTR [rdi+40]
        mov     rcx, QWORD PTR [rdi]
        test    rax, rax
        je      .L5
        cmp     rax, 1
        jne     .L11
.L5:
        vmovups ymm0, YMMWORD PTR [rdx]
        vaddps  ymm0, ymm0, YMMWORD PTR [rsi]
        vmovups YMMWORD PTR [rcx], ymm0
        vzeroupper
        ret
.L11:
        vmovups xmm1, XMMWORD PTR [rdx]
        vaddps  xmm0, xmm1, XMMWORD PTR [rsi]
        lea     rdi, [rcx+rax*8]
        mov     r8, rax
        sal     r8, 4
        vmovss  DWORD PTR [rcx], xmm0
        vextractps      DWORD PTR [rcx+rax*4], xmm0, 1
        vextractps      DWORD PTR [rcx+rax*8], xmm0, 2
        vextractps      DWORD PTR [rdi+rax*4], xmm0, 3
        vmovups xmm0, XMMWORD PTR [rdx+16]
        vaddps  xmm0, xmm0, XMMWORD PTR [rsi+16]
        lea     rdi, [rcx+r8]
        lea     rdx, [rdi+rax*8]
        vmovss  DWORD PTR [rcx+r8], xmm0
        vextractps      DWORD PTR [rdi+rax*4], xmm0, 1
        vextractps      DWORD PTR [rdi+rax*8], xmm0, 2
        vextractps      DWORD PTR [rdx+rax*4], xmm0, 3
        ret


More information about the Gcc-bugs mailing list