This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug tree-optimization/33244] New: Missed opportunities for vectorization due to PRE
- From: "spop at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 30 Aug 2007 02:55:16 -0000
- Subject: [Bug tree-optimization/33244] New: Missed opportunities for vectorization due to PRE
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
The following loop showing up in the top time users in capacita.f90 is
not vectorized because the loop latch block is non empty:
./capacita.f90:51: note: ===== analyze_loop_nest =====
./capacita.f90:51: note: === vect_analyze_loop_form ===
./capacita.f90:51: note: not vectorized: unexpected loop form.
./capacita.f90:51: note: bad loop form.
./capacita.f90:9: note: vectorized 0 loops in function.
This block contains the following code that comes from the
partial redundancy elimination pass:
bb_14 (preds = {bb_13 }, succs = {bb_13 })
{
<bb 14>:
# VUSE <SFT.109_593> { SFT.109 }
pretmp.166_821 = g.dim[1].stride;
goto <bb 13>;
}
Now, if I disable the PRE with -fno-tree-pre, I get another problem on
the data dependence analysis:
base_address: &d1
offset from base address: 0
constant offset from base address: 0
step: 0
aligned to: 128
base_object: d1
symbol tag: d1
FAILED as dr address is invariant
/home/seb/ex/capacita.f90:46: note: not vectorized: unhandled data-ref
/home/seb/ex/capacita.f90:46: note: bad data references.
/home/seb/ex/capacita.f90:4: note: vectorized 0 loops in function.
This fail corresponds to the following code in tree-data-ref.c
/* FIXME -- data dependence analysis does not work correctly for objects
with
invariant addresses. Let us fail here until the problem is fixed. */
if (dr_address_invariant_p (dr))
{
free_data_ref (dr);
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "\tFAILED as dr address is invariant\n");
ret = false;
break;
}
Due to the following statement:
# VUSE <d1_143> { d1 }
d1.33_86 = d1;
So here the data reference is for d1 that is a read with the following tree:
arg 1 <var_decl 0xb7be01cc d1 type <real_type 0xb7b4eaf8 real4>
addressable used public static SF file /home/seb/ex/capacita.f90 line
11 size <integer_cst 0xb7b4163c 32> unit size <integer_cst 0xb7b41428 4>
align 32
chain <var_decl 0xb7be0170 d2 type <real_type 0xb7b4eaf8 real4>
addressable used public static SF file /home/seb/ex/capacita.f90
line 11 size <integer_cst 0xb7b4163c 32> unit size <integer_cst 0xb7b41428 4>
align 32 chain <var_decl 0xb7be0114 eps0>>>
I don't really know how this could be handled as a data reference,
because that statement has a VUSE but the type of d1 is scalar.
A reduced testcase is like this:
module solv_cap
implicit none
public :: init_solve
integer, parameter, public :: dp = selected_real_kind(5)
real(kind=dp), private :: Pi, Mu0, c0, eps0
logical, private :: UseFFT, UsePreco
real(kind=dp), private :: D1, D2
integer, private, save :: Ng1=0, Ng2=0
integer, private, pointer, dimension(:,:) :: Grid
real(kind=dp), private, allocatable, dimension(:,:) :: G
contains
subroutine init_solve(Grid_in, GrSize1, GrSize2, UseFFT_in, UsePreco_in)
integer, intent(in), target, dimension(:,:) :: Grid_in
real(kind=dp), intent(in) :: GrSize1, GrSize2
logical, intent(in) :: UseFFT_in, UsePreco_in
integer :: i, j
Pi = acos(-1.0_dp)
Mu0 = 4e-7_dp * Pi
c0 = 299792458
eps0 = 1 / (Mu0 * c0**2)
UseFFT = UseFFT_in
UsePreco = UsePreco_in
if(Ng1 /= 0 .and. allocated(G) ) then
deallocate( G )
end if
Grid => Grid_in
Ng1 = size(Grid, 1)
Ng2 = size(Grid, 2)
D1 = GrSize1/Ng1
D2 = GrSize2/Ng2
allocate( G(0:Ng1,0:Ng2) )
write(unit=*, fmt=*) "Calculating G"
do i=0,Ng1
do j=0,Ng2
G(i,j) = Ginteg( -D1/2,-D2/2, D1/2,D2/2, i*D1,j*D2 )
end do
end do
if(UseFFT) then
write(unit=*, fmt=*) "Transforming G"
call FourirG(G,1)
end if
return
end subroutine init_solve
function Ginteg(xq1,yq1, xq2,yq2, xp,yp) result(G)
real(kind=dp), intent(in) :: xq1,yq1, xq2,yq2, xp,yp
real(kind=dp) :: G
real(kind=dp) :: x1,x2,y1,y2,t
x1 = xq1-xp
x2 = xq2-xp
y1 = yq1-yp
y2 = yq2-yp
if (x1+x2 < 0) then
t = -x1
x1 = -x2
x2 = t
end if
if (y1+y2 < 0) then
t = -y1
y1 = -y2
y2 = t
end if
G = Vprim(x2,y2)-Vprim(x1,y2)-Vprim(x2,y1)+Vprim(x1,y1)
return
end function Ginteg
function Vprim(x,y) result(VP)
real(kind=dp), intent(in) :: x,y
real(kind=dp) :: VP
real(kind=dp) :: r
r = sqrt(x**2+y**2)
VP = y*log(x+r) + x*log(y+r)
return
end function Vprim
end module solv_cap
--
Summary: Missed opportunities for vectorization due to PRE
Product: gcc
Version: 4.3.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: tree-optimization
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: spop at gcc dot gnu dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33244