This is the mail archive of the
mailing list for the GCC project.
RE: [Patch for suggestions]: How do we know a loop is the peeled version?
- From: "Fang, Changpeng" <Changpeng dot Fang at amd dot com>
- To: "Fang, Changpeng" <Changpeng dot Fang at amd dot com>, Richard Guenther <richard dot guenther at gmail dot com>, Sebastian Pop <sebpop at gmail dot com>
- Cc: Zdenek Dvorak <rakdver at kam dot mff dot cuni dot cz>, Christian Borntraeger <borntraeger at de dot ibm dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, "uweigand at de dot ibm dot com" <uweigand at de dot ibm dot com>
- Date: Fri, 2 Jul 2010 11:16:32 -0500
- Subject: RE: [Patch for suggestions]: How do we know a loop is the peeled version?
- References: <D4C76825A6780047854A11E93CDE84D02F7763@SAUSEXMBP01.amd.com>
An additional problem is that after a loop is completely-unrolled, the loop structure
seems not destroyed. And thus later optimization passes may still be performed
on these non-existence loops.
From: Fang, Changpeng
Sent: Thursday, July 01, 2010 1:41 PM
To: Richard Guenther; Sebastian Pop
Cc: Zdenek Dvorak; Christian Borntraeger; email@example.com; firstname.lastname@example.org
Subject: [Patch for suggestions]: How do we know a loop is the peeled version?
Just found that many optimizations (prefetch, loop unrolling) are performed on the peeled loops.
This causes code size and compilation time increase without benefit.
INTEGER, PARAMETER :: RK8 = SELECTED_REAL_KIND(15, 300)
END MODULE kinds
PROGRAM TEST_FPU ! A number-crunching benchmark using matrix inversion.
USE kinds ! Implemented by: David Frank Dave_Frank@hotmail.com
IMPLICIT NONE ! Gauss routine by: Tim Prince N8TM@aol.com
! Crout routine by: James Van Buskirk email@example.com
! Lapack routine by: Jos Bergervoet bergervo@IAEhv.nl
REAL(RK8) :: pool(101, 101,1000), a(101, 101)
INTEGER :: i
DO i = 1,1000
a = pool(:,:,i) ! get next matrix to invert
END PROGRAM TEST_FPU
For this example (-O3 -fprefetch-loop-arrays -funroll-loops), the vectorizer peels the loop.
And the prefetching and loop unrolling are performed on the peeled loops.
In the attached patch, the vectorizer marked the loop as peeled, and the prefetching
gives up. However, the RTL unroller could not get this information and still unroll the peeled
I need suggestion: How the optimizer recognizes that the loop is the peeled version (preloop or postloop)?