This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: induct regression in gcc 4.3.1


Tim,
   I am told that this regression in performance doesn't
occur in gcc 4.4 and is due to the lack of a proper cost-model
for this case on Intel Core 2. If so, might it be possible to
identify the cost-model change in gcc trunk and backport it
for gcc 4.3.2?
             Jack

On Sun, Jun 22, 2008 at 02:10:49PM +0900, Tim Prince wrote:
> Jack Howarth wrote:
>>      While generating .s files for induct.f90, I discovered that
>> when I compile...
>>
>> gfortran -fassociative-math -fno-signed-zeros -O3 induct.f90
>>
>> or
>>
>> gfortran -fassociative-math -fno-trapping-math -O3 induct.f90
>>
>> I get the warning...
>>
>> f951: warning: -fassociative-math disabled; other options take precedence
>>
>> Only...
>>
>> gfortran -fassociative-math -fno-signed-zeros -fno-trapping-math -O3 induct.f90
>>
>> inhibits this message. This strongly suggests that the 
>> -fassociative-math (which is only active in concert with 
>> -fno-signed-zeros -fno-trapping-math)
>> is the cause of the performance regression in the code generated for
>> induct.f90 for gcc 4.3.1 on Macintel.
>
> So you see the same complaint about -fassociative-math as I.
> In the spirit of flogging a dead horse,  I'll remind again that  
> -ffast-math and -fassociative-math are supposed to enable dot_product  
> vectorization.  So, how does your vectorization report for the regression 
> case look?
> tim@tim-t61:~/src/ph> gfortran -O3 -ftree-vectorizer-verbose=1 
> -ffast-math induct.f90
>
> induct.f90:5761: note:
> induct.f90:5142: note:
> induct.f90:777: note:
> induct.f90:5062: note: LOOP VECTORIZED.
> induct.f90:5061: note: LOOP VECTORIZED.
> induct.f90:5060: note: LOOP VECTORIZED.
> induct.f90:5059: note: LOOP VECTORIZED.
> induct.f90:5058: note: LOOP VECTORIZED.
> induct.f90:5057: note: LOOP VECTORIZED.
> induct.f90:4893: note: vectorizing stmts using SLP.
> induct.f90:4893: note: LOOP VECTORIZED.
> induct.f90:4840: note: vectorized 7 loops in function.
>
> induct.f90:3566: note:
> induct.f90:3566: note:
> induct.f90:3289: note:
> induct.f90:3141: note:
> induct.f90:2220: note: LOOP VECTORIZED.
> induct.f90:2159: note: vectorizing stmts using SLP.
> induct.f90:2159: note: LOOP VECTORIZED.
> induct.f90:2077: note: LOOP VECTORIZED.
> induct.f90:2016: note: vectorizing stmts using SLP.
> induct.f90:2016: note: LOOP VECTORIZED.
> induct.f90:1824: note: vectorized 4 loops in function.
>
> induct.f90:1772: note: LOOP VECTORIZED.
> induct.f90:1741: note: vectorizing stmts using SLP.
> induct.f90:1741: note: LOOP VECTORIZED.
> induct.f90:1660: note: LOOP VECTORIZED.
> induct.f90:1629: note: vectorizing stmts using SLP.
> induct.f90:1629: note: LOOP VECTORIZED.
> induct.f90:1441: note: vectorized 4 loops in function.
>
> induct.f90:1264: note:
> induct.f90:1000: note:
> induct.f90:556: note:
> induct.f90:3060: note: LOOP VECTORIZED.
> induct.f90:3038: note: vectorizing stmts using SLP.
> induct.f90:3038: note: LOOP VECTORIZED.
> induct.f90:2918: note: LOOP VECTORIZED.
> induct.f90:2896: note: vectorizing stmts using SLP.
> induct.f90:2896: note: LOOP VECTORIZED.
> induct.f90:2845: note: created 1 versioning for alias checks.
>
> induct.f90:2845: note: vectorizing stmts using SLP.
> induct.f90:2845: note: LOOP VECTORIZED.
> induct.f90:2724: note: LOOP VECTORIZED.
> induct.f90:2702: note: vectorizing stmts using SLP.
> induct.f90:2702: note: LOOP VECTORIZED.
> induct.f90:2582: note: LOOP VECTORIZED.
> induct.f90:2560: note: vectorizing stmts using SLP.
> induct.f90:2560: note: LOOP VECTORIZED.
> induct.f90:2509: note: created 1 versioning for alias checks.
>
> induct.f90:2509: note: vectorizing stmts using SLP.
> induct.f90:2509: note: LOOP VECTORIZED.
> induct.f90:2273: note: vectorized 10 loops in function.
> ....
> cpu time to define res-q can/coil mutual inductances =  37.183
> ....
> run on Core 2 Duo laptop 2.001Ghz
> Target: x86_64-unknown-linux-gnu
> Configured with: ../configure --prefix=/usr/local/gcc44  
> --enable-languages='c c++ fortran'
> Thread model: posix
> gcc version 4.4.0 20080509 (experimental) (GCC)
>
> Does anyone care about -ffast-math performance of induct.f90 for gfortran 
> 4.3.x, when it never approached the performance of gfortran 4.4?
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]