Bug 36841 - Eliminate gfortran_sum_r8 call for calculation involving multidimensional array multiplication followed by a sum along first dimension
Summary: Eliminate gfortran_sum_r8 call for calculation involving multidimensional arr...
Status: RESOLVED DUPLICATE of bug 43829
Alias: None
Product: gcc
Classification: Unclassified
Component: fortran (show other bugs)
Version: 4.7.0
: P3 normal
Target Milestone: ---
Assignee: Mikael Morin
URL:
Keywords: missed-optimization
Depends on: 43829
Blocks: 36854
  Show dependency treegraph
 
Reported: 2008-07-15 18:09 UTC by Rajiv Adhikary
Modified: 2012-03-04 19:13 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-09-13 21:35:52


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Rajiv Adhikary 2008-07-15 18:09:12 UTC
For calculation involving multidimensional array multiplication followed by a sum along first dimension,
    GCC performs the steps separately - the element-by-element array multiplication is completed first.
    Function gfortran_sum_r8 is called next to calculate the sum.
    A better process would be to keep an accumulator updated as the element-by-element array multiplication
    is carried out. This has following benefits:
    i. gfortran_sum_r8 call is eliminated.
    ii. there is no longer a need for temporary array to hold array multiplication result.

    subroutine sum_test(Rx,Ry,Rz,nx,ny)
    implicit none
      integer(kind=kind(1)), intent(in) :: nx,ny
      real(kind=kind(1.0d0)), dimension(nx,ny), intent(in) :: Rx,Ry
      real(kind=kind(1.0d0)), dimension(ny), intent(out) :: Rz

      Rz = sum(Rx * Ry, 1)
    end subroutine sum_test


Other relevant information:
1. Compile flags: -O3 -ffast-math -m64 -march=amdfam10

2. gfortran version: gfortran -v
Using built-in specs.
Target: x86_64-unknown-linux-gnu
Configured with: /tmp/src/gcc-4.3.0/configure --prefix=/opt/amd/gcc-4.3.0 --enable-languages=c,c++,fortran --enable-stage1-checking --with-as=/opt/amd/gcc-4.3.0/bin/as --with-ld=/opt/amd/gcc-4.3.0/bin/ld --with-mpfr=/tmp/install/mpfr-2.3.0 --with-gmp=/tmp/install/gmp-4.2.2
Thread model: posix
gcc version 4.3.1 20080312 (prerelease) (GCC)

3. model name: AMD Phenom(tm) 8650 Triple-Core Processor
4. flags     : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt rdtscp lm 3dnowext 3dnow constant_tsc pni cx16 popcnt lahf_lm cmp_legacy svm extapic cr8_legacy altmovcr8 abm sse4a misalignsse 3dnowprefetch osvw
Comment 1 Richard Biener 2008-07-23 09:40:10 UTC
Confirmed.  The middle-end array work will address this in a generic way.
Comment 2 Jerry DeLisle 2010-09-12 16:14:38 UTC
Is this something for the FE to do?
Comment 3 Steven Bosscher 2010-09-12 17:14:48 UTC
This is not a job for the FE.
Comment 4 Dominique d'Humieres 2010-09-12 19:56:36 UTC
> This is not a job for the FE.

How could the middle-end do the job if __gfortran_sum_r8 is not inlined/scalarized (see pr43829)?
Comment 5 Steven Bosscher 2010-09-12 21:24:01 UTC
OK, I thought you meant that this would be something for a separate Fortran front end optimization pass.  Expanding SUM differently is a job for the FE, yes.
Comment 6 Jakub Jelinek 2010-09-13 10:18:32 UTC
I believe just gfc_conv_intrinsic_arith needs to be adjusted so that it also handles se->ss case, at least for optimize && !optimize_size.  Currently it just handles the case where those intrinsics return a scalar.
Comment 7 Mikael Morin 2010-09-13 17:14:37 UTC
(In reply to comment #4)
> (see pr43829)
> 

I think it is a duplicate of (or close to) pr43829. 
Marked as depending on it so that I don't forget it. 
Comment 8 Jakub Jelinek 2010-09-13 18:50:30 UTC
So, are you goint to take care of this?
Comment 9 Mikael Morin 2010-09-13 21:35:52 UTC
(In reply to comment #8)
> So, are you goint to take care of this?
> 

Sure.
Comment 10 Mikael Morin 2012-03-04 19:13:53 UTC
(In reply to comment #7)
> (In reply to comment #4)
> > (see pr43829)
> > 
> 
> I think it is a duplicate of (or close to) pr43829. 
> Marked as depending on it so that I don't forget it. 

This is fixed for the 4.7.0 version.
Closing.

*** This bug has been marked as a duplicate of bug 43829 ***