This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

Re: PATCH to hoist loads/stores out of loops

To: toon at moene dot indiv dot nluug dot nl
Subject: Re: PATCH to hoist loads/stores out of loops
From: Mark Mitchell <mark at markmitchell dot com>
Date: Mon, 20 Jul 1998 17:19:59 -0700
CC: egcs-patches at cygnus dot com
References: <199807200704.AAA31187@smtp.earthlink.net> <9807202003.AA01538@moene.indiv.nluug.nl>
Reply-to: mark at markmitchell dot com


You wrote:

    [ This is beyond my wildest dreams of the possible outcomes of
      the egcs project - having a C++ guru solving Fortran optimi-
      sation shortcomings :-) :-) ]

Well, I did take a scientific computation course using Fortran90 on a
CM-2 at one point.  You asked:

    Could you also pull the same trick for the following code:

	   subroutine gemm(a, b, c, m, n, k)
	   integer i,m,n,k,l,j
	   dimension a(k,m),  b(n,k),  c(n,m)
	   do i=1,m     ! poor for illustration only
	     do j=1,n
	       do l=1,k
		 c(j,i) = c(j,i) + a(l,i)*b(j,l)
	       end do
	     end do
	   end do
	   end

Actually, the code I submitted does do this.  What, you say, it didn't
when you tried it?  Oops, I forgot to tell you that I haven't yet
implemented C's `restrict'.  (I'd like to do that; all I need is a
client to pay for it (or some free time).)  Basically, what I'm
saying is that if you could convice GCC that a,b,c don't alias you'd
be all set.  Right now, it's afraid that the store's into c(j,i) might
alter a(l,i) or b(j,i).

For exmaple, the following C variant:

  double a[10][10];
  double b[10][10];
  double c[10][10];

  void gemm(int m, int n, int k)
  {
    int i;
    int j;
    int l;
    for (i = 0; i < m; ++i)
      for (j = 0; j < n; ++j)
	for (l = 0; l < k; ++l)
	  c[j][i] += a[l][i] * b[j][l];
  }

is optimized as you suggest, with the patches I submitted.  The inner
loop looks like:

	  fldl c(%ebx)         # Load c(j,i)
	  .p2align 4,,7
  .L13:
	  fldl (%ecx)          # Load a(l,i)
	  fmull b(%edx)        # Load b(j,l)
	  addl $80,%ecx        # Increment some pointers
	  addl $8,%edx
	  incl %eax            # Increment l  
	  faddp %st,%st(1)     # Add 
	  cmpl 16(%ebp),%eax   # Branch
	  jl .L13
	  fstpl c(%ebx)        # Store c(j,i)

So, we're one (relatively small) step away.

-- 
Mark Mitchell 			mark@markmitchell.com
Mark Mitchell Consulting	http://www.markmitchell.com

Follow-Ups:
- Re: PATCH to hoist loads/stores out of loops
  - From: Richard Henderson
- Re: PATCH to hoist loads/stores out of loops
  - From: Richard Henderson

References:
- PATCH to hoist loads/stores out of loops
  - From: Mark Mitchell
- Re: PATCH to hoist loads/stores out of loops
  - From: Toon Moene

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]