This is the mail archive of the fortran@gcc.gnu.org mailing list for the GNU Fortran project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: patches for increased performance of matmul, dotprod, transpose


Hi Tim,

Tim Prince wrote:
> I ran the complete build and test on i686-pc-cygwin gcc 4.0.2, also 
> tested with 4.1.0 and on ia64- and x86-64 linux.  As the other work 
> mentioned on this list no doubt will have to be completed, followed by 
> determining whether any of these are still relevant, I expect any 
> changelog entries would be obsolete before these could be considered.  I 
> haven't heard of any checks on status of my paperwork for several years.
> Additional unrolling is intended only to be enough to approach full 
> pipeline on the simpler x86 CPUs, which can issue more than 1 floating 
> point multiply instruction within the latency of addition.  This 
> reasoning doesn't apply to integer add.  For matmul, memory accesses 
> should be cut nearly by 2.   Problems which are large with respect to 
> cache size aren't addressed, except that the boundary of "large" is 
> pushed up somewhat.  Likewise, with transpose taken in an order more 
> favorable to machines with write combine buffering.

If the manual loop unrolling is really helpful, I think an optimizer bug
should be filed.  On IRC our optimizer guys told me that at least the
modification to matmul should already be done automatically, so I'm wondering,
  if you have any benchmark numbers supporting this modification?  Can't the
same effect be obtained by building libgfortran with -funroll-loops?

A procedural point: your patch doesn't conform to the GNU coding style
(comments should begin with a capital letter and end in "punctuation + two
spaces + */", always put blanks before and after operators, indent comments to
code); also if you find you really need to hand-optimize code, please add a
comment explaining why, adding "FIXME" and a PR number if you think you're
working around an optimizer bug.

WRT your mail:  paragraphs make reading and also responding to emails much
easier.  ChangeLogs are also useful for referencing parts when discussing the
patch.

- Tobi


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]