This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch, fortran] PR24518 and PR24520 - Improvements to MOD andDOT_PRODUCT


Janne,

BTW, does BLAS do anything fancy for dot product that might help for
big vectors? I mean, is it worth thinking about inlining only for
small vectors? What happpens when the vectors won't fit into cache?
I'm not saying this as a criticism of your patch, just idle
wondering..


I'll check to see what happens with big vectors. Of BLAS, I can only claim ignorance.

For some further idle speculation, for big vectors would it pay off to
insert prefetch hints for archs that support it?


How do I do that? Does the backend not do it for me?

Seems like pathscale is even more demanding. I had to go to O1 to get
it to execute the loops. Same for dot_demo.f90.


It goes with the performance, I think.

I'm not entirely comfortable with this. This might give the user a
false sense of security. At least a huge value is obviously wrong.


I was wondering about this. How about NaN instead?

Another way to circumvent this would be to use the builtin fmod in the case where precision is falling by the wayside.

WOW!!! That's great!

I cannot recall if I remarked anywhere that the code that this DOT_PRODUCT produces, is the same as that produced by z(:) = SUM (x(:), y(:)). This is similarly nifty.

Best regards

Paul


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]