This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug fortran/68600] Inlined MATMUL is too slow.


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=68600

--- Comment #8 from Jerry DeLisle <jvdelisle at gcc dot gnu.org> ---
Created attachment 36887
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=36887&action=edit
A faster version

I took the example code found in
http://wiki.cs.utexas.edu/rvdg/HowToOptimizeGemm/ where the register based
vector computations are explicitly called via the SSE registers and converted
it to use the builtin gcc vector extensions.  I had to experiment a little to
get some of the equivalent operations of the original code.

With only -O2 and march=native I am getting good results. I need to roll this
into the other test program yet to confirm the gflops are being computed
correctly.  The diff value is comparing to the reference naive results to check
the computation is correct.

MY_MMult = [
Size: 40, Gflops: 1.828571e+00, Diff: 2.664535e-15 
Size: 80, Gflops: 3.696751e+00, Diff: 7.105427e-15 
Size: 120, Gflops: 4.051583e+00, Diff: 1.065814e-14 
Size: 160, Gflops: 4.015686e+00, Diff: 1.421085e-14 
Size: 200, Gflops: 4.029212e+00, Diff: 2.131628e-14 
Size: 240, Gflops: 3.972414e+00, Diff: 2.486900e-14 
Size: 280, Gflops: 3.881188e+00, Diff: 2.842171e-14 
Size: 320, Gflops: 3.872371e+00, Diff: 3.552714e-14 
Size: 360, Gflops: 3.887676e+00, Diff: 4.973799e-14 
Size: 400, Gflops: 3.862052e+00, Diff: 4.973799e-14 
Size: 440, Gflops: 3.886575e+00, Diff: 4.973799e-14 
Size: 480, Gflops: 3.910124e+00, Diff: 6.039613e-14 
Size: 520, Gflops: 3.863706e+00, Diff: 6.394885e-14 
Size: 560, Gflops: 3.976947e+00, Diff: 6.750156e-14 
Size: 600, Gflops: 4.002631e+00, Diff: 7.460699e-14 
Size: 640, Gflops: 3.992507e+00, Diff: 8.171241e-14 
Size: 680, Gflops: 3.964570e+00, Diff: 9.237056e-14 
Size: 720, Gflops: 3.973661e+00, Diff: 1.101341e-13 
Size: 760, Gflops: 3.982346e+00, Diff: 1.065814e-13 
Size: 800, Gflops: 3.869291e+00, Diff: 8.881784e-14 
Size: 840, Gflops: 3.936271e+00, Diff: 1.065814e-13 
Size: 880, Gflops: 3.931259e+00, Diff: 1.030287e-13 
Size: 920, Gflops: 3.912907e+00, Diff: 1.207923e-13 
Size: 960, Gflops: 3.938391e+00, Diff: 1.278977e-13 
Size: 1000, Gflops: 3.945754e+00, Diff: 1.421085e-13

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]