This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
Other format: | [Raw text] |
On Tue, Feb 28, 2006 at 12:09:09AM +0100, Paul Thomas wrote: > Jakub, > > >Is GCC 4.2 libgfortran.so expected to be ABI incompatible with GCC 4.1 > >compiled fortran code? > >If not, then you shouldn't be removing any exported functions from > >libgfortran. > > > > > That's a good question - I believe the compatibility has already been > compromised but I am not sure. In many ways, I would be happier not > removing the existing library functions; at least for the time being. IMHO 4.1 is still an experimental relase, so just remove dot_product from trunk/libgfortran. Not from 4.1.1 though. Perhaps for 4.2 we're ready for a "real" release following the usual regression fixes only etc. stuff, and at that point we should be more careful about API/ABI breaking. Perhaps at that point the time is ripe to introduce symbol versioning too for 4.2. As for your patch itself, it's ok for trunk and 4.1 once it opens, with a small nitpick fix: Remove the f->value.function.name = gfc_get_string (PREFIX("dot_product_%c%d"), gfc_type_letter (f->ts.type), f->ts.kind); thing from gfc_resolve_dot_product, since it's not needed anymore and lest somebody gets confused. I did some benchmarking as well, and it turns out that with the correct compile options, performance for large arrays is not that much worse than ddot from GOTO BLAS. "default" options: phi:~/src/gfortran/test/pr26025-blas-dot-matmul% make gdef gfortran -O2 -o gdef dot-bench.f90 -lgoto -lpthread phi:~/src/gfortran/test/pr26025-blas-dot-matmul% ./gdef DOT_PRODUCT test, results in Gflop/s array length BLAS DOT_PROD inline 4 0.15 0.57 0.80 8 0.42 0.94 0.94 16 0.49 0.62 0.62 32 0.78 0.73 0.73 64 1.21 0.80 0.80 128 1.64 0.84 0.84 256 1.91 0.86 0.86 512 2.18 0.87 0.87 1024 2.35 0.87 0.87 x87 vs. sse2 doesn't really make any difference on K8: phi:~/src/gfortran/test/pr26025-blas-dot-matmul% gfortran -O3 -ffast-math -funroll-loops -march=k8 -std=f95 -Wall dot-bench.f90 -lgoto -lpthread phi:~/src/gfortran/test/pr26025-blas-dot-matmul% ./a.out DOT_PRODUCT test, results in Gflop/s array length BLAS DOT_PROD inline 4 0.18 0.90 0.83 8 0.42 1.03 1.02 16 0.49 1.13 1.12 32 0.77 1.08 1.07 64 1.20 0.97 0.96 128 1.63 0.84 0.83 256 1.91 0.86 0.85 512 2.18 0.87 0.87 1024 2.35 0.87 0.87 phi:~/src/gfortran/test/pr26025-blas-dot-matmul% make gopt gfortran -O3 -ffast-math -funroll-loops -mfpmath=sse -msse2 -march=k8 -std=f95 -Wall -o gopt dot-bench.f90 -lgoto -lpthread phi:~/src/gfortran/test/pr26025-blas-dot-matmul% ./gopt DOT_PRODUCT test, results in Gflop/s array length BLAS DOT_PROD inline 4 0.19 0.81 0.80 8 0.41 1.01 0.97 16 0.49 1.11 1.12 32 0.77 1.07 1.04 64 1.20 0.96 0.95 128 1.63 0.84 0.83 256 1.91 0.86 0.86 512 2.19 0.87 0.87 1024 2.34 0.88 0.87 And finally, with vectorization performance for small arrays is reduced while for bigger ones it goes much faster: phi:~/src/gfortran/test/pr26025-blas-dot-matmul% make gvect gfortran -O3 -ffast-math -funroll-loops -mfpmath=sse -msse2 -march=k8 -ftree-vectorize -std=f95 -Wall -o gvect dot-bench.f90 -lgoto -lpthread phi:~/src/gfortran/test/pr26025-blas-dot-matmul% ./gvect DOT_PRODUCT test, results in Gflop/s array length BLAS DOT_PROD inline 4 0.14 0.41 0.49 8 0.40 0.89 0.86 16 0.86 1.25 1.20 32 1.30 1.46 1.41 64 1.72 1.59 1.56 128 2.05 1.66 1.65 256 2.18 1.63 1.64 512 2.34 1.69 1.69 1024 2.43 1.72 1.72 -- Janne Blomqvist
Attachment:
pgp00000.pgp
Description: PGP signature
Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|
Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |