[patch, libfortran] Speed up cshift for dim > 1
Thomas Koenig
tkoenig@netcologne.de
Wed Jun 14 19:42:00 GMT 2017
Hello world,
the attached patch implements a blocked algorithm for
improving the speed of cshift for dim > 1.
It uses the fact that
integer, dimension (n1,n2,n3) :: a, b
b = cshift(a,shift,3)
is identical, as far as the memory locations is concerned.
integer, dimension (n1*n2*n3) :: c, d
d = cshift(c, shift*n1*n2, 1)
The speedup is quite large; from being really slow for
dim > 1, this patch makes it go even faster.
Below there are some comparisons for the attached benchmark,
do-1.f90. gfortran-7 uses the old library version.
Interestingly, the library version is also much faster
than an implementation of straight DO loops.
Regression-tested. OK for trunk?
Regards
Thomas
$ gfortran-7 -static-libgfortran -O3 do-1.f90 && ./a.out
Testing explicit DO loops
Dim = 1 Elapsed CPU time = 5.71363878
Dim = 2 Elapsed CPU time = 5.40494061
Dim = 3 Elapsed CPU time = 5.40769291
Testing built-in cshift
Dim = 1 Elapsed CPU time = 3.43479729
Dim = 2 Elapsed CPU time = 11.7110386
Dim = 3 Elapsed CPU time = 31.0966301
$ gfortran -static-libgfortran -O3 do-1.f90 && ./a.out
Testing explicit DO loops
Dim = 1 Elapsed CPU time = 5.73881340
Dim = 2 Elapsed CPU time = 5.38435745
Dim = 3 Elapsed CPU time = 5.38971329
Testing built-in cshift
Dim = 1 Elapsed CPU time = 3.42018127
Dim = 2 Elapsed CPU time = 2.24075317
Dim = 3 Elapsed CPU time = 2.23136330
2017-06-14 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/52473
* m4/cshift0.m4: For arrays that are contiguous up to
shift, implement blocked algorighm for cshift.
* generated/cshift0_c10.c: Regenerated.
* generated/cshift0_c16.c: Regenerated.
* generated/cshift0_c4.c: Regenerated.
* generated/cshift0_c8.c: Regenerated.
* generated/cshift0_i1.c: Regenerated.
* generated/cshift0_i16.c: Regenerated.
* generated/cshift0_i2.c: Regenerated.
* generated/cshift0_i4.c: Regenerated.
* generated/cshift0_i8.c: Regenerated.
* generated/cshift0_r10.c: Regenerated.
* generated/cshift0_r16.c: Regenerated.
* generated/cshift0_r4.c: Regenerated.
* generated/cshift0_r8.c: Regenerated.
2017-06-14 Thomas Koenig <tkoenig@gcc.gnu.org>
PR fortran/52473
* gfortran.dg/cshift_1.f90: New test.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: p6.diff
Type: text/x-patch
Size: 48036 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20170614/a3f07b86/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cshift_1.f90
Type: text/x-fortran
Size: 3148 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20170614/a3f07b86/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: do-1.f90
Type: text/x-fortran
Size: 2681 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20170614/a3f07b86/attachment-0002.bin>
More information about the Gcc-patches
mailing list