This is the mail archive of the
`gcc-patches@gcc.gnu.org`
mailing list for the GCC project.

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |

Other format: | [Raw text] |

*From*: roger at eyesopen dot com*To*: gcc-patches at gcc dot gnu dot org, fortran at gcc dot gnu dot org*Date*: Mon, 18 Dec 2006 10:36:24 -0700 (MST)*Subject*: [fortran PATCH] Implement a(:,:) = 0.0 using memset

The following patch makes use of the recently added gfc_full_array_ref_p function to provide the optimization of using memset when assigning an entire array to zero. Currently, the source code below: integer :: a(20) a(:) = 0; we currently generate the following with -fdump-tree-original int8 S.0; S.0 = 1; while (1) { if (S.0 > 20) goto L.1; else (void) 0; (*a)[NON_LVALUE_EXPR <S.0> + -1] = 0; S.0 = S.0 + 1; } L.1:; with the patch below, we now generate this instead. (void) __builtin_memset ((void *) a, 0, 80); This can then take advantage of GCC's intrinsic expansion machinery, including Jan's recent improvements for x86. I'm keen to hear if there are any corner cases that I've overlooked and aren't covered by the gfortran testsuite. Perhaps if someone could run NIST, polyhedron and the usual suspects to confirm there are no issues. Once this is in the tree, and there are no major issues, there are some obvious extensions and improvements that can be made a follow-up patches: [1] Avoid using memset for small array sizes, such that the tree-ssa optimizers would unroll the loop and reveal the assignments via SRA. [2] Allow reverse order initialization, such as a(20:1:-1) = 0. [3] Extend the infrastructure to support sequentially consecutive assignments that don't cover the entire array a(20:40) = 0.0. [4] Extend infrastructure for arbitrary (run-time) length expressions, such as a(1:n) = 0.0. [5] Generalize this optimization to use memcpy (or memmove?) for array assignments, a(:) = b(:). Whilst its true that we could hope for the tree-ssa optimizers to start recognizing memset and memcpy idioms as an optimization pass, it makes sense that the f90 array lowering/scalarizing machinery recognize these easy cases itself. I'd like to hope this idiom is common enough, that this change improves some common benchmarks, but I've not done actual timings myself. The following patch has been tested on x86_64-unknown-linux-gnu with a full "make bootstrap", including gfortran, and regression tested with a top-level "make -k check" with no new failures. Ok for mainline? 2006-12-18 Roger Sayle <roger@eyesopen.com> * trans-expr.c (is_zero_initializer_p): Determine whether a given constant expression is a zero initializer. (gfc_trans_zero_assign): New function to attempt top optimize "a(:) = 0.0" as a call to __builtin_memset (a, 0, sizeof(a)); (gfc_trans_assignment): Special case array assignments to zero initializer constants, using gfc_trans_zero_assign. * gfortran.dg/array_memset_1.f90: New test case. Roger --

**Attachment:
patche.txt**

**Attachment:
array_memset_1.f90**

**Follow-Ups**:**Re: [fortran PATCH] Implement a(:,:) = 0.0 using memset***From:*Tobias Schlüter

**Re: [fortran PATCH] Implement a(:,:) = 0.0 using memset***From:*Andrew Pinski

**Re: [fortran PATCH] Implement a(:,:) = 0.0 using memset***From:*Tim Prince

**Re: [fortran PATCH] Implement a(:,:) = 0.0 using memset***From:*Steve Kargl

Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
---|---|---|

Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |