This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: marxin <mliska at suse dot cz>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Jan Hubicka <hubicka at ucw dot cz>, Dominique Dhumieres <dominiq at lps dot ens dot fr>, williamclodius at gmail dot com
- Date: Thu, 7 Jul 2016 16:00:18 +0200
- Subject: Re: [PATCH 0/2, fortran] Better code generation for DO loops with +-1 step
- Authentication-results: sourceware.org; auth=none
- References: <cover.1467883947.git.mliska@suse.cz>
On Thu, Jul 7, 2016 at 11:32 AM, marxin <mliska@suse.cz> wrote:
> Hello.
>
> As discussed in [1], I would like to change code emission from:
>
> D.3428 = (*array)[0];
> D.3429 = (*array)[1];
> i = D.3428;
> if (i <= D.3429)
> {
> while (1)
> {
> {
> logical(kind=4) D.3432;
>
> (*block)[(integer(kind=8)) i + -1] = (*block)[(integer(kind=8)) i + -1] + 10;
> L.1:;
> D.3432 = i == D.3429;
> i = i + 1;
> if (D.3432) goto L.2;
> }
> }
> }
> L.2:;
>
> to:
>
> D.3428 = (*array)[0];
> D.3429 = (*array)[1];
> i = D.3428;
> while (1)
> {
> {
> logical(kind=4) D.3432;
>
> D.3432 = i > D.3429;
> if (D.3432) goto L.2;
> (*block)[(integer(kind=8)) i + -1] = (*block)[(integer(kind=8)) i + -1] + 10;
> L.1:;
> i = i + 1;
> }
> }
> L.2:;
>
> Following changes quite significantly improves exchange_2 benchmark (part of CPUv6),
> where it runs 6% faster. The patchset consists of 2 patches:
>
> a) Add PRED_FORTRAN_LOOP_PREHEADER to DO loops with step bigger than +-1.
>
> 1) I converted predict-[12].f90 tests to use a different step than 1
> 2) I noticed that a generic DO loop code emission misses expect PRED_FORTRAN_LOOP_PREHEADER, thus I added that.
>
> b) Optimize fortran loops with +-1 step.
>
> 1) We generate the c-style loop.
> 2) New warning Wundefined-do-loop is added.
> 3) Couple of tests which hit the undefined behavior are removed.
> 4) New tests that cover the undefined behavior are introduced.
Why is the behavior only undefined for step 1 if the last iteration IV
increment overflows?
Doesn't this apply to all step values?
Richard.
> The patchset survives regression tests and bootstraps on x86_64-linux-gnu and
> I've been running CPU2006 benchmarks to hit a possible speed-up/regression.
>
> Martin
>
> [1] https://gcc.gnu.org/ml/fortran/2016-06/msg00122.html
>
> gcc/fortran/lang.opt | 4 +
> gcc/fortran/resolve.c | 23 +++++
> gcc/fortran/trans-stmt.c | 123 ++++++++++++++-------------
> gcc/testsuite/gfortran.dg/do_1.f90 | 6 --
> gcc/testsuite/gfortran.dg/do_3.F90 | 2 -
> gcc/testsuite/gfortran.dg/do_check_11.f90 | 12 +++
> gcc/testsuite/gfortran.dg/do_check_12.f90 | 12 +++
> gcc/testsuite/gfortran.dg/do_corner_warn.f90 | 22 +++++
> gcc/testsuite/gfortran.dg/ldist-1.f90 | 2 +-
> gcc/testsuite/gfortran.dg/pr48636.f90 | 2 +-
> gcc/testsuite/gfortran.dg/predict-1.f90 | 9 +-
> gcc/testsuite/gfortran.dg/predict-2.f90 | 6 +-
> 12 files changed, 150 insertions(+), 73 deletions(-)
> create mode 100644 gcc/testsuite/gfortran.dg/do_check_11.f90
> create mode 100644 gcc/testsuite/gfortran.dg/do_check_12.f90
> create mode 100644 gcc/testsuite/gfortran.dg/do_corner_warn.f90
>
> --
> 2.8.4
>