This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
RE: [PATCH, ping 1] Move insns without introducing new temporaries in loop2_invariant
- From: "Thomas Preud'homme" <thomas dot preudhomme at arm dot com>
- To: "'Steven Bosscher'" <stevenb dot gcc at gmail dot com>
- Cc: "GCC Patches" <gcc-patches at gcc dot gnu dot org>, "Eric Botcazou" <ebotcazou at libertysurf dot fr>
- Date: Wed, 6 May 2015 17:47:38 +0800
- Subject: RE: [PATCH, ping 1] Move insns without introducing new temporaries in loop2_invariant
- Authentication-results: sourceware.org; auth=none
Ping?
Best regards,
Thomas
> -----Original Message-----
> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches-
> owner@gcc.gnu.org] On Behalf Of Thomas Preud'homme
> Sent: Monday, March 16, 2015 8:39 PM
> To: 'Steven Bosscher'
> Cc: GCC Patches; Eric Botcazou
> Subject: RE: [PATCH, stage1] Move insns without introducing new
> temporaries in loop2_invariant
>
> > From: Steven Bosscher [mailto:stevenb.gcc@gmail.com]
> > Sent: Monday, March 09, 2015 7:48 PM
> > To: Thomas Preud'homme
> > Cc: GCC Patches; Eric Botcazou
> > Subject: Re: [PATCH, stage1] Move insns without introducing new
> > temporaries in loop2_invariant
>
> New patch below.
>
> >
> > It looks like this would run for all candidate loop invariants, right?
> >
> > If so, you're creating run time of O(n_invariants*n_bbs_in_loop), a
> > potential compile time hog for large loops.
> >
> > But why compute this at all? Perhaps I'm missing something, but you
> > already have inv->always_executed available, no?
>
> Indeed. I didn't realize the information was already there.
>
> >
> >
> > > + basic_block use_bb;
> > > +
> > > + ref = DF_REF_INSN (use);
> > > + use_bb = BLOCK_FOR_INSN (ref);
> >
> > You can use DF_REF_BB.
>
> Since I need use_insn here I kept BLOCK_FOR_INSN but I used
> DF_REF_BB for the def below.
>
>
> So here are the new ChangeLog entries:
>
> *** gcc/ChangeLog ***
>
> 2015-03-11 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> * loop-invariant.c (can_move_invariant_reg): New.
> (move_invariant_reg): Call above new function to decide whether
> instruction can just be moved, skipping creation of temporary
> register.
>
> *** gcc/testsuite/ChangeLog ***
>
> 2015-03-12 Thomas Preud'homme <thomas.preudhomme@arm.com>
>
> * gcc.dg/loop-8.c: New test.
> * gcc.dg/loop-9.c: New test.
>
>
> diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
> index f79b497..8217d62 100644
> --- a/gcc/loop-invariant.c
> +++ b/gcc/loop-invariant.c
> @@ -1512,6 +1512,79 @@ replace_uses (struct invariant *inv, rtx reg,
> bool in_group)
> return 1;
> }
>
> And the new patch:
>
> +/* Whether invariant INV setting REG can be moved out of LOOP, at the
> end of
> + the block preceding its header. */
> +
> +static bool
> +can_move_invariant_reg (struct loop *loop, struct invariant *inv, rtx
> reg)
> +{
> + df_ref def, use;
> + unsigned int dest_regno, defs_in_loop_count = 0;
> + rtx_insn *insn = inv->insn;
> + basic_block bb = BLOCK_FOR_INSN (inv->insn);
> +
> + /* We ignore hard register and memory access for cost and complexity
> reasons.
> + Hard register are few at this stage and expensive to consider as they
> + require building a separate data flow. Memory access would require
> using
> + df_simulate_* and can_move_insns_across functions and is more
> complex. */
> + if (!REG_P (reg) || HARD_REGISTER_P (reg))
> + return false;
> +
> + /* Check whether the set is always executed. We could omit this
> condition if
> + we know that the register is unused outside of the loop, but it does
> not
> + seem worth finding out. */
> + if (!inv->always_executed)
> + return false;
> +
> + /* Check that all uses reached by the def in insn would still be reached
> + it. */
> + dest_regno = REGNO (reg);
> + for (use = DF_REG_USE_CHAIN (dest_regno); use; use =
> DF_REF_NEXT_REG (use))
> + {
> + rtx_insn *use_insn;
> + basic_block use_bb;
> +
> + use_insn = DF_REF_INSN (use);
> + use_bb = BLOCK_FOR_INSN (use_insn);
> +
> + /* Ignore instruction considered for moving. */
> + if (use_insn == insn)
> + continue;
> +
> + /* Don't consider uses outside loop. */
> + if (!flow_bb_inside_loop_p (loop, use_bb))
> + continue;
> +
> + /* Don't move if a use is not dominated by def in insn. */
> + if (use_bb == bb && DF_INSN_LUID (insn) >= DF_INSN_LUID
> (use_insn))
> + return false;
> + if (!dominated_by_p (CDI_DOMINATORS, use_bb, bb))
> + return false;
> + }
> +
> + /* Check for other defs. Any other def in the loop might reach a use
> + currently reached by the def in insn. */
> + for (def = DF_REG_DEF_CHAIN (dest_regno); def; def =
> DF_REF_NEXT_REG (def))
> + {
> + basic_block def_bb = DF_REF_BB (def);
> +
> + /* Defs in exit block cannot reach a use they weren't already. */
> + if (single_succ_p (def_bb))
> + {
> + basic_block def_bb_succ;
> +
> + def_bb_succ = single_succ (def_bb);
> + if (!flow_bb_inside_loop_p (loop, def_bb_succ))
> + continue;
> + }
> +
> + if (++defs_in_loop_count > 1)
> + return false;
> + }
> +
> + return true;
> +}
> +
> /* Move invariant INVNO out of the LOOP. Returns true if this succeeds,
> false
> otherwise. */
>
> @@ -1545,11 +1618,8 @@ move_invariant_reg (struct loop *loop,
> unsigned invno)
> }
> }
>
> - /* Move the set out of the loop. If the set is always executed (we
> could
> - omit this condition if we know that the register is unused
> outside of
> - the loop, but it does not seem worth finding out) and it has no
> uses
> - that would not be dominated by it, we may just move it (TODO).
> - Otherwise we need to create a temporary register. */
> + /* If possible, just move the set out of the loop. Otherwise, we
> + need to create a temporary register. */
> set = single_set (inv->insn);
> reg = dest = SET_DEST (set);
> if (GET_CODE (reg) == SUBREG)
> @@ -1557,19 +1627,25 @@ move_invariant_reg (struct loop *loop,
> unsigned invno)
> if (REG_P (reg))
> regno = REGNO (reg);
>
> - reg = gen_reg_rtx_and_attrs (dest);
> + if (!can_move_invariant_reg (loop, inv, reg))
> + {
> + reg = gen_reg_rtx_and_attrs (dest);
>
> - /* Try replacing the destination by a new pseudoregister. */
> - validate_change (inv->insn, &SET_DEST (set), reg, true);
> + /* Try replacing the destination by a new pseudoregister. */
> + validate_change (inv->insn, &SET_DEST (set), reg, true);
>
> - /* As well as all the dominated uses. */
> - replace_uses (inv, reg, true);
> + /* As well as all the dominated uses. */
> + replace_uses (inv, reg, true);
>
> - /* And validate all the changes. */
> - if (!apply_change_group ())
> - goto fail;
> + /* And validate all the changes. */
> + if (!apply_change_group ())
> + goto fail;
>
> - emit_insn_after (gen_move_insn (dest, reg), inv->insn);
> + emit_insn_after (gen_move_insn (dest, reg), inv->insn);
> + }
> + else if (dump_file)
> + fprintf (dump_file, "Invariant %d moved without introducing a
> new "
> + "temporary register\n", invno);
> reorder_insns (inv->insn, inv->insn, BB_END (preheader));
>
> /* If there is a REG_EQUAL note on the insn we just moved, and the
> diff --git a/gcc/testsuite/gcc.dg/loop-8.c b/gcc/testsuite/gcc.dg/loop-8.c
> new file mode 100644
> index 0000000..592e54c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/loop-8.c
> @@ -0,0 +1,24 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fdump-rtl-loop2_invariant" } */
> +
> +void
> +f (int *a, int *b)
> +{
> + int i;
> +
> + for (i = 0; i < 100; i++)
> + {
> + int d = 42;
> +
> + a[i] = d;
> + if (i % 2)
> + d = i;
> + b[i] = d;
> + }
> +}
> +
> +/* Load of 42 is moved out of the loop, introducing a new pseudo
> register. */
> +/* { dg-final { scan-rtl-dump-times "Decided" 1 "loop2_invariant" } } */
> +/* { dg-final { scan-rtl-dump-not "without introducing a new temporary
> register" "loop2_invariant" } } */
> +/* { dg-final { cleanup-rtl-dump "loop2_invariant" } } */
> +
> diff --git a/gcc/testsuite/gcc.dg/loop-9.c b/gcc/testsuite/gcc.dg/loop-9.c
> new file mode 100644
> index 0000000..96412ed
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/loop-9.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O1 -fdump-rtl-loop2_invariant" } */
> +
> +void
> +f (double *a)
> +{
> + int i;
> + for (i = 0; i < 100; i++)
> + a[i] = 18.4242;
> +}
> +
> +/* Load of x is moved out of the loop. */
> +/* { dg-final { scan-rtl-dump "Decided" "loop2_invariant" } } */
> +/* { dg-final { scan-rtl-dump "without introducing a new temporary
> register" "loop2_invariant" } } */
> +/* { dg-final { cleanup-rtl-dump "loop2_invariant" } } */
> +
>
> * testsuite was run in QEMU when compiled by an arm-none-eabi GCC
> cross-compiler targeting Cortex-M3 and a bootstrapped x86_64 native
> GCC without any regression.
> * New tests were checked to run correctly on aarch64-unknown-linux-
> gnu GCC cross-compiler
>
> Is this ok for stage1?
>
> Best regards,
>
> Thomas
>
>
>