This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]
Re: [patch] Merge cfo-branch, RTL sequence abstraction (part 1)

From: Ian Lance Taylor <ian at airs dot com>
To: Gabor Loki <loki at gcc dot gnu dot org>
Cc: gcc-patches at gcc dot gnu dot org, bernds_cb1 at t-online dot de
Date: 15 Dec 2005 11:31:43 -0800
Subject: Re: [patch] Merge cfo-branch, RTL sequence abstraction (part 1)
References: <439EADF8.7030405@gcc.gnu.org>
Gabor Loki <loki@gcc.gnu.org> writes:

> Index: gcc/fact-common.h
> ===================================================================
> --- gcc/fact-common.h	(revision 0)
> +++ gcc/fact-common.h	(revision 0)
> @@ -0,0 +1,39 @@
> +/* Local factoring (code hoisting/sinking) on SSA trees.
> +   Copyright (C) 2004 Free Software Foundation, Inc.

This comment seems wrong.  And the copyright date should be 2005.

> +#ifndef GCC_FACT_COMMON
> +#define GCC_FACT_COMMON
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "tm.h"
> +#include "rtl.h"
> +#include "basic-block.h"
> +#include "resource.h"
> +#include "ggc.h"
> +#include "tree.h"
> +#include "tree-flow.h"
> +
> +extern bool is_seqabstr (void);
> +
> +/* Main CFO functions */
> +extern void rtl_seqabstr (void);
> +
> +#endif

There is no reason to include all those header files if all you are
going to do is declare two functions.

It seems to me that the header file should have the same name as the
.c file, unless you are doing this because there is more code coming.

> Index: gcc/rtl-factoring.c
> ===================================================================
> --- gcc/rtl-factoring.c	(revision 0)
> +++ gcc/rtl-factoring.c	(revision 0)
> @@ -0,0 +1,1075 @@
> +/* RTL factoring (code hoisting/sinking and sequence abstraction).
> +   Copyright (C) 2004 Free Software Foundation, Inc.

Copyright 2005 (well, 2006 soon).

> +/* TODO:
> +   Sequence abstraction:
> +   - Use length attribute of insns to calculate gain. (Count insns only if
> +     length is not available.)
> +   - Use REG_ALLOC_ORDER when choosing link register.
> +   - Handle JUMP_INSNs. Also handle volatile function calls (handle them
> +     simmilar to unconditional jumps.)
> +   - Test command line option -fpic.
> +*/

You need a lengthy comment here describing what this file does, and
what algorithm it uses.

> +/* Predicate yielding nonzero iff X is a call insn.  */
> +#define CALL_P(X) (GET_CODE (X) == CALL_INSN)

This is defined in rtl.h; don't define it again.

> +
> +#ifndef REGNO_MODE_OK_FOR_BASE_P
> +/* Using REGNO_OK_FOR_BASE_P if target machine does not define this macro.  */
> +#define REGNO_MODE_OK_FOR_BASE_P(REGNO, MODE) REGNO_OK_FOR_BASE_P (REGNO)
> +#endif

This is defined in defaults.h; don't define it again.

> +/* Cost of a call sequence for seq.abstr.
> +   FIXME: Should be 2 but on ARM reorg transforms a symbol_ref into a load from
> +   the constant pool so compensate it here.  */
> +#define CALL_COST 3

I think this needs to be machine dependent.  At the very least there
needs to be a reasonable and documented way for the tm.h file to
override it.

> +/* Cost of a return insn.  */
> +#define RETURN_COST 1

Same here.

> +  /* Try to match every abstractable insn in every block with every other
> +     insn.  */
> +  FOR_EACH_BB (pbb)
> +  {
> +    rtx pinsn;
> +
> +    /* Skip that bb when its first succ. bb has more incoming edge than
> +       PARAM_MAX_CROSSJUMP_EDGES. The same checks as in crossjump.  */
> +    if (EDGE_COUNT (pbb->succs)
> +        && (EDGE_SUCC (pbb, 0))->dest
> +        && EDGE_COUNT ((EDGE_SUCC (pbb, 0))->dest->preds) >
> +        (unsigned long) PARAM_VALUE (PARAM_MAX_CROSSJUMP_EDGES))
> +      continue;
> +
> +    for (pinsn = BB_HEAD (pbb);; pinsn = NEXT_INSN (pinsn))
> +      {
> +        if (ABSTRACTABLE_INSN_P (pinsn)
> +#ifdef STACK_REGS
> +            && !bitmap_bit_p (&stack_reg_live, INSN_UID (pinsn))
> +#endif
> +           )
> +          {
> +
> +            FOR_EACH_BB (mbb)
> +            {
> +              rtx minsn;
> +
> +              for (minsn = BB_END (mbb);; minsn = PREV_INSN (minsn))
> +                {
> +                  if (ABSTRACTABLE_INSN_P (minsn)
> +#ifdef STACK_REGS
> +                      && !bitmap_bit_p (&stack_reg_live, INSN_UID (minsn))
> +#endif
> +                     )
> +                    match_seqs (pinsn, minsn);
> +
> +                  if (minsn == BB_HEAD (mbb))
> +                    break;

This algorithm is N^2.  You are essentially comparing every insn
against every other insn.  There has to be a faster way to do this.
I'll bet there is some literature you can read about this.

Or, off the top of my head, you could stuff all the instructions in a
hash table, and use that to just start matching sequences at identical
instructions.  That way you would only walk the entire instruction
stream once, and would at least restrict the N^2 behaviour to matching
instructions.

By the way, use FOR_BB_INSNS and FOR_BB_INSNS_REVERSE in the above.


> +  prev = NULL_RTX;
> +  for (mseq = pseq->matching_seqs; mseq; mseq = mseq->next_matching_seq)
> +    {
> +      /* Determine ABSTRACTED_LENGTH.  */
> +      x = mseq->insn;
> +      for (i = 0; (i < mseq->matching_length) && (x != prev); i++)
> +        x = prev_insn_in_block (x);
> +      mseq->abstracted_length = i;
> +
> +      /* If ABSTRACTED_LENGTH is big enough registers live in this matching
> +         sequence should not be used as a link register. Also set
> +         ABSTRACTED_LENGTH of PSEQ.  */
> +      if (mseq->abstracted_length > CALL_COST)
> +        {
> +          clear_regs_live_in_seq (&linkregs, mseq->insn,
> +                                  mseq->abstracted_length);
> +          if (mseq->abstracted_length > pseq->abstracted_length)
> +            pseq->abstracted_length = mseq->abstracted_length;
> +
> +          prev = mseq->insn;
> +        }
> +    }
> +
> +  /* Modify ABSTRACTED_LENGTH of PSEQ if pattern sequence overlaps with one
> +     of the matching sequences.  */
> +  for (mseq = pseq->matching_seqs; mseq; mseq = mseq->next_matching_seq)
> +    {
> +      x = pseq->insn;
> +      for (i = 0; (i < pseq->abstracted_length) && (x != mseq->insn); i++)
> +        x = prev_insn_in_block (x);
> +      pseq->abstracted_length = i;
> +    }
> +  /* No gain if ABSTRACTED_LENGTH is too small.  */
> +  if (pseq->abstracted_length <= CALL_COST)
> +    return;

You seem to be measuring cost purely on the basis of number of
instructions.  I think you should use the existing cost routines.
With -Os, that should normally be the same as the number of
instructions.


> +/* Finds equivalent insn sequences in the current function and retains only o> +  /* Build an initial set of pattern sequences from the current function.  */
> +  collect_pattern_seqs ();
> +  dump_pattern_seqs ();
> +
> +  /* Iterate until there are sequences to abstract.  */

I think you mean until there are no sequences to abstract.

> +static void
> +rest_of_rtl_seqabstr (void)
> +{
> +  life_analysis (dump_file, PROP_DEATH_NOTES |
> +                            PROP_SCAN_DEAD_CODE |
> +                            PROP_KILL_DEAD_CODE);
> +
> +  cleanup_cfg (CLEANUP_EXPENSIVE |
> +               CLEANUP_UPDATE_LIFE |
> +               (flag_crossjumping ? CLEANUP_CROSSJUMP : 0));
> +
> +  /* Abstract out common insn sequences. */
> +  rtl_seqabstr ();
> +
> +  /* Update notes.  */
> +  count_or_remove_death_notes (NULL, 1);
> +    
> +  life_analysis (dump_file, PROP_DEATH_NOTES |
> +                            PROP_SCAN_DEAD_CODE |
> +                            PROP_KILL_DEAD_CODE);
> +
> +  /* Extra cleanup.  */
> +  cleanup_cfg (CLEANUP_EXPENSIVE |
> +               CLEANUP_UPDATE_LIFE |
> +               (flag_crossjumping ? CLEANUP_CROSSJUMP : 0));
> +}

It does not make sense to run life_analysis and cleanup_cfg both
before and after rtl_seqabstr.  It's not clear to me why you need to
run them before at all.  And at the very least you should only run
them after if the pass actually changed something.


You also need user level documentation.  I think we can probably come
up with a better option name than -frtl-seqabstr.  How about simply
-fabstract-sequences?

Hope this helps.

Ian
References:
- [patch] Merge cfo-branch, RTL sequence abstraction (part 1)
  - From: Gabor Loki
Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]