This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Repost: RFA [4.1]: improvement to if-conversion and cross-jumping (PR20070)


When I mentioned on the mailing list that I was going to help review the CFO branch since we (Analog Devices) are interested in code size optimizations, people on IRC pointed me at this patch as a potentially better starting point, so I'll try to review it first. Here's my first pass of comments. More to come, probably, once I've played with the patch some more and understand it better.

>       * ifcvt.c (noce_try_complex_cmove): New function.
>       (noce_process_if_block): Call it.
>       For flag_expensive_optimizations, update cond_exec_changed_p on
>       success.
>       (rest_of_handle_if_conversion): For flag_expensive_optimizations,
>       provide if_convert with register lifeness info.

Can we postpone the ifcvt changes? They look quite straightforward, but it seems like they are independent of the rest. Once we've done the heavy lifting to get the first part in, we can revisit this.

One general comment: You're matching entire pieces of RTL. Wouldn't it be possible/simpler to match insn_codes, extract the operands, and then just look at those? I have in mind something that looks a bit like parts of regrename.

I was wondering about the use of the find_reg_equal_equiv_note - lots of code seems to use that function, but do we really want to mess with REG_EQUIV notes?

Like Stephen B., I'd like to see this moved to a separate file; then you can also lose the struct_equiv_ prefix from most function names and make them more meaningful.

> +/* In cfgcleanup.c */

Needs updating when moving the code.

> +/* A struct equiv_info is used to pass information to struct_equiv and
> +   to gather state while two basic blocks are checked for structural
> +   equivalence.

Like Stephen, I'd like to see the rest of this comment distributed so that each structure member has its own comment in front of it. See also below - I'd like to see a struct_equiv_checkpoint used inside this monster struct to keep track of current state. Further improvements might be to split off those members which track state, and others which are just inputs set by the callers of the struct_equiv pass ("struct struct_equiv_params"?). Such a split could either result in two structures, or just two clearly separated areas inside the same structure. There should be some kind of logical grouping instead of just a collection of members.

> + X_BLOCK and Y_BLOCK are the two block being compared.

"blocks"

> +   NEED_FULL_BLOCK is set when the entire basic blocks have to match, as
> +   in if-conversion.  It is cleared for cross-jumping.

Might be better as an arg to struct_equiv_block_eq or part of a params struct, since it's only needed in the toplevel function. Anything to get this struct here down in size...

> +  bool local_rvalue[STRUCT_EQUIV_MAX_LOCAL];
> +  rtx x_local[STRUCT_EQUIV_MAX_LOCAL], y_local[STRUCT_EQUIV_MAX_LOCAL];

Might want to turn this into an array of structs. Then we could also make STRUCT_EQUIV_MAX_LOCAL a runtime param more easily.

> +static int
> +assign_reg_reg_set (regset set, rtx reg, int value)
> +{
> + unsigned regno = REGNO (reg);
> + int nregs, i, old;
> +
> + if (regno >= FIRST_PSEUDO_REGISTER)
> + {
> + if (reload_completed)
> + regno = reg_renumber[regno];
> + else
> + {
> + if ((value != 0) == REGNO_REG_SET_P (set, regno))
> + return 0;
> + if (value)
> + SET_REGNO_REG_SET (set, regno);
> + else
> + CLEAR_REGNO_REG_SET (set, regno);
> + return value ? 1 : -1;
> + }
> + }
> + old = 0;
> + for (i = nregs = hard_regno_nregs[regno][GET_MODE (reg)]; --i >= 0; regno++)
> + {
> + old += REGNO_REG_SET_P (set, regno);
> + if (value)
> + SET_REGNO_REG_SET (set, regno);
> + else
> + CLEAR_REGNO_REG_SET (set, regno);
> + }
> + return value ? nregs - old : -old;
> +}


Looks slightly obfuscated to me with the special case for pseudos. Please combine the two loops into one by setting a NREGS variable to 1 for pseudos.

> +   RVALUE == 0: destination
> +   RVALUE == 1: source
> +   RVALUE == -1: source, ignore SET_DEST of SET / clobber.  */

Better an enum.

> + it already registerd. */

"registered"

> + change *= IMPOSSIBLE_MOVE_FACTOR;

Just an assignment would be enough? Wouldn't like this to overflow, and I don't think we need different grades of impossible.

> +/* Do only the struct_equiv SET_DEST processing for SETs and CLOBBERs. */
> +static bool
> +struct_equiv_set (rtx x, rtx y, struct equiv_info *info)


Not too happy with some of these function names. "set_dest_equiv_p"?
Should mention return value in comment. The rest of the description could also be enlarged to describe the necessary details of the algorithm to understand why we need this function.


> +/* Record state about current inputs / local registers / lifeness
> +   in CHECKPOINT[N] of INFO.  */
> +static void
> +struct_equiv_make_checkpoint (int n, struct equiv_info *info)
> +{
> +  struct struct_equiv_checkpoint *p = &info->checkpoint[n];
> +
> +  p->ninsns = info->ninsns;
> +  p->version = info->version;
> +  p->local_count = info->local_count;
> +  p->input_count = info->input_count;
> +  p->x_start = info->x_start;
> +  p->y_start = info->y_start;
> +  p->input_valid = info->input_valid;
> +}

Seems like you could use a checkpoint structure inside struct equiv_info and do a struct copy here. That would give struct equiv_info a bit more
structure.


> +/* Restore state about current inputs / local registers / lifeness
> +   from CHECKPOINT[N] of INFO.  */
> +static void
> +struct_equiv_restore_checkpoint (int n, struct equiv_info *info)
> +{
> +  struct struct_equiv_checkpoint *p = &info->checkpoint[n];
> +
> +  info->ninsns = p->ninsns;
> +  info->x_start = p->x_start;
> +  info->y_start = p->y_start;
> +  info->input_count = p->input_count;
> +  info->input_valid = p->input_valid;
> +  while (info->local_count > p->local_count)
> +    {
> +      info->local_count--;
> +      info->version--;
> +      if (REGNO_REG_SET_P (info->x_local_live,
> +                        REGNO (info->x_local[info->local_count])))
> +     {
> +       assign_reg_reg_set (info->x_local_live,
> +                           info->x_local[info->local_count], 0);
> +       assign_reg_reg_set (info->y_local_live,
> +                           info->y_local[info->local_count], 0);
> +       info->version--;
> +     }
> +    }

I'm not sure I understand what's going on here. This goes back and gets reg numbers out of x_local and y_local, indexed by local_count. For this to work, these arrays must always (except for in this function) grow and never shrink - correct?
But there's code after the comment
/* This might be a register local to each block. See if we have
it already registerd. */
which seems to shrink the arrays and overwrite entries. How does this work together?


> + if (info->version != p->version)
> + info->need_rerun = true;
> +}
> +
> +/* Check if the set of REG_DEAD notes attached to I1 and I2 allows us to
> + go ahead with merging I1 and I2, which otherwise look fine.
> + Inputs / local registers for the inputs of I1 and I2 have already been
> + set up. */
> +static bool
> +struct_equiv_death (rtx i1 ATTRIBUTE_UNUSED, rtx i2 ATTRIBUTE_UNUSED,
> + struct equiv_info *info ATTRIBUTE_UNUSED)


"death_notes_match_p"?

> +{
> +#ifdef STACK_REGS
> + /* If cross_jump_death_matters is not 0, the insn's mode
> + indicates whether or not the insn contains any stack-like regs. */
> +
> + if (INSN_P (i1)
> + && (info->mode & CLEANUP_POST_REGSTACK) && stack_regs_mentioned (i1))


Is this bit always set after regstack, or do we need an extra test such as reload_completed?

> +/* Return number of matched insns. */

Matched how? I'd like a human-readable overview of the algorithm somewhere.

> +/* This function must be called up to three times for a successful cross-jump
> + match:
> + first to find out which instructions do match. While trying to match
> + another instruction that doesn't match, we destroy information in info
> + about the actual inputs. So if there have been any before the last
> + match attempt, we need to callthis function again to recompute the


"call this"

> +   actual inputs up to the actual start of the matching sequence.
> +   When we are then satisfied that the cross-jump is worth while, we

"worthwhile"

> + if (info->x_start != x_stop) for (;;)

Formatting.

> + /* The following code helps take care of G++ cleanups. */

Comment not helpful. How does it help? What's special about G++ cleanups?


Bernd



Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]