This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH: PR/40314] extract common address for memory access to fields of large struct


On Sun, May 31, 2009 at 11:12 AM, Carrot Wei <carrot@google.com> wrote:
> ChangeLog:
> 2009-05-31 ?Carrot Wei ?<carrot@google.com>
>
> ? ? ? ?PR/40314
> ? ? ? ?* config/arm/arm.c: add functions arm_address_offset_range,
> ? ? ? ?thumb1_address_offset_range, thumb2_address_offset_range,
> ? ? ? ?arm_address_offset_range_1.
> ? ? ? ?* config/arm/arm-protos.h: add prototype of arm_address_offset_range.
> ? ? ? ?* doc/tm.texi: add entry for TARGET_ADDRESS_OFFSET_RANGE.
> ? ? ? ?* Makefile.in: add rule for extract_common_addr.
> ? ? ? ?* target.h: add a hook function get_address_offset_range.
> ? ? ? ?* target-def.h: add a hook function TARGET_ADDRESS_OFFSET_RANGE.
> ? ? ? ?* tree-pass.h: add a pass_extract_common_addr.
> ? ? ? ?* passes.c: add a pass_extract_common_addr.
> ? ? ? ?* extract-common-addr.c: implementation of the new
> ? ? ? ?pass_extract_common_addr.
>
> Test:
> This patch was applied to trunk GCC and tested on the arm emulator with newlib.
> No new failures.

And what does it do for code size?  Compile time?  Do you know CSiBE?
It's a nice benchmark to check those two things, too.


And then some random comments:

First, it would be nice to have some general description at the top of
the file of how the algorithm works.

It looks to me like you're collecting base+offset per basic block (so
the optimization is local) and then process the recorded information
globally.  Why not record-and-process one basic block at a time?

Perhaps you can add some comments in process_one_base in the part with
the fuzzy math?  ;-)  I got all confused by how you calculate the new
base+offset and alignment.


> +/* Given a register rtx, look up the offset_table. If we found it and they
> + ? have same machine mode returns the pointer to the offset information.
> + ? Otherwise returns NULL. ?*/
> +
> +static inline struct offset_info *
> +find_reg (rtx x)

A more descriptive function name, perhaps?  Everyone else also calls
their functions find_reg already... :-)


> +/* If x is MEM, and the address is in the form of
> + ? (plus base_reg offset_reg)
> + ? remove the corresponding item from offset table and
> + ? insert it into a base_reg_info object. */

Describe the return value of the function too, please.  Just saying
it's called via for_each_rtx is enough.
And again, a more descriptive function name, perhaps?  Explain what is
passed in DATA.


> +static int
> +find_mem (rtx x, void *data)
> +{
(...)
> + ?/* If the offset is not large enough, ignore it. ?*/
> + ?targetm.get_address_offset_range (GET_MODE (x), reg1,
> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?&min_offset, &max_offset, &alignment);
> + ?if (offset_item->offset >= min_offset
> + ? ? ?&& offset_item->offset <= max_offset)
> + ? ?return -1;
> +
> + ?/* If the base register and offset register don't have the same
> + ? ? machine mode, ignore this offset. ?*/
> + ?if (GET_MODE (reg1) != offset_item->mode)
> + ? ?return -1;

Move this second check before the targetm.get_address_offset_range check?


> + ?/* Look up the corresponding base register information.
> + ? ? If there is no one, create it. ?*/
> + ?base_reg_item = &base_table[REGNO (reg1)];
> + ?if (base_reg_item->offsets == NULL)
> + ? ?{
> + ? ? ?base_reg_item->mode = GET_MODE (reg1);
> + ? ? ?base_reg_item->base_regno = REGNO (reg1);
> + ? ? ?base_reg_item->base_reg = reg1;
> + ? ?}
> +
> + ?/* Associate this offset information with the base information. ?*/
> + ?offset_item->mem = x;
> + ?offset_item->mem_insn = (rtx) data;
> + ?offset_item->next = base_reg_item->offsets;
> + ?base_reg_item->offsets = offset_item;

Why collect them in a list here when you later need the offsets in an array?


> +/* If a reg is clobbered in insn, remove it from offset table. ?*/
> +
> +static void
> +clobber_insn_reg (rtx insn)
> +{

Can write this and clobber_reg by walking over DF_INSN_DEFS looking
for refs with DF_REF_MUST_CLOBBER, and with DF_REF_REAL_REG.


> +/* If this insn is in the form of
> + ? (set reg offset)
> + ? insert this offset info into offset table. ?*/
> +
> +static void
> +find_offset (rtx insn)
> +{
> + ?rtx src, dest;
> + ?unsigned int regno;
> + ?HOST_WIDE_INT offset_val;
> + ?struct offset_info *offset;
> + ?rtx x = single_set (insn);
> + ?if (!x)
> + ? ?return;
> +
> + ?dest = SET_DEST (x);
> + ?if (!REG_P (dest))
> + ? ?return;
> + ?regno = REGNO (dest);
> + ?if (regno < FIRST_PSEUDO_REGISTER)
> + ? ?return;

if (HARD_REGISTER_NUM_P (regno))
  return;

Same with other FIRST_PSEUDO_REGISTER.  You should use
HARD_REGISTER_NUM_P and HARD_REGISTER_P instead of comparing against
FIRST_PSEUDO_REGISTER.


> + ?src = SET_SRC (x);
> + ?if (GET_CODE (src) != CONST_INT)
> + ? ?return;
> + ?offset_val = INTVAL (src);
> + ?if (offset_val <= 0)
> + ? ?return;
> +
> + ?offset = (struct offset_info *) pool_alloc (offset_element_pool);
> + ?offset->mem = NULL;
> + ?offset->offset_regno = regno;
> + ?offset->mode = GET_MODE (dest);
> + ?offset->offset = offset_val;
> + ?offset->next = NULL;
> + ?offset_table[regno] = offset;

Maybe pool_free the existing offset_table[regno] entry?  Or are there
still base_reg_items that can point to that entry?  Or are you sure
that offset_table[regno] is always NULL at this point?  If the latter,
maybe gcc_assert that?


> +/* Look for optimization opportunities. ?*/
> +
> +static void
> +collect_candidates (void)
> +{
> + ?basic_block block;
> +
> + ?FOR_EACH_BB (block)
> + ? ?{
> + ? ? ?rtx insn;
> + ? ? ?memset (offset_table, 0, sizeof (struct offset_info *) * max_reg_num ());

You may want to experiment with something to track the registers for
which you have recorded something in the offset_table.  I think the
number of offsets you find is generally very small compared to
max_reg_num().  It may speed things up a bit if you keep a bitmap of
registers that you recorded something for and only clean those
entries, instead of using memset.


> +}
> +
> +/* Optimize a list of memory addressing relative to one base. ?*/
> +static void
> +process_one_base (struct base_reg_info *base_reg)
> +{

Describe the function arguments please, always.

> + ?HOST_WIDE_INT max_offset, min_offset;
> + ?HOST_WIDE_INT base_offset, max_base;
> + ?int alignment, new_alignment;
> +
> + ?struct offset_info **offset_array;
> + ?struct offset_info *offset = base_reg->offsets;
> + ?int count = 0;
> + ?int i, group_count;
> +
> + ?/* Count the number of memory accesses using the same base register. */
> + ?while (offset)
> + ? ?{
> + ? ? ?count++;
> + ? ? ?offset = offset->next;
> + ? ?}
> + ?if (count < 2)
> + ? ?return;
> +
> + ?/* Sort the memory accesses according to their offset. ?*/
> + ?offset_array = XNEWVEC (struct offset_info *, count);
> + ?offset = base_reg->offsets;
> + ?for (i = 0; i < count; i++)
> + ? ?{
> + ? ? ?offset_array[i] = offset;
> + ? ? ?offset = offset->next;
> + ? ?}
> + ?qsort (offset_array, count, sizeof(struct offset_info *), compare_offset);

So you have recorded the offsets in the list base_reg->offsets.  But
you want to know the number of offsets of a base reg and you want to
sort the list.  So you walk the list to count and the stuff the list
into an array to sort it.

Why not record the offsets into an array straight away, then?  Looks
like you could better use a VEC instead, e.g.:

typedef struct offset_info *offset_info_t
DEF_VEC_P(offset_info_t);
DEF_VEC_ALLOC_P(offset_info_t, heap);

/* Memory base address information.  */
struct base_reg_info
{
  /* All offset information used with this base register.  */
  VEC(offset_info_t, heap) *offsets;
(...)
};

and use VEC_safe_push to record the offset_info entries.  You can use
VEC_length to get the number of memory accesses using the same base
register.  You can qsort a VEC in place, grep for "qsort (VEC_address"
for examples.


> +static unsigned int
> +rest_of_handle_extract_common_addr (void)
> +{
> + ?int i;
> + ?int reg_count = max_reg_num();
> + ?offset_table = XNEWVEC (struct offset_info *, reg_count);
> + ?base_table = XNEWVEC (struct base_reg_info, reg_count);
> + ?memset (offset_table, 0, sizeof (struct offset_info *) * reg_count);
> + ?memset (base_table, 0, sizeof (struct base_reg_info) * reg_count);

XCNEWVEC instead of XNEWVEC+memset.


> +struct rtl_opt_pass pass_extract_common_addr =
> +{
> + {
> + ?RTL_PASS,
> + ?"extract_common_addr", ? ? ? ? ? ? ? ?/* name */
> + ?gate_handle_common_addr, ? ? ? ? ? ? ?/* gate */
> + ?rest_of_handle_extract_common_addr, ? /* execute */
> + ?NULL, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? /* sub */
> + ?NULL, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? /* next */
> + ?0, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?/* static_pass_number */
> + ?TV_NONE, ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?/* tv_id */

New pass needs a new timevar.


Ciao!
Steven


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]