Fix PR tree-optimization/41857 (Re: m32c support for named addr spaces branch)

Richard Guenther richard.guenther@gmail.com
Fri Oct 30 09:51:00 GMT 2009


On Thu, Oct 29, 2009 at 7:29 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
>> On Tue, Oct 27, 2009 at 7:22 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
>> > I now seem to be running into a related problem with __ea on SPU, in a
>> > loop where the loop optimizer decides to use a DImode IV to implement
>> > a 64-bit __ea pointer.
>> >
>> > This causes create_mem_ref to be called with an affine combination
>> > that contains only a single value, of 64-bit integral type.
>> >
>> > Because this is not a pointer type, addr_to_parts interprets this
>> > value as "index", not "base".  (I'm not quite sure I understand
>> > how this is intended to work anyway: it seems to me that loop IVs
>> > use (nearly?) always integral types.  Why does move_pointer_to_base
>> > check for pointer type then?)
>
> I've now opened PR 41857 for this particular problem.
>
> The core problem in this case is that pointers may be converted to
> sizetype by loop IV optimizations.  This is particularly aggravated
> by the fact that create_mem_ref has a hard time determining which
> of the variables in the affine expression it gets as input
> represents the base of the reference.
>
> The heuristics used to determine this looks for variables of pointer
> type.  However, due to problems related to undefined overflow, the
> IV optimizations have switched to always using integer types for
> induction variables, even those that represent memory addresses.
> This makes the create_mem_ref heuristics mostly useless in many
> cases ...
>
> However, rewrite_use_address (the call site of create_mem_ref) does
> actually have the information; it knows whether the IV candidate
> is based on a memory object.  The information is simply not passed
> to create_mem_ref.  The patch below changes this by adding a new
> BASE_HINT argument to create_mem_ref, which is set to the IV that
> is to be used as base of the reference, if any.
>
> create_mem_ref and its subroutines then use this variable (casted
> to the appropriate pointer type) as base.
>
> As a side effect, the patch also fixes a FIXME in
> most_expensive_mult_to_index.
>
> While not a complete fix to the general sizetype problem, this
> patch fixes the ICE-on-valid bug on the testcase in the PR, and
> it seems a good idea to properly identify the base anyway; this
> is needed (or at least beneficial) on some platforms.
>
> Tested on s390x-ibm-linux and spu-elf.
> OK for mainline?

This looks good to me, though I'd like Zdenek to have a 2nd look.

Thanks,
Richard.

> Bye,
> Ulrich
>
>
> ChangeLog:
>
> gcc/
>        PR tree-optimization/41857
>        * tree-flow.h (rewrite_use_address): Add BASE_HINT argument.
>        * tree-ssa-loop-ivopts.c (rewrite_use_address): Pass base hint
>        to create_mem_ref.
>        * tree-ssa-address.c (move_hint_to_base): New function.
>        (most_expensive_mult_to_index): Add TYPE argument.  Use mode and
>        address space associated with TYPE.
>        (addr_to_parts): Add TYPE and BASE_HINT arguments.  Pass TYPE to
>        most_expensive_mult_to_index.  Call move_hint_to_base.
>        (create_mem_ref): Add BASE_HINT argument.  Pass BASE_HINT and
>        TYPE to addr_to_parts.
>
> gcc/testsuite/
>        PR tree-optimization/41857
>        * gcc.target/spu/ea/pr41857.c: New file.
>
>
> Index: gcc/tree-ssa-loop-ivopts.c
> ===================================================================
> *** gcc/tree-ssa-loop-ivopts.c  (revision 153683)
> --- gcc/tree-ssa-loop-ivopts.c  (working copy)
> *************** rewrite_use_address (struct ivopts_data
> *** 5510,5515 ****
> --- 5510,5516 ----
>  {
>    aff_tree aff;
>    gimple_stmt_iterator bsi = gsi_for_stmt (use->stmt);
> +   tree base_hint = NULL_TREE;
>    tree ref;
>    bool ok;
>
> *************** rewrite_use_address (struct ivopts_data
> *** 5517,5523 ****
>    gcc_assert (ok);
>    unshare_aff_combination (&aff);
>
> !   ref = create_mem_ref (&bsi, TREE_TYPE (*use->op_p), &aff, data->speed);
>    copy_ref_info (ref, *use->op_p);
>    *use->op_p = ref;
>  }
> --- 5518,5539 ----
>    gcc_assert (ok);
>    unshare_aff_combination (&aff);
>
> !   /* To avoid undefined overflow problems, all IV candidates use unsigned
> !      integer types.  The drawback is that this makes it impossible for
> !      create_mem_ref to distinguish an IV that is based on a memory object
> !      from one that represents simply an offset.
> !
> !      To work around this problem, we pass a hint to create_mem_ref that
> !      indicates which variable (if any) in aff is an IV based on a memory
> !      object.  Note that we only consider the candidate.  If this is not
> !      based on an object, the base of the reference is in some subexpression
> !      of the use -- but these will use pointer types, so they are recognized
> !      by the create_mem_ref heuristics anyway.  */
> !   if (cand->iv->base_object)
> !     base_hint = var_at_stmt (data->current_loop, cand, use->stmt);
> !
> !   ref = create_mem_ref (&bsi, TREE_TYPE (*use->op_p), &aff, base_hint,
> !                       data->speed);
>    copy_ref_info (ref, *use->op_p);
>    *use->op_p = ref;
>  }
> Index: gcc/tree-ssa-address.c
> ===================================================================
> *** gcc/tree-ssa-address.c      (revision 153683)
> --- gcc/tree-ssa-address.c      (working copy)
> *************** move_fixed_address_to_symbol (struct mem
> *** 392,397 ****
> --- 392,424 ----
>    aff_combination_remove_elt (addr, i);
>  }
>
> + /* If ADDR contains an instance of BASE_HINT, move it to PARTS->base.  */
> +
> + static void
> + move_hint_to_base (tree type, struct mem_address *parts, tree base_hint,
> +                  aff_tree *addr)
> + {
> +   unsigned i;
> +   tree val = NULL_TREE;
> +
> +   for (i = 0; i < addr->n; i++)
> +     {
> +       if (!double_int_one_p (addr->elts[i].coef))
> +       continue;
> +
> +       val = addr->elts[i].val;
> +       if (operand_equal_p (val, base_hint, 0))
> +       break;
> +     }
> +
> +   if (i == addr->n)
> +     return;
> +
> +   /* Cast value to appropriate pointer type.  */
> +   parts->base = fold_convert (build_pointer_type (type), val);
> +   aff_combination_remove_elt (addr, i);
> + }
> +
>  /* If ADDR contains an address of a dereferenced pointer, move it to
>     PARTS->base.  */
>
> *************** add_to_parts (struct mem_address *parts,
> *** 453,461 ****
>     element(s) to PARTS.  */
>
>  static void
> ! most_expensive_mult_to_index (struct mem_address *parts, aff_tree *addr,
> !                             bool speed)
>  {
>    HOST_WIDE_INT coef;
>    double_int best_mult, amult, amult_neg;
>    unsigned best_mult_cost = 0, acost;
> --- 480,490 ----
>     element(s) to PARTS.  */
>
>  static void
> ! most_expensive_mult_to_index (tree type, struct mem_address *parts,
> !                             aff_tree *addr, bool speed)
>  {
> +   addr_space_t as = TYPE_ADDR_SPACE (type);
> +   enum machine_mode address_mode = targetm.addr_space.address_mode (as);
>    HOST_WIDE_INT coef;
>    double_int best_mult, amult, amult_neg;
>    unsigned best_mult_cost = 0, acost;
> *************** most_expensive_mult_to_index (struct mem
> *** 469,483 ****
>        if (!double_int_fits_in_shwi_p (addr->elts[i].coef))
>        continue;
>
> -       /* FIXME: Should use the correct memory mode rather than Pmode.  */
> -
>        coef = double_int_to_shwi (addr->elts[i].coef);
>        if (coef == 1
> !         || !multiplier_allowed_in_address_p (coef, Pmode,
> !                                              ADDR_SPACE_GENERIC))
>        continue;
>
> !       acost = multiply_by_cost (coef, Pmode, speed);
>
>        if (acost > best_mult_cost)
>        {
> --- 498,509 ----
>        if (!double_int_fits_in_shwi_p (addr->elts[i].coef))
>        continue;
>
>        coef = double_int_to_shwi (addr->elts[i].coef);
>        if (coef == 1
> !         || !multiplier_allowed_in_address_p (coef, TYPE_MODE (type), as))
>        continue;
>
> !       acost = multiply_by_cost (coef, address_mode, speed);
>
>        if (acost > best_mult_cost)
>        {
> *************** most_expensive_mult_to_index (struct mem
> *** 520,527 ****
>    parts->step = double_int_to_tree (sizetype, best_mult);
>  }
>
> ! /* Splits address ADDR into PARTS.
> !
>     TODO -- be more clever about the distribution of the elements of ADDR
>     to PARTS.  Some architectures do not support anything but single
>     register in address, possibly with a small integer offset; while
> --- 546,555 ----
>    parts->step = double_int_to_tree (sizetype, best_mult);
>  }
>
> ! /* Splits address ADDR for a memory access of type TYPE into PARTS.
> !    If BASE_HINT is non-NULL, it specifies an SSA name to be used
> !    preferentially as base of the reference.
> !
>     TODO -- be more clever about the distribution of the elements of ADDR
>     to PARTS.  Some architectures do not support anything but single
>     register in address, possibly with a small integer offset; while
> *************** most_expensive_mult_to_index (struct mem
> *** 530,536 ****
>     addressing modes is useless.  */
>
>  static void
> ! addr_to_parts (aff_tree *addr, struct mem_address *parts, bool speed)
>  {
>    tree part;
>    unsigned i;
> --- 558,565 ----
>     addressing modes is useless.  */
>
>  static void
> ! addr_to_parts (tree type, aff_tree *addr, tree base_hint,
> !              struct mem_address *parts, bool speed)
>  {
>    tree part;
>    unsigned i;
> *************** addr_to_parts (aff_tree *addr, struct me
> *** 550,561 ****
>
>    /* First move the most expensive feasible multiplication
>       to index.  */
> !   most_expensive_mult_to_index (parts, addr, speed);
>
>    /* Try to find a base of the reference.  Since at the moment
>       there is no reliable way how to distinguish between pointer and its
>       offset, this is just a guess.  */
> !   if (!parts->symbol)
>      move_pointer_to_base (parts, addr);
>
>    /* Then try to process the remaining elements.  */
> --- 579,592 ----
>
>    /* First move the most expensive feasible multiplication
>       to index.  */
> !   most_expensive_mult_to_index (type, parts, addr, speed);
>
>    /* Try to find a base of the reference.  Since at the moment
>       there is no reliable way how to distinguish between pointer and its
>       offset, this is just a guess.  */
> !   if (!parts->symbol && base_hint)
> !     move_hint_to_base (type, parts, base_hint, addr);
> !   if (!parts->symbol && !parts->base)
>      move_pointer_to_base (parts, addr);
>
>    /* Then try to process the remaining elements.  */
> *************** gimplify_mem_ref_parts (gimple_stmt_iter
> *** 592,604 ****
>
>  tree
>  create_mem_ref (gimple_stmt_iterator *gsi, tree type, aff_tree *addr,
> !               bool speed)
>  {
>    tree mem_ref, tmp;
>    tree atype;
>    struct mem_address parts;
>
> !   addr_to_parts (addr, &parts, speed);
>    gimplify_mem_ref_parts (gsi, &parts);
>    mem_ref = create_mem_ref_raw (type, &parts);
>    if (mem_ref)
> --- 623,635 ----
>
>  tree
>  create_mem_ref (gimple_stmt_iterator *gsi, tree type, aff_tree *addr,
> !               tree base_hint, bool speed)
>  {
>    tree mem_ref, tmp;
>    tree atype;
>    struct mem_address parts;
>
> !   addr_to_parts (type, addr, base_hint, &parts, speed);
>    gimplify_mem_ref_parts (gsi, &parts);
>    mem_ref = create_mem_ref_raw (type, &parts);
>    if (mem_ref)
> Index: gcc/tree-flow.h
> ===================================================================
> *** gcc/tree-flow.h     (revision 153683)
> --- gcc/tree-flow.h     (working copy)
> *************** struct mem_address
> *** 921,927 ****
>
>  struct affine_tree_combination;
>  tree create_mem_ref (gimple_stmt_iterator *, tree,
> !                    struct affine_tree_combination *, bool);
>  rtx addr_for_mem_ref (struct mem_address *, addr_space_t, bool);
>  void get_address_description (tree, struct mem_address *);
>  tree maybe_fold_tmr (tree);
> --- 921,927 ----
>
>  struct affine_tree_combination;
>  tree create_mem_ref (gimple_stmt_iterator *, tree,
> !                    struct affine_tree_combination *, tree, bool);
>  rtx addr_for_mem_ref (struct mem_address *, addr_space_t, bool);
>  void get_address_description (tree, struct mem_address *);
>  tree maybe_fold_tmr (tree);
> *** /dev/null   2009-10-07 18:17:03.469375344 +0200
> --- gcc/testsuite/gcc.target/spu/ea/pr41857.c   2009-10-29 14:11:18.000000000 +0100
> ***************
> *** 0 ****
> --- 1,29 ----
> + /* Copyright (C) 2009 Free Software Foundation, Inc.
> +
> +    This file is free software; you can redistribute it and/or modify it under
> +    the terms of the GNU General Public License as published by the Free
> +    Software Foundation; either version 3 of the License, or (at your option)
> +    any later version.
> +
> +    This file is distributed in the hope that it will be useful, but WITHOUT
> +    ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +    FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +    for more details.
> +
> +    You should have received a copy of the GNU General Public License
> +    along with this file; see the file COPYING3.  If not see
> +    <http://www.gnu.org/licenses/>.  */
> +
> + /* { dg-do compile } */
> +
> + __ea char *strchr_ea (__ea const char *s, int c);
> + __ea char *foo (__ea char *s)
> + {
> +   __ea char *ret = s;
> +   int i;
> +
> +   for (i = 0; i < 3; i++)
> +     ret = strchr_ea (ret, s[i]);
> +
> +   return ret;
> + }
>
>
> --
>  Dr. Ulrich Weigand
>  GNU Toolchain for Linux on System z and Cell BE
>  Ulrich.Weigand@de.ibm.com
>



More information about the Gcc-patches mailing list