Fix PR tree-optimization/41857 (Re: m32c support for named addr spaces branch)
Richard Guenther
richard.guenther@gmail.com
Fri Oct 30 09:51:00 GMT 2009
On Thu, Oct 29, 2009 at 7:29 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
>> On Tue, Oct 27, 2009 at 7:22 PM, Ulrich Weigand <uweigand@de.ibm.com> wrote:
>> > I now seem to be running into a related problem with __ea on SPU, in a
>> > loop where the loop optimizer decides to use a DImode IV to implement
>> > a 64-bit __ea pointer.
>> >
>> > This causes create_mem_ref to be called with an affine combination
>> > that contains only a single value, of 64-bit integral type.
>> >
>> > Because this is not a pointer type, addr_to_parts interprets this
>> > value as "index", not "base". (I'm not quite sure I understand
>> > how this is intended to work anyway: it seems to me that loop IVs
>> > use (nearly?) always integral types. Why does move_pointer_to_base
>> > check for pointer type then?)
>
> I've now opened PR 41857 for this particular problem.
>
> The core problem in this case is that pointers may be converted to
> sizetype by loop IV optimizations. This is particularly aggravated
> by the fact that create_mem_ref has a hard time determining which
> of the variables in the affine expression it gets as input
> represents the base of the reference.
>
> The heuristics used to determine this looks for variables of pointer
> type. However, due to problems related to undefined overflow, the
> IV optimizations have switched to always using integer types for
> induction variables, even those that represent memory addresses.
> This makes the create_mem_ref heuristics mostly useless in many
> cases ...
>
> However, rewrite_use_address (the call site of create_mem_ref) does
> actually have the information; it knows whether the IV candidate
> is based on a memory object. The information is simply not passed
> to create_mem_ref. The patch below changes this by adding a new
> BASE_HINT argument to create_mem_ref, which is set to the IV that
> is to be used as base of the reference, if any.
>
> create_mem_ref and its subroutines then use this variable (casted
> to the appropriate pointer type) as base.
>
> As a side effect, the patch also fixes a FIXME in
> most_expensive_mult_to_index.
>
> While not a complete fix to the general sizetype problem, this
> patch fixes the ICE-on-valid bug on the testcase in the PR, and
> it seems a good idea to properly identify the base anyway; this
> is needed (or at least beneficial) on some platforms.
>
> Tested on s390x-ibm-linux and spu-elf.
> OK for mainline?
This looks good to me, though I'd like Zdenek to have a 2nd look.
Thanks,
Richard.
> Bye,
> Ulrich
>
>
> ChangeLog:
>
> gcc/
> PR tree-optimization/41857
> * tree-flow.h (rewrite_use_address): Add BASE_HINT argument.
> * tree-ssa-loop-ivopts.c (rewrite_use_address): Pass base hint
> to create_mem_ref.
> * tree-ssa-address.c (move_hint_to_base): New function.
> (most_expensive_mult_to_index): Add TYPE argument. Use mode and
> address space associated with TYPE.
> (addr_to_parts): Add TYPE and BASE_HINT arguments. Pass TYPE to
> most_expensive_mult_to_index. Call move_hint_to_base.
> (create_mem_ref): Add BASE_HINT argument. Pass BASE_HINT and
> TYPE to addr_to_parts.
>
> gcc/testsuite/
> PR tree-optimization/41857
> * gcc.target/spu/ea/pr41857.c: New file.
>
>
> Index: gcc/tree-ssa-loop-ivopts.c
> ===================================================================
> *** gcc/tree-ssa-loop-ivopts.c (revision 153683)
> --- gcc/tree-ssa-loop-ivopts.c (working copy)
> *************** rewrite_use_address (struct ivopts_data
> *** 5510,5515 ****
> --- 5510,5516 ----
> {
> aff_tree aff;
> gimple_stmt_iterator bsi = gsi_for_stmt (use->stmt);
> + tree base_hint = NULL_TREE;
> tree ref;
> bool ok;
>
> *************** rewrite_use_address (struct ivopts_data
> *** 5517,5523 ****
> gcc_assert (ok);
> unshare_aff_combination (&aff);
>
> ! ref = create_mem_ref (&bsi, TREE_TYPE (*use->op_p), &aff, data->speed);
> copy_ref_info (ref, *use->op_p);
> *use->op_p = ref;
> }
> --- 5518,5539 ----
> gcc_assert (ok);
> unshare_aff_combination (&aff);
>
> ! /* To avoid undefined overflow problems, all IV candidates use unsigned
> ! integer types. The drawback is that this makes it impossible for
> ! create_mem_ref to distinguish an IV that is based on a memory object
> ! from one that represents simply an offset.
> !
> ! To work around this problem, we pass a hint to create_mem_ref that
> ! indicates which variable (if any) in aff is an IV based on a memory
> ! object. Note that we only consider the candidate. If this is not
> ! based on an object, the base of the reference is in some subexpression
> ! of the use -- but these will use pointer types, so they are recognized
> ! by the create_mem_ref heuristics anyway. */
> ! if (cand->iv->base_object)
> ! base_hint = var_at_stmt (data->current_loop, cand, use->stmt);
> !
> ! ref = create_mem_ref (&bsi, TREE_TYPE (*use->op_p), &aff, base_hint,
> ! data->speed);
> copy_ref_info (ref, *use->op_p);
> *use->op_p = ref;
> }
> Index: gcc/tree-ssa-address.c
> ===================================================================
> *** gcc/tree-ssa-address.c (revision 153683)
> --- gcc/tree-ssa-address.c (working copy)
> *************** move_fixed_address_to_symbol (struct mem
> *** 392,397 ****
> --- 392,424 ----
> aff_combination_remove_elt (addr, i);
> }
>
> + /* If ADDR contains an instance of BASE_HINT, move it to PARTS->base. */
> +
> + static void
> + move_hint_to_base (tree type, struct mem_address *parts, tree base_hint,
> + aff_tree *addr)
> + {
> + unsigned i;
> + tree val = NULL_TREE;
> +
> + for (i = 0; i < addr->n; i++)
> + {
> + if (!double_int_one_p (addr->elts[i].coef))
> + continue;
> +
> + val = addr->elts[i].val;
> + if (operand_equal_p (val, base_hint, 0))
> + break;
> + }
> +
> + if (i == addr->n)
> + return;
> +
> + /* Cast value to appropriate pointer type. */
> + parts->base = fold_convert (build_pointer_type (type), val);
> + aff_combination_remove_elt (addr, i);
> + }
> +
> /* If ADDR contains an address of a dereferenced pointer, move it to
> PARTS->base. */
>
> *************** add_to_parts (struct mem_address *parts,
> *** 453,461 ****
> element(s) to PARTS. */
>
> static void
> ! most_expensive_mult_to_index (struct mem_address *parts, aff_tree *addr,
> ! bool speed)
> {
> HOST_WIDE_INT coef;
> double_int best_mult, amult, amult_neg;
> unsigned best_mult_cost = 0, acost;
> --- 480,490 ----
> element(s) to PARTS. */
>
> static void
> ! most_expensive_mult_to_index (tree type, struct mem_address *parts,
> ! aff_tree *addr, bool speed)
> {
> + addr_space_t as = TYPE_ADDR_SPACE (type);
> + enum machine_mode address_mode = targetm.addr_space.address_mode (as);
> HOST_WIDE_INT coef;
> double_int best_mult, amult, amult_neg;
> unsigned best_mult_cost = 0, acost;
> *************** most_expensive_mult_to_index (struct mem
> *** 469,483 ****
> if (!double_int_fits_in_shwi_p (addr->elts[i].coef))
> continue;
>
> - /* FIXME: Should use the correct memory mode rather than Pmode. */
> -
> coef = double_int_to_shwi (addr->elts[i].coef);
> if (coef == 1
> ! || !multiplier_allowed_in_address_p (coef, Pmode,
> ! ADDR_SPACE_GENERIC))
> continue;
>
> ! acost = multiply_by_cost (coef, Pmode, speed);
>
> if (acost > best_mult_cost)
> {
> --- 498,509 ----
> if (!double_int_fits_in_shwi_p (addr->elts[i].coef))
> continue;
>
> coef = double_int_to_shwi (addr->elts[i].coef);
> if (coef == 1
> ! || !multiplier_allowed_in_address_p (coef, TYPE_MODE (type), as))
> continue;
>
> ! acost = multiply_by_cost (coef, address_mode, speed);
>
> if (acost > best_mult_cost)
> {
> *************** most_expensive_mult_to_index (struct mem
> *** 520,527 ****
> parts->step = double_int_to_tree (sizetype, best_mult);
> }
>
> ! /* Splits address ADDR into PARTS.
> !
> TODO -- be more clever about the distribution of the elements of ADDR
> to PARTS. Some architectures do not support anything but single
> register in address, possibly with a small integer offset; while
> --- 546,555 ----
> parts->step = double_int_to_tree (sizetype, best_mult);
> }
>
> ! /* Splits address ADDR for a memory access of type TYPE into PARTS.
> ! If BASE_HINT is non-NULL, it specifies an SSA name to be used
> ! preferentially as base of the reference.
> !
> TODO -- be more clever about the distribution of the elements of ADDR
> to PARTS. Some architectures do not support anything but single
> register in address, possibly with a small integer offset; while
> *************** most_expensive_mult_to_index (struct mem
> *** 530,536 ****
> addressing modes is useless. */
>
> static void
> ! addr_to_parts (aff_tree *addr, struct mem_address *parts, bool speed)
> {
> tree part;
> unsigned i;
> --- 558,565 ----
> addressing modes is useless. */
>
> static void
> ! addr_to_parts (tree type, aff_tree *addr, tree base_hint,
> ! struct mem_address *parts, bool speed)
> {
> tree part;
> unsigned i;
> *************** addr_to_parts (aff_tree *addr, struct me
> *** 550,561 ****
>
> /* First move the most expensive feasible multiplication
> to index. */
> ! most_expensive_mult_to_index (parts, addr, speed);
>
> /* Try to find a base of the reference. Since at the moment
> there is no reliable way how to distinguish between pointer and its
> offset, this is just a guess. */
> ! if (!parts->symbol)
> move_pointer_to_base (parts, addr);
>
> /* Then try to process the remaining elements. */
> --- 579,592 ----
>
> /* First move the most expensive feasible multiplication
> to index. */
> ! most_expensive_mult_to_index (type, parts, addr, speed);
>
> /* Try to find a base of the reference. Since at the moment
> there is no reliable way how to distinguish between pointer and its
> offset, this is just a guess. */
> ! if (!parts->symbol && base_hint)
> ! move_hint_to_base (type, parts, base_hint, addr);
> ! if (!parts->symbol && !parts->base)
> move_pointer_to_base (parts, addr);
>
> /* Then try to process the remaining elements. */
> *************** gimplify_mem_ref_parts (gimple_stmt_iter
> *** 592,604 ****
>
> tree
> create_mem_ref (gimple_stmt_iterator *gsi, tree type, aff_tree *addr,
> ! bool speed)
> {
> tree mem_ref, tmp;
> tree atype;
> struct mem_address parts;
>
> ! addr_to_parts (addr, &parts, speed);
> gimplify_mem_ref_parts (gsi, &parts);
> mem_ref = create_mem_ref_raw (type, &parts);
> if (mem_ref)
> --- 623,635 ----
>
> tree
> create_mem_ref (gimple_stmt_iterator *gsi, tree type, aff_tree *addr,
> ! tree base_hint, bool speed)
> {
> tree mem_ref, tmp;
> tree atype;
> struct mem_address parts;
>
> ! addr_to_parts (type, addr, base_hint, &parts, speed);
> gimplify_mem_ref_parts (gsi, &parts);
> mem_ref = create_mem_ref_raw (type, &parts);
> if (mem_ref)
> Index: gcc/tree-flow.h
> ===================================================================
> *** gcc/tree-flow.h (revision 153683)
> --- gcc/tree-flow.h (working copy)
> *************** struct mem_address
> *** 921,927 ****
>
> struct affine_tree_combination;
> tree create_mem_ref (gimple_stmt_iterator *, tree,
> ! struct affine_tree_combination *, bool);
> rtx addr_for_mem_ref (struct mem_address *, addr_space_t, bool);
> void get_address_description (tree, struct mem_address *);
> tree maybe_fold_tmr (tree);
> --- 921,927 ----
>
> struct affine_tree_combination;
> tree create_mem_ref (gimple_stmt_iterator *, tree,
> ! struct affine_tree_combination *, tree, bool);
> rtx addr_for_mem_ref (struct mem_address *, addr_space_t, bool);
> void get_address_description (tree, struct mem_address *);
> tree maybe_fold_tmr (tree);
> *** /dev/null 2009-10-07 18:17:03.469375344 +0200
> --- gcc/testsuite/gcc.target/spu/ea/pr41857.c 2009-10-29 14:11:18.000000000 +0100
> ***************
> *** 0 ****
> --- 1,29 ----
> + /* Copyright (C) 2009 Free Software Foundation, Inc.
> +
> + This file is free software; you can redistribute it and/or modify it under
> + the terms of the GNU General Public License as published by the Free
> + Software Foundation; either version 3 of the License, or (at your option)
> + any later version.
> +
> + This file is distributed in the hope that it will be useful, but WITHOUT
> + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License
> + for more details.
> +
> + You should have received a copy of the GNU General Public License
> + along with this file; see the file COPYING3. If not see
> + <http://www.gnu.org/licenses/>. */
> +
> + /* { dg-do compile } */
> +
> + __ea char *strchr_ea (__ea const char *s, int c);
> + __ea char *foo (__ea char *s)
> + {
> + __ea char *ret = s;
> + int i;
> +
> + for (i = 0; i < 3; i++)
> + ret = strchr_ea (ret, s[i]);
> +
> + return ret;
> + }
>
>
> --
> Dr. Ulrich Weigand
> GNU Toolchain for Linux on System z and Cell BE
> Ulrich.Weigand@de.ibm.com
>
More information about the Gcc-patches
mailing list