This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR19431 a bit, make PRE remove loads


> On Tue, 17 Apr 2007, Richard Guenther wrote:
>
> >
> > This improves PR19431 by teaching PRE on how to remove trivial loads
like
> > in (from std::min):
> >
> > <bb 2>:
> >   D.2477_6 = a;
> >   D.2478_7 = b;
> >   if (D.2477_6 < D.2478_7) goto <L1>; else goto <L2>;
> >
> > <L1>:;
> >
> >   # a_8 = PHI <&b(2), &a(3)>
> > <L2>:;
> >   D.2459_2 = *a_8;
> >   return D.2459_2;
> >
> > so that phiopt can later turn this into a MIN_EXPR.
> >
> > Fortunately a more complete solution has been posted (but keeps being
> > ignored) here:
> > http://gcc.gnu.org/ml/gcc-patches/2007-02/msg01331.html
> >
> > So I'll post this variant again which improves PRE but that lacks in
> > several ways:
> >
> >  - PRE is run too late, only the phiopt pass after loop (and
vectorizer)
> >    recognizes the new opportunity
> >
> >  - nothing marks no longer addressable vars as registers after PRE, so
> >    the C testcase has to be XFAILed for now.
> >
> >  - It doesn't work for chains like std::min(std::max(a, b), c)
> >
> >
> > Still, bootstrap and regtest in progress.  Ok for mainline if it
passes?
>
> So the vectorizer doesn't like if we apply this to loads in loops, so
> I'm back to the original variant which requires all loads to be simple.

just curious: how does the load look like to the vectorizer, and why
doesn't it like it? (I wonder if it makes sense to extend the vectorizer to
handle such non-simple loads? are we losing a lot of PRE opportunities by
reverting to the original variant?)

thanks,
dorit

> Also there's a fortran FE problem (PR31608) which makes us ICE on
> gfortran.dg/achar_4.f90.
>
> Richard.
>
> For reference:
>
> 2007-04-17  Richard Guenther  <rguenther@suse.de>
>
>    PR tree-optimization/19431
>    * tree-ssa-pre.c (do_regular_insertion): Do insertion if
>    that can remove all loads.
>
>    * gcc.dg/tree-ssa/ssa-pre-17.c: New testcase.
>    * g++.dg/tree-ssa/ssa-pre-1.C: Likewise.
>
> Index: tree-ssa-pre.c
> ===================================================================
> *** tree-ssa-pre.c   (revision 123910)
> --- tree-ssa-pre.c   (working copy)
> *************** do_regular_insertion (basic_block block,
> *** 2596,2601 ****
> --- 2596,2602 ----
>        bool by_some = false;
>        bool cant_insert = false;
>        bool all_same = true;
> +      bool is_load, all_loads_simple;
>        tree first_s = NULL;
>        edge pred;
>        basic_block bprime;
> *************** do_regular_insertion (basic_block block,
> *** 2612,2617 ****
> --- 2613,2621 ----
>            continue;
>          }
>
> +      is_load = TREE_CODE (expr) == INDIRECT_REF;
> +      all_loads_simple = true;
> +
>        avail = XCNEWVEC (tree, last_basic_block);
>        FOR_EACH_EDGE (pred, ei, block->preds)
>          {
> *************** do_regular_insertion (basic_block block,
> *** 2645,2650 ****
> --- 2649,2671 ----
>           break;
>         }
>
> +          /* See if this is a load and if so, if on this edge
> +        the load can be trivially removed.  */
> +          if (is_load
> +         && TREE_CODE (eprime) == INDIRECT_REF)
> +       {
> +         tree vhe = TREE_OPERAND (eprime, 0);
> +         if (TREE_CODE (vhe) == VALUE_HANDLE)
> +           {
> +             bitmap_set_t exprset = VALUE_HANDLE_EXPR_SET (vhe);
> +             unsigned int firstbit;
> +             firstbit = bitmap_first_set_bit (exprset->expressions);
> +             vhe = expression_for_id (firstbit);
> +           }
> +         if (TREE_CODE (vhe) != ADDR_EXPR)
> +           all_loads_simple = false;
> +       }
> +
>            eprime = fully_constant_expression (eprime);
>            vprime = get_value_handle (eprime);
>            gcc_assert (vprime);
> *************** do_regular_insertion (basic_block block,
> *** 2670,2676 ****
>           already existing along every predecessor, and
>           it's defined by some predecessor, it is
>           partially redundant.  */
> !      if (!cant_insert && !all_same && by_some)
>          {
>            if (insert_into_preds_of_block (block, get_expression_id
(expr),
>                        avail))
> --- 2691,2698 ----
>           already existing along every predecessor, and
>           it's defined by some predecessor, it is
>           partially redundant.  */
> !      if (!cant_insert && !all_same
> !          && (by_some || (is_load && all_loads_simple)))
>          {
>            if (insert_into_preds_of_block (block, get_expression_id
(expr),
>                        avail))
> Index: testsuite/gcc.dg/tree-ssa/ssa-pre-17.c
> ===================================================================
> *** testsuite/gcc.dg/tree-ssa/ssa-pre-17.c   (revision 0)
> --- testsuite/gcc.dg/tree-ssa/ssa-pre-17.c   (revision 0)
> ***************
> *** 0 ****
> --- 1,28 ----
> + /* { dg-do compile } */
> + /* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> + /* We want this optimized as
> + <bb 2>:
> +   if (k != 0) goto <L2>; else goto <L3>;
> +
> + <L3>:;
> +   i1 = j1;
> +
> + <L2>:;
> +   return i1;
> +
> +   This requires that i1 and j1 are changed into registers after they
> +   no longer have their address taken.  */
> +
> + int f(int k, int i1, int j1)
> + {
> +   int *f1;
> +   if(k)
> +    f1 = &i1;
> +   else
> +    f1 = &j1;
> +   return *f1;
> + }
> +
> + /* { dg-final { scan-tree-dump-times "goto" 1 "optimized" { xfail
> *-*-* } } } */
> + /* { dg-final { cleanup-tree-dump "optimized" } } */
> Index: testsuite/g++.dg/tree-ssa/ssa-pre-1.C
> ===================================================================
> *** testsuite/g++.dg/tree-ssa/ssa-pre-1.C   (revision 0)
> --- testsuite/g++.dg/tree-ssa/ssa-pre-1.C   (revision 0)
> ***************
> *** 0 ****
> --- 1,13 ----
> + /* { dg-do compile } */
> + /* { dg-options "-O2 -fdump-tree-optimized" } */
> +
> + template<class T> static inline const T &ref_min(const T &a, const T
&b)
> + { return a<b ? a : b; }
> +
> + int foo(int a, int b)
> + {
> +   return ref_min(a, b);
> + }
> +
> + /* { dg-final { scan-tree-dump "MIN_EXPR" "optimized" } } */
> + /* { dg-final { cleanup-tree-dump "optimized" } } */


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]