This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix PR63761: stmt set to wrong basic block


On Tue, Nov 11, 2014 at 5:58 PM, Thomas Preud'homme
<thomas.preudhomme@arm.com> wrote:
> Hi,
>
> The bswap pass iterate over the statements in a basic block and replace them by a bswap, load or load+bswap when equivalent. However, to keep the code equivalent the load need to be perform at the same place as where one of the original load was made so that they have the same VUSE. Consider for example the following:
>
> incorrect_read_le32 (char *x, char *y)
> {
>   unsigned char c0, c1, c2, c3;
>   uint32_t result;
>
>   c0 = x[0];
>   c1 = x[1];
>   c2 = x[2];
>   c3 = x[3];
>   *y = 1;
>   if (c0 %2)
>     result = c0 | c1 << 8 | c2 << 16 | c3 << 24;
>   else
>     result = c1;
>   return result;
> }
>
> On ARM this would be replaced by a 32bit load but this needs to be done before the *y = 1 line. This implies to move the result line and can make it change basic block. This can then lead to problems as the iterator used to iterate over the statements in a basic block still points to the original basic block of this statement while the statement itself points to the new basic block. A second bswap can then copy this wrong basic block information in a newly created gimple statement while it links to statement in another basic block, which is the error observed.
>
> The patch consist in making sure the iterator points to the statement just before the statement being consider for bswap replacement. The patch also adds a number of comments to give the rational why some (potentially surprising things) happen. The testsuite was constructed by using the gimple representation of the binutils file causing the ICE and then reduced with creduce (creduce could not reduce the binutils file directly).
>
> ChangeLog entries are as follows:
>
> *** gcc/ChangeLog ***
>
> 2014-11-09  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>
>         PR tree-optimization/63761
>         * tree-ssa-math-opts.c (bswap_replace): Construct gsi from cur_stmt
>         rather than taking it as a parameter. Add some comments to explain the
>         gsi_move_before in case of load and why canonicalization of bswap into
>         a rotation is only done for 16bit values.
>         (pass_optimize_bswap::execute): Adapt for loop via gsi to make gsi
>         refer to the statement just before cur_stmt. Ignore 16bit bswap that
>         are already in canonical form. Adapt bswap_replace to removal of its
>         gsi parameter.
>
>
> *** gcc/testsuite/ChangeLog ***
>
> 2014-11-09  Thomas Preud'homme  <thomas.preudhomme@arm.com>
>
>         PR tree-optimization/63761
>         * /gcc.c-torture/compile/pr63761.c: New test.
>
> Testing done:
> * GCC with the patch was boostrapped on x86_64-linux-gnu and the testsuite ran without any regression.
> * Equally GCC with the patch was built for aarch64-none-linux-gnu target and the tests ran without regression.
>
>
> diff --git a/gcc/testsuite/gcc.c-torture/compile/pr63761.c b/gcc/testsuite/gcc.c-torture/compile/pr63761.c
> new file mode 100644
> index 0000000..5cda3f1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.c-torture/compile/pr63761.c
> @@ -0,0 +1,17 @@
> +int a, b;
> +short c;
> +
> +void fn1 ();
> +
> +void
> +fn2 (unsigned short p1)
> +{
> +  int d;
> +
> +  c = p1 >> 8 | p1 << 8;
> +  d = b;
> +  if (d)
> +    fn1 ();
> +  a = d >> 8 & 0x00FF
> +    | d << 8 & 0xFF00;
> +}
> diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
> index 12fa4e8..aab056c 100644
> --- a/gcc/tree-ssa-math-opts.c
> +++ b/gcc/tree-ssa-math-opts.c
> @@ -2172,23 +2172,28 @@ public:
>
>  }; // class pass_optimize_bswap
>
> -/* Perform the bswap optimization: replace the statement CUR_STMT at
> -   GSI with a load of type, VUSE and set-alias as described by N if a
> -   memory source is involved (N->base_addr is non null), followed by
> -   the builtin bswap invocation in FNDECL if BSWAP is true.  SRC_STMT
> -   gives where should the replacement be made.  It also gives the
> -   source on which CUR_STMT is operating via its rhs's first tree nad
> -   N->range gives the size of the expression involved for maintaining
> -   some statistics.  */
> +/* Perform the bswap optimization: replace the expression computed in the rhs
> +   of CUR_STMT by an equivalent bswap, load or load + bswap expression.
> +   Which of these alternatives replace the rhs is given by N->base_addr (non
> +   null if a load is needed) and BSWAP.  The type, VUSE and set-alias of the
> +   load to perform are also given in N while the builtin bswap invoke is given
> +   in FNDEL.  Finally, if a load is involved, SRC_STMT refers to one of the
> +   load statements involved to construct the rhs in CUR_STMT and N->range gives
> +   the size of the rhs expression for maintaining some statistics.
> +
> +   Note that if the replacement involve a load, CUR_STMT is moved just after
> +   SRC_STMT to do the load with the same VUSE which can lead to CUR_STMT
> +   changing of basic block.  */
>
>  static bool
> -bswap_replace (gimple cur_stmt, gimple_stmt_iterator gsi, gimple src_stmt,
> -              tree fndecl, tree bswap_type, tree load_type,
> -              struct symbolic_number *n, bool bswap)
> +bswap_replace (gimple cur_stmt, gimple src_stmt, tree fndecl, tree bswap_type,
> +              tree load_type, struct symbolic_number *n, bool bswap)
>  {
> +  gimple_stmt_iterator gsi;
>    tree src, tmp, tgt;
>    gimple bswap_stmt;
>
> +  gsi = gsi_for_stmt (cur_stmt);
>    src = gimple_assign_rhs1 (src_stmt);
>    tgt = gimple_assign_lhs (cur_stmt);
>
> @@ -2207,6 +2212,9 @@ bswap_replace (gimple cur_stmt, gimple_stmt_iterator gsi, gimple src_stmt,
>           && SLOW_UNALIGNED_ACCESS (TYPE_MODE (load_type), align))
>         return false;
>
> +      /* Move cur_stmt just before  one of the load of the original
> +        to ensure it has the same VUSE.  See PR61517 for what could
> +        go wrong.  */
>        gsi_move_before (&gsi, &gsi_ins);
>        gsi = gsi_for_stmt (cur_stmt);
>
> @@ -2293,7 +2301,10 @@ bswap_replace (gimple cur_stmt, gimple_stmt_iterator gsi, gimple src_stmt,
>
>    tmp = src;
>
> -  /* Canonical form for 16 bit bswap is a rotate expression.  */
> +  /* Canonical form for 16 bit bswap is a rotate expression.  Only 16bit values
> +     are considered as rotation of 2N bit values by N bits is generally not
> +     equivalent to a bswap.  Consider for instance 0x01020304 >> 16 which gives
> +     0x03040102 while a bswap for that value is 0x04030201.  */
>    if (bswap && n->range == 16)
>      {
>        tree count = build_int_cst (NULL, BITS_PER_UNIT);
> @@ -2393,10 +2404,10 @@ pass_optimize_bswap::execute (function *fun)
>        gimple_stmt_iterator gsi;
>
>        /* We do a reverse scan for bswap patterns to make sure we get the
> -        widest match. As bswap pattern matching doesn't handle
> -        previously inserted smaller bswap replacements as sub-
> -        patterns, the wider variant wouldn't be detected.  */
> -      for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi); gsi_prev (&gsi))
> +        widest match. As bswap pattern matching doesn't handle previously
> +        inserted smaller bswap replacements as sub-patterns, the wider
> +        variant wouldn't be detected.  */
> +      for (gsi = gsi_last_bb (bb); !gsi_end_p (gsi);)
>          {
>           gimple src_stmt, cur_stmt = gsi_stmt (gsi);
>           tree fndecl = NULL_TREE, bswap_type = NULL_TREE, load_type;
> @@ -2404,6 +2415,14 @@ pass_optimize_bswap::execute (function *fun)
>           struct symbolic_number n;
>           bool bswap;
>
> +         /* This gsi_prev (&gsi) is not part of the for loop because cur_stmt
> +            might be moved to a different basic block by bswap_replace and gsi
> +            must not points to it if that's the case.  Moving the gsi_prev
> +            there make sure that gsi points to the statement previous to
> +            cur_stmt while still making sure that all statements are
> +            considered in this basic block.  */
> +         gsi_prev (&gsi);
> +
>           if (!is_gimple_assign (cur_stmt))
>             continue;
>
> @@ -2431,6 +2450,9 @@ pass_optimize_bswap::execute (function *fun)
>           switch (n.range)
>             {
>             case 16:
> +             /* Already in canonical form, nothing to do.  */
> +             if (code == LROTATE_EXPR || code == RROTATE_EXPR)
> +               continue;
>               load_type = uint16_type_node;
>               if (bswap16_p)
>                 {
> @@ -2461,8 +2483,8 @@ pass_optimize_bswap::execute (function *fun)
>           if (bswap && !fndecl)
>             continue;
>
> -         if (bswap_replace (cur_stmt, gsi, src_stmt, fndecl, bswap_type,
> -                            load_type, &n, bswap))
> +         if (bswap_replace (cur_stmt, src_stmt, fndecl, bswap_type, load_type,
> +                            &n, bswap))
>             changed = true;
>         }
>      }
>
> Is this ok for trunk?

Ok.

Thanks,
Richard.

> Best regards,
>
> Thomas
>
>
>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]