This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] __atomic_compare_exchange* optimizations (PR middle-end/66867)


On Thu, 23 Jun 2016, Jakub Jelinek wrote:

> Hi!
> 
> This PR is about 2 issues with the *atomic_compare_exchange* APIs, which
> didn't exist with __sync_*_compare_and_swap:
> 1) the APIs make the expected argument addressable, although it is very
>    common it is an automatic variable that is addressable only because of
>    these APIs
> 2) for the fear that expected might be a pointer to memory accessed by
>    multiple threads, the store of the oldvar to that location is only
>    conditional (if the compare and swap failed) - while again for the
>    common case when it is a local otherwise non-addressable automatic
>    var, it can be stored unconditionally.
> 
> To resolve this, we effectively need a call (or some other stmt) that
> returns two values.  We need that also for the __builtin_*_overflow*
> builtins and have solved it by returning from an internal-fn call
> a complex int value, where REALPART_EXPR of it is one result and
> IMAGPART_EXPR the other (bool-ish) result.
> 
> The following patch handles it the same, by folding
> __atomic_compare_exchange_N early to an internal call (with conditional
> store in the IL), and then later on if the expected var becomes
> non-addressable and is rewritten into SSA, optimizing the conditional store
> into unconditional (that is the gimple-fold.c part).
> 
> Thinking about this again, there could be another option - keep
> __atomic_compare_exchange_N in the IL, but under certain conditions (similar
> to what the patch uses in fold_builtin_atomic_compare_exchange) for these
> builtins ignore &var on the second argument, and if we actually turn var
> into non-addressable, convert the builtin call similarly to what
> fold_builtin_atomic_compare_exchange does in the patch (except the store
> would be non-conditional then; the gimple-fold.c part wouldn't be needed
> then).
> 
> Any preferences?

I wonder if always expanding from the internal-fn would
eventually generate worse code if the value would be addressable already.
If that is the case then doing it conditionally on the arg becoming
non-addressable (update-address-taken I guess) would be prefered.

Otherwise you can't use immediate uses from gimple-fold but you have
to stick the transform into tree-ssa-forwprop.c instead (gimple-fold
can only rely on up-to-date use-def chains, stmt operands are _not_
up-to-date reliably).

Thanks,
Richard.

>  This version has been bootstrapped/regtested on
> x86_64-linux and i686-linux.  Attached are various testcases I've been using
> to see if the generated code improved (tried x86_64, powerpc64le, s390x and
> aarch64).  E.g. on x86_64-linux, in the first testcase at -O2 the
> improvement in f1/f2 is removal of dead
>        movl    $0, -4(%rsp)
> in f4
> -       movl    $0, -4(%rsp)
>         lock; cmpxchgl  %edx, (%rdi)
> -       je      .L7
> -       movl    %eax, -4(%rsp)
> -.L7:
> -       movl    -4(%rsp), %eax
> etc.
> 
> 2016-06-23  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR middle-end/66867
> 	* builtins.c: Include gimplify.h.
> 	(expand_ifn_atomic_compare_exchange_into_call,
> 	expand_ifn_atomic_compare_exchange,
> 	fold_builtin_atomic_compare_exchange): New functions.
> 	(fold_builtin_varargs): Handle BUILT_IN_ATOMIC_COMPARE_EXCHANGE_*.
> 	* internal-fn.c (expand_ATOMIC_COMPARE_EXCHANGE): New function.
> 	* tree.h (build_call_expr_internal_loc): Rename to ...
> 	(build_call_expr_internal_loc_array): ... this.  Fix up type of
> 	last argument.
> 	* internal-fn.def (ATOMIC_COMPARE_EXCHANGE): New internal fn.
> 	* predict.c (expr_expected_value_1): Handle IMAGPART_EXPR of
> 	ATOMIC_COMPARE_EXCHANGE result.
> 	* builtins.h (expand_ifn_atomic_compare_exchange): New prototype.
> 	* gimple-fold.c (fold_ifn_atomic_compare_exchange): New function.
> 	(gimple_fold_call): Handle IFN_ATOMIC_COMPARE_EXCHANGE.
> 
> 	* gfortran.dg/coarray_atomic_4.f90: Add -O0 to dg-options.
> 
> --- gcc/builtins.c.jj	2016-06-08 21:01:25.000000000 +0200
> +++ gcc/builtins.c	2016-06-23 09:17:51.053713986 +0200
> @@ -65,6 +65,7 @@ along with GCC; see the file COPYING3.
>  #include "internal-fn.h"
>  #include "case-cfn-macros.h"
>  #include "gimple-fold.h"
> +#include "gimplify.h"
>  
>  
>  struct target_builtins default_target_builtins;
> @@ -5158,6 +5159,123 @@ expand_builtin_atomic_compare_exchange (
>    return target;
>  }
>  
> +/* Helper function for expand_ifn_atomic_compare_exchange - expand
> +   internal ATOMIC_COMPARE_EXCHANGE call into __atomic_compare_exchange_N
> +   call.  The weak parameter must be dropped to match the expected parameter
> +   list and the expected argument changed from value to pointer to memory
> +   slot.  */
> +
> +static void
> +expand_ifn_atomic_compare_exchange_into_call (gcall *call, machine_mode mode)
> +{
> +  unsigned int z;
> +  vec<tree, va_gc> *vec;
> +
> +  vec_alloc (vec, 5);
> +  vec->quick_push (gimple_call_arg (call, 0));
> +  tree expected = gimple_call_arg (call, 1);
> +  rtx x = assign_stack_temp_for_type (mode, GET_MODE_SIZE (mode),
> +				      TREE_TYPE (expected));
> +  rtx expd = expand_expr (expected, x, mode, EXPAND_NORMAL);
> +  if (expd != x)
> +    emit_move_insn (x, expd);
> +  tree v = make_tree (TREE_TYPE (expected), x);
> +  vec->quick_push (build1 (ADDR_EXPR,
> +			   build_pointer_type (TREE_TYPE (expected)), v));
> +  vec->quick_push (gimple_call_arg (call, 2));
> +  /* Skip the boolean weak parameter.  */
> +  for (z = 4; z < 6; z++)
> +    vec->quick_push (gimple_call_arg (call, z));
> +  built_in_function fncode
> +    = (built_in_function) ((int) BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1
> +			   + exact_log2 (GET_MODE_SIZE (mode)));
> +  tree fndecl = builtin_decl_explicit (fncode);
> +  tree fn = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (fndecl)),
> +		    fndecl);
> +  tree exp = build_call_vec (boolean_type_node, fn, vec);
> +  tree lhs = gimple_call_lhs (call);
> +  rtx boolret = expand_call (exp, NULL_RTX, lhs == NULL_TREE);
> +  if (lhs)
> +    {
> +      rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> +      if (GET_MODE (boolret) != mode)
> +	boolret = convert_modes (mode, GET_MODE (boolret), boolret, 1);
> +      x = force_reg (mode, x);
> +      write_complex_part (target, boolret, true);
> +      write_complex_part (target, x, false);
> +    }
> +}
> +
> +/* Expand IFN_ATOMIC_COMPARE_EXCHANGE internal function.  */
> +
> +void
> +expand_ifn_atomic_compare_exchange (gcall *call)
> +{
> +  int size = tree_to_shwi (gimple_call_arg (call, 3)) & 255;
> +  gcc_assert (size == 1 || size == 2 || size == 4 || size == 8 || size == 16);
> +  machine_mode mode = mode_for_size (BITS_PER_UNIT * size, MODE_INT, 0);
> +  rtx expect, desired, mem, oldval, boolret;
> +  enum memmodel success, failure;
> +  tree lhs;
> +  bool is_weak;
> +  source_location loc
> +    = expansion_point_location_if_in_system_header (gimple_location (call));
> +
> +  success = get_memmodel (gimple_call_arg (call, 4));
> +  failure = get_memmodel (gimple_call_arg (call, 5));
> +
> +  if (failure > success)
> +    {
> +      warning_at (loc, OPT_Winvalid_memory_model,
> +		  "failure memory model cannot be stronger than success "
> +		  "memory model for %<__atomic_compare_exchange%>");
> +      success = MEMMODEL_SEQ_CST;
> +    }
> +
> +  if (is_mm_release (failure) || is_mm_acq_rel (failure))
> +    {
> +      warning_at (loc, OPT_Winvalid_memory_model,
> +		  "invalid failure memory model for "
> +		  "%<__atomic_compare_exchange%>");
> +      failure = MEMMODEL_SEQ_CST;
> +      success = MEMMODEL_SEQ_CST;
> +    }
> +
> +  if (!flag_inline_atomics)
> +    {
> +      expand_ifn_atomic_compare_exchange_into_call (call, mode);
> +      return;
> +    }
> +
> +  /* Expand the operands.  */
> +  mem = get_builtin_sync_mem (gimple_call_arg (call, 0), mode);
> +
> +  expect = expand_expr_force_mode (gimple_call_arg (call, 1), mode);
> +  desired = expand_expr_force_mode (gimple_call_arg (call, 2), mode);
> +
> +  is_weak = (tree_to_shwi (gimple_call_arg (call, 3)) & 256) != 0;
> +
> +  boolret = NULL;
> +  oldval = NULL;
> +
> +  if (!expand_atomic_compare_and_swap (&boolret, &oldval, mem, expect, desired,
> +				       is_weak, success, failure))
> +    {
> +      expand_ifn_atomic_compare_exchange_into_call (call, mode);
> +      return;
> +    }
> +
> +  lhs = gimple_call_lhs (call);
> +  if (lhs)
> +    {
> +      rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> +      if (GET_MODE (boolret) != mode)
> +	boolret = convert_modes (mode, GET_MODE (boolret), boolret, 1);
> +      write_complex_part (target, boolret, true);
> +      write_complex_part (target, oldval, false);
> +    }
> +}
> +
>  /* Expand the __atomic_load intrinsic:
>     	TYPE __atomic_load (TYPE *object, enum memmodel)
>     EXP is the CALL_EXPR.
> @@ -9515,6 +9633,63 @@ fold_builtin_object_size (tree ptr, tree
>    return NULL_TREE;
>  }
>  
> +/* Fold
> +     r = __atomic_compare_exchange_N (p, &e, d, w, s, f);
> +   into
> +     _Complex uintN_t t = ATOMIC_COMPARE_EXCHANGE (p, e, d, w * 256 + N, s, f);
> +     i = IMAGPART_EXPR <t>;
> +     r = (_Bool) i;
> +     if (!b)
> +       e = REALPART_EXPR <t>;  */
> +
> +static tree
> +fold_builtin_atomic_compare_exchange (location_t loc, tree fndecl,
> +				      tree *args, int nargs)
> +{
> +  if (nargs != 6
> +      || !flag_inline_atomics
> +      || !optimize
> +      || (flag_sanitize & (SANITIZE_THREAD | SANITIZE_ADDRESS)) != 0
> +      || (!integer_zerop (args[3]) && !integer_onep (args[3])))
> +    return NULL_TREE;
> +
> +  tree argsc[6];
> +  tree parmt = TYPE_ARG_TYPES (TREE_TYPE (fndecl));
> +  tree itype = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (parmt)));
> +  machine_mode mode = TYPE_MODE (itype);
> +
> +  if (direct_optab_handler (atomic_compare_and_swap_optab, mode)
> +      == CODE_FOR_nothing
> +      && optab_handler (sync_compare_and_swap_optab, mode) == CODE_FOR_nothing)
> +    return NULL_TREE;
> +
> +  tree ctype = build_complex_type (itype);
> +  tree alias_type = build_pointer_type_for_mode (itype, ptr_mode, true);
> +  tree alias_off = build_int_cst (alias_type, 0);
> +  tree expected = fold_build2_loc (loc, MEM_REF, itype, args[1], alias_off);
> +  memcpy (argsc, args, sizeof (argsc));
> +  argsc[1] = expected;
> +  argsc[3] = build_int_cst (integer_type_node,
> +			    (integer_onep (args[3]) ? 256 : 0)
> +			    + int_size_in_bytes (itype));
> +  tree var = create_tmp_var_raw (ctype);
> +  DECL_CONTEXT (var) = current_function_decl;
> +  tree call
> +    = build_call_expr_internal_loc_array (loc, IFN_ATOMIC_COMPARE_EXCHANGE,
> +					  ctype, 6, argsc);
> +  var = build4 (TARGET_EXPR, ctype, var, call, NULL, NULL);
> +  tree ret
> +    = fold_convert_loc (loc, boolean_type_node,
> +			build1 (IMAGPART_EXPR, itype, var));
> +  tree condstore
> +    = build3_loc (loc, COND_EXPR, void_type_node, ret,
> +		  void_node, build2_loc (loc, MODIFY_EXPR, void_type_node,
> +					 unshare_expr (expected),
> +					 build1 (REALPART_EXPR, itype, var)));
> +  return build2_loc (loc, COMPOUND_EXPR, boolean_type_node, condstore,
> +		     unshare_expr (ret));
> +}
> +
>  /* Builtins with folding operations that operate on "..." arguments
>     need special handling; we need to store the arguments in a convenient
>     data structure before attempting any folding.  Fortunately there are
> @@ -9533,6 +9708,13 @@ fold_builtin_varargs (location_t loc, tr
>        ret = fold_builtin_fpclassify (loc, args, nargs);
>        break;
>  
> +    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1:
> +    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2:
> +    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4:
> +    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8:
> +    case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16:
> +      return fold_builtin_atomic_compare_exchange (loc, fndecl, args, nargs);
> +
>      default:
>        break;
>      }
> --- gcc/internal-fn.c.jj	2016-06-15 19:09:09.000000000 +0200
> +++ gcc/internal-fn.c	2016-06-22 15:30:06.838951934 +0200
> @@ -2143,6 +2143,14 @@ expand_ATOMIC_BIT_TEST_AND_RESET (intern
>    expand_ifn_atomic_bit_test_and (call);
>  }
>  
> +/* Expand atomic bit test and set.  */
> +
> +static void
> +expand_ATOMIC_COMPARE_EXCHANGE (internal_fn, gcall *call)
> +{
> +  expand_ifn_atomic_compare_exchange (call);
> +}
> +
>  /* Expand a call to FN using the operands in STMT.  FN has a single
>     output operand and NARGS input operands.  */
>  
> --- gcc/tree.h.jj	2016-06-20 21:16:07.000000000 +0200
> +++ gcc/tree.h	2016-06-21 17:35:19.806362408 +0200
> @@ -3985,8 +3985,8 @@ extern tree build_call_expr_loc (locatio
>  extern tree build_call_expr (tree, int, ...);
>  extern tree build_call_expr_internal_loc (location_t, enum internal_fn,
>  					  tree, int, ...);
> -extern tree build_call_expr_internal_loc (location_t, enum internal_fn,
> -					  tree, int, tree *);
> +extern tree build_call_expr_internal_loc_array (location_t, enum internal_fn,
> +						tree, int, const tree *);
>  extern tree maybe_build_call_expr_loc (location_t, combined_fn, tree,
>  				       int, ...);
>  extern tree build_string_literal (int, const char *);
> --- gcc/internal-fn.def.jj	2016-05-03 13:36:50.000000000 +0200
> +++ gcc/internal-fn.def	2016-06-21 17:10:23.516879436 +0200
> @@ -193,6 +193,7 @@ DEF_INTERNAL_FN (SET_EDOM, ECF_LEAF | EC
>  DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_SET, ECF_LEAF | ECF_NOTHROW, NULL)
>  DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_COMPLEMENT, ECF_LEAF | ECF_NOTHROW, NULL)
>  DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_RESET, ECF_LEAF | ECF_NOTHROW, NULL)
> +DEF_INTERNAL_FN (ATOMIC_COMPARE_EXCHANGE, ECF_LEAF | ECF_NOTHROW, NULL)
>  
>  #undef DEF_INTERNAL_INT_FN
>  #undef DEF_INTERNAL_FLT_FN
> --- gcc/predict.c.jj	2016-06-22 11:17:44.763444374 +0200
> +++ gcc/predict.c	2016-06-22 14:26:08.894724088 +0200
> @@ -1978,6 +1978,25 @@ expr_expected_value_1 (tree type, tree o
>        if (TREE_CONSTANT (op0))
>  	return op0;
>  
> +      if (code == IMAGPART_EXPR)
> +	{
> +	  if (TREE_CODE (TREE_OPERAND (op0, 0)) == SSA_NAME)
> +	    {
> +	      def = SSA_NAME_DEF_STMT (TREE_OPERAND (op0, 0));
> +	      if (is_gimple_call (def)
> +		  && gimple_call_internal_p (def)
> +		  && (gimple_call_internal_fn (def)
> +		      == IFN_ATOMIC_COMPARE_EXCHANGE))
> +		{
> +		  /* Assume that any given atomic operation has low contention,
> +		     and thus the compare-and-swap operation succeeds.  */
> +		  if (predictor)
> +		    *predictor = PRED_COMPARE_AND_SWAP;
> +		  return build_one_cst (TREE_TYPE (op0));
> +		}
> +	    }
> +	}
> +
>        if (code != SSA_NAME)
>  	return NULL_TREE;
>  
> --- gcc/builtins.h.jj	2016-05-03 13:36:50.000000000 +0200
> +++ gcc/builtins.h	2016-06-21 18:05:11.678635858 +0200
> @@ -72,6 +72,7 @@ extern tree std_canonical_va_list_type (
>  extern void std_expand_builtin_va_start (tree, rtx);
>  extern void expand_builtin_trap (void);
>  extern void expand_ifn_atomic_bit_test_and (gcall *);
> +extern void expand_ifn_atomic_compare_exchange (gcall *);
>  extern rtx expand_builtin (tree, rtx, rtx, machine_mode, int);
>  extern rtx expand_builtin_with_bounds (tree, rtx, rtx, machine_mode, int);
>  extern enum built_in_function builtin_mathfn_code (const_tree);
> --- gcc/gimple-fold.c.jj	2016-06-16 21:00:08.000000000 +0200
> +++ gcc/gimple-fold.c	2016-06-23 11:48:48.081789706 +0200
> @@ -2980,6 +2980,236 @@ arith_overflowed_p (enum tree_code code,
>    return wi::min_precision (wres, sign) > TYPE_PRECISION (type);
>  }
>  
> +/* Recognize:
> +     _Complex uintN_t t = ATOMIC_COMPARE_EXCHANGE (p, _1, d, w, s, f);
> +     r = IMAGPART_EXPR <t>;
> +     b = (_Bool) r;
> +     if (!b)
> +       _2 = REALPART_EXPR <t>;
> +     _3 = PHI<_1, _2>;
> +   and, because REALPART_EXPR <t> for !!b is necessarily equal to
> +   _1, use REALPART_EXPR <t> unconditionally.  This happens when the
> +   expected argument of __atomic_compare_exchange* is addressable
> +   only because its address had to be passed to __atomic_compare_exchange*,
> +   but otherwise is a local variable.  We don't need to worry about any
> +   race conditions in that case.  */
> +
> +static bool
> +fold_ifn_atomic_compare_exchange (gcall *call)
> +{
> +  tree lhs = gimple_call_lhs (call);
> +  imm_use_iterator imm_iter;
> +  use_operand_p use_p;
> +
> +  if (cfun->cfg == NULL
> +      || lhs == NULL_TREE
> +      || TREE_CODE (lhs) != SSA_NAME
> +      || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
> +    return false;
> +  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
> +    {
> +      gimple *use_stmt = USE_STMT (use_p);
> +      if (is_gimple_debug (use_stmt))
> +	continue;
> +      if (!is_gimple_assign (use_stmt))
> +	break;
> +      if (gimple_assign_rhs_code (use_stmt) == REALPART_EXPR)
> +	continue;
> +      if (gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR)
> +	break;
> +
> +      tree lhs2 = gimple_assign_lhs (use_stmt);
> +      if (TREE_CODE (lhs2) != SSA_NAME)
> +	continue;
> +
> +      use_operand_p use2_p;
> +      gimple *use2_stmt;
> +      if (!single_imm_use (lhs2, &use2_p, &use2_stmt)
> +	  || !gimple_assign_cast_p (use2_stmt))
> +	continue;
> +
> +      tree lhs3 = gimple_assign_lhs (use2_stmt);
> +      if (TREE_CODE (lhs3) != SSA_NAME)
> +	continue;
> +
> +      imm_use_iterator imm3_iter;
> +      use_operand_p use3_p;
> +      FOR_EACH_IMM_USE_FAST (use3_p, imm3_iter, lhs3)
> +	{
> +	  gimple *use3_stmt = USE_STMT (use3_p);
> +	  if (gimple_code (use3_stmt) != GIMPLE_COND
> +	      || gimple_cond_lhs (use3_stmt) != lhs3
> +	      || !integer_zerop (gimple_cond_rhs (use3_stmt))
> +	      || (gimple_cond_code (use3_stmt) != EQ_EXPR
> +		  && gimple_cond_code (use3_stmt) != NE_EXPR))
> +	    continue;
> +
> +	  basic_block bb = gimple_bb (use3_stmt);
> +	  basic_block bb1, bb2;
> +	  edge e1, e2;
> +
> +	  e1 = EDGE_SUCC (bb, 0);
> +	  bb1 = e1->dest;
> +	  e2 = EDGE_SUCC (bb, 1);
> +	  bb2 = e2->dest;
> +
> +	  /* We cannot do the optimization on abnormal edges.  */
> +	  if ((e1->flags & EDGE_ABNORMAL) != 0
> +	      || (e2->flags & EDGE_ABNORMAL) != 0
> +	      || bb1 == NULL
> +	      || bb2 == NULL)
> +	    continue;
> +
> +	  /* Find the bb which is the fall through to the other.  */
> +	  if (single_succ_p (bb1) && single_succ (bb1) == bb2)
> +	    ;
> +	  else if (single_succ_p (bb2) && single_succ (bb2) == bb1)
> +	    {
> +	      std::swap (bb1, bb2);
> +	      std::swap (e1, e2);
> +	    }
> +	  else
> +	    continue;
> +
> +	  e1 = single_succ_edge (bb1);
> +
> +	  /* Make sure that bb1 is just a fall through.  */
> +	  if ((e1->flags & EDGE_FALLTHRU) == 0)
> +	    continue;
> +
> +	  /* Make sure bb1 is executed if b is the atomic operation
> +	     failed.  */
> +	  if ((gimple_cond_code (use3_stmt) == NE_EXPR)
> +	      ^ ((e2->flags & EDGE_TRUE_VALUE) != 0))
> +	    continue;
> +
> +	  /* Also make sure that bb1 only have one predecessor and that it
> +	     is bb.  */
> +	  if (!single_pred_p (bb1) || single_pred (bb1) != bb)
> +	    continue;
> +
> +	  gimple_stmt_iterator gsi = gsi_start_nondebug_after_labels_bb (bb1);
> +	  if (gsi_end_p (gsi))
> +	    continue;
> +
> +	  gimple *rp_stmt = gsi_stmt (gsi);
> +	  if (!is_gimple_assign (rp_stmt)
> +	      || gimple_assign_rhs_code (rp_stmt) != REALPART_EXPR
> +	      || TREE_OPERAND (gimple_assign_rhs1 (rp_stmt), 0) != lhs)
> +	    continue;
> +
> +	  gsi_next_nondebug (&gsi);
> +
> +	  tree lhs4 = gimple_assign_lhs (rp_stmt);
> +	  if (TREE_CODE (lhs4) != SSA_NAME)
> +	    continue;
> +
> +	  use_operand_p use4_p;
> +	  gimple *use4_stmt;
> +	  if (!single_imm_use (lhs4, &use4_p, &use4_stmt))
> +	    continue;
> +
> +	  tree_code cvt = ERROR_MARK;
> +
> +	  /* See if there is extra cast, like:
> +	     _1 = VIEW_CONVERT_EXPR<uintN_t>(_4);
> +	     _Complex uintN_t t = ATOMIC_COMPARE_EXCHANGE (p, _1, d, w, s, f);
> +	     r = IMAGPART_EXPR <t>;
> +	     b = (_Bool) r;
> +	     if (!b) {
> +	       _2 = REALPART_EXPR <t>;
> +	       _5 = (intN_t) _2;
> +	     }
> +	     _3 = PHI<_4, _5>;  */
> +	  if (gimple_assign_cast_p (use4_stmt)
> +	      && gimple_bb (use4_stmt) == bb1
> +	      && use4_stmt == gsi_stmt (gsi))
> +	    {
> +	      tree rhstype = TREE_TYPE (lhs4);
> +	      lhs4 = gimple_assign_lhs (use4_stmt);
> +	      cvt = gimple_assign_rhs_code (use4_stmt);
> +	      if (cvt != VIEW_CONVERT_EXPR
> +		  && (!CONVERT_EXPR_CODE_P (cvt)
> +		      || !INTEGRAL_TYPE_P (TREE_TYPE (lhs4))
> +		      || (TYPE_PRECISION (TREE_TYPE (lhs4))
> +			  != TYPE_PRECISION (rhstype))))
> +		continue;
> +	      if (!single_imm_use (lhs4, &use4_p, &use4_stmt))
> +		continue;
> +	      gsi_next_nondebug (&gsi);
> +	    }
> +
> +	  if (gimple_code (use4_stmt) != GIMPLE_PHI
> +	      || gimple_bb (use4_stmt) != bb2)
> +	    continue;
> +
> +	  if (!gsi_end_p (gsi))
> +	    continue;
> +
> +	  use_operand_p val_p = PHI_ARG_DEF_PTR_FROM_EDGE (use4_stmt, e2);
> +	  tree val = USE_FROM_PTR (val_p);
> +	  tree arge = gimple_call_arg (call, 1);
> +	  if (!operand_equal_p (val, arge, 0))
> +	    {
> +
> +	      if (cvt == ERROR_MARK)
> +		continue;
> +	      else if (TREE_CODE (val) == SSA_NAME)
> +		{
> +		  if (TREE_CODE (arge) != SSA_NAME)
> +		    continue;
> +		  gimple *def = SSA_NAME_DEF_STMT (arge);
> +		  if (!gimple_assign_cast_p (def))
> +		    continue;
> +		  tree arg = gimple_assign_rhs1 (def);;
> +		  switch (gimple_assign_rhs_code (def))
> +		    {
> +		    case VIEW_CONVERT_EXPR:
> +		      arg = TREE_OPERAND (arg, 0);
> +		      break;
> +		    CASE_CONVERT:
> +		      if (!INTEGRAL_TYPE_P (TREE_TYPE (arge))
> +			  || (TYPE_PRECISION (TREE_TYPE (arge))
> +			      != TYPE_PRECISION (TREE_TYPE (arg))))
> +			continue;
> +		      break;
> +		    default:
> +		      continue;
> +		    }
> +		  if (!operand_equal_p (val, arg, 0))
> +		    continue;
> +		}
> +	      else if (TREE_CODE (arge) == SSA_NAME
> +		       || !operand_equal_p (val, fold_build1 (cvt,
> +							      TREE_TYPE (lhs4),
> +							      arge), 0))
> +		continue;
> +	    }
> +
> +	  gsi = gsi_for_stmt (use3_stmt);
> +	  tree type = TREE_TYPE (TREE_TYPE (lhs));
> +	  gimple *g = gimple_build_assign (make_ssa_name (type),
> +					   build1 (REALPART_EXPR, type, lhs));
> +	  gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +	  if (cvt != ERROR_MARK)
> +	    {
> +	      tree arg = gimple_assign_lhs (g);
> +	      if (cvt == VIEW_CONVERT_EXPR)
> +		arg = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs4), arg);
> +	      g = gimple_build_assign (make_ssa_name (TREE_TYPE (lhs4)),
> +				       cvt, arg);
> +	      gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> +	    }
> +	  SET_USE (val_p, gimple_assign_lhs (g));
> +	  val_p = PHI_ARG_DEF_PTR_FROM_EDGE (use4_stmt, e1);
> +	  SET_USE (val_p, gimple_assign_lhs (g));
> +	  update_stmt (use4_stmt);
> +	  return true;
> +	}
> +    }
> +  return false;
> +}
> +
>  /* Attempt to fold a call statement referenced by the statement iterator GSI.
>     The statement may be replaced by another statement, e.g., if the call
>     simplifies to a constant value. Return true if any changes were made.
> @@ -3166,6 +3396,10 @@ gimple_fold_call (gimple_stmt_iterator *
>  	      return true;
>  	    }
>  	  break;
> +	case IFN_ATOMIC_COMPARE_EXCHANGE:
> +	  if (fold_ifn_atomic_compare_exchange (stmt))
> +	    changed = true;
> +	  break;
>  	case IFN_GOACC_DIM_SIZE:
>  	case IFN_GOACC_DIM_POS:
>  	  result = fold_internal_goacc_dim (stmt);
> --- gcc/testsuite/gfortran.dg/coarray_atomic_4.f90.jj	2015-05-29 15:03:08.000000000 +0200
> +++ gcc/testsuite/gfortran.dg/coarray_atomic_4.f90	2016-06-23 12:11:55.507093867 +0200
> @@ -1,5 +1,5 @@
>  ! { dg-do compile }
> -! { dg-options "-fcoarray=single -fdump-tree-original" }
> +! { dg-options "-fcoarray=single -fdump-tree-original -O0" }
>  !
>  use iso_fortran_env, only: atomic_int_kind, atomic_logical_kind
>  implicit none
> 
> 	Jakub
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]