This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [RFC] __atomic_compare_exchange* optimizations (PR middle-end/66867)
- From: Richard Biener <rguenther at suse dot de>
- To: Jakub Jelinek <jakub at redhat dot com>
- Cc: Jeff Law <law at redhat dot com>, Andrew MacLeod <amacleod at redhat dot com>, Richard Henderson <rth at redhat dot com>, gcc-patches at gcc dot gnu dot org
- Date: Tue, 28 Jun 2016 10:09:13 +0200 (CEST)
- Subject: Re: [RFC] __atomic_compare_exchange* optimizations (PR middle-end/66867)
- Authentication-results: sourceware.org; auth=none
- References: <20160623152321 dot GB7387 at tucnak dot redhat dot com>
On Thu, 23 Jun 2016, Jakub Jelinek wrote:
> Hi!
>
> This PR is about 2 issues with the *atomic_compare_exchange* APIs, which
> didn't exist with __sync_*_compare_and_swap:
> 1) the APIs make the expected argument addressable, although it is very
> common it is an automatic variable that is addressable only because of
> these APIs
> 2) for the fear that expected might be a pointer to memory accessed by
> multiple threads, the store of the oldvar to that location is only
> conditional (if the compare and swap failed) - while again for the
> common case when it is a local otherwise non-addressable automatic
> var, it can be stored unconditionally.
>
> To resolve this, we effectively need a call (or some other stmt) that
> returns two values. We need that also for the __builtin_*_overflow*
> builtins and have solved it by returning from an internal-fn call
> a complex int value, where REALPART_EXPR of it is one result and
> IMAGPART_EXPR the other (bool-ish) result.
>
> The following patch handles it the same, by folding
> __atomic_compare_exchange_N early to an internal call (with conditional
> store in the IL), and then later on if the expected var becomes
> non-addressable and is rewritten into SSA, optimizing the conditional store
> into unconditional (that is the gimple-fold.c part).
>
> Thinking about this again, there could be another option - keep
> __atomic_compare_exchange_N in the IL, but under certain conditions (similar
> to what the patch uses in fold_builtin_atomic_compare_exchange) for these
> builtins ignore &var on the second argument, and if we actually turn var
> into non-addressable, convert the builtin call similarly to what
> fold_builtin_atomic_compare_exchange does in the patch (except the store
> would be non-conditional then; the gimple-fold.c part wouldn't be needed
> then).
>
> Any preferences?
I wonder if always expanding from the internal-fn would
eventually generate worse code if the value would be addressable already.
If that is the case then doing it conditionally on the arg becoming
non-addressable (update-address-taken I guess) would be prefered.
Otherwise you can't use immediate uses from gimple-fold but you have
to stick the transform into tree-ssa-forwprop.c instead (gimple-fold
can only rely on up-to-date use-def chains, stmt operands are _not_
up-to-date reliably).
Thanks,
Richard.
> This version has been bootstrapped/regtested on
> x86_64-linux and i686-linux. Attached are various testcases I've been using
> to see if the generated code improved (tried x86_64, powerpc64le, s390x and
> aarch64). E.g. on x86_64-linux, in the first testcase at -O2 the
> improvement in f1/f2 is removal of dead
> movl $0, -4(%rsp)
> in f4
> - movl $0, -4(%rsp)
> lock; cmpxchgl %edx, (%rdi)
> - je .L7
> - movl %eax, -4(%rsp)
> -.L7:
> - movl -4(%rsp), %eax
> etc.
>
> 2016-06-23 Jakub Jelinek <jakub@redhat.com>
>
> PR middle-end/66867
> * builtins.c: Include gimplify.h.
> (expand_ifn_atomic_compare_exchange_into_call,
> expand_ifn_atomic_compare_exchange,
> fold_builtin_atomic_compare_exchange): New functions.
> (fold_builtin_varargs): Handle BUILT_IN_ATOMIC_COMPARE_EXCHANGE_*.
> * internal-fn.c (expand_ATOMIC_COMPARE_EXCHANGE): New function.
> * tree.h (build_call_expr_internal_loc): Rename to ...
> (build_call_expr_internal_loc_array): ... this. Fix up type of
> last argument.
> * internal-fn.def (ATOMIC_COMPARE_EXCHANGE): New internal fn.
> * predict.c (expr_expected_value_1): Handle IMAGPART_EXPR of
> ATOMIC_COMPARE_EXCHANGE result.
> * builtins.h (expand_ifn_atomic_compare_exchange): New prototype.
> * gimple-fold.c (fold_ifn_atomic_compare_exchange): New function.
> (gimple_fold_call): Handle IFN_ATOMIC_COMPARE_EXCHANGE.
>
> * gfortran.dg/coarray_atomic_4.f90: Add -O0 to dg-options.
>
> --- gcc/builtins.c.jj 2016-06-08 21:01:25.000000000 +0200
> +++ gcc/builtins.c 2016-06-23 09:17:51.053713986 +0200
> @@ -65,6 +65,7 @@ along with GCC; see the file COPYING3.
> #include "internal-fn.h"
> #include "case-cfn-macros.h"
> #include "gimple-fold.h"
> +#include "gimplify.h"
>
>
> struct target_builtins default_target_builtins;
> @@ -5158,6 +5159,123 @@ expand_builtin_atomic_compare_exchange (
> return target;
> }
>
> +/* Helper function for expand_ifn_atomic_compare_exchange - expand
> + internal ATOMIC_COMPARE_EXCHANGE call into __atomic_compare_exchange_N
> + call. The weak parameter must be dropped to match the expected parameter
> + list and the expected argument changed from value to pointer to memory
> + slot. */
> +
> +static void
> +expand_ifn_atomic_compare_exchange_into_call (gcall *call, machine_mode mode)
> +{
> + unsigned int z;
> + vec<tree, va_gc> *vec;
> +
> + vec_alloc (vec, 5);
> + vec->quick_push (gimple_call_arg (call, 0));
> + tree expected = gimple_call_arg (call, 1);
> + rtx x = assign_stack_temp_for_type (mode, GET_MODE_SIZE (mode),
> + TREE_TYPE (expected));
> + rtx expd = expand_expr (expected, x, mode, EXPAND_NORMAL);
> + if (expd != x)
> + emit_move_insn (x, expd);
> + tree v = make_tree (TREE_TYPE (expected), x);
> + vec->quick_push (build1 (ADDR_EXPR,
> + build_pointer_type (TREE_TYPE (expected)), v));
> + vec->quick_push (gimple_call_arg (call, 2));
> + /* Skip the boolean weak parameter. */
> + for (z = 4; z < 6; z++)
> + vec->quick_push (gimple_call_arg (call, z));
> + built_in_function fncode
> + = (built_in_function) ((int) BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1
> + + exact_log2 (GET_MODE_SIZE (mode)));
> + tree fndecl = builtin_decl_explicit (fncode);
> + tree fn = build1 (ADDR_EXPR, build_pointer_type (TREE_TYPE (fndecl)),
> + fndecl);
> + tree exp = build_call_vec (boolean_type_node, fn, vec);
> + tree lhs = gimple_call_lhs (call);
> + rtx boolret = expand_call (exp, NULL_RTX, lhs == NULL_TREE);
> + if (lhs)
> + {
> + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> + if (GET_MODE (boolret) != mode)
> + boolret = convert_modes (mode, GET_MODE (boolret), boolret, 1);
> + x = force_reg (mode, x);
> + write_complex_part (target, boolret, true);
> + write_complex_part (target, x, false);
> + }
> +}
> +
> +/* Expand IFN_ATOMIC_COMPARE_EXCHANGE internal function. */
> +
> +void
> +expand_ifn_atomic_compare_exchange (gcall *call)
> +{
> + int size = tree_to_shwi (gimple_call_arg (call, 3)) & 255;
> + gcc_assert (size == 1 || size == 2 || size == 4 || size == 8 || size == 16);
> + machine_mode mode = mode_for_size (BITS_PER_UNIT * size, MODE_INT, 0);
> + rtx expect, desired, mem, oldval, boolret;
> + enum memmodel success, failure;
> + tree lhs;
> + bool is_weak;
> + source_location loc
> + = expansion_point_location_if_in_system_header (gimple_location (call));
> +
> + success = get_memmodel (gimple_call_arg (call, 4));
> + failure = get_memmodel (gimple_call_arg (call, 5));
> +
> + if (failure > success)
> + {
> + warning_at (loc, OPT_Winvalid_memory_model,
> + "failure memory model cannot be stronger than success "
> + "memory model for %<__atomic_compare_exchange%>");
> + success = MEMMODEL_SEQ_CST;
> + }
> +
> + if (is_mm_release (failure) || is_mm_acq_rel (failure))
> + {
> + warning_at (loc, OPT_Winvalid_memory_model,
> + "invalid failure memory model for "
> + "%<__atomic_compare_exchange%>");
> + failure = MEMMODEL_SEQ_CST;
> + success = MEMMODEL_SEQ_CST;
> + }
> +
> + if (!flag_inline_atomics)
> + {
> + expand_ifn_atomic_compare_exchange_into_call (call, mode);
> + return;
> + }
> +
> + /* Expand the operands. */
> + mem = get_builtin_sync_mem (gimple_call_arg (call, 0), mode);
> +
> + expect = expand_expr_force_mode (gimple_call_arg (call, 1), mode);
> + desired = expand_expr_force_mode (gimple_call_arg (call, 2), mode);
> +
> + is_weak = (tree_to_shwi (gimple_call_arg (call, 3)) & 256) != 0;
> +
> + boolret = NULL;
> + oldval = NULL;
> +
> + if (!expand_atomic_compare_and_swap (&boolret, &oldval, mem, expect, desired,
> + is_weak, success, failure))
> + {
> + expand_ifn_atomic_compare_exchange_into_call (call, mode);
> + return;
> + }
> +
> + lhs = gimple_call_lhs (call);
> + if (lhs)
> + {
> + rtx target = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
> + if (GET_MODE (boolret) != mode)
> + boolret = convert_modes (mode, GET_MODE (boolret), boolret, 1);
> + write_complex_part (target, boolret, true);
> + write_complex_part (target, oldval, false);
> + }
> +}
> +
> /* Expand the __atomic_load intrinsic:
> TYPE __atomic_load (TYPE *object, enum memmodel)
> EXP is the CALL_EXPR.
> @@ -9515,6 +9633,63 @@ fold_builtin_object_size (tree ptr, tree
> return NULL_TREE;
> }
>
> +/* Fold
> + r = __atomic_compare_exchange_N (p, &e, d, w, s, f);
> + into
> + _Complex uintN_t t = ATOMIC_COMPARE_EXCHANGE (p, e, d, w * 256 + N, s, f);
> + i = IMAGPART_EXPR <t>;
> + r = (_Bool) i;
> + if (!b)
> + e = REALPART_EXPR <t>; */
> +
> +static tree
> +fold_builtin_atomic_compare_exchange (location_t loc, tree fndecl,
> + tree *args, int nargs)
> +{
> + if (nargs != 6
> + || !flag_inline_atomics
> + || !optimize
> + || (flag_sanitize & (SANITIZE_THREAD | SANITIZE_ADDRESS)) != 0
> + || (!integer_zerop (args[3]) && !integer_onep (args[3])))
> + return NULL_TREE;
> +
> + tree argsc[6];
> + tree parmt = TYPE_ARG_TYPES (TREE_TYPE (fndecl));
> + tree itype = TREE_VALUE (TREE_CHAIN (TREE_CHAIN (parmt)));
> + machine_mode mode = TYPE_MODE (itype);
> +
> + if (direct_optab_handler (atomic_compare_and_swap_optab, mode)
> + == CODE_FOR_nothing
> + && optab_handler (sync_compare_and_swap_optab, mode) == CODE_FOR_nothing)
> + return NULL_TREE;
> +
> + tree ctype = build_complex_type (itype);
> + tree alias_type = build_pointer_type_for_mode (itype, ptr_mode, true);
> + tree alias_off = build_int_cst (alias_type, 0);
> + tree expected = fold_build2_loc (loc, MEM_REF, itype, args[1], alias_off);
> + memcpy (argsc, args, sizeof (argsc));
> + argsc[1] = expected;
> + argsc[3] = build_int_cst (integer_type_node,
> + (integer_onep (args[3]) ? 256 : 0)
> + + int_size_in_bytes (itype));
> + tree var = create_tmp_var_raw (ctype);
> + DECL_CONTEXT (var) = current_function_decl;
> + tree call
> + = build_call_expr_internal_loc_array (loc, IFN_ATOMIC_COMPARE_EXCHANGE,
> + ctype, 6, argsc);
> + var = build4 (TARGET_EXPR, ctype, var, call, NULL, NULL);
> + tree ret
> + = fold_convert_loc (loc, boolean_type_node,
> + build1 (IMAGPART_EXPR, itype, var));
> + tree condstore
> + = build3_loc (loc, COND_EXPR, void_type_node, ret,
> + void_node, build2_loc (loc, MODIFY_EXPR, void_type_node,
> + unshare_expr (expected),
> + build1 (REALPART_EXPR, itype, var)));
> + return build2_loc (loc, COMPOUND_EXPR, boolean_type_node, condstore,
> + unshare_expr (ret));
> +}
> +
> /* Builtins with folding operations that operate on "..." arguments
> need special handling; we need to store the arguments in a convenient
> data structure before attempting any folding. Fortunately there are
> @@ -9533,6 +9708,13 @@ fold_builtin_varargs (location_t loc, tr
> ret = fold_builtin_fpclassify (loc, args, nargs);
> break;
>
> + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_1:
> + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_2:
> + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_4:
> + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_8:
> + case BUILT_IN_ATOMIC_COMPARE_EXCHANGE_16:
> + return fold_builtin_atomic_compare_exchange (loc, fndecl, args, nargs);
> +
> default:
> break;
> }
> --- gcc/internal-fn.c.jj 2016-06-15 19:09:09.000000000 +0200
> +++ gcc/internal-fn.c 2016-06-22 15:30:06.838951934 +0200
> @@ -2143,6 +2143,14 @@ expand_ATOMIC_BIT_TEST_AND_RESET (intern
> expand_ifn_atomic_bit_test_and (call);
> }
>
> +/* Expand atomic bit test and set. */
> +
> +static void
> +expand_ATOMIC_COMPARE_EXCHANGE (internal_fn, gcall *call)
> +{
> + expand_ifn_atomic_compare_exchange (call);
> +}
> +
> /* Expand a call to FN using the operands in STMT. FN has a single
> output operand and NARGS input operands. */
>
> --- gcc/tree.h.jj 2016-06-20 21:16:07.000000000 +0200
> +++ gcc/tree.h 2016-06-21 17:35:19.806362408 +0200
> @@ -3985,8 +3985,8 @@ extern tree build_call_expr_loc (locatio
> extern tree build_call_expr (tree, int, ...);
> extern tree build_call_expr_internal_loc (location_t, enum internal_fn,
> tree, int, ...);
> -extern tree build_call_expr_internal_loc (location_t, enum internal_fn,
> - tree, int, tree *);
> +extern tree build_call_expr_internal_loc_array (location_t, enum internal_fn,
> + tree, int, const tree *);
> extern tree maybe_build_call_expr_loc (location_t, combined_fn, tree,
> int, ...);
> extern tree build_string_literal (int, const char *);
> --- gcc/internal-fn.def.jj 2016-05-03 13:36:50.000000000 +0200
> +++ gcc/internal-fn.def 2016-06-21 17:10:23.516879436 +0200
> @@ -193,6 +193,7 @@ DEF_INTERNAL_FN (SET_EDOM, ECF_LEAF | EC
> DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_SET, ECF_LEAF | ECF_NOTHROW, NULL)
> DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_COMPLEMENT, ECF_LEAF | ECF_NOTHROW, NULL)
> DEF_INTERNAL_FN (ATOMIC_BIT_TEST_AND_RESET, ECF_LEAF | ECF_NOTHROW, NULL)
> +DEF_INTERNAL_FN (ATOMIC_COMPARE_EXCHANGE, ECF_LEAF | ECF_NOTHROW, NULL)
>
> #undef DEF_INTERNAL_INT_FN
> #undef DEF_INTERNAL_FLT_FN
> --- gcc/predict.c.jj 2016-06-22 11:17:44.763444374 +0200
> +++ gcc/predict.c 2016-06-22 14:26:08.894724088 +0200
> @@ -1978,6 +1978,25 @@ expr_expected_value_1 (tree type, tree o
> if (TREE_CONSTANT (op0))
> return op0;
>
> + if (code == IMAGPART_EXPR)
> + {
> + if (TREE_CODE (TREE_OPERAND (op0, 0)) == SSA_NAME)
> + {
> + def = SSA_NAME_DEF_STMT (TREE_OPERAND (op0, 0));
> + if (is_gimple_call (def)
> + && gimple_call_internal_p (def)
> + && (gimple_call_internal_fn (def)
> + == IFN_ATOMIC_COMPARE_EXCHANGE))
> + {
> + /* Assume that any given atomic operation has low contention,
> + and thus the compare-and-swap operation succeeds. */
> + if (predictor)
> + *predictor = PRED_COMPARE_AND_SWAP;
> + return build_one_cst (TREE_TYPE (op0));
> + }
> + }
> + }
> +
> if (code != SSA_NAME)
> return NULL_TREE;
>
> --- gcc/builtins.h.jj 2016-05-03 13:36:50.000000000 +0200
> +++ gcc/builtins.h 2016-06-21 18:05:11.678635858 +0200
> @@ -72,6 +72,7 @@ extern tree std_canonical_va_list_type (
> extern void std_expand_builtin_va_start (tree, rtx);
> extern void expand_builtin_trap (void);
> extern void expand_ifn_atomic_bit_test_and (gcall *);
> +extern void expand_ifn_atomic_compare_exchange (gcall *);
> extern rtx expand_builtin (tree, rtx, rtx, machine_mode, int);
> extern rtx expand_builtin_with_bounds (tree, rtx, rtx, machine_mode, int);
> extern enum built_in_function builtin_mathfn_code (const_tree);
> --- gcc/gimple-fold.c.jj 2016-06-16 21:00:08.000000000 +0200
> +++ gcc/gimple-fold.c 2016-06-23 11:48:48.081789706 +0200
> @@ -2980,6 +2980,236 @@ arith_overflowed_p (enum tree_code code,
> return wi::min_precision (wres, sign) > TYPE_PRECISION (type);
> }
>
> +/* Recognize:
> + _Complex uintN_t t = ATOMIC_COMPARE_EXCHANGE (p, _1, d, w, s, f);
> + r = IMAGPART_EXPR <t>;
> + b = (_Bool) r;
> + if (!b)
> + _2 = REALPART_EXPR <t>;
> + _3 = PHI<_1, _2>;
> + and, because REALPART_EXPR <t> for !!b is necessarily equal to
> + _1, use REALPART_EXPR <t> unconditionally. This happens when the
> + expected argument of __atomic_compare_exchange* is addressable
> + only because its address had to be passed to __atomic_compare_exchange*,
> + but otherwise is a local variable. We don't need to worry about any
> + race conditions in that case. */
> +
> +static bool
> +fold_ifn_atomic_compare_exchange (gcall *call)
> +{
> + tree lhs = gimple_call_lhs (call);
> + imm_use_iterator imm_iter;
> + use_operand_p use_p;
> +
> + if (cfun->cfg == NULL
> + || lhs == NULL_TREE
> + || TREE_CODE (lhs) != SSA_NAME
> + || SSA_NAME_OCCURS_IN_ABNORMAL_PHI (lhs))
> + return false;
> + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
> + {
> + gimple *use_stmt = USE_STMT (use_p);
> + if (is_gimple_debug (use_stmt))
> + continue;
> + if (!is_gimple_assign (use_stmt))
> + break;
> + if (gimple_assign_rhs_code (use_stmt) == REALPART_EXPR)
> + continue;
> + if (gimple_assign_rhs_code (use_stmt) != IMAGPART_EXPR)
> + break;
> +
> + tree lhs2 = gimple_assign_lhs (use_stmt);
> + if (TREE_CODE (lhs2) != SSA_NAME)
> + continue;
> +
> + use_operand_p use2_p;
> + gimple *use2_stmt;
> + if (!single_imm_use (lhs2, &use2_p, &use2_stmt)
> + || !gimple_assign_cast_p (use2_stmt))
> + continue;
> +
> + tree lhs3 = gimple_assign_lhs (use2_stmt);
> + if (TREE_CODE (lhs3) != SSA_NAME)
> + continue;
> +
> + imm_use_iterator imm3_iter;
> + use_operand_p use3_p;
> + FOR_EACH_IMM_USE_FAST (use3_p, imm3_iter, lhs3)
> + {
> + gimple *use3_stmt = USE_STMT (use3_p);
> + if (gimple_code (use3_stmt) != GIMPLE_COND
> + || gimple_cond_lhs (use3_stmt) != lhs3
> + || !integer_zerop (gimple_cond_rhs (use3_stmt))
> + || (gimple_cond_code (use3_stmt) != EQ_EXPR
> + && gimple_cond_code (use3_stmt) != NE_EXPR))
> + continue;
> +
> + basic_block bb = gimple_bb (use3_stmt);
> + basic_block bb1, bb2;
> + edge e1, e2;
> +
> + e1 = EDGE_SUCC (bb, 0);
> + bb1 = e1->dest;
> + e2 = EDGE_SUCC (bb, 1);
> + bb2 = e2->dest;
> +
> + /* We cannot do the optimization on abnormal edges. */
> + if ((e1->flags & EDGE_ABNORMAL) != 0
> + || (e2->flags & EDGE_ABNORMAL) != 0
> + || bb1 == NULL
> + || bb2 == NULL)
> + continue;
> +
> + /* Find the bb which is the fall through to the other. */
> + if (single_succ_p (bb1) && single_succ (bb1) == bb2)
> + ;
> + else if (single_succ_p (bb2) && single_succ (bb2) == bb1)
> + {
> + std::swap (bb1, bb2);
> + std::swap (e1, e2);
> + }
> + else
> + continue;
> +
> + e1 = single_succ_edge (bb1);
> +
> + /* Make sure that bb1 is just a fall through. */
> + if ((e1->flags & EDGE_FALLTHRU) == 0)
> + continue;
> +
> + /* Make sure bb1 is executed if b is the atomic operation
> + failed. */
> + if ((gimple_cond_code (use3_stmt) == NE_EXPR)
> + ^ ((e2->flags & EDGE_TRUE_VALUE) != 0))
> + continue;
> +
> + /* Also make sure that bb1 only have one predecessor and that it
> + is bb. */
> + if (!single_pred_p (bb1) || single_pred (bb1) != bb)
> + continue;
> +
> + gimple_stmt_iterator gsi = gsi_start_nondebug_after_labels_bb (bb1);
> + if (gsi_end_p (gsi))
> + continue;
> +
> + gimple *rp_stmt = gsi_stmt (gsi);
> + if (!is_gimple_assign (rp_stmt)
> + || gimple_assign_rhs_code (rp_stmt) != REALPART_EXPR
> + || TREE_OPERAND (gimple_assign_rhs1 (rp_stmt), 0) != lhs)
> + continue;
> +
> + gsi_next_nondebug (&gsi);
> +
> + tree lhs4 = gimple_assign_lhs (rp_stmt);
> + if (TREE_CODE (lhs4) != SSA_NAME)
> + continue;
> +
> + use_operand_p use4_p;
> + gimple *use4_stmt;
> + if (!single_imm_use (lhs4, &use4_p, &use4_stmt))
> + continue;
> +
> + tree_code cvt = ERROR_MARK;
> +
> + /* See if there is extra cast, like:
> + _1 = VIEW_CONVERT_EXPR<uintN_t>(_4);
> + _Complex uintN_t t = ATOMIC_COMPARE_EXCHANGE (p, _1, d, w, s, f);
> + r = IMAGPART_EXPR <t>;
> + b = (_Bool) r;
> + if (!b) {
> + _2 = REALPART_EXPR <t>;
> + _5 = (intN_t) _2;
> + }
> + _3 = PHI<_4, _5>; */
> + if (gimple_assign_cast_p (use4_stmt)
> + && gimple_bb (use4_stmt) == bb1
> + && use4_stmt == gsi_stmt (gsi))
> + {
> + tree rhstype = TREE_TYPE (lhs4);
> + lhs4 = gimple_assign_lhs (use4_stmt);
> + cvt = gimple_assign_rhs_code (use4_stmt);
> + if (cvt != VIEW_CONVERT_EXPR
> + && (!CONVERT_EXPR_CODE_P (cvt)
> + || !INTEGRAL_TYPE_P (TREE_TYPE (lhs4))
> + || (TYPE_PRECISION (TREE_TYPE (lhs4))
> + != TYPE_PRECISION (rhstype))))
> + continue;
> + if (!single_imm_use (lhs4, &use4_p, &use4_stmt))
> + continue;
> + gsi_next_nondebug (&gsi);
> + }
> +
> + if (gimple_code (use4_stmt) != GIMPLE_PHI
> + || gimple_bb (use4_stmt) != bb2)
> + continue;
> +
> + if (!gsi_end_p (gsi))
> + continue;
> +
> + use_operand_p val_p = PHI_ARG_DEF_PTR_FROM_EDGE (use4_stmt, e2);
> + tree val = USE_FROM_PTR (val_p);
> + tree arge = gimple_call_arg (call, 1);
> + if (!operand_equal_p (val, arge, 0))
> + {
> +
> + if (cvt == ERROR_MARK)
> + continue;
> + else if (TREE_CODE (val) == SSA_NAME)
> + {
> + if (TREE_CODE (arge) != SSA_NAME)
> + continue;
> + gimple *def = SSA_NAME_DEF_STMT (arge);
> + if (!gimple_assign_cast_p (def))
> + continue;
> + tree arg = gimple_assign_rhs1 (def);;
> + switch (gimple_assign_rhs_code (def))
> + {
> + case VIEW_CONVERT_EXPR:
> + arg = TREE_OPERAND (arg, 0);
> + break;
> + CASE_CONVERT:
> + if (!INTEGRAL_TYPE_P (TREE_TYPE (arge))
> + || (TYPE_PRECISION (TREE_TYPE (arge))
> + != TYPE_PRECISION (TREE_TYPE (arg))))
> + continue;
> + break;
> + default:
> + continue;
> + }
> + if (!operand_equal_p (val, arg, 0))
> + continue;
> + }
> + else if (TREE_CODE (arge) == SSA_NAME
> + || !operand_equal_p (val, fold_build1 (cvt,
> + TREE_TYPE (lhs4),
> + arge), 0))
> + continue;
> + }
> +
> + gsi = gsi_for_stmt (use3_stmt);
> + tree type = TREE_TYPE (TREE_TYPE (lhs));
> + gimple *g = gimple_build_assign (make_ssa_name (type),
> + build1 (REALPART_EXPR, type, lhs));
> + gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> + if (cvt != ERROR_MARK)
> + {
> + tree arg = gimple_assign_lhs (g);
> + if (cvt == VIEW_CONVERT_EXPR)
> + arg = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (lhs4), arg);
> + g = gimple_build_assign (make_ssa_name (TREE_TYPE (lhs4)),
> + cvt, arg);
> + gsi_insert_before (&gsi, g, GSI_SAME_STMT);
> + }
> + SET_USE (val_p, gimple_assign_lhs (g));
> + val_p = PHI_ARG_DEF_PTR_FROM_EDGE (use4_stmt, e1);
> + SET_USE (val_p, gimple_assign_lhs (g));
> + update_stmt (use4_stmt);
> + return true;
> + }
> + }
> + return false;
> +}
> +
> /* Attempt to fold a call statement referenced by the statement iterator GSI.
> The statement may be replaced by another statement, e.g., if the call
> simplifies to a constant value. Return true if any changes were made.
> @@ -3166,6 +3396,10 @@ gimple_fold_call (gimple_stmt_iterator *
> return true;
> }
> break;
> + case IFN_ATOMIC_COMPARE_EXCHANGE:
> + if (fold_ifn_atomic_compare_exchange (stmt))
> + changed = true;
> + break;
> case IFN_GOACC_DIM_SIZE:
> case IFN_GOACC_DIM_POS:
> result = fold_internal_goacc_dim (stmt);
> --- gcc/testsuite/gfortran.dg/coarray_atomic_4.f90.jj 2015-05-29 15:03:08.000000000 +0200
> +++ gcc/testsuite/gfortran.dg/coarray_atomic_4.f90 2016-06-23 12:11:55.507093867 +0200
> @@ -1,5 +1,5 @@
> ! { dg-do compile }
> -! { dg-options "-fcoarray=single -fdump-tree-original" }
> +! { dg-options "-fcoarray=single -fdump-tree-original -O0" }
> !
> use iso_fortran_env, only: atomic_int_kind, atomic_logical_kind
> implicit none
>
> Jakub
>
--
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)