[stage1][PATCH] Lower VEC_COND_EXPR into internal functions.
Martin Liška
mliska@suse.cz
Thu Jun 18 08:10:28 GMT 2020
On 6/17/20 3:15 PM, Richard Biener wrote:
> On Wed, Jun 17, 2020 at 10:50 AM Richard Biener
> <richard.guenther@gmail.com> wrote:
>>
>> On Mon, Jun 15, 2020 at 2:20 PM Martin Liška <mliska@suse.cz> wrote:
>>>
>>> On 6/15/20 1:59 PM, Richard Biener wrote:
>>>> On Mon, Jun 15, 2020 at 1:19 PM Martin Liška <mliska@suse.cz> wrote:
>>>>>
>>>>> On 6/15/20 9:14 AM, Richard Biener wrote:
>>>>>> On Fri, Jun 12, 2020 at 3:24 PM Martin Liška <mliska@suse.cz> wrote:
>>>>>>>
>>>>>>> On 6/12/20 11:43 AM, Richard Biener wrote:
>>>>>>>> So ... how far are you with enforcing a split VEC_COND_EXPR?
>>>>>>>> Thus can we avoid the above completely (even as intermediate
>>>>>>>> state)?
>>>>>>>
>>>>>>> Apparently, I'm quite close. Using the attached patch I see only 2 testsuite
>>>>>>> failures:
>>>>>>>
>>>>>>> FAIL: gcc.dg/tree-ssa/pr68714.c scan-tree-dump-times reassoc1 " <= " 1
>>>>>>> FAIL: gcc.target/i386/pr78102.c scan-assembler-times pcmpeqq 3
>>>>>>>
>>>>>>> The first one is about teaching reassoc about the SSA_NAMEs in VEC_COND_EXPR. I haven't
>>>>>>> analyze the second failure.
>>>>>>>
>>>>>>> I'm also not sure about the gimlification change, I see a superfluous assignments:
>>>>>>> vec_cond_cmp.5 = _1 == _2;
>>>>>>> vec_cond_cmp.6 = vec_cond_cmp.5;
>>>>>>> vec_cond_cmp.7 = vec_cond_cmp.6;
>>>>>>> _3 = VEC_COND_EXPR <vec_cond_cmp.7, { -1, -1, -1, -1, -1, -1, -1, -1 }, { 0, 0, 0, 0, 0, 0, 0, 0 }>;
>>>>>>> ?
>>>>>>>
>>>>>>> So with the suggested patch, the EH should be gone as you suggested. Right?
>>>>>>
>>>>>> Right, it should be on the comparison already from the start.
>>>>>>
>>>>>> @@ -14221,9 +14221,13 @@ gimplify_expr (tree *expr_p, gimple_seq
>>>>>> *pre_p, gimple_seq *post_p,
>>>>>> case VEC_COND_EXPR:
>>>>>> {
>>>>>> enum gimplify_status r0, r1, r2;
>>>>>> -
>>>>>> r0 = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p,
>>>>>> post_p, is_gimple_condexpr, fb_rvalue);
>>>>>> + tree xop0 = TREE_OPERAND (*expr_p, 0);
>>>>>> + tmp = create_tmp_var_raw (TREE_TYPE (xop0), "vec_cond_cmp");
>>>>>> + gimple_add_tmp_var (tmp);
>>>>>> + gimplify_assign (tmp, xop0, pre_p);
>>>>>> + TREE_OPERAND (*expr_p, 0) = tmp;
>>>>>> r1 = gimplify_expr (&TREE_OPERAND (*expr_p, 1), pre_p,
>>>>>> post_p, is_gimple_val, fb_rvalue);
>>>>>>
>>>>>> all of VEC_COND_EXPR can now be a simple goto expr_3;
>>>>>
>>>>> Works for me, thanks!
>>>>>
>>>>>>
>>>>>> diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
>>>>>> index 494c9e9c20b..090fb52a2f1 100644
>>>>>> --- a/gcc/tree-ssa-forwprop.c
>>>>>> +++ b/gcc/tree-ssa-forwprop.c
>>>>>> @@ -3136,6 +3136,10 @@ pass_forwprop::execute (function *fun)
>>>>>> if (code == COND_EXPR
>>>>>> || code == VEC_COND_EXPR)
>>>>>> {
>>>>>> + /* Do not propagate into VEC_COND_EXPRs. */
>>>>>> + if (code == VEC_COND_EXPR)
>>>>>> + break;
>>>>>> +
>>>>>>
>>>>>> err - remove the || code == VEC_COND_EXPR instead?
>>>>>
>>>>> Yep.
>>>>>
>>>>>>
>>>>>> @@ -2221,24 +2226,12 @@ expand_vector_operations (void)
>>>>>> {
>>>>>> gimple_stmt_iterator gsi;
>>>>>> basic_block bb;
>>>>>> - bool cfg_changed = false;
>>>>>>
>>>>>> FOR_EACH_BB_FN (bb, cfun)
>>>>>> - {
>>>>>> - for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>>>>> - {
>>>>>> - expand_vector_operations_1 (&gsi);
>>>>>> - /* ??? If we do not cleanup EH then we will ICE in
>>>>>> - verification. But in reality we have created wrong-code
>>>>>> - as we did not properly transition EH info and edges to
>>>>>> - the piecewise computations. */
>>>>>> - if (maybe_clean_eh_stmt (gsi_stmt (gsi))
>>>>>> - && gimple_purge_dead_eh_edges (bb))
>>>>>> - cfg_changed = true;
>>>>>> - }
>>>>>> - }
>>>>>>
>>>>>> I'm not sure about this. Consider the C++ testcase where
>>>>>> the ?: is replaced by a division. If veclower needs to replace
>>>>>> that with four scalrar division statements then the above
>>>>>> still applies - veclower does not correctly duplicate EH info
>>>>>> and EH edges to the individual divisions (and we do not know
>>>>>> which component might trap).
>>>>>>
>>>>>> So please leave the above in. You can try if using integer
>>>>>> division makes it break and add such a testcase if there's
>>>>>> no coverage for this in the testsuite.
>>>>>
>>>>> I'm leaving that above. Can you please explain how can a division test-case
>>>>> be created?
>>>>
>>>> typedef long v2di __attribute__((vector_size(16)));
>>>>
>>>> v2di foo (v2di a, v2di b)
>>>> {
>>>> try
>>>> {
>>>> v2di res = a / b;
>>>> return res;
>>>> }
>>>> catch (...)
>>>> {
>>>> return (v2di){};
>>>> }
>>>> }
>>>>
>>>> with -fnon-call-exceptions I see in t.ii.090t.ehdisp (correctly):
>>>>
>>>> ;; basic block 2, loop depth 0
>>>> ;; pred: ENTRY
>>>> [LP 1] _6 = a_4(D) / b_5(D);
>>>> ;; succ: 5
>>>> ;; 3
>>>>
>>>> while after t.ii.226t.veclower we have
>>>>
>>>> ;; basic block 2, loop depth 0
>>>> ;; pred: ENTRY
>>>> _13 = BIT_FIELD_REF <a_4(D), 64, 0>;
>>>> _14 = BIT_FIELD_REF <b_5(D), 64, 0>;
>>>> _15 = _13 / _14;
>>>> _16 = BIT_FIELD_REF <a_4(D), 64, 64>;
>>>> _17 = BIT_FIELD_REF <b_5(D), 64, 64>;
>>>> _18 = _16 / _17;
>>>> _6 = {_15, _18};
>>>> res_7 = _6;
>>>> _8 = res_7;
>>>> ;; succ: 3
>>>>
>>>> and all EH is gone and we'd ICE if you remove the above hunk. Hopefully.
>>>
>>> Yes, it ICEs then:
>>>
>>>
>>> ./xg++ -B. ~/Programming/testcases/ice.c -c -fnon-call-exceptions -O3
>>> /home/marxin/Programming/testcases/ice.c: In function ‘v2di foo(v2di, v2di)’:
>>> /home/marxin/Programming/testcases/ice.c:3:6: error: statement marked for throw, but doesn’t
>>> 3 | v2di foo (v2di a, v2di b)
>>> | ^~~
>>> _6 = {_12, _15};
>>> during GIMPLE pass: veclower2
>>> /home/marxin/Programming/testcases/ice.c:3:6: internal compiler error: verify_gimple failed
>>> 0x10e308a verify_gimple_in_cfg(function*, bool)
>>> /home/marxin/Programming/gcc/gcc/tree-cfg.c:5461
>>> 0xfc9caf execute_function_todo
>>> /home/marxin/Programming/gcc/gcc/passes.c:1985
>>> 0xfcaafc do_per_function
>>> /home/marxin/Programming/gcc/gcc/passes.c:1640
>>> 0xfcaafc execute_todo
>>> /home/marxin/Programming/gcc/gcc/passes.c:2039
>>> Please submit a full bug report,
>>> with preprocessed source if appropriate.
>>> Please include the complete backtrace with any bug report.
>>> See <https://gcc.gnu.org/bugs/> for instructions.
>>>
>>>>
>>>> We still generate wrong-code obviously as we'd need to duplicate the
>>>> EH info on each component division (and split blocks and generate
>>>> extra EH edges). That's a pre-existing bug of course. I just wanted
>>>> to avoid to create a new instance just because of the early instruction
>>>> selection for VEC_COND_EXPR.
>>>
>>> Fine!
>>>
>>>>
>>>>>>
>>>>>> What's missing from the patch is adjusting
>>>>>> verify_gimple_assign_ternary from
>>>>>>
>>>>>> if (((rhs_code == VEC_COND_EXPR || rhs_code == COND_EXPR)
>>>>>> ? !is_gimple_condexpr (rhs1) : !is_gimple_val (rhs1))
>>>>>> || !is_gimple_val (rhs2)
>>>>>> || !is_gimple_val (rhs3))
>>>>>> {
>>>>>> error ("invalid operands in ternary operation");
>>>>>> return true;
>>>>>>
>>>>>> to the same with the rhs_code == VEC_COND_EXPR case removed.
>>>>>
>>>>> Hmm. I'm not sure I've got this comment. Why do we want to change it
>>>>> and is it done wright in the patch?
>>>>
>>>> Ah, I missed the hunk you added.
>>>
>>> That explains the confusion I got.
>>>
>>>> But the check should be an inclusive
>>>> one, not an exclusive one and earlier accepting a is_gimple_condexpr
>>>> is superfluous when you later reject the tcc_comparison part. Just
>>>> testing is_gimple_val is better. So yes, remove your tree-cfg.c hunk
>>>> and just adjust the above test.
>>>
>>> I simplified that.
>>>
>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Please double-check the changelog
>>
>> (do_store_flag):
>>
>> + tree-vect-isel.o \
>>
>> IMHO we want to move more of the pattern matching magic of RTL
>> expansion here to obsolete TER. So please name it gimple-isel.cc
>> (.cc!, not .c)
>>
>> + gassign *assign = dyn_cast<gassign *> (SSA_NAME_DEF_STMT (cond));
>> + if (stmt != NULL
>> + && TREE_CODE_CLASS (gimple_assign_rhs_code (assign)) != tcc_comparison)
>> + return ERROR_MARK;
>>
>> you want stmt == NULL || TREE_CODE_CLASS (...)
>>
>> in case the def stmt is a call.
>>
>> + gimple_seq seq;
>> + tree exp = force_gimple_operand (comb, &seq, true, NULL_TREE);
>> + if (seq)
>> + {
>> + gimple_stmt_iterator gsi = gsi_for_stmt (vcond0);
>> + gsi_insert_before (&gsi, seq, GSI_SAME_STMT);
>> + }
>>
>> use force_gimple_operand_gsi that makes the above simpler.
>>
>> if (invert)
>> - std::swap (*gimple_assign_rhs2_ptr (stmt0),
>> - *gimple_assign_rhs3_ptr (stmt0));
>> - update_stmt (stmt0);
>> + std::swap (*gimple_assign_rhs2_ptr (vcond0),
>> + *gimple_assign_rhs3_ptr (vcond0));
>>
>> use swap_ssa_operands.
>>
>> + gimple_assign_set_rhs1 (vcond0, exp);
>> + update_stmt (vcond0);
>>
>> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
>> index cf2d979fea1..710b17a7c5c 100644
>> --- a/gcc/tree-vect-stmts.c
>> +++ b/gcc/tree-vect-stmts.c
>> @@ -9937,8 +9937,8 @@ vectorizable_condition (vec_info *vinfo,
>> {
>> vec_cond_rhs = vec_oprnds1[i];
>> if (bitop1 == NOP_EXPR)
>> - vec_compare = build2 (cond_code, vec_cmp_type,
>> - vec_cond_lhs, vec_cond_rhs);
>> + vec_compare = gimplify_build2 (gsi, cond_code, vec_cmp_type,
>> + vec_cond_lhs, vec_cond_rhs);
>> else
>> {
>>
>> please don't introduce more uses of gimplify_buildN - I'd like to
>> get rid of those. You can use
>>
>> gimple_seq stmts = NULL;
>> vec_compare = gimple_build (&stmts, cond_code, ...);
>> gsi_insert_seq_before/after (...);
>>
>> OK with those changes.
>
> Applying the patch caused
>
> Running target unix//-m32
> FAIL: gcc.c-torture/execute/ieee/pr50310.c execution, -O3
> -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer
> -finline-functions
> FAIL: gcc.c-torture/execute/ieee/pr50310.c execution, -O3 -g
I can't reproduce that with current master. Can you?
>
> and
>
> FAIL: ext/random/simd_fast_mersenne_twister_engine/operators/inequal.cc
> (test for excess errors)
> UNRESOLVED: ext/random/simd_fast_mersenne_twister_engine/operators/inequal.cc
> compilation failed to produce executable
I've just fixed this one.
Martin
>
> Richard.
>
>> Thanks,
>> Richard.
>>
>>
>>> Thanks,
>>> Martin
>>>
>>>>
>>>>>>
>>>>>> You'll likely figure the vectorizer still creates some VEC_COND_EXPRs
>>>>>> with embedded comparisons.
>>>>>
>>>>> I've fixed 2 failing test-cases I mentioned in the previous email.
>>>>>
>>>>> Martin
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Richard.
>>>>>>
>>>>>>
>>>>>>> Martin
>>>>>
>>>
More information about the Gcc-patches
mailing list