This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC PR43721] Optimize a/b and a%b to single divmod call

On Fri, Oct 30, 2015 at 8:39 AM, Prathamesh Kulkarni
<> wrote:
> Hi,
> I have attached revamped version of Kugan's patch
> ( for PR43721.
> Test-case:
> divmod pass dump:
> Assembly:
> The approach I took is similar to sincos pass, which involves two parts:
> a) divmod pass that transforms:
> t1 = a / b;
> t2 = a % b;
> to the following sequence:
> complex_tmp = DIVMOD (a, b);
> t1 = REALPART_EXPR(a);
> t2 = IMAGPART_EXPR(b);
> b) DIVMOD(a, b) is represented as an internal-fn and is expanded by
> expand_DIVMOD().
> I am not sure if I have done this correctly. I was somehow looking to
> reuse expand_divmod() but am not able to figure out how to do that
> (AFAIU expand_divmod() strictly returns either the quotient or
> remainder but never both which is what I want), so ended up with
> manually expanding to rtx.
> While going through the sincos pass in execute_cse_sincos_1, I didn't
> understand the following:
>  if (gimple_purge_dead_eh_edges (gimple_bb (stmt)))
>           cfg_changed = true;
> Um why is the call to gimple_purge_dead_eh_edges necessary here?

The call replacement might no longer throw.

> A silly question, what options to pass to gcc to print statistics ? I
> added statistics_counter_event to keep track of number of calls
> inserted/used but not sure how to print them :/

-fdump-tree-XXX-stats or -fdump-statistics-stats

> I would be grateful for suggestions for improving the patch.

First of all new functions need comments (expand_DIVMOD).

Second - what's the reasoning for the pass placement seen?  I think
the transform is a lowering one, improving RTL expansion.  The
DIVMOD representation is harmful for GIMPLE optimizers.

Third - I'd have integrated this with an existing pass - we have another
lowering / RTL expansion kind pass which is pass_optimize_widening_mul.
Please piggy-back on it.

You seem to do the transform unconditionally even if the target has
instructions for division and modulus but no instruction for divmod
in which case you'll end up emitting a library call in the end instead
of a div and mod instruction.  I think the transform should be gated on
missing div/mod instructions or the availability of a divmod one.

+         if (is_gimple_assign (stmt)
+             && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_binary)
+           {
+             if (gimple_assign_rhs_code (stmt) == TRUNC_DIV_EXPR)
+               cfg_changed |= execute_divmod_1 (stmt);

you can directly check gimple_assign_rhs_code.  I'd check for TRUNC_MOD_EXPR
which seems to be less common and thus should cut down compile-time.

+  /* Case 2: When both are in top_bb */
+  else if (op1_def_in_bb && op2_def_in_bb)
+    {
+      /* FIXME: better way to compare iterators?.  */
+      gimple_stmt_iterator gsi = get_later_stmt (top_bb,
def_stmt_op1, def_stmt_op2);

You can't use get_later_stmt, it's a vectorizer speciality.  To do
this test efficiently
you need stmt UIDs computed.

I miss a testcase (or two).


> Thank you,
> Prathamesh

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]