This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [RFC] Elimination of zext/sext - type promotion pass


On Fri, May 1, 2015 at 6:41 AM, Kugan <kugan.vivekanandarajah@linaro.org> wrote:
>
>>> Thanks for the comments. Here is a prototype patch that implements a
>>> type promotion pass. This pass records SSA variables that will have
>>> values in higher bits (than the original type precision) if promoted and
>>> uses this information in inserting appropriate truncations and
>>> extensions. This pass also classifies some of the stmts that sets ssa's
>>> to be unsafe to promote. Here is a gimple difference for the type
>>> promotion as compared to previous dump for a testcase.
>>
>> Note that while GIMPLE has a way to zero-extend (using BIT_AND_EXPR)
>> it has no convenient way to sign-extend other than truncating to a signed
>> (non-promoted) type and then extending to the promoted type.  Thus
>> I think such pass should be accompanied with a new tree code,
>> SEXT_EXPR.  Otherwise we end up with "spurious" un-promoted
>> signed types which later optimizations may be confused about.
>>
>> Not sure if that is the actual issue though.
>>
>> Instead op "prmt" and "prmtn" I'd spell out promote and tree-type-prmtn
>> should be gimple-ssa-type-promote.c.  In the end all targets with
>> non-trivial PROMOTE_MODE should run the pass as a lowering step
>> so it should be enabled even at -O0 (and not disablable).
>>
>> I'd definitely run the pass _after_ pass_lower_vector_ssa (and in the
>> end I'd like to run it before IVOPTs ... which means moving IVOPTs
>> later, after VRP which should be the pass optimizing away some of
>> the extensions).
>>
>> In get_promoted_type I don't understand why you preserve qualifiers.
>> Also even for targets without PROMOTE_MODE it may be
>> beneficial to expose truncations required by expanding bit-precision
>> arithmetic earlier (that is, if !PROMOTE_MODE at least promote
>> to GET_MODE_PRECISION (TYPE_MODE (type))).  A testcase
>> for that is for example
>>
>> struct { long i : 33; long j : 33; } a;
>> return a.i + a.j;
>>
>> where bitfields of type > int do not promote so you get a
>> 33 bit add which we expand to a 64bit add plus a sign-extension
>> (and nothing optimizes that later usually).
>>
>> insert_next_bb sounds like you want to use insert_on_edge
>> somewhere.
>>
>> in assign_rhs_promotable_p you handle comparisons special
>> but the ternary COND_EXPR and VEC_COND_EXPR can have
>> comparisons embedded in their first operand.  The comment
>> confuses me though - with proper sign- or zero-extensions inserted
>> you should be able to promote them anyway?
>>
>> You seem to miss that a GIMPLE_ASSIGN can have 3 operands
>> in promote_cst_in_stmt as well.
>>
>> In promote_assign_stmt_use I consider a default: case that ends
>> up doing nothing dangerous ;)  Please either use gcc_unreachable ()
>> or do the safe thing (fix = true;?).  You seem to be working with
>> a lattice of some kind - fixing up stmt uses the way you do - walking
>> over immediate uses - is not very cache friendly.  Why not use
>> a lattice for this - record promoted vars to be used for old SSA names
>> and walk over all stmts instead, replacing SSA uses on them?
>> Btw, you don't need to call update_stmt if you SET_USE and not
>> replace an SSA name with a constant.
>>
>> You seem to "fix" with a single stmt but I don't see where you insert
>> zero- or sign-extensions for ssa_overflows_p cases?
>>
>> Note that at least for SSA names with !SSA_NAME_VAR (thus
>> anonymous vars) you want to do a cheaper promotion by not
>> allocating a new SSA name but simply "fixing" its type by
>> assigning to its TREE_TYPE.   For SSA names with SSA_NAME_VAR
>> there is of course debug-info to consider and thus doing what you
>> do is better (but probably still will wreck debuginfo?).
>>
>> GIMPLE_NOPs are not only used for parameters but also uninitialized
>> uses - for non-parameters you should simply adjust their type.  No
>> need to fixup their value.
>>
>> The pass needs more comments.
>>
>> It looks like you are not promoting all variables but only those
>> where compensation code (zero-/sign-extensions) is not necessary?
>>
>
> Thanks for the comments. Please find an updated version of this which
> addresses your review comments above. I am still to do full benchmarking
> on this, but tried with few small benchmarks. I will do proper
> benchmarking after getting feedback on the implementation. I have
> however bootstrapped on x86-64-none-linux and regression tested on
> x86-64, ARM and AArch64.
>
> I am also not clear with how I should handle the gimple debug statements
> when the intermediate temporary variable that maps to the original
> variable is promoted.

A few notes.

+/*  Sign-extend operation.  */
+DEFTREECODE (SEXT_EXPR, "sext_expr", tcc_binary, 2)

this needs an extended comment documenting the operands.

+    case SEXT_EXPR:
+       {
+         rtx op0 = expand_normal (treeop0);
+         rtx temp;
+         if (!target)
+           target = gen_reg_rtx (TYPE_MODE (TREE_TYPE (treeop0)));
+
+         machine_mode inner_mode = smallest_mode_for_size
(tree_to_shwi (treeop1),
+                                                           MODE_INT);
+         temp = convert_modes (inner_mode,
+                               TYPE_MODE (TREE_TYPE (treeop0)), op0, 0);
+         convert_move (target, temp, 0);
+         return target;
+       }

I think that if you allow arbitrary treeop1 you have to properly implement
fallbacks for the case where direct expansion to
(sign_extend:<target-mode> (subreg:<inner-mode> reg)) does not work
which is the intended operation modeled by SEXT_EXPR.

Direct expansion to that RTL would also be best I suppose.

+/* Return the promoted type for TYPE.  */
+static tree
+get_promoted_type (tree type)
+{
+  tree promoted_type;
+  enum machine_mode mode;
+  int uns;
+  if (POINTER_TYPE_P (type)
+      || TYPE_PRECISION (type) == 1
+      || !INTEGRAL_TYPE_P (type))
+    return type;

you should check for INTEGRAL_TYPE_P before looking at TYPE_PRECISION.

+#ifdef PROMOTE_MODE
+  mode = TYPE_MODE (type);
+  uns = TYPE_SIGN (type);
+  PROMOTE_MODE (mode, uns, type);
+#else
+  mode = smallest_mode_for_size (GET_MODE_PRECISION (TYPE_MODE (type)),
+                                MODE_INT);
+#endif

That smallest_mode_for_size should be a no-op.  Just hoist out
mode = TYPE_MODE (type).

Now before I get into too much details at this point.

You compute which promotions are unsafe, like sources/sinks of memory
(I think you miss call arguments/return values and also asm operands here).
But instead of simply marking those SSA names as not to be promoted
I'd instead split their life-ranges, thus replace

  _1 = mem;

with

  _2 = mem;
  _1 = [zs]ext (_2, ...);

and promote _1 anyway.  So in the first phase I'd do that (and obviously
note that _2 isn't to be promoted in the specific example).

For promotions that apply I wouldn't bother allocating new SSA names
but just "fix" their types (assign to their TREE_TYPE).  This also means
they have to become anonymous and if they didn't have a !DECL_IGNORED_P
decl before then a debug stmt should be inserted at the point of the
promotions.  So

  bar_3 = _1 + _2;

when promoted would become

 _4 = _1 + _2;
 _3 = sext <_4, ...>;
 # DEBUG bar = (orig-type) _4;  // or _3?

so you'd basically always promote defs (you have a lot of stmt/operand
walking code I didn't look too closely at - but it looks like too much) and
the uses get promoted automagically (because you promote the original
SSA name). Promotion of constants has to remain, of course.

I wouldn't promote pointers at all (are targets doing that?)

There are existing various helpers for stuff you re-invent.  I've just
spotted get_single_successor_bb for which there is single_succ_p ()
plus single_succ ().

Generally most of your stmt walking code could either use walk_stmt ()
or the low-level gimple_op (...) interface and a loop over all gimple_num_ops ()
operands.

You seem to mix in optimization and lowering - you are extending
at uses, not at defs for example.  IMHO that complicates the code
or do you think that a later optimization pass cannot recover from some
obviously bad decisions here?  If so then I suggest to implement sth
less ad-hoc by using a SSA lattice to track this and propagate the
info properly, still emitting the truncations at the defs where necessary.

Thanks,
Richard.

> Thanks,
> Kugan
>
>
> gcc/ChangeLog:
>
> 2015-05-01  Kugan Vivekanandarajah  <kuganv@linaro.org>
>
>         * Makefile.in: Add gimple-ssa-type-promote.o.
>         * cfgexpand.c (expand_debug_expr): Handle SEXT_EXPR.
>         * common.opt: New option -ftree-type-promote.
>         * expr.c (expand_expr_real_2): Handle SEXT_EXPR.
>         * fold-const.c (int_const_binop_1):
>         * gimple-ssa-type-promote.c: New file.
>         * passes.def: Define new pass_type_promote.
>         * timevar.def: Define new TV_TREE_TYPE_PROMOTE.
>         * tree-cfg.c (verify_gimple_assign_binary): Handle SEXT_EXPR.
>         * tree-inline.c (estimate_operator_cost):
>         * tree-pass.h (make_pass_type_promote): New.
>         * tree-pretty-print.c (dump_generic_node): Handle SEXT_EXPR.
>         (op_symbol_code): Likewise.
>         * tree-vrp.c (extract_range_from_binary_expr_1): Likewise.
>         * tree.def: Define new SEXT_EXPR.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]