This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: rtl expansion without zero/sign extension based on VRP
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Kugan <kugan dot vivekanandarajah at linaro dot org>
- Cc: gcc-patches at gcc dot gnu dot org, ebotcazou at libertysurf dot fr, ramana dot radhakrishnan at arm dot com, Richard Earnshaw <Richard dot Earnshaw at arm dot com>
- Date: Mon, 13 May 2013 10:17:08 +0200
- Subject: Re: rtl expansion without zero/sign extension based on VRP
- References: <519061CA dot 9020408 at linaro dot org>
On Mon, May 13, 2013 at 5:45 AM, Kugan
<kugan.vivekanandarajah@linaro.org> wrote:
> Hi,
>
> This patch removes some of the redundant sign/zero
> extensions using value ranges informations generated by VRP.
>
> When GIMPLE_ASSIGN stmts with LHS type smaller than word is expanded to
> rtl, if we can prove that RHS expression value can always fit in LHS
> type and there is no sign conversion, truncation and extension to fit
> the type is redundant. Subreg and Zero/sign extensions are therefore
> redundant.
>
> For example, when an expression is evaluated and it's value is assigned
> to variable of type short, the generated rtl would look something like
> the following.
>
> (set (reg:SI 110)
> (zero_extend:SI (subreg:HI (reg:SI 117) 0)))
>
> However, if during value range propagation, if we can say for certain
> that the value of the expression which is present in register 117 is
> within the limits of short and there is no sign conversion, we don’t
> need to perform the subreg and zero_extend; instead we can generate the
> following rtl.
>
> (set (reg:SI 110)
> (reg:SI 117)))
>
> Attached patch adds value range information to gimple stmts during value
> range propagation pass and expands rtl without subreg and zero/sign
> extension if the value range is within the limits of it's type.
>
> This change improve the geomean of spec2k int benchmark with ref by about
> ~3.5% on an arm chromebook.
>
> Tested on X86_64 and ARM.
>
> I would like review comments on this.
A few comments on the way you preserve this information.
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -191,6 +191,11 @@ struct GTY((chain_next ("%h.next")))
gimple_statement_base {
in there. */
unsigned int subcode : 16;
+ /* if an assignment gimple statement has RHS expression that can fit
+ LHS type, zero/sign extension to truncate is redundant.
+ Set this if we detect extension as redundant during VRP. */
+ unsigned sign_zero_ext_redundant : 1;
+
this enlarges all gimple statements by 8 bytes, so it's out of the question.
My original plan to preserve value-range information was to re-use
SSA_NAME_PTR_INFO which is where we already store value-range
information for pointers:
struct GTY(()) ptr_info_def
{
/* The points-to solution. */
struct pt_solution pt;
/* Alignment and misalignment of the pointer in bytes. Together
align and misalign specify low known bits of the pointer.
ptr & (align - 1) == misalign. */
/* When known, this is the power-of-two byte alignment of the object this
pointer points into. This is usually DECL_ALIGN_UNIT for decls and
MALLOC_ABI_ALIGNMENT for allocated storage. When the alignment is not
known, it is zero. Do not access directly but use functions
get_ptr_info_alignment, set_ptr_info_alignment,
mark_ptr_info_alignment_unknown and similar. */
unsigned int align;
/* When alignment is known, the byte offset this pointer differs from the
above alignment. Access only through the same helper functions as align
above. */
unsigned int misalign;
};
what you'd do is to make the ptr_info_def reference from tree_ssa_name a
reference to a union of the above (for pointer SSA names) and an alternate
info like
struct range_info_def {
double_int min;
double_int max;
};
or a more compressed format (a reference to a mode or type it fits for example).
The very specific case in question also points at the fact that we have
two conversion tree codes, NOP_EXPR and CONVERT_EXPR, and we've
tried to unify them (but didn't finish up that task)...
+static void
+process_stmt_for_ext_elimination ()
+{
we already do all the analysis when looking for opportunities to eliminate
the intermediate cast in (T2)(T1)x in simplify_conversion_using_ranges,
that's the place that should handle this case.
Richard.
> Thanks,
> Kugan
>
>
> 2013-05-09 Kugan Vivekanandarajah <kuganv@linaro.org>
>
> * gcc/gimple.h (gimple_is_exp_fit_lhs, gimple_set_exp_fit_lhs): New
> function.
> * gcc/tree-vrp.c (process_stmt_for_ext_elimination): Likewise.
> * (is_msb_set, range_fits_type): Likewise.
> * (vrp_finalize): Call process_stmt_for_ext_elimination.
> * gcc/dojump.c (do_compare_and_jump): generates rtl without
> zero/sign extension
> if process_stmt_for_ext_elimination tells so.
> * gcc/cfgexpand.c (expand_gimple_stmt_1): Likewise.
>
>