This is the mail archive of the
mailing list for the GCC project.
Re: [patch] Reduce over-promotion of vector operations
- From: Richard Guenther <richard dot guenther at gmail dot com>
- To: Ira Rosen <ira dot rosen at linaro dot org>
- Cc: gcc-patches at gcc dot gnu dot org, Patch Tracking <patches at linaro dot org>
- Date: Tue, 19 Jul 2011 10:57:00 +0200
- Subject: Re: [patch] Reduce over-promotion of vector operations
- References: <CAKSNEw48+5OkNwa+Z0Uw5FO50RNsUTRLGVwiqAF_fwnpECwTng@mail.gmail.com>
On Tue, Jul 19, 2011 at 8:44 AM, Ira Rosen <firstname.lastname@example.org> wrote:
> This patch tries to reduce over-promotion of vector operations that
> could be done with narrower elements, e.g., for
> char a;
> int b, c;
> short d;
> b = (int) a;
> c = b << 2;
> d = (short) c;
> we currently produce six vec_unpack_lo/hi_expr statements for
> char->int conversion and then two vec_pack_trunc_expr for short->int.
> While the shift can be performed on short, using only two
> vec_unpack_lo/hi_expr operations for char->short conversion in this
> With this patch we detect such over-promoted sequences that start with
> a type promotion operation and end with a type demotion operation. The
> statements in between are checked if they can be performed using
> smaller type (this patch only adds a support for shifts and bit
> operations with a constant). If a sequence is detected we create a
> sequence of scalar pattern statements to be vectorized instead the
> original one. ?Since there may be two pattern statements created for
> the same original statement - the operation itself (on an intermediate
> type) and a type promotion (from a smaller type to the intermediate
> type) for the non-constant operand - this patch adds a new field to
> struct _stmt_vec_info to keep that pattern def statement.
> Bootstrapped and tested on powerpc64-suse-linux.
> Comments are welcome.
I wonder if we should do this optimization for scalars as well. We still
do some sort of that in frontends shorten_* functions and I added
the capability to remove intermediate conversions to VRP recently.
At least it looks like VRP could be a good place to re-write operations
in narrower types. That is, for a truncation statement
d = (short) c;
see if that truncation is value-preserving by looking at the value-range
of C, then look if all related defs of C can be rewritten to that truncated
type until you reach only stmts that need no further processing
(not sure if that might be too expensive - at least I could imagine
some artificial testcases that would exhibit quadratic behavior).
You'd need to make VRP handle new SSA names during substitue_and_fold
> ? * tree-vectorizer.h (struct _stmt_vec_info): Add new field for
> ? pattern def statement, and its access macro.
> ? (NUM_PATTERNS): Set to 5.
> ? * tree-vect-loop.c (vect_determine_vectorization_factor): Handle
> ? pattern def statement.
> ? (vect_transform_loop): Likewise.
> ? * tree-vect-patterns.c (vect_vect_recog_func_ptrs): Add new
> ? function vect_recog_over_widening_pattern ().
> ? (vect_operation_fits_smaller_type): New function.
> ? (vect_recog_over_widening_pattern, vect_mark_pattern_stmts):
> ? Likewise.
> ? (vect_pattern_recog_1): Move the code that marks pattern
> ? statements to vect_mark_pattern_stmts (), and call it. ?Update
> ? documentation.
> ? * tree-vect-stmts.c (vect_supportable_shift): New function.
> ? (vect_analyze_stmt): Handle pattern def statement.
> ? (new_stmt_vec_info): Initialize pattern def statement.
> ? * gcc.dg/vect/vect-over-widen-1.c: New test.
> ? * gcc.dg/vect/vect-over-widen-2.c: New test.
> ? * gcc.dg/vect/vect-over-widen-3.c: New test.
> ? * gcc.dg/vect/vect-over-widen-4.c: New test.