This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [14/n] PR85694: Rework overwidening detection


Christophe Lyon <christophe.lyon@linaro.org> writes:
> On Fri, 29 Jun 2018 at 13:36, Richard Sandiford
> <richard.sandiford@arm.com> wrote:
>>
>> Richard Sandiford <richard.sandiford@arm.com> writes:
>> > This patch is the main part of PR85694.  The aim is to recognise at least:
>> >
>> >   signed char *a, *b, *c;
>> >   ...
>> >   for (int i = 0; i < 2048; i++)
>> >     c[i] = (a[i] + b[i]) >> 1;
>> >
>> > as an over-widening pattern, since the addition and shift can be done
>> > on shorts rather than ints.  However, it ended up being a lot more
>> > general than that.
>> >
>> > The current over-widening pattern detection is limited to a few simple
>> > cases: logical ops with immediate second operands, and shifts by a
>> > constant.  These cases are enough for common pixel-format conversion
>> > and can be detected in a peephole way.
>> >
>> > The loop above requires two generalisations of the current code: support
>> > for addition as well as logical ops, and support for non-constant second
>> > operands.  These are harder to detect in the same peephole way, so the
>> > patch tries to take a more global approach.
>> >
>> > The idea is to get information about the minimum operation width
>> > in two ways:
>> >
>> > (1) by using the range information attached to the SSA_NAMEs
>> >     (effectively a forward walk, since the range info is
>> >     context-independent).
>> >
>> > (2) by back-propagating the number of output bits required by
>> >     users of the result.
>> >
>> > As explained in the comments, there's a balance to be struck between
>> > narrowing an individual operation and fitting in with the surrounding
>> > code.  The approach is pretty conservative: if we could narrow an
>> > operation to N bits without changing its semantics, it's OK to do that if:
>> >
>> > - no operations later in the chain require more than N bits; or
>> >
>> > - all internally-defined inputs are extended from N bits or fewer,
>> >   and at least one of them is single-use.
>> >
>> > See the comments for the rationale.
>> >
>> > I didn't bother adding STMT_VINFO_* wrappers for the new fields
>> > since the code seemed more readable without.
>> >
>> > Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
>>
>> Here's a version rebased on top of current trunk.  Changes from last time:
>>
>> - reintroduce dump_generic_expr_loc, with the obvious change to the
>>   prototype
>>
>> - fix a typo in a comment
>>
>> - use vect_element_precision from the new version of 12/n.
>>
>> Tested as before.  OK to install?
>>
>
> Hi Richard,
>
> This patch introduces regressions on arm-none-linux-gnueabihf:
>     gcc.dg/vect/vect-over-widen-1-big-array.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 2
>     gcc.dg/vect/vect-over-widen-1-big-array.c scan-tree-dump-times
> vect "vect_recog_widen_shift_pattern: detected" 2
>     gcc.dg/vect/vect-over-widen-1.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 2
>     gcc.dg/vect/vect-over-widen-1.c scan-tree-dump-times vect
> "vect_recog_widen_shift_pattern: detected" 2
>     gcc.dg/vect/vect-over-widen-4-big-array.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 2
>     gcc.dg/vect/vect-over-widen-4-big-array.c scan-tree-dump-times
> vect "vect_recog_widen_shift_pattern: detected" 2
>     gcc.dg/vect/vect-over-widen-4.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 2
>     gcc.dg/vect/vect-over-widen-4.c scan-tree-dump-times vect
> "vect_recog_widen_shift_pattern: detected" 2
>     gcc.dg/vect/vect-widen-shift-s16.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 8
>     gcc.dg/vect/vect-widen-shift-s16.c scan-tree-dump-times vect
> "vect_recog_widen_shift_pattern: detected" 8
>     gcc.dg/vect/vect-widen-shift-s8.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 1
>     gcc.dg/vect/vect-widen-shift-s8.c scan-tree-dump-times vect
> "vect_recog_widen_shift_pattern: detected" 1
>     gcc.dg/vect/vect-widen-shift-u16.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 1
>     gcc.dg/vect/vect-widen-shift-u16.c scan-tree-dump-times vect
> "vect_recog_widen_shift_pattern: detected" 1
>     gcc.dg/vect/vect-widen-shift-u8.c -flto -ffat-lto-objects
> scan-tree-dump-times vect "vect_recog_widen_shift_pattern: detected" 2
>     gcc.dg/vect/vect-widen-shift-u8.c scan-tree-dump-times vect
> "vect_recog_widen_shift_pattern: detected" 2

Sorry about that, was caused by a stupid typo.  I've applied the
below as obvious.

(For the record, it was actually 12/n that caused this.  14/n hasn't
been applied yet.)

Thanks,
Richard


2018-07-02  Richard Sandiford  <richard.sandiford@arm.com>

gcc/
	* tree-vect-patterns.c (vect_recog_widen_shift_pattern): Fix typo
	in dump string.

Index: gcc/tree-vect-patterns.c
===================================================================
--- gcc/tree-vect-patterns.c	2018-07-02 14:30:57.000000000 +0100
+++ gcc/tree-vect-patterns.c	2018-07-02 14:30:57.383750450 +0100
@@ -1739,7 +1739,7 @@ vect_recog_widen_shift_pattern (vec<gimp
 {
   return vect_recog_widen_op_pattern (stmts, type_out, LSHIFT_EXPR,
 				      WIDEN_LSHIFT_EXPR, true,
-				      "vect_widen_shift_pattern");
+				      "vect_recog_widen_shift_pattern");
 }
 
 /* Detect a rotate pattern wouldn't be otherwise vectorized:


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]