This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching

From: Richard Guenther <richard dot guenther at gmail dot com>
To: Andrew Stubbs <andrew dot stubbs at gmail dot com>
Cc: Michael Matz <matz at suse dot de>, gcc-patches at gcc dot gnu dot org, patches at linaro dot org
Date: Thu, 7 Jul 2011 14:28:33 +0200
Subject: Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
References: <4E034EF2.3070503@codesourcery.com> <4E03504B.9060305@codesourcery.com> <BANLkTi=a8B-DZZdG2bmOWQ8A+-pCSEsAbQ@mail.gmail.com> <4E044559.5000105@linaro.org> <BANLkTimfKyrARS+eRy+MWYVO2gLqp85JsQ@mail.gmail.com> <1A77B5B39081C241A68E6CF16983025F020906F6@EU1-MAIL.mgc.mentorg.com> <BANLkTi=1TRM7uHWxLD3Se=S-ibe0C13T-Q@mail.gmail.com> <4E09B142.4020402@codesourcery.com> <BANLkTim0wLcSTgAL9isO+dVXxgq-U8B4Sw@mail.gmail.com> <Pine.LNX.4.64.1106281741170.17115@wotan.suse.de> <4E09FDEA.3000004@gmail.com> <Pine.LNX.4.64.1106281827230.17115@wotan.suse.de> <1A77B5B39081C241A68E6CF16983025F0209071D@EU1-MAIL.mgc.mentorg.com> <BANLkTimJob8C2L8kYkd-aU6FbM5nckc8Yg@mail.gmail.com> <4E11CCD1.4010505@codesourcery.com> <CAFiYyc3WczNWC19j-dWWRtDdOk=9E7vRtqidDpv8BW3V=W1Fpw@mail.gmail.com> <4E1589D8.2060108@codesourcery.com> <4E159BEA.9070201@codesourcery.com>

On Thu, Jul 7, 2011 at 1:43 PM, Andrew Stubbs <andrew.stubbs@gmail.com> wrote:
> On 07/07/11 11:26, Andrew Stubbs wrote:
>>
>> On 07/07/11 10:58, Richard Guenther wrote:
>>>
>>> I think you should assume that series of widenings,
>>> (int)(short)char_variable
>>> are already combined. ?Thus I believe you only need to consider a single
>>> conversion in valid_types_for_madd_p.
>>
>> Hmm, I'm not so sure. I'll look into it a bit further.
>
> OK, here's a test case that gives multiple conversions:
>
> ?long long
> ?foo (long long a, signed char b, signed char c)
> ?{
> ? ?int bc = b * c;
> ? ?return a + (short)bc;
> ?}
>
> The dump right before the widen_mult pass gives:
>
> ?foo (long long int a, signed char b, signed char c)
> ?{
> ? ?int bc;
> ? ?long long int D.2018;
> ? ?short int D.2017;
> ? ?long long int D.2016;
> ? ?int D.2015;
> ? ?int D.2014;
>
> ?<bb 2>:
> ? ?D.2014_2 = (int) b_1(D);
> ? ?D.2015_4 = (int) c_3(D);
> ? ?bc_5 = D.2014_2 * D.2015_4;
> ? ?D.2017_6 = (short int) bc_5;

Ok, so you have a truncation that is a no-op value-wise.  I would
argue that this truncation should be removed independent on
whether we have a widening multiply instruction or not.

The technically most capable place to remove non-value-changing
truncations (and combine them with a successive conversion)
would be value-range propagation.  Which already knows:

Value ranges after VRP:

b_1(D): VARYING
D.2698_2: [-128, 127]
c_3(D): VARYING
D.2699_4: [-128, 127]
bc_5: [-16256, 16384]
D.2701_6: [-16256, 16384]
D.2702_7: [-16256, 16384]
a_8(D): VARYING
D.2700_9: VARYING

thus truncating bc_5 to short does not change the value.

The simplification could be made when looking at the
statement

> ? ?D.2018_7 = (long long int) D.2017_6;

in vrp_fold_stmt, based on the fact that this conversion
converts from a value-preserving intermediate conversion.
Thus the transform would replace the D.2017_6 operand
with bc_5.

So yes, the case appears - but it shouldn't ;)

I'll cook up a quick patch for VRP.

Thanks,
Richard.

> ? ?D.2016_9 = D.2018_7 + a_8(D);
> ? ?return D.2016_9;
>
> ?}
>
> Here we have a multiply and accumulate done the long way. The 8-bit inputs
> are widened to 32-bit, multiplied to give a 32-bit result (of which only the
> lower 16-bits contain meaningful data), then truncated to 16-bits, and
> sign-extended up to 64-bits ready for the 64-bit addition.
>
> This is slight contrived, perhaps, but not unlike the sort of thing that
> might occur when you have inline functions and macros, and most importantly
> - it is mathematically valid!
>
>
> So, here's the output from my patched widen_mult pass:
>
> ?foo (long long int a, signed char b, signed char c)
> ?{
> ? ?int bc;
> ? ?long long int D.2018;
> ? ?short int D.2017;
> ? ?long long int D.2016;
> ? ?int D.2015;
> ? ?int D.2014;
>
> ?<bb 2>:
> ? ?D.2014_2 = (int) b_1(D);
> ? ?D.2015_4 = (int) c_3(D);
> ? ?bc_5 = b_1(D) w* c_3(D);
> ? ?D.2017_6 = (short int) bc_5;
> ? ?D.2018_7 = (long long int) D.2017_6;
> ? ?D.2016_9 = WIDEN_MULT_PLUS_EXPR <b_1(D), c_3(D), a_8(D)>;
> ? ?return D.2016_9;
>
> ?}
>
> As you can see, everything except the WIDEN_MULT_PLUS_EXPR statement is now
> redundant. (Ideally, this would be removed now, but in fact it doesn't get
> eliminated until the RTL into_cfglayout pass. This is not new behaviour.)
>
>
> My point is that it's possible to have at least two conversions to examine.
> Is it possible to have more? I don't know, but once I'm dealing with two I
> might as well deal with an arbitrary number.
>
> Andrew
>

Follow-Ups:
- Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
  - From: Richard Guenther

References:
- Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
  - From: Stubbs, Andrew
- Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
  - From: Richard Guenther
- Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
  - From: Andrew Stubbs
- Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
  - From: Richard Guenther
- Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
  - From: Andrew Stubbs
- Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching
  - From: Andrew Stubbs

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]