This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH (3/7)] Widening multiply-and-accumulate pattern matching


On 07/07/11 11:26, Andrew Stubbs wrote:
On 07/07/11 10:58, Richard Guenther wrote:
I think you should assume that series of widenings,
(int)(short)char_variable
are already combined.  Thus I believe you only need to consider a single
conversion in valid_types_for_madd_p.

Hmm, I'm not so sure. I'll look into it a bit further.

OK, here's a test case that gives multiple conversions:


  long long
  foo (long long a, signed char b, signed char c)
  {
    int bc = b * c;
    return a + (short)bc;
  }

The dump right before the widen_mult pass gives:

  foo (long long int a, signed char b, signed char c)
  {
    int bc;
    long long int D.2018;
    short int D.2017;
    long long int D.2016;
    int D.2015;
    int D.2014;

  <bb 2>:
    D.2014_2 = (int) b_1(D);
    D.2015_4 = (int) c_3(D);
    bc_5 = D.2014_2 * D.2015_4;
    D.2017_6 = (short int) bc_5;
    D.2018_7 = (long long int) D.2017_6;
    D.2016_9 = D.2018_7 + a_8(D);
    return D.2016_9;

}

Here we have a multiply and accumulate done the long way. The 8-bit inputs are widened to 32-bit, multiplied to give a 32-bit result (of which only the lower 16-bits contain meaningful data), then truncated to 16-bits, and sign-extended up to 64-bits ready for the 64-bit addition.

This is slight contrived, perhaps, but not unlike the sort of thing that might occur when you have inline functions and macros, and most importantly - it is mathematically valid!


So, here's the output from my patched widen_mult pass:


  foo (long long int a, signed char b, signed char c)
  {
    int bc;
    long long int D.2018;
    short int D.2017;
    long long int D.2016;
    int D.2015;
    int D.2014;

  <bb 2>:
    D.2014_2 = (int) b_1(D);
    D.2015_4 = (int) c_3(D);
    bc_5 = b_1(D) w* c_3(D);
    D.2017_6 = (short int) bc_5;
    D.2018_7 = (long long int) D.2017_6;
    D.2016_9 = WIDEN_MULT_PLUS_EXPR <b_1(D), c_3(D), a_8(D)>;
    return D.2016_9;

}

As you can see, everything except the WIDEN_MULT_PLUS_EXPR statement is now redundant. (Ideally, this would be removed now, but in fact it doesn't get eliminated until the RTL into_cfglayout pass. This is not new behaviour.)


My point is that it's possible to have at least two conversions to examine. Is it possible to have more? I don't know, but once I'm dealing with two I might as well deal with an arbitrary number.


Andrew


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]