[RFC/RFA] [PATCH 03/12] RISC-V: Add CRC expander to generate faster CRC.

Sun Jun 9 07:14:27 GMT 2024

On Sat, Jun 8, 2024, 09:53 Richard Sandiford <richard.sandiford@arm.com>
wrote:

> Thanks a lot for doing this!  It's a really nice series.
>

Thank you for your positive feedback and for your review and suggestions on
the patch series.

Just had a comment on the long division helper:
>
> Mariam Arutunian <mariamarutunian@gmail.com> writes:
> > +/* Return the quotient of polynomial long division of x^2N by POLYNOMIAL
> > +   in GF (2^N).  */
>
> It looks like there might be an off-by-one discrepancy between the comment
> and the code.  The comment suggests that N is the degree of the polynomial
> (crc_size), whereas the callers seem to pass crc_size + 1.  This doesn't
> matter in practice since...
>
> > +
> > +unsigned HOST_WIDE_INT
> > +gf2n_poly_long_div_quotient (unsigned HOST_WIDE_INT polynomial, size_t
> n)
> > +{
> > +  vec<short> x2n;
> > +  vec<bool> pol, q;
> > +  /* Create vector of bits, for the polynomial.  */
> > +  pol.create (n + 1);
> > +  for (size_t i = 0; i < n; i++)
> > +    {
> > +      pol.quick_push (polynomial & 1);
> > +      polynomial >>= 1;
> > +    }
> > +  pol.quick_push (1);
> > +
> > +  /* Create vector for x^2n polynomial.  */
> > +  x2n.create (2 * n - 1);
> > +  for (size_t i = 0; i < 2 * (n - 1); i++)
> > +    x2n.safe_push (0);
> > +  x2n.safe_push (1);
>
> ...this compensates by setting the dividend to x^(2N-2).  And although
> the first loop reads crc_size+1 bits from polynomial before adding the
> implicit leading 1, only the low crc_size elements of poly affect the
> result.
>

Yes. Initially, I implemented it quickly using an implementation I found
with the intention of refining it later.

> If we do pass crc_size as N, a simpler way of writing the routine might be:
>
> {
>   /* The result has degree N, so needs N + 1 bits.  */
>   gcc_assert (n < 64);
>
>   /* Perform a division step for the x^2N coefficient.  At this point the
>      quotient and remainder have N implicit trailing zeros.  */
>   unsigned HOST_WIDE_INT quotient = 1;
>   unsigned HOST_WIDE_INT remainder = polynomial;
>
>   /* Process the coefficients for x^(2N-1) down to x^N, with each step
>      reducing the number of implicit trailing zeros by one.  */
>   for (unsigned int i = 0; i < n; ++i)
>     {
>       bool coeff = remainder & (HOST_WIDE_INT_1U << (n - 1));
>       quotient = (quotient << 1) | coeff;
>       remainder = (remainder << 1) ^ (coeff ? polynomial : 0);
>     }
>   return quotient;
> }
>
> I realise there are many ways of writing this out there though,
> so that's just a suggestion.  (And only lightly tested.)
>

Thanks, I appreciate your input.
I'm currently on vacation, but after I return, I'll apply all the changes
and send a new version.

> FWIW, we could easily extend the interface to work on wide_ints if we
> ever need it for N>63.

 The problem is keeping the whole quotient. For 64 degree, it may need 65
bits. Jeff already answered this. Alternatively, there might be a method to
perform the calculation without retaining that extra bit, but I haven't
explored it yet.

Thank you again,
Mariam

Thanks,
> Richard
>
> > +
> > +  q.create (n);
> > +  for (size_t i = 0; i < n; i++)
> > +    q.quick_push (0);
> > +
> > +  /* Calculate the quotient of x^2n/polynomial.  */
> > +  for (int i = n - 1; i >= 0; i--)
> > +    {
> > +      int d = x2n[i + n - 1];
> > +      if (d == 0)
> > +     continue;
> > +      for (int j = i + n - 1; j >= i; j--)
> > +     x2n[j] ^= (pol[j - i]);
> > +      q[i] = 1;
> > +    }
> > +
> > +  /* Get the number from the vector of 0/1s.  */
> > +  unsigned HOST_WIDE_INT quotient = 0;
> > +  for (size_t i = 0; i < q.length (); i++)
> > +    {
> > +      quotient <<= 1;
> > +      quotient = quotient | q[q.length () - i - 1];
> > +    }
> > +  return quotient;
> > +}
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://gcc.gnu.org/pipermail/gcc-patches/attachments/20240609/1b7b79df/attachment.htm>