[PATCH][v4] GIMPLE store merging pass

Mon Oct 3 16:43:00 GMT 2016

On October 3, 2016 3:02:04 PM GMT+02:00, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote:
>Hi Richard,
>another question as I'm working through your comments...
>
>On 29/09/16 11:45, Richard Biener wrote:
>>
>>> +      /* The region from the byte array that we're inserting into. 
>*/
>>> +      tree ptr_wide_int
>>> +	= native_interpret_expr (dest_int_type, ptr + first_byte,
>>> +				 total_bytes);
>>> +
>>> +      gcc_assert (ptr_wide_int);
>>> +      wide_int dest_wide_int
>>> +	= wi::to_wide (ptr_wide_int, TYPE_PRECISION (dest_int_type));
>>> +      wide_int expr_wide_int
>>> +	= wi::to_wide (tmp_int, byte_size * BITS_PER_UNIT);
>>> +      if (BYTES_BIG_ENDIAN)
>>> +	{
>>> +	  unsigned int insert_pos
>>> +	    = byte_size * BITS_PER_UNIT - bitlen - (bitpos %
>BITS_PER_UNIT);
>>> +	  dest_wide_int
>>> +	    = wi::insert (dest_wide_int, expr_wide_int, insert_pos,
>bitlen);
>>> +	}
>>> +      else
>>> +	dest_wide_int = wi::insert (dest_wide_int, expr_wide_int,
>>> +				    bitpos % BITS_PER_UNIT, bitlen);
>>> +
>>> +      tree res = wide_int_to_tree (dest_int_type, dest_wide_int);
>>> +      native_encode_expr (res, ptr + first_byte, total_bytes, 0);
>>> +
>> OTOH this whole dance looks as complicated and way more expensive
>than
>> using native_encode_expr into a temporary buffern and then a
>> manually implemented "bit-merging" of it at ptr + first_byte +
>bitpos.
>> AFAICS that operation is even endianess agnostic.
>
>If the quantity we're inserting at a non-byte boundary
>is more than a byte wide we still have to shift the value
>to position properly across the bytes it straddles, so I don't
>see how we can avoid creating a wide_int here.
>Consider inserting a 10-bit value at bitposition 3 (I hope the mailer
>doesn't screw up the indentation):
>value:  xxxxxxxxxx
>before: |--------||--------|
>         | byte 1 || byte 2 |
>after:  |---xxxxx||xxxxx---|
>
>We'll native_encode_expr the value into a two-byte buffer but then we
>can't
>just shift each byte by 3 to insert it into the destination buffer, we
>need
>to form the whole 10-bit value and shift is as a whole to not lose any
>bits.

Native encode will encode into a byte array in target representation / endianess.

I think you can work byte-wise by properly merging 'lost' bits from adjacent bytes.  And you at most need 2 of them per 'target byte'.

>And if a value crosses bytes then we need to care about
>BYTES_BIG_ENDIAN when
>writing the bytes back into the buffer, no?

If you shift a > byte size quantity on the host (wide-ints are in host endianess) then you indeed need to watch out for endianess.

But as we deal with target memory representation plus bit offsets into memory I think it's natural to work with bytes.

Richard.

>Thanks,
>Kyrill