This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][RFC] bitpacking rewrite
On Sat, 12 Jun 2010, Jan Hubicka wrote:
> >
> > The following preliminary patch rewrites bitpacking to happen
> > word-by-word and on the stack (well, in the preliminary form
> > only on the stack after inlining).
> >
> > Comments?
>
> I like the change. Probably instead of bp/bp_s game, I would go with
> adding &bp to every caller, but I do not care.
I did that in the final patch.
> > + static inline void
> > + bp_pack_value (struct bitpack_d *bp, bitpack_word_t val, unsigned nbits)
> > + {
> > + bitpack_word_t word = bp->word;
> > + int pos = bp->pos;
> > + /* If val does not fit into the current bitpack word switch to the
> > + next one. */
> > + if (pos + nbits > BITS_PER_BITPACK_WORD)
> > + {
> > + lto_output_uleb128_stream ((struct lto_output_stream *) bp->stream, word);
>
> Hehe, I never got to idea that one can actually output the bitpack when it overflows
> first word ;)
> I guess you want to add some checking asserts that nbits does not exceed BITS_PER_BITPACK_WORD.
Well - it's worse. We'd have to assert that val actually fits in nbits
bits, otherwise we corrupt subsequent packs (but that's true for the
old implementation as well).
> So next step would be to optimize ulebs/slebs, right? :)
Yeah. The encoding is quite funny and likely causes a lot of
mispredicts.
void
lto_output_uleb128_stream (struct lto_output_stream *obs,
unsigned HOST_WIDE_INT work)
{
do
{
unsigned int byte = (work & 0x7f);
work >>= 7;
if (work != 0)
/* More bytes to follow. */
byte |= 0x80;
lto_output_1_stream (obs, byte);
}
while (work != 0);
}
so for 64bit HWI we output at most 9 bytes one byte at a time
via
void
lto_output_1_stream (struct lto_output_stream *obs, char c)
{
/* No space left. */
if (obs->left_in_block == 0)
append_block (obs);
/* Write the actual character. */
*obs->current_pointer = c;
obs->current_pointer++;
obs->total_size++;
obs->left_in_block--;
}
that will be neither exactly fast nor efficient. I suppose
encoding into a temporary buffer and writing that in one go
would speed up things significantly already.
Richard.