[PATCH][RFC] Sparse on entry cache for Ranger.

Richard Biener richard.guenther@gmail.com
Mon Jun 7 17:50:43 GMT 2021


On Mon, Jun 7, 2021 at 7:26 PM Andrew MacLeod <amacleod@redhat.com> wrote:
>
> On 6/7/21 4:40 AM, Richard Biener wrote:
> > On Wed, Jun 2, 2021 at 11:15 PM Andrew MacLeod <amacleod@redhat.com> wrote:
> >>
> >> My thoughts are I would put this into trunk, and assuming nothing comes
> >> up  over the next couple of days, port it back to GCC11 to resolve
> >> 100299 and other excessive memory consumption PRs there as well. given
> >> that its reusing bitmap code for the sparse representation, it seems
> >> like it would be low risk.
> >>
> >> Are we OK with the addition of the bitmap_get_quad and bitmap_set_quad
> >> routines in bitmap.c?  It seems like they might be useful to others.
> >> They are simple tweaks of bitmap_set_bit and bitmap_bit_p.. just dealing
> >> with 4 bits at a time.  I could make them local if this is a problem,
> >> but i don't have access to the bitmap internals there.
> > I think _quad is a bit too specific - it's aligned chunks so maybe
> >
> > void bitmap_set_aligned_chunk (bitmap, unsigned int chunk, unsigned
> > int chunk_size, BITMAP_WORD chunk_value);
> >
> > and
> >
> > BITMAP_WORD bitmap_get_aligned_chunk (bitmap, unsigned int chunk,
> > unsigned chunk_size);
> >
> > and assert that chunk_size is power-of-two and fits in BITMAP_WORD?
> >
> > (also note using unsigned ints and BITMAP_WORD for the data type)
> >
> > I've been using two-bit representations in a few places (but mostly
> > setting/testing the
> > respective bits independently), I suppose for example
>
> That's exactly how this started.. I was using a pair of bits for
> pointers. UNDEFINED, zero, non-zero and varying... and checking the bits
> independently. when I decided I needed 3 bits, the whole quad thing
> evolved since picking up 3 or 4 consecutive bits one at a time seemed
> too inefficient.
>
> >
> > static dep_state
> > query_loop_dependence (class loop *loop, im_mem_ref *ref, dep_kind kind)
> > {
> >    unsigned first_bit = 6 * loop->num + kind * 2;
> >    if (bitmap_bit_p (&ref->dep_loop, first_bit))
> >      return dep_independent;
> >    else if (bitmap_bit_p (&ref->dep_loop, first_bit + 1))
> >      return dep_dependent;
> >
> > could use a chunk size of 2 and a single bitmap query.  Incidentially this
> > specific code uses 6 bits, so it's not fully aligned ...
> >
> > /* We use six bits per loop in the ref->dep_loop bitmap to record
> >     the dep_kind x dep_state combinations.  */
> >
> > enum dep_kind { lim_raw, sm_war, sm_waw };
> > enum dep_state { dep_unknown, dep_independent, dep_dependent };
> >
> > ... but there's also at most a single bit set.
> >
> > Anyway, I'm OK with adding API to access aligned power-of-two sized chunks.
> > Even not power-of-two sized unaligned chunks should be quite straight
> > forward to implement if we limit the chunk size to BITMAP_WORD by
> > simply advancing to the next bitmap word / element when necessary.
> >
> > An alternative low-level API would provide accesses to whole BITMAP_WORD
> > entries and the quads could be implemented on top of that
> > (bitmap_set_word/_get_word)
> >
> > Richard.
> >
> I think I'll stick to the power of 2 limitation for now.  If someone
> finds a pressing need or desire, they can enhance it :-)
>
> Wanna eyeball this an make sure I'm not doing something unportable.. I
> just used your original 2 function names, and swapped out the 4 and a
> couple of constants for computed values. Works fine for me.
>
> I also made the self test process 2, 4 and 8 bit quantities.
>
> Its going thru a test cycle now.

+  gcc_checking_assert (__builtin_popcount (chunk_size) == 1);

please use pow2p_hwi (chunk_size) instead, __builtin_popcount might
not be available with non-GCC host compilers.

Otherwise looks good to me.

Thanks,
Richard.

> Andrew
>
>


More information about the Gcc-patches mailing list