This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][store merging][RFA] Re-implement merging code
- From: Richard Biener <rguenther at suse dot de>
- To: Kyrill Tkachov <kyrylo dot tkachov at foss dot arm dot com>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>
- Date: Mon, 10 Oct 2016 12:24:40 +0200 (CEST)
- Subject: Re: [PATCH][store merging][RFA] Re-implement merging code
- Authentication-results: sourceware.org; auth=none
- References: <57FB4BC6.6080403@foss.arm.com> <alpine.LSU.2.11.1610101219030.26629@t29.fhfr.qr>
On Mon, 10 Oct 2016, Richard Biener wrote:
> On Mon, 10 Oct 2016, Kyrill Tkachov wrote:
>
> > Hi Richard,
> >
> > As I mentioned, here is the patch applying to the main store merging patch to
> > re-implement encode_tree_to_bitpos
> > to operate on the bytes directly.
> >
> > This works fine on little-endian but breaks on big-endian, even for merging
> > bitfields within a single byte.
> > Consider the code snippet from gcc.dg/store_merging_6.c:
> >
> > struct bar {
> > int a : 3;
> > unsigned char b : 4;
> > unsigned char c : 1;
> > char d;
> > char e;
> > char f;
> > char g;
> > };
> >
> > void
> > foo1 (struct bar *p)
> > {
> > p->b = 3;
> > p->a = 2;
> > p->c = 1;
> > p->d = 4;
> > p->e = 5;
> > }
> >
> > The correct GIMPLE for these merged stores on big-endian is:
> > MEM[(voidD.49 *)p_2(D)] = 18180;
> > MEM[(charD.8 *)p_2(D) + 2B] = 5;
> >
> > whereas with this patch we emit:
> > MEM[(voidD.49 *)p_2(D)] = 39428;
> > MEM[(charD.8 *)p_2(D) + 2B] = 5;
> >
> > The dump for merging the individual stores without this patch (using the
> > correct but costly wide_int approach in the base patch) is:
> > After writing 3 of size 4 at position 3 the merged region contains:
> > 6 0 0 0 0 0
> > After writing 2 of size 3 at position 0 the merged region contains:
> > 46 0 0 0 0 0
> > After writing 1 of size 1 at position 7 the merged region contains:
> > 47 0 0 0 0 0
> > After writing 4 of size 8 at position 8 the merged region contains:
> > 47 4 0 0 0 0
> > After writing 5 of size 8 at position 16 the merged region contains:
> > 47 4 5 0 0 0
> >
> >
> > And with this patch it is:
> > After writing 3 of size 4 at position 3 the merged region contains:
> > 18 0 0 0 0 0
> > After writing 2 of size 3 at position 0 the merged region contains:
> > 1a 0 0 0 0 0
> > After writing 1 of size 1 at position 7 the merged region contains:
> > 9a 0 0 0 0 0
> > After writing 4 of size 8 at position 8 the merged region contains:
> > 9a 4 0 0 0 0
> > After writing 5 of size 8 at position 16 the merged region contains:
> > 9a 4 5 0 0 0
> >
> > (Note the dump just dumps the byte array from index 0 to <len> so the first
> > thing printed is the lowest numbered byte.
> > Also, each byte is dumped in hex.)
> >
> > The code as included here doesn't do any byte swapping for big-endian but as
> > seen from the dump even writing a sub-byte
> > bitfield goes wrong so it would be nice to resolve that before going forward.
> > Any help with debugging this is hugely appreciated. I've included an ASCII
> > diagram of the steps in the algorithm
> > in the patch itself.
>
> Ah, I think you need to account for BITS_BIG_ENDIAN in
> shift_bytes_in_array. You have to shift towards MSB which means changing
> left to right shifts for BITS_BIG_ENDIAN.
>
> You also seem to miss to account for amnt / BITS_PER_UNIT != 0.
> Independently of BYTES_BIG_ENDIAN it would be
Ok, that would matter only if you'd merge shift_bytes_in_array,
clear_bit_region and the |-ring of that into the final buffer
(which should be possible).
Richard.