[PATCH][v6] GIMPLE store merging pass

Richard Biener rguenther@suse.de
Fri Oct 21 12:37:00 GMT 2016


On Tue, 18 Oct 2016, Kyrill Tkachov wrote:

> Hi Richard,
> 
> This patch is a merge of [1] and [2] and implements the manual merging of
> bitfields
> as outlined in [1] but actually makes it work on BYTES_BIG_ENDIAN too.
> It caused me a lot of headeache because the bit offset is counted from the
> most significant bit
> in the byte, even though BITS_BIG_ENDIAN was 0 (BITS_BIG_ENDIAN looks
> irrelevant for store merging
> anyway as it's just used to described RTL extract operations).
> I've included ASCII diagrams of the steps in the merging algorithm.

Heh, thanks.

> Bootstrapped and tested on arm-none-linux-gnueabihf, aarch64-none-linux-gnu,
> x86_64-unknown-linux-gnu.
> Also tested on aarch64_be-none-elf.
> 
> How does this version look now?

Mostly good.  For

+bool
+pass_store_merging::terminate_all_aliasing_chains (tree dest, tree base,
+                                                  gimple *stmt)
+{
...
+  /* Check if the assignment destination (BASE) is part of a store chain.
+     This is to catch non-constant stores to destinations that may be 
part
+     of a chain.  */
+  if (base)
+    {
+      chain_info = m_stores.get (base);
+      if (chain_info)
+       {
+         struct store_immediate_info *info;
+         unsigned int i;
+         FOR_EACH_VEC_ELT ((*chain_info)->m_store_info, i, info)
+           {
+             if (refs_may_alias_p (info->dest, dest))
+               {

I suppose the chain is not yet sorted in any way?

At least for 'dest' which do not have a known constant offset we
could do

   if (base)
     terminate_and_release_chain (base);

to speed things up?  IIRC we do not terminate chains early in 
this phase when we have enough stores to form a group, so
writing a testcase that triggers quadraticness would be as simple
as having

char a[1000000];

void foo ()
{
 a[0] = 1;
 a[1] = 2;
 ....
 a[9999999] = 3;
}

?

so I think you probably want to limit the number of stores you
ever put onto a chain and if you reach that limit, terminate
and release it?  Like just choose 16 or 64?  (and experiment
with the above kind of testcases)

+                 bit_off = byte_off << LOG2_BITS_PER_UNIT;
+                 if (!wi::neg_p (bit_off) && wi::fits_shwi_p (bit_off))
+                   {
+                     bitpos += bit_off.to_shwi ();
+

I think you want bit_off += bitpos before the fits_shwi check
otherwise this add may still overflow.

+                     base_addr = copy_node (base_addr);
+                     TREE_OPERAND (base_addr, 1)
+                       = build_zero_cst (TREE_TYPE (TREE_OPERAND (
+                                                      base_addr, 1)));

I'd prefer

      base_addr = build2 (MEM_REF, ...);

here.

Thanks,
Richard.

> Thanks,
> Kyrill
> 
> [1] https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00573.html
> [2] https://gcc.gnu.org/ml/gcc-patches/2016-10/msg00572.html
> 
> 2016-10-18  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
> 
>     PR middle-end/22141
>     * Makefile.in (OBJS): Add gimple-ssa-store-merging.o.
>     * common.opt (fstore-merging): New Optimization option.
>     * opts.c (default_options_table): Add entry for
>     OPT_ftree_store_merging.
>     * fold-const.h (can_native_encode_type_p): Declare prototype.
>     * fold-const.c (can_native_encode_type_p): Define.
>     * params.def (PARAM_STORE_MERGING_ALLOW_UNALIGNED): Define.
>     * passes.def: Insert pass_tree_store_merging.
>     * tree-pass.h (make_pass_store_merging): Declare extern
>     prototype.
>     * gimple-ssa-store-merging.c: New file.
>     * doc/invoke.texi (Optimization Options): Document
>     -fstore-merging.
> 
> 2016-10-18  Kyrylo Tkachov  <kyrylo.tkachov@arm.com>
>             Jakub Jelinek  <jakub@redhat.com>
>             Andrew Pinski  <pinskia@gmail.com>
> 
>     PR middle-end/22141
>     PR rtl-optimization/23684
>     * gcc.c-torture/execute/pr22141-1.c: New test.
>     * gcc.c-torture/execute/pr22141-2.c: Likewise.
>     * gcc.target/aarch64/ldp_stp_1.c: Adjust for -fstore-merging.
>     * gcc.target/aarch64/ldp_stp_4.c: Likewise.
>     * gcc.dg/store_merging_1.c: New test.
>     * gcc.dg/store_merging_2.c: Likewise.
>     * gcc.dg/store_merging_3.c: Likewise.
>     * gcc.dg/store_merging_4.c: Likewise.
>     * gcc.dg/store_merging_5.c: Likewise.
>     * gcc.dg/store_merging_6.c: Likewise.
>     * gcc.dg/store_merging_7.c: Likewise.
>     * gcc.target/i386/pr22141.c: Likewise.
>     * gcc.target/i386/pr34012.c: Add -fno-store-merging to dg-options.
>     * g++.dg/init/new17.C: Likewise.
> 

-- 
Richard Biener <rguenther@suse.de>
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 21284 (AG Nuernberg)



More information about the Gcc-patches mailing list