This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Bin Cheng <Bin dot Cheng at arm dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, nd <nd at arm dot com>
- Date: Wed, 16 Mar 2016 13:20:33 +0100
- Subject: Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible
- Authentication-results: sourceware.org; auth=none
- References: <AM4PR08MB11404693376F44CBBF8D9E4AE78A0 at AM4PR08MB1140 dot eurprd08 dot prod dot outlook dot com>
On Wed, Mar 16, 2016 at 10:59 AM, Bin Cheng <Bin.Cheng@arm.com> wrote:
> Hi,
> One issue revealed in tree ifcvt is the pass stores/tracks DRs against its memory references in IR. This causes difficulty in identifying same memory references appearing in different forms. Given below example:
>
> void foo (int a[], int b[])
> {
> int i;
> for (i = 0; i < 100; i++)
> {
> if (a[i] ==0)
> a[i] = b[i]*4;
> else
> a[i] = b[i]*3;
> }
> }
>
> The gimple dump before tree ifcvt is as:
>
> <bb 2>:
>
> <bb 3>:
> # i_24 = PHI <i_19(7), 0(2)>
> # ivtmp_28 = PHI <ivtmp_23(7), 100(2)>
> _5 = (long unsigned int) i_24;
> _6 = _5 * 4;
> _8 = a_7(D) + _6;
> _9 = *_8;
> if (_9 == 0)
> goto <bb 4>;
> else
> goto <bb 5>;
>
> <bb 4>:
> _11 = b_10(D) + _6;
> _12 = *_11;
> _13 = _12 * 4;
> goto <bb 6>;
>
> <bb 5>:
> _15 = b_10(D) + _6;
> _16 = *_15;
> _17 = _16 * 3;
>
> <bb 6>:
> # cstore_1 = PHI <_13(4), _17(5)>
> *_8 = cstore_1;
> i_19 = i_24 + 1;
> ivtmp_23 = ivtmp_28 - 1;
> if (ivtmp_23 != 0)
> goto <bb 7>;
> else
> goto <bb 8>;
>
> <bb 7>:
> goto <bb 3>;
>
> <bb 8>:
> return;
>
> The two memory references "*_11" and "*_15" are actually the same thing, but ifcvt failed to discover this because they are recorded in different forms. This patch fixes the issue by recording/tracking memory reference against its innermost_loop_behavior if: the memory reference has innermost_loop_behavior and it is not a compound reference.
> BTW, PR56625 reported that this case couldn't be vectorized even tree if-conv can handle it. Interesting thing is at current time, it's tree if-conv that could not handle the case. Once it's if-converted with this patch, vectorizer are able to handle it too.
>
> Bootstrap and test on x86_64 and AArch64. Is it OK, not sure if it's GCC 7?
Hmm.
+ equal_p = true;
+ if (e1->base_address && e2->base_address)
+ equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
+ if (e1->offset && e2->offset)
+ equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
surely better to return false early.
I think we don't want this in tree-data-refs.h also because of ...
@@ -615,15 +619,29 @@
hash_memrefs_baserefs_and_store_DRs_read_written_info
(data_reference_p a)
data_reference_p *master_dr, *base_master_dr;
tree ref = DR_REF (a);
tree base_ref = DR_BASE_OBJECT (a);
+ innermost_loop_behavior *innermost = &DR_INNERMOST (a);
tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
bool exist1, exist2;
- while (TREE_CODE (ref) == COMPONENT_REF
- || TREE_CODE (ref) == IMAGPART_EXPR
- || TREE_CODE (ref) == REALPART_EXPR)
- ref = TREE_OPERAND (ref, 0);
+ /* If reference in DR has innermost loop behavior and it is not
+ a compound memory reference, we store it to innermost_DR_map,
+ otherwise to ref_DR_map. */
+ if (TREE_CODE (ref) == COMPONENT_REF
+ || TREE_CODE (ref) == IMAGPART_EXPR
+ || TREE_CODE (ref) == REALPART_EXPR
+ || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
+ || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
+ {
+ while (TREE_CODE (ref) == COMPONENT_REF
+ || TREE_CODE (ref) == IMAGPART_EXPR
+ || TREE_CODE (ref) == REALPART_EXPR)
+ ref = TREE_OPERAND (ref, 0);
+
+ master_dr = &ref_DR_map->get_or_insert (ref, &exist1);
+ }
+ else
+ master_dr = &innermost_DR_map->get_or_insert (innermost, &exist1);
we don't want an extra hashmap but replace ref_DR_map entirely. So we'd need to
strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
and REALPART) before creating the DR (or adjust the equality function
and hashing
to disregard them which means subtracting their offset from DR_INIT.
To adjust the references we collect you'd maybe could use a callback
to get_references_in_stmt
to adjust them.
OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple as
Index: tree-if-conv.c
===================================================================
--- tree-if-conv.c (revision 234215)
+++ tree-if-conv.c (working copy)
@@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
for (i = 0; refs->iterate (i, &dr); i++)
{
+ tree *refp = &DR_REF (dr);
+ while ((TREE_CODE (*refp) == COMPONENT_REF
+ && TREE_OPERAND (*refp, 2) == NULL_TREE)
+ || TREE_CODE (*refp) == IMAGPART_EXPR
+ || TREE_CODE (*refp) == REALPART_EXPR)
+ refp = &TREE_OPERAND (*refp, 0);
+ if (refp != &DR_REF (dr))
+ {
+ tree saved_base = *refp;
+ *refp = integer_zero_node;
+
+ if (DR_INIT (dr))
+ {
+ tree poffset;
+ int punsignedp, preversep, pvolatilep;
+ machine_mode pmode;
+ HOST_WIDE_INT pbitsize, pbitpos;
+ get_inner_reference (DR_REF (dr), &pbitsize, &pbitpos, &poffset,
+ &pmode, &punsignedp, &preversep, &pvolatilep,
+ false);
+ gcc_assert (poffset == NULL_TREE);
+
+ DR_INIT (dr)
+ = wide_int_to_tree (ssizetype,
+ wi::sub (DR_INIT (dr),
+ pbitpos / BITS_PER_UNIT));
+ }
+
+ *refp = saved_base;
+ DR_REF (dr) = *refp;
+ }
+
dr->aux = XNEW (struct ifc_dr);
DR_BASE_W_UNCONDITIONALLY (dr) = false;
DR_RW_UNCONDITIONALLY (dr) = false;
Can you add a testcase for the vectorization?
Richard.
> Thanks,
> bin
>
>
> 2016-03-11 Bin Cheng <bin.cheng@arm.com>
>
> PR tree-optimization/56625
> PR tree-optimization/69489
> * tree-data-ref.h (innermost_loop_behavior_hash): New class for
> hashing struct innermost_loop_behavior.
> (DR_INNERMOST): New macro.
> * tree-if-conv.c (innermost_DR_map): New map.
> (ref_DR_map, baseref_DR_map): Revise the comment.
> (hash_memrefs_baserefs_and_store_DRs_read_written_info): Store DR
> to innermost_DR_map if it has innermost loop behavior and is not
> a compound reference.
> (ifcvt_memrefs_wont_trap): Get DR from innermost_DR_map if it has
> innermost loop behavior and is not a compound reference.
> (if_convertible_loop_p_1): Initialize innermost_DR_map.
> (if_convertible_loop_p): Release innermost_DR_map.
>
> gcc/testsuite/ChangeLog
> 2016-03-11 Bin Cheng <bin.cheng@arm.com>
>
> PR tree-optimization/56625
> PR tree-optimization/69489
> * gcc.dg/tree-ssa/ifc-pr69489-1.c: New test.
>