[patch] Enhance conditional store sinking
Richard Guenther
richard.guenther@gmail.com
Wed Mar 23 15:37:00 GMT 2011
On Tue, Mar 22, 2011 at 2:28 PM, Ira Rosen <ira.rosen@linaro.org> wrote:
> On 17 March 2011 16:48, Richard Guenther <richard.guenther@gmail.com> wrote:
>
>>> + then_datarefs = VEC_alloc (data_reference_p, heap, 1);
>>> + else_datarefs = VEC_alloc (data_reference_p, heap, 1);
>>> + then_ddrs = VEC_alloc (ddr_p, heap, 1);
>>> + else_ddrs = VEC_alloc (ddr_p, heap, 1);
>>> + if (!compute_data_dependences_for_bb (then_bb, false, &then_datarefs,
>>> + &then_ddrs)
>>
>> Can we avoid computing dependencies if the other BB would have no
>> data-refs? Thus, split collecting datarefs and computing dependences?
>
> Done.
>
>>
>>> + || !compute_data_dependences_for_bb (else_bb, false, &else_datarefs,
>>> + &else_ddrs)
>>> + || !VEC_length (data_reference_p, then_datarefs)
>>> + || !VEC_length (data_reference_p, else_datarefs))
>>> + {
>>> + free_data_refs (then_datarefs);
>>> + free_data_refs (else_datarefs);
>>> + free_dependence_relations (then_ddrs);
>>> + free_dependence_relations (else_ddrs);
>>> + return false;
>>> + }
>>> +
>>> + /* Check that there are no read-after-write or write-after-write dependencies
>>> + in THEN_BB. */
>>> + FOR_EACH_VEC_ELT (ddr_p, then_ddrs, i, ddr)
>>> + {
>>> + struct data_reference *dra = DDR_A (ddr);
>>> + struct data_reference *drb = DDR_B (ddr);
>>> +
>>> + if (DDR_ARE_DEPENDENT (ddr) != chrec_known
>>> + && ((DR_IS_READ (dra) && DR_IS_WRITE (drb)
>>> + && gimple_uid (DR_STMT (dra)) > gimple_uid (DR_STMT (drb)))
>>> + || (DR_IS_READ (drb) && DR_IS_WRITE (dra)
>>> + && gimple_uid (DR_STMT (drb)) > gimple_uid (DR_STMT (dra)))
>>> + || (DR_IS_WRITE (dra) && DR_IS_WRITE (drb))))
>>
>> The gimple_uids are not initialized here, you need to make sure to
>> call renumber_gimple_stmt_uids () before starting. Note that phiopt
>> incrementally changes the IL, so I'm not sure those uids will stay
>> valid as stmts are moved across blocks.
>
> I added a call to renumber_gimple_stmt_uids_in_blocks() before data
> dependence checks, and there are no code changes between that and the
> checks, so, I think, it should be OK.
>
>>
>>> + {
>>> + free_data_refs (then_datarefs);
>>> + free_data_refs (else_datarefs);
>>> + free_dependence_relations (then_ddrs);
>>> + free_dependence_relations (else_ddrs);
>>> + return false;
>>> + }
>>> + }
>>> +
>>> + /* Check that there are no read-after-write or write-after-write dependencies
>>> + in ELSE_BB. */
>>> + FOR_EACH_VEC_ELT (ddr_p, else_ddrs, i, ddr)
>>> + {
>>> + struct data_reference *dra = DDR_A (ddr);
>>> + struct data_reference *drb = DDR_B (ddr);
>>> +
>>> + if (DDR_ARE_DEPENDENT (ddr) != chrec_known
>>> + && ((DR_IS_READ (dra) && DR_IS_WRITE (drb)
>>> + && gimple_uid (DR_STMT (dra)) > gimple_uid (DR_STMT (drb)))
>>> + || (DR_IS_READ (drb) && DR_IS_WRITE (dra)
>>> + && gimple_uid (DR_STMT (drb)) > gimple_uid (DR_STMT (dra)))
>>> + || (DR_IS_WRITE (dra) && DR_IS_WRITE (drb))))
>>> + {
>>> + free_data_refs (then_datarefs);
>>> + free_data_refs (else_datarefs);
>>> + free_dependence_relations (then_ddrs);
>>> + free_dependence_relations (else_ddrs);
>>> + return false;
>>> + }
>>> + }
>>> +
>>> + /* Found pairs of stores with equal LHS. */
>>> + FOR_EACH_VEC_ELT (data_reference_p, then_datarefs, i, then_dr)
>>> + {
>>> + if (DR_IS_READ (then_dr))
>>> + continue;
>>> +
>>> + then_store = DR_STMT (then_dr);
>>> + then_lhs = gimple_assign_lhs (then_store);
>>> + found = false;
>>> +
>>> + FOR_EACH_VEC_ELT (data_reference_p, else_datarefs, j, else_dr)
>>> + {
>>> + if (DR_IS_READ (else_dr))
>>> + continue;
>>> +
>>> + else_store = DR_STMT (else_dr);
>>> + else_lhs = gimple_assign_lhs (else_store);
>>> +
>>> + if (operand_equal_p (then_lhs, else_lhs, 0))
>>> + {
>>> + found = true;
>>> + break;
>>> + }
>>> + }
>>> +
>>> + if (!found)
>>> + continue;
>>> +
>>> + res = cond_if_else_store_replacement_1 (then_bb, else_bb, join_bb,
>>> + then_store, else_store);
>>
>> So you are executing if-else store replacement for common data reference
>> pairs only. I think it's cheaper to collect those pairs before computing
>> dependences and only if there is at least one pair perform the optimization.
>
> OK, I changed the order.
>
>>
>> You basically perform store sinking, creating a PHI node for each store
>> you sink (that's then probably if-converted by if-conversion later, eventually
>> redundant with -ftree-loop-if-convert-stores?)
>>
>> I am concerned that having no bound on the number of stores sunk will
>> increase register pressure and does not allow scheduling of the stores
>> in an optimal way. Consider two BBs similar to
>>
>> t = a + b;
>> *p = t;
>> t = c + d;
>> *q = t;
>>
>> where the transformation undoes a good schedule and makes fixing it
>> impossible if the remaining statements are not if-convertible.
>>
>> Thus, I'd rather make this transformation only if in the end the conditional
>> can be completely if-converted.
>>
>> I realize that we already do unbound and very aggressive if-conversion
>> in tree-ifcvt.c regardless of whether the loop will be vectorized or not
>> (including leaking the if-converted loops to the various loop versions
>> we create during vectorization, causing only code-size bloat). But it's
>> not a good reason to continue down this road ;)
>>
>> I suppose a simple maximum on the number of stores to sink
>> controllable by a param should do, eventually disabling this
>> extended transformation when vectorization is disabled?
>
> I added a param, and it's set to 0 if either vectorization or if-conversion
> is disabled.
>
>>
>> Otherwise the implementation looks good.
>
> Bootstrapped and tested on powerpc64-suse-linux.
> OK to apply?
Ok with
+ if (find_data_references_in_bb (NULL, then_bb, &then_datarefs)
+ == chrec_dont_know
+ || !VEC_length (data_reference_p, then_datarefs)
+ || find_data_references_in_bb (NULL, else_bb, &else_datarefs)
+ == chrec_dont_know
paranteses added around the call & comparison to make auto-indent
work.
Thanks,
Richard.
> Thanks,
> Ira
>
> ChangeLog:
>
> * doc/invoke.texi (max-stores-to-sink): Document.
> * params.h (MAX_STORES_TO_SINK): Define.
> * opts.c (finish_options): Set MAX_STORES_TO_SINK to 0
> if either vectorization or if-conversion is disabled.
> * tree-data-ref.c (dr_equal_offsets_p1): Moved and renamed from
> tree-vect-data-refs.c vect_equal_offsets.
> (dr_equal_offsets_p): New function.
> (find_data_references_in_bb): Remove static.
> * tree-data-ref.h (find_data_references_in_bb): Declare.
> (dr_equal_offsets_p): Likewise.
> * tree-vect-data-refs.c (vect_equal_offsets): Move to tree-data-ref.c.
> (vect_drs_dependent_in_basic_block): Update calls to vect_equal_offsets.
> (vect_check_interleaving): Likewise.
> * tree-ssa-phiopt.c: Include cfgloop.h and tree-data-ref.h.
> (cond_if_else_store_replacement): Rename to...
> (cond_if_else_store_replacement_1): ... this. Change arguments and
> documentation.
> (cond_if_else_store_replacement): New function.
> * Makefile.in (tree-ssa-phiopt.o): Adjust dependencies.
> * params.def (PARAM_MAX_STORES_TO_SINK): Define.
>
> testsuite/ChangeLog:
>
> * gcc.dg/vect/vect-cselim-1.c: New test.
> * gcc.dg/vect/vect-cselim-2.c: New test.
>
More information about the Gcc-patches
mailing list