This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [patch] Enhance conditional store sinking
On Thu, Mar 24, 2011 at 1:39 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Thu, Mar 24, 2011 at 11:03 AM, H.J. Lu <hjl.tools@gmail.com> wrote:
>> On Tue, Mar 22, 2011 at 6:28 AM, Ira Rosen <ira.rosen@linaro.org> wrote:
>>> On 17 March 2011 16:48, Richard Guenther <richard.guenther@gmail.com> wrote:
>>>
>>>>> + ?then_datarefs = VEC_alloc (data_reference_p, heap, 1);
>>>>> + ?else_datarefs = VEC_alloc (data_reference_p, heap, 1);
>>>>> + ?then_ddrs = VEC_alloc (ddr_p, heap, 1);
>>>>> + ?else_ddrs = VEC_alloc (ddr_p, heap, 1);
>>>>> + ?if (!compute_data_dependences_for_bb (then_bb, false, &then_datarefs,
>>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?&then_ddrs)
>>>>
>>>> Can we avoid computing dependencies if the other BB would have no
>>>> data-refs? ?Thus, split collecting datarefs and computing dependences?
>>>
>>> Done.
>>>
>>>>
>>>>> + ? ? ?|| !compute_data_dependences_for_bb (else_bb, false, &else_datarefs,
>>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? &else_ddrs)
>>>>> + ? ? ?|| !VEC_length (data_reference_p, then_datarefs)
>>>>> + ? ? ?|| !VEC_length (data_reference_p, else_datarefs))
>>>>> + ? ?{
>>>>> + ? ? ?free_data_refs (then_datarefs);
>>>>> + ? ? ?free_data_refs (else_datarefs);
>>>>> + ? ? ?free_dependence_relations (then_ddrs);
>>>>> + ? ? ?free_dependence_relations (else_ddrs);
>>>>> + ? ? ?return false;
>>>>> + ? ?}
>>>>> +
>>>>> + ?/* Check that there are no read-after-write or write-after-write dependencies
>>>>> + ? ? in THEN_BB. ?*/
>>>>> + ?FOR_EACH_VEC_ELT (ddr_p, then_ddrs, i, ddr)
>>>>> + ? ?{
>>>>> + ? ? ?struct data_reference *dra = DDR_A (ddr);
>>>>> + ? ? ?struct data_reference *drb = DDR_B (ddr);
>>>>> +
>>>>> + ? ? ?if (DDR_ARE_DEPENDENT (ddr) != chrec_known
>>>>> + ? ? ? ? ?&& ((DR_IS_READ (dra) && DR_IS_WRITE (drb)
>>>>> + ? ? ? ? ? ? ? && gimple_uid (DR_STMT (dra)) > gimple_uid (DR_STMT (drb)))
>>>>> + ? ? ? ? ? ? ?|| (DR_IS_READ (drb) && DR_IS_WRITE (dra)
>>>>> + ? ? ? ? ? ? ? ? ?&& gimple_uid (DR_STMT (drb)) > gimple_uid (DR_STMT (dra)))
>>>>> + ? ? ? ? ? ? ?|| (DR_IS_WRITE (dra) && DR_IS_WRITE (drb))))
>>>>
>>>> The gimple_uids are not initialized here, you need to make sure to
>>>> call renumber_gimple_stmt_uids () before starting. ?Note that phiopt
>>>> incrementally changes the IL, so I'm not sure those uids will stay
>>>> valid as stmts are moved across blocks.
>>>
>>> I added a call to renumber_gimple_stmt_uids_in_blocks() before data
>>> dependence checks, and there are no code changes between that and the
>>> checks, so, I think, it should be OK.
>>>
>>>>
>>>>> + ? ? ? ?{
>>>>> + ? ? ? ? ?free_data_refs (then_datarefs);
>>>>> + ? ? ? ? ?free_data_refs (else_datarefs);
>>>>> + ? ? ? ? ?free_dependence_relations (then_ddrs);
>>>>> + ? ? ? ? ?free_dependence_relations (else_ddrs);
>>>>> + ? ? ? ? ?return false;
>>>>> + ? ? ? ?}
>>>>> + ? ?}
>>>>> +
>>>>> + ?/* Check that there are no read-after-write or write-after-write dependencies
>>>>> + ? ? in ELSE_BB. ?*/
>>>>> + ?FOR_EACH_VEC_ELT (ddr_p, else_ddrs, i, ddr)
>>>>> + ? ?{
>>>>> + ? ? ?struct data_reference *dra = DDR_A (ddr);
>>>>> + ? ? ?struct data_reference *drb = DDR_B (ddr);
>>>>> +
>>>>> + ? ? ?if (DDR_ARE_DEPENDENT (ddr) != chrec_known
>>>>> + ? ? ? ? ?&& ((DR_IS_READ (dra) && DR_IS_WRITE (drb)
>>>>> + ? ? ? ? ? ? ? && gimple_uid (DR_STMT (dra)) > gimple_uid (DR_STMT (drb)))
>>>>> + ? ? ? ? ? ? ?|| (DR_IS_READ (drb) && DR_IS_WRITE (dra)
>>>>> + ? ? ? ? ? ? ? ? ?&& gimple_uid (DR_STMT (drb)) > gimple_uid (DR_STMT (dra)))
>>>>> + ? ? ? ? ? ? ?|| (DR_IS_WRITE (dra) && DR_IS_WRITE (drb))))
>>>>> + ? ? ? ?{
>>>>> + ? ? ? ? ?free_data_refs (then_datarefs);
>>>>> + ? ? ? ? ?free_data_refs (else_datarefs);
>>>>> + ? ? ? ? ?free_dependence_relations (then_ddrs);
>>>>> + ? ? ? ? ?free_dependence_relations (else_ddrs);
>>>>> + ? ? ? ? ?return false;
>>>>> + ? ? ? ?}
>>>>> + ? ?}
>>>>> +
>>>>> + ?/* Found pairs of stores with equal LHS. ?*/
>>>>> + ?FOR_EACH_VEC_ELT (data_reference_p, then_datarefs, i, then_dr)
>>>>> + ? ?{
>>>>> + ? ? ?if (DR_IS_READ (then_dr))
>>>>> + ? ? ? ?continue;
>>>>> +
>>>>> + ? ? ?then_store = DR_STMT (then_dr);
>>>>> + ? ? ?then_lhs = gimple_assign_lhs (then_store);
>>>>> + ? ? ?found = false;
>>>>> +
>>>>> + ? ? ?FOR_EACH_VEC_ELT (data_reference_p, else_datarefs, j, else_dr)
>>>>> + ? ? ? ?{
>>>>> + ? ? ? ? ?if (DR_IS_READ (else_dr))
>>>>> + ? ? ? ? ? ?continue;
>>>>> +
>>>>> + ? ? ? ? ?else_store = DR_STMT (else_dr);
>>>>> + ? ? ? ? ?else_lhs = gimple_assign_lhs (else_store);
>>>>> +
>>>>> + ? ? ? ? ?if (operand_equal_p (then_lhs, else_lhs, 0))
>>>>> + ? ? ? ? ? ?{
>>>>> + ? ? ? ? ? ? ?found = true;
>>>>> + ? ? ? ? ? ? ?break;
>>>>> + ? ? ? ? ? ?}
>>>>> + ? ? ? ?}
>>>>> +
>>>>> + ? ? ?if (!found)
>>>>> + ? ? ? ?continue;
>>>>> +
>>>>> + ? ? ?res = cond_if_else_store_replacement_1 (then_bb, else_bb, join_bb,
>>>>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?then_store, else_store);
>>>>
>>>> So you are executing if-else store replacement for common data reference
>>>> pairs only. ?I think it's cheaper to collect those pairs before computing
>>>> dependences and only if there is at least one pair perform the optimization.
>>>
>>> OK, I changed the order.
>>>
>>>>
>>>> You basically perform store sinking, creating a PHI node for each store
>>>> you sink (that's then probably if-converted by if-conversion later, eventually
>>>> redundant with -ftree-loop-if-convert-stores?)
>>>>
>>>> I am concerned that having no bound on the number of stores sunk will
>>>> increase register pressure and does not allow scheduling of the stores
>>>> in an optimal way. ?Consider two BBs similar to
>>>>
>>>> ?t = a + b;
>>>> ?*p = t;
>>>> ?t = c + d;
>>>> ?*q = t;
>>>>
>>>> where the transformation undoes a good schedule and makes fixing it
>>>> impossible if the remaining statements are not if-convertible.
>>>>
>>>> Thus, I'd rather make this transformation only if in the end the conditional
>>>> can be completely if-converted.
>>>>
>>>> I realize that we already do unbound and very aggressive if-conversion
>>>> in tree-ifcvt.c regardless of whether the loop will be vectorized or not
>>>> (including leaking the if-converted loops to the various loop versions
>>>> we create during vectorization, causing only code-size bloat). ?But it's
>>>> not a good reason to continue down this road ;)
>>>>
>>>> I suppose a simple maximum on the number of stores to sink
>>>> controllable by a param should do, eventually disabling this
>>>> extended transformation when vectorization is disabled?
>>>
>>> I added a param, and it's set to 0 if either vectorization or if-conversion
>>> is disabled.
>>>
>>>>
>>>> Otherwise the implementation looks good.
>>>
>>> Bootstrapped and tested on powerpc64-suse-linux.
>>> OK to apply?
>>>
>>> Thanks,
>>> Ira
>>>
>>> ChangeLog:
>>>
>>> ? ? * doc/invoke.texi (max-stores-to-sink): Document.
>>> ? ? * params.h (MAX_STORES_TO_SINK): Define.
>>> ? ? * opts.c (finish_options): Set MAX_STORES_TO_SINK to 0
>>> ? ? if either vectorization or if-conversion is disabled.
>>> ? ? * tree-data-ref.c (dr_equal_offsets_p1): Moved and renamed from
>>> ? ? tree-vect-data-refs.c vect_equal_offsets.
>>> ? ? (dr_equal_offsets_p): New function.
>>> ? ? (find_data_references_in_bb): Remove static.
>>> ? ? * tree-data-ref.h (find_data_references_in_bb): Declare.
>>> ? ? (dr_equal_offsets_p): Likewise.
>>> ? ? * tree-vect-data-refs.c (vect_equal_offsets): Move to tree-data-ref.c.
>>> ? ? (vect_drs_dependent_in_basic_block): Update calls to vect_equal_offsets.
>>> ? ? (vect_check_interleaving): Likewise.
>>> ? ? * tree-ssa-phiopt.c: Include cfgloop.h and tree-data-ref.h.
>>> ? ? (cond_if_else_store_replacement): Rename to...
>>> ? ? (cond_if_else_store_replacement_1): ... this. Change arguments and
>>> ? ? documentation.
>>> ? ? (cond_if_else_store_replacement): New function.
>>> ? ? * Makefile.in (tree-ssa-phiopt.o): Adjust dependencies.
>>> ? ? * params.def (PARAM_MAX_STORES_TO_SINK): Define.
>>>
>>> testsuite/ChangeLog:
>>>
>>> ? ? * gcc.dg/vect/vect-cselim-1.c: New test.
>>> ? ? * gcc.dg/vect/vect-cselim-2.c: New test.
>>>
>>
>> This caused:
>>
>> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=48270
>
> Yep, I see
>
> FAIL: gcc.c-torture/execute/builtins/strlen-2.c compilation, ?-O3 -fomit-frame-p
> ointer ?(internal compiler error)
>
> on x86_64-linux with
>
> #0 ?0x0000000001f53253 in gimple_uid (g=0x0)
> ? ?at /space/rguenther/src/svn/trunk/gcc/gimple.h:1297
> 1297 ? ? ?return g->gsbase.uid;
> (gdb)
> Bottom (innermost) frame selected; you cannot go down.
> (gdb) up
> #1 ?0x0000000001f7753d in cond_if_else_store_replacement (
> ? ?then_bb=0x7ffff533abc8, else_bb=0x7ffff533ac30, join_bb=0x7ffff533ac98)
> ? ?at /space/rguenther/src/svn/trunk/gcc/tree-ssa-phiopt.c:1511
> 1511 ? ? ? ? ?if (DDR_ARE_DEPENDENT (ddr) != chrec_known
>
> DR_STMT of dra is NULL. ?No idea how that can happen.
Maybe because
1495 compute_all_dependences (then_datarefs, &then_ddrs, NULL, false);
1496 compute_all_dependences (else_datarefs, &else_ddrs, NULL, false);
1497 free_data_refs (then_datarefs);
1498 free_data_refs (else_datarefs);
frees the data refs too early?
Richard.