This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 3/5] The IPA-SRA itself.
- From: Martin Jambor <mjambor at suse dot cz>
- To: Richard Guenther <rguenther at suse dot de>
- Cc: GCC Patches <gcc-patches at gcc dot gnu dot org>, Jan Hubicka <hubicka at ucw dot cz>
- Date: Thu, 10 Sep 2009 17:21:01 +0200
- Subject: Re: [PATCH 3/5] The IPA-SRA itself.
- References: <20090805152128.023505229@virgil.suse.cz> <20090805152152.312738451@virgil.suse.cz> <alpine.LNX.2.00.0908071137370.16347@zhemvz.fhfr.qr> <20090903194217.GD10997@virgil.suse.cz> <alpine.LNX.2.00.0909041106420.28140@zhemvz.fhfr.qr>
Hi,
On Fri, Sep 04, 2009 at 11:21:58AM +0200, Richard Guenther wrote:
> On Thu, 3 Sep 2009, Martin Jambor wrote:
>
> > Hi,
> >
> > On Fri, Aug 07, 2009 at 11:50:02AM +0200, Richard Guenther wrote:
> > > On Wed, 5 Aug 2009, Martin Jambor wrote:
> > >
> > > > Hi,
> > > >
> > > > this patch contains the implementation of IPA-SRA which is built on
> > > > top of the new intraprocedural SRA.
> > > >
> > > > This is a re-submission in which I incorporated most of the feedback.
> > > > Disqualification of addressable types and direct uses are done when
> > > > selecting candidates in the initial phase. The check whether it is
> > > > legal to move dereferences to the caller has been re-written from
> > > > scratch. It now does not utilize dominators but rather propagates the
> > > > dereference information across the control flow graph in a fashion
> > > > similar to the simplest constant-propagation algorithms. In short,
> > > > for each BB and pointer candidate I note down in local_dereferences
> > > > whether (and how much of it) it was dereferenced there. Then, when
> > > > actually checking the dereference info, I propagate this in
> > > > global_dereferences to other basic blocks if all of their predecessors
> > > > also are marked as having dereferenced the parameter in
> > > > global_dereferences. I propagate this information until it stabilizes
> > > > and then check it for BBs which can abort the progressing of the
> > > > functions (returns, calls of non-pure functions, potential infinite
> > > > loops, external exceptions).
> > >
> > > I'm still confused about all this complexity and why you need to consider
> > > offsets at all.
> >
> > The concern here is that the caller might allocate only a part of the
> > whole agggregate if it knows only a part is used.
>
> Uh, ok. Can you add a testcase covering this case?
Originally I thought this was primarily necessary for unbounded
arrays at the end of structures but apparently those are not even
considered candidates for IPA-SRA which checks that it knows the
aggregate size. But similar tricks can be played with unions (unless
someone finds a reason why it is invalid C, it looks a bit odd):
---------- cut here ----------
struct big
{
int data[1000000];
};
struct small
{
int data[10];
};
union both
{
struct big big;
struct small small;
};
extern void *malloc(__SIZE_TYPE__);
extern void free (void *);
static int
__attribute__((noinline))
foo (int fail, union both *agg)
{
int r;
if (fail)
r = agg->big.data[999999];
else
r = agg->small.data[0];
return r;
}
int main (int argc, char *argv[])
{
union both *agg = malloc (sizeof (struct small));
int r;
r = foo ((argc > 2000), agg);
free (agg);
return r;
}
---------- cut here ----------
>
> > > For simplicity I would simply check dereferencing in
> > > ptr_parm_has_direct_uses where you already walk all direct uses:
> > >
> > > > +ptr_parm_has_direct_uses (tree parm)
> > > > +{
> > > > + imm_use_iterator ui;
> > > > + gimple stmt;
> > > > + tree name = gimple_default_def (cfun, parm);
> > > > + bool ret = false;
> > > > +
> > > > + FOR_EACH_IMM_USE_STMT (stmt, ui, name)
> > > > + {
> > > > + if (gimple_assign_single_p (stmt))
> > > > + {
> > > > + tree rhs = gimple_assign_rhs1 (stmt);
> > > > + if (rhs == name)
> > > > + ret = true;
> > > > + else if (TREE_CODE (rhs) == ADDR_EXPR)
> > > > + {
> > > > + do
> > > > + {
> > > > + rhs = TREE_OPERAND (rhs, 0);
> > > > + }
> > > > + while (handled_component_p (rhs));
> > > > + if (INDIRECT_REF_P (rhs) && TREE_OPERAND (rhs, 0) == name)
> > > > + ret = true;
> > > > + }
> > >
> > > lhs = gimple_assign_lhs (stmt);
> > > if (TREE_CODE (lhs) != SSA_NAME)
> > > {
> > > while (handled_component_p (lhs))
> > > lhs = TREE_OPERAND (lhs, 0);
> > > if (INDIRECT_REF_P (lhs) && TREE_OPERAND (lhs, 0) == name
> > > && (gimple_bb (stmt) == single_succ
> > > (ENTRY_BLOCK_PTR)
> > > || dominated_by_p (CDI_POST_DOMINATORS,
> > > single_succ (ENTRY_BLOCK_PTR),
> > > gimple_bb (stmt))))
> > > {
> > > note parm can be dereferenced in the caller
> > > }
> > > }
> > >
> > > same for the rhs (and maybe call lhs and call args, but that
> > > probably doesn't make too much of a difference).
> > >
> >
> > As far as I can remember, me and Honza (actually, the whole algorithm
> > was Honza's idea, and most probably a good one) have decided not to
> > rely on dominance but rather do it in this way because of cases like
> > this:
> >
> > if (condition)
> > {
> > a = param->field + 1;
> > /* maybe do a lot of othe stuff */
> > }
> > else
> > {
> > a = param->field + 5;
> > /* maybe do a lot of othe stuff */
> > }
> >
> > Neither of the loads (dereferences) post-dominates exit but it is
> > still safe to proceed with dereferencing in the caller. In the test
> > cases I have examined the loads remained in the branches and were not
> > moved before the condition. Moreover, the scenario seems very frequent.
>
> Hmm, ok.
>
> I don't quite follow analyze_caller_derference_legality though
> (or rather local vs global dereferences). Local dereferences are
> built up during access building and in a_c_d_l you propagate them
> into the global ones. Why do you update local_dereferences i
> during this propagation?
I remember that I deliberately added the global array for some
specific reason but looking at the code I don't see any now. So I
guess that either the reason went away or I simply messed up. I will
try to remove it.
> In fact - why have two arrays anyway and
> not just propagate in-place? Also in the caller of a_c_d_l you
> check for each BB and for each param whether the group is not
> necessarily dereferenced - but you just propagated the offsets, so
> checking the entry basic block should be enough?
I propagate the offsets "downwards," i.e. in the direction of the
control flow edges and check it at each block marked as "final"
(i.e. returning, externally throwing etc...).
>
> That I don't get it means it lacks documentation - what is the
> converged state after propagation supposed to look like? Thus, what
> can you actually do with it?
OK, I will add this to the source and re-post.
The bigger worry at the moment is that IPA-SRA causes different output
when generating debug info. Hopefully I can find out what causes that
soon.
Thanks a lot,
Martin