This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 3/5] The IPA-SRA itself.


Hi,

On Fri, Sep 04, 2009 at 11:21:58AM +0200, Richard Guenther wrote:
> On Thu, 3 Sep 2009, Martin Jambor wrote:
> 
> > Hi,
> > 
> > On Fri, Aug 07, 2009 at 11:50:02AM +0200, Richard Guenther wrote:
> > > On Wed, 5 Aug 2009, Martin Jambor wrote:
> > > 
> > > > Hi,
> > > > 
> > > > this patch  contains the implementation  of IPA-SRA which is  built on
> > > > top of the new intraprocedural SRA.
> > > > 
> > > > This is a re-submission in  which I incorporated most of the feedback.
> > > > Disqualification of  addressable types and  direct uses are  done when
> > > > selecting candidates  in the initial  phase.  The check whether  it is
> > > > legal  to move  dereferences to  the caller  has been  re-written from
> > > > scratch.  It now does not utilize dominators but rather propagates the
> > > > dereference  information across the  control flow  graph in  a fashion
> > > > similar  to the simplest  constant-propagation algorithms.   In short,
> > > > for each  BB and pointer  candidate I note down  in local_dereferences
> > > > whether (and  how much of it)  it was dereferenced  there.  Then, when
> > > > actually  checking   the  dereference   info,  I  propagate   this  in
> > > > global_dereferences to other basic blocks if all of their predecessors
> > > > also   are   marked   as   having  dereferenced   the   parameter   in
> > > > global_dereferences.  I propagate this information until it stabilizes
> > > > and  then check  it for  BBs which  can abort  the progressing  of the
> > > > functions  (returns, calls of  non-pure functions,  potential infinite
> > > > loops, external exceptions).
> > > 
> > > I'm still confused about all this complexity and why you need to consider
> > > offsets at all.
> > 
> > The concern here is that the caller might allocate only a part of the
> > whole agggregate if it knows only a part is used.
> 
> Uh, ok.  Can you add a testcase covering this case?

Originally I thought this was primarily necessary for unbounded
arrays at the end of structures but apparently those are not even
considered candidates for IPA-SRA which checks that it knows the
aggregate size.  But similar tricks can be played with unions (unless
someone finds a reason why it is invalid C, it looks a bit odd):

---------- cut here ----------
struct big
{
  int data[1000000];
};

struct small
{
  int data[10];
};

union both
{
  struct big big;
  struct small small;
};

extern void *malloc(__SIZE_TYPE__);
extern void free (void *);

static int
__attribute__((noinline))
foo (int fail, union both *agg)
{
  int r;
  if (fail)
    r = agg->big.data[999999];
  else
    r = agg->small.data[0];
  return r;
}

int main (int argc, char *argv[])
{
  union both *agg = malloc (sizeof (struct small));
  int r;

  r = foo ((argc > 2000), agg);

  free (agg);
  return r;
}
---------- cut here ----------


> 
> > > For simplicity I would simply check dereferencing in 
> > > ptr_parm_has_direct_uses where you already walk all direct uses:
> > > 
> > > > +ptr_parm_has_direct_uses (tree parm)
> > > > +{
> > > > +  imm_use_iterator ui;
> > > > +  gimple stmt;
> > > > +  tree name = gimple_default_def (cfun, parm);
> > > > +  bool ret = false;
> > > > +
> > > > +  FOR_EACH_IMM_USE_STMT (stmt, ui, name)
> > > > +    {
> > > > +      if (gimple_assign_single_p (stmt))
> > > > +	{
> > > > +	  tree rhs = gimple_assign_rhs1 (stmt);
> > > > +	  if (rhs == name)
> > > > +	    ret = true;
> > > > +	  else if (TREE_CODE (rhs) == ADDR_EXPR)
> > > > +	    {
> > > > +	      do
> > > > +		{
> > > > +		  rhs = TREE_OPERAND (rhs, 0);
> > > > +		}
> > > > +	      while (handled_component_p (rhs));
> > > > +	      if (INDIRECT_REF_P (rhs) && TREE_OPERAND (rhs, 0) == name)
> > > > +		ret = true;
> > > > +	    }
> > > 
> > >         lhs = gimple_assign_lhs (stmt);
> > >         if (TREE_CODE (lhs) != SSA_NAME)
> > >           {
> > >             while (handled_component_p (lhs))
> > >              lhs = TREE_OPERAND (lhs, 0);
> > >             if (INDIRECT_REF_P (lhs) && TREE_OPERAND (lhs, 0) == name
> > >                 && (gimple_bb (stmt) == single_succ 
> > > (ENTRY_BLOCK_PTR)
> > >                     || dominated_by_p (CDI_POST_DOMINATORS,
> > > 				   single_succ (ENTRY_BLOCK_PTR),
> > >                                    gimple_bb (stmt))))
> > >               {
> > >                 note parm can be dereferenced in the caller
> > >               }
> > >           }
> > > 
> > >         same for the rhs (and maybe call lhs and call args, but that
> > >         probably doesn't make too much of a difference).
> > > 
> > 
> > As far as I can remember, me and Honza (actually, the whole algorithm
> > was Honza's idea, and most probably a good one) have decided not to
> > rely on dominance but rather do it in this way because of cases like
> > this:
> > 
> >   if (condition)
> >     {
> >       a = param->field + 1;
> >       /* maybe do a lot of othe stuff */
> >     }
> >   else
> >     {
> >       a = param->field + 5;
> >       /* maybe do a lot of othe stuff */
> >     }
> > 
> > Neither of the loads (dereferences) post-dominates exit but it is
> > still safe to proceed with dereferencing in the caller.  In the test
> > cases I have examined the loads remained in the branches and were not
> > moved before the condition.  Moreover, the scenario seems very frequent.
> 
> Hmm, ok.
> 
> I don't quite follow analyze_caller_derference_legality though
> (or rather local vs global dereferences).  Local dereferences are
> built up during access building and in a_c_d_l you propagate them
> into the global ones.  Why do you update local_dereferences i
> during this propagation? 

I remember that I deliberately added the global array for some
specific reason but looking at the code I don't see any now.  So I
guess that either the reason went away or I simply messed up.  I will
try to remove it.

> In fact - why have two arrays anyway and
> not just propagate in-place?  Also in the caller of a_c_d_l you
> check for each BB and for each param whether the group is not
> necessarily dereferenced - but you just propagated the offsets, so
> checking the entry basic block should be enough?

I propagate the offsets "downwards," i.e. in the direction of the
control flow edges and check it at each block marked as "final"
(i.e. returning, externally throwing etc...).

> 
> That I don't get it means it lacks documentation - what is the
> converged state after propagation supposed to look like?  Thus, what
> can you actually do with it?

OK, I will add this to the source and re-post.

The bigger worry at the moment is that IPA-SRA causes different output
when generating debug info.  Hopefully I can find out what causes that
soon.

Thanks a lot,

Martin


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]