[PATCH 3/5] New intraprocedural Scalar Reduction of Aggregates.

Martin Jambor mjambor@suse.cz
Tue May 12 00:24:00 GMT 2009


Hi,

thanks for a quick reply.  Some clarifications below:

On Sun, May 10, 2009 at 01:48:01PM +0200, Richard Guenther wrote:
> On Sun, 10 May 2009, Martin Jambor wrote:
> 
> > > >       expr = TREE_OPERAND (expr, 0);
> > > >       bit_ref = true;
> > > >     }
> > > >   else
> > > >     bit_ref = false;
> > > > 
> > > >   while (TREE_CODE (expr) == NOP_EXPR
> > > 
> > > CONVERT_EXPR_P (expr)
> > 
> > OK... but  at another place  in the email  you said it might  not even
> > appear in a valid gimple statement?  Should I remove it altogether?
> 
> Indeed.  If you not build trees from tuple stmts then a
> NOP_EXPR cannot appear as a rhs1 or rhs2 of an assignment (instead
> it is always the subcode in gimple stmt and the rhs1 is simply sth
> valid for is_gimple_val).

OK

> 
> > > > 	 || TREE_CODE (expr) == VIEW_CONVERT_EXPR
> > > > 	 || TREE_CODE (expr) == REALPART_EXPR
> > > > 	 || TREE_CODE (expr) == IMAGPART_EXPR)
> > > >     expr = TREE_OPERAND (expr, 0);
> > > 
> > > Why do this here btw, and not just lump ...
> > > 
> > > >   switch (TREE_CODE (expr))
> > > >     {
> > > >     case ADDR_EXPR:
> > > >     case SSA_NAME:
> > > >     case INDIRECT_REF:
> > > >       break;
> > > > 
> > > >     case VAR_DECL:
> > > >     case PARM_DECL:
> > > >     case RESULT_DECL:
> > > >     case COMPONENT_REF:
> > > >     case ARRAY_REF:
> > > >       ret = create_access (expr, write);
> > > >       break;
> > > 
> > > ... this ...
> > > 
> > > >     case REALPART_EXPR:
> > > >     case IMAGPART_EXPR:
> > > >       expr = TREE_OPERAND (expr, 0);
> > > >       ret = create_access (expr, write);
> > > 
> > > ... and this together?  Won't you create bogus accesses if you
> > > strip for example IMAGPART_EXPR (which has non-zero offset)?
> > 
> > That would  break the complex  number into its components.   I thought
> > that they are  meant to stay together for  some reason, otherwise they
> > would not be represented explicitly  in gimple... do you think it does
> > not matter?  What about vectors then?
> > 
> > The access is not bogus because modification functions take care of
> > these statements in a special way.  However, if it is indeed OK to
> > split complex numbers into their components, I will gladly simplify
> > this as you suggested.
> 
> Yes, it is valid to split them (and complex lowering indeed does that).
> It _might_ be useful to keep a complex together in a single SSA_NAME
> for optimization purposes, but I guess you detect that anyway if there
> is a read of the whole complex element into a register and keep it
> that way.
> 
> I would favor simplifying SRA in this case and just split them if
> that is valid.

OK, I will try to do that and incormporate it into the patch if it
indeed simplifies things.  At least analysis of access trees will OTOH
become more complex.  Unfortunately, I need to concentrate on another
two things this week so I'll do that early the next one.

> > > >     return SRA_SA_NONE;
> > > >
> > > >   lhs_ptr = gimple_assign_lhs_ptr (stmt);
> > > >   rhs_ptr = gimple_assign_rhs1_ptr (stmt);
> > > 
> > > you probably don't need to pass pointers to trees everywhere as you
> > > are not changing them.
> > 
> > Well, this  function is a  callback called by scan_function  which can
> > also  call  sra_modify_expr  in  the  last  stage  of  the  pass  when
> > statements  are modified.   I have  considered splitting  the function
> > into two but  in the end I  thought they would be too  similar and the
> > overhead is hopefully manageable.
> 
> Yeah, I noticed this later.  It is somewhat confusing at first sight,
> so maybe just amending the comment before this function could
> clarify things.

OK, for my part, I've realized that build_access_from_expr_1 indeed
does not use the gsi parameter (and is not a callback, unlike
build_access_from_expr).

> > > >   if (disqualify_ops_if_throwing_stmt (stmt, lhs_ptr, rhs_ptr))
> > > >     return SRA_SA_NONE;
> > > > 
> > > >   racc = build_access_from_expr_1 (rhs_ptr, gsi, false);
> > > >   lacc = build_access_from_expr_1 (lhs_ptr, gsi, true);
> > > 
> > > just avoid calling into build_access_from_expr_1 for SSA_NAMEs
> > > or is_gimple_min_invariant lhs/rhs, that should make that
> > > function more regular.
> > 
> > In what sense?  build_access_from_expr_1 looks at TREE_CODE anyway and
> > can discard the two cases,  without for example looking into ADR_EXPRs
> > like is_gimple_min_invariant().
> > 
> > But if you really think it is indeed beneficial, I can do that, sure -
> > to me it just looks ugly).
> 
> Ok, just keep it as is.
> 
> > > >   if (lacc && racc
> > > >       && !lacc->grp_unscalarizable_region
> > > >       && !racc->grp_unscalarizable_region
> > > >       && AGGREGATE_TYPE_P (TREE_TYPE (*lhs_ptr))
> > > >       && lacc->size <= racc->size
> > > >       && useless_type_conversion_p (lacc->type, racc->type))
> > > 
> > > useless_type_conversion_p should be always true here.
> > 
> > I don't think so, build_access_from_expr_1 can look through V_C_Es and
> > the types of accesses are the type of the operand in such cases..
> 
> Ok, but what is the point of looking through V_C_Es there if it makes
> this test fail?  Hmm, IIRC this was only to track struct copies, right?
> I guess it's ok then.

Yes, only for that.  The point of looking through V_C_Es is the size
incosistency.

> 
> > > That would just be useless information.  I guess you copied this
> > > from old SRA?
> > 
> > Yes.  All this fancy naming stuff  is quite useless but I find it very
> > handy when debugging SRA issues.
> 
> Yeah, sort of.  Still using no name in that case will do exactly
> the same thing ;)
> 
> > > > static tree
> > > > create_access_replacement (struct access *access)
> > > > {
> > > >   tree repl;
> > > > 
> > > >   repl = make_rename_temp (access->type, "SR");
> > > >   get_var_ann (repl);
> > > >   add_referenced_var (repl);
> > > > 
> > > >   DECL_SOURCE_LOCATION (repl) = DECL_SOURCE_LOCATION (access->base);
> > > >   DECL_ARTIFICIAL (repl) = 1;
> > > > 
> > > >   if (DECL_NAME (access->base) && !DECL_IGNORED_P (access->base))
> > > 
> > > at least && !DECL_ARTIFICIAL (access->base) I think.
> > 
> > This part is also largely copied from the old SRA.  So far it seems to
> > work nicely, replacements of artificial declarations get SRsome_number
> > fancy names and that makes  them easy to distinguish.  Nevertheless, I
> > can change the condition if it is somehow wrong.  Or do you expect any
> > other problems beside not-so-fancy fancy names?
> 
> No, it merely uses up memory.  Not that other passes do not do this ...
> 
> Thus, I probably do not care too much.

Well, I'd like to have the fancy names when SRA gets merged and new
issues are likely to come up.  We can always remove this later on.

 
> > > >   if (access->grp_bfr_lhs)
> > > >     DECL_GIMPLE_REG_P (repl) = 0;
> > > 
> > > But you never set it (see update_address_taken for more cases,
> > > most notably VIEW_CONVERT_EXPR on the lhs which need to be taken
> > > care of).  You should set it for COMPLEX_TYPE and VECTOR_TYPE 
> > > replacements.
> > 
> > This function  is the  only place where  I still  use make_rename_temp
> > which sets it  exactly in these two cases.  I did  not really know why
> > it is  required in these two  cases and only  in these two cases  so I
> > left it there, at least for  now.  I guess I understand that now after
> > seeing update_address_taken.
> > 
> > I can  replace this  with calling create_tmp_var()  and doing  all the
> > rest  that make_rename_temp does  - I  believe that  you intend  to to
> > remove it - I have just not found out why it is so bad.
> 
> The bad thing about it is that it supports using the SSA renamer
> to write a single variable into SSA.  That is usually more costly
> than just manually allocating SSA_NAMEs and updating SSA form,
> which is usally very easy.
> 
> It's not used much, in which case the easiest thing might be to
> fix all remaining uses to manually update SSA form.
> 
> But yes, I now see why that zeroing is necessary.

OK, I'll use create_tmp_var_here too.  But at this point I cannot
create SSA_NAMEs manually and  will basically have to do all that
make_rename_temp does.
 
> > > 
> > > CONVERT_EXPR_P (expr)
> > > 
> > > >       || TREE_CODE (expr) == VIEW_CONVERT_EXPR)
> > > 
> > > VIEW_CONVERT_EXPR is also a handled_component_p.
> > > 
> > > Note that NOP_EXPR should never occur here - that would be invalid
> > > gimple.  So I think you can (and should) just delete the above.
> > 
> > I haven't  seen a NOP_EXPR for a  while, do they still  exist in lower
> > gimple?  Thus I have removed their handling.
> > 
> > Removing diving through V_C_E breaks ADA, though.  The reason is that
> > we get a different size (and max_size) when calling
> > get_ref_base_and_extent on the V_C_E and on its argument.  However, I
> > believe both should be represented by a single access representative.
> 
> Yeah, I remember this :/  It is technically invalid GIMPLE that the
> Ada FE generates though.  The size of the V_C_E result has to match
> that of the operand.
> 
> Please add a FIXME before this stripping refering to the Ada problem.

OK

> > > >   tree repl = get_access_replacement (access);
> > > >   if (!TREE_ADDRESSABLE (type))
> > > >     {
> > > >       tree tmp = create_tmp_var (type, "SRvce");
> > > > 
> > > >       add_referenced_var (tmp);
> > > >       if (is_gimple_reg_type (type))
> > > > 	tmp = make_ssa_name (tmp, NULL);
> > > 
> > > Should be always is_gimple_reg_type () if it is a type suitable for
> > > a SRA scalar replacement. 
> > 
> > No, it is the type suitable for  the statement, it can be a union type
> > or a record with only one field. But see the more thorough explanation
> > below...
> 
> I think it should be always a register type, but see below... ;)
> 
> > > But you should set DECL_GIMPLE_REG_P for
> > > VECTOR and COMPLEX types here.
> > > 
> > > >       if (write)
> > > > 	{
> > > > 	  gimple stmt;
> > > > 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);
> > > 
> > > This needs to either always fold to plain 'tmp' or tmp has to be a
> > > non-register.  Otherwise you will create invalid gimple.
> > > 
> > > > 	  *expr = tmp;
> > > > 	  if (is_gimple_reg_type (type))
> > > > 	    SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
> > > 
> > > See above.
> > > 
> > > > 	  stmt = gimple_build_assign (repl, conv);
> > > > 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > > > 	  update_stmt (stmt);
> > > > 	}
> > > >       else
> > > > 	{
> > > > 	  gimple stmt;
> > > > 	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);
> > > > 
> > > > 	  stmt = gimple_build_assign (tmp, conv);
> > > > 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> > > > 	  if (is_gimple_reg_type (type))
> > > > 	    SSA_NAME_DEF_STMT (tmp) = stmt;
> > > 
> > > See above.  (I wonder if the patch still passes bootstrap & regtest
> > > after the typecking patch)
> > > 
> > > > 	  *expr = tmp;
> > > > 	  update_stmt (stmt);
> > > > 	}
> > > >     }
> > > >   else
> > > >     {
> > > >       if (write)
> > > > 	{
> > > > 	  gimple stmt;
> > > > 
> > > > 	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
> > > > 	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
> > > > 	  update_stmt (stmt);
> > > > 	}
> > > >       else
> > > > 	{
> > > > 	  gimple stmt;
> > > > 
> > > > 	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
> > > > 	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
> > > > 	  update_stmt (stmt);
> > > > 	}
> > > 
> > > I don't understand this path.  Are the types here always compatible?
> > 
> > And I don't really understand the comments.  The function is called by
> > sra_modify_expr (the function doing the replacements in all non-assign
> > statements) when it  needs to replace a reference by  a scalar but the
> > types don't  match.  This can happen  when replacing a  V_C_E, a union
> > access  when we  picked a  different  type that  the one  used in  the
> > statement or (and this case can be remarkably irritating) an access to
> > a records with only one (scalar) field.
> > 
> > My original idea  was to simply put a V_C_E in  the place.  However, I
> > believe there are places where this  is not possible - or at least one
> > case, a  LHS of  a call statement  because V_C_Es  of gimple_registers
> > (ssa_names) are not allowed on  LHSs.  My initial idea to handle these
> > cases  were to  create a  new temporary  with a  matching and  a V_C_E
> > assign statement  (with the V_C_E always  on the RHS -  I believe that
> > works even  with gimple  registers) that would  do the  conversion and
> > load/store  it   to  the  replacement  variable  (this   is  what  the
> > !TREE_ADDRESSABLE branch does).
> > 
> > The problem  with this idea  are TREE_ADDRESSABLE types.   These types
> > need to be  constructed and thus we cannot  create temporary variables
> > of these types.   On the other hand they absolutely  need to be SRAed,
> > not doing  so slows down tramp3d by  a factor of two  (and the current
> > SRA also breaks them up).  And  quite a few C++ classes are such types
> > that   are  "non-addressable"   and  have   only  one   scalar  field.
> > Identifying  such records  is possible,  I  soon realized  that I  can
> > simply leave the statement as it  is and produce a new statement to do
> > load/store  from the original  field (that's  what the  outermost else
> > branch does).
> > 
> > Does this make sense or is there some fundamental flaw in my reasoning
> > about gimple again?  Does this explain what the function does?
> 
> Ok, so the case in question is
> 
>   struct X { int i; } x;
>   x = foo ();
> 
> where you want to scalarize x.  Indeed the obvious scalarization would
> be
> 
>   x = foo ();
>   SR_1 = x.i;
> 
> For all other LHS cases (not calls) you can move the V_C_E to the RHS
> and should be fine.

Well, I  have tried  to remove the  function and  have sra_modify_expr
handle this  particular case only  to discover another one  where it's
probably required  (given that  gimple verification is  correct).  The
new  problem  is  that  I  cannot   put  a  V_C_E  as  an  operand  of
GIMPLE_RETURN.   Specifically  I got  an  "invalid  operand in  return
statement" failure when verifying:

return VIEW_CONVERT_EXPR<struct gnat__perfect_hash_generators__key_type>(SR.2108_12);

Thus I will probably keep the function to be always safe but change it
to the following:

/* Substitute into *EXPR an expression of type TYPE with the value of the
   replacement of ACCESS.  This is done either by producing a special V_C_E
   assignment statement converting the replacement to a new temporary of the
   requested type if TYPE is is_gimple_reg_type or by going through the base
   aggregate if it is not.  */

static void
fix_incompatible_types_for_expr (tree *expr, tree type, struct access *access,
				 gimple_stmt_iterator *gsi, bool write)
{
  tree repl = get_access_replacement (access);

  if (is_gimple_reg_type (type))
    {
      tree tmp = create_tmp_var (type, "SRvce");

      add_referenced_var (tmp);
      tmp = make_ssa_name (tmp, NULL);

      if (write)
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, TREE_TYPE (repl), tmp);

	  *expr = tmp;
	  SSA_NAME_DEF_STMT (tmp) = gsi_stmt (*gsi);
	  stmt = gimple_build_assign (repl, conv);
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;
	  tree conv = fold_build1 (VIEW_CONVERT_EXPR, type, repl);

	  stmt = gimple_build_assign (tmp, conv);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  SSA_NAME_DEF_STMT (tmp) = stmt;
	  *expr = tmp;
	  update_stmt (stmt);
	}
    }
  else
    {
      if (write)
	{
	  gimple stmt;

	  stmt = gimple_build_assign (repl, unshare_expr (access->expr));
	  gsi_insert_after (gsi, stmt, GSI_NEW_STMT);
	  update_stmt (stmt);
	}
      else
	{
	  gimple stmt;

	  stmt = gimple_build_assign (unshare_expr (access->expr), repl);
	  gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
	  update_stmt (stmt);
	}
    }
}



> I don't understand the TREE_ADDRESSABLE thingy yet.  Especially why
> the type should be a non-register type.

I guess it does not matter now, but in your example above type would be
struct x, not int.

> > It certainly passes bootstrap and testing, I use --enable-checking=yes.
> 
> That's good.
> 
> > > >     {
> > > >       /* I have never seen this code path trigger but if it can happen the
> > > > 	 following should handle it gracefully.  */
> > > 
> > > It can trigger for vector constants.
> > 
> > OK, I'll remove the comment.  Apparently there are none in the
> > testsuite, I believe I tested with a gcc_unreachable here.
> 
> Err, for vector constants we have VECTOR_CST, so it triggers for
> non-constant vector constructors like
> 
>  vector int x = { a, b, c, d };

I see.
 
> > > >   update_stmt (aux_stmt);
> > > >   new_stmt = gimple_build_assign (get_access_replacement (access),
> > > > 				  fold_build2 (COMPLEX_EXPR, access->type,
> > > > 					       rp, ip));
> > > >   gsi_insert_after (gsi, new_stmt, GSI_NEW_STMT);
> > > >   update_stmt (new_stmt);
> > > 
> > > Hm.  So you do what complex lowering does here.  Note that this may
> > > create loads from uninitialized memory with all its problems.
> > 
> > Yes,  but I  have not  had any  such problems  with complex  types (as
> > opposed to  simple loads from half-initialized  records, for example).
> > OTOH, I  have also contemplated  setting DECL_GIMPLE_REG_P to  zero of
> > complex replacement which appear in IMAG_PART or REAL_PART on a LHS of
> > a statement.
> 
> Yes, that's necessary.  I still think SRA should not bother about
> this at all ;)
> 
> > > WRT the complex stuff.  If you would do scalarization and analysis
> > > just on the components (not special case REAL/IMAGPART_EXPR everywhere)
> > > it should work better, correct?  You still could handle group
> > > scalarization for the case of for example passing a complex argument
> > > to a function.
> > 
> > Well, my reasoning was that if complex types were first-class citizens
> > in gimple  (as opposed to a record),  there was a reason  to keep them
> > together  and  so  I  attempted   that.   But  again,  if  that  is  a
> > misconception of mine and there  is no point in keeping them together,
> > I will gladly remove this.
> 
> It's not clear.  Complex lowering decomposes all complex variables
> to components if possible.  Again, simplifying SRA is probably better.

OK, I will find out how much this would actually simplify things and
what new problems might arise.  

I have already tried relaxing the  access tree analysis s that it does
not  prevent scalarization  of  subaccesses of  scalar accesses  which
would be necessary for decomposing complex components and uncovered an
unrelated problem, again in my favorite testcase entry_4.f90.

Can you please check whether the following snippet is a valid gimple?
The expander ICEs on an assert when trying to crunch the V_C_E at the
end.  Looking at it myself, I start to doubt that I can always handle
union type-punning with V_C_Es.

----------------------------------------------------------------------
master.2.f3 (integer(kind=8) __entry, integer(kind=4) * b, integer(kind=4) * a)
{
  real(kind=4) __result_master.2.f3$r;
  union munion.2.f3 __result_master.2.f3;
  union munion.2.f3 D.1641;
  complex(kind=4) D.1640;
  real(kind=4) D.1639;
  integer(kind=4) D.1638;
  logical(kind=4) D.1637;
  integer(kind=4) D.1636;
  real(kind=4) D.1635;
  integer(kind=4) D.1634;
  integer(kind=4) D.1633;

<bb 2>:
  switch (__entry_1(D)) <default: <L3>, case 0: <L3>, case 1: <L10>, case 2: <L9>>

<L3>:
  D.1633_3 = *a_2(D);
  D.1634_4 = D.1633_3 + 15;
  D.1635_5 = (real(kind=4)) D.1634_4;
  __result_master.2.f3$r_15 = D.1635_5;
  goto <bb 6>;

<L10>:
  D.1636_7 = *b_6(D);
  D.1637_8 = D.1636_7 == 42;
  __result_master.2.f3$r_19 = VIEW_CONVERT_EXPR<real(kind=4)>(D.1637_8);
----------------------------------------------------------------------

> 
> > > void bar(_Complex float);
> > > void foo(float x, float y)
> > > {
> > >   _Complex float z = x;
> > >   __imag z = y;
> > >   bar(z);
> > > }
> > > 
> > > The same applies for vectors - the REAL/IMAGPART_EXPRs equivalent
> > > there is BIT_FIELD_REF.
> > 
> > These are handled  by setting DECL_GIMPLE_REG_P to zero  if a B_F_R is
> > on a LHS.  I believe the current SRA does the same.  It works fine and
> > there's a lot less fuss about them.
> >  
> > > >   return SRA_SA_PROCESSED;
> > > > }
> > > > 
> > > > /* Return true iff T has a VIEW_CONVERT_EXPR among its handled components.  */
> > > > 
> > > > static bool
> > > > contains_view_convert_expr_p (tree t)
> > > > {
> > > >   while (1)
> > > >     {
> > > >       if (TREE_CODE (t) == VIEW_CONVERT_EXPR)
> > > > 	return true;
> > > >       if (!handled_component_p (t))
> > > > 	return false;
> > > >       t = TREE_OPERAND (t, 0);
> > > >     }
> > > > }
> > > 
> > > Place this in tree-flow-inline.h next to ref_contains_array_ref, also
> > > structure the loop in the same way.
> > 
> > OK,  but I'd like  the function  to work  if passed  declarations too.
> > Thus I cannot really use a  do-while loop.  I'll send it in a separate
> > patch.
> > 
> > > > /* Change STMT to assign compatible types by means of adding component or array
> > > >    references or VIEW_CONVERT_EXPRs.  All parameters have the same meaning as
> > > >    variable with the same names in sra_modify_assign.  This is done in a
> > > >    such a complicated way in order to make
> > > >    testsuite/g++.dg/tree-ssa/ssa-sra-2.C happy and so it helps in at least some
> > > >    cases.  */
> > > > 
> > > > static void
> > > > fix_modified_assign_compatibility (gimple_stmt_iterator *gsi, gimple *stmt,
> > > > 				   struct access *lacc, struct access *racc,
> > > > 				   tree lhs, tree *rhs, tree ltype, tree rtype)
> > > > {
> > > >   if (racc && racc->grp_to_be_replaced && AGGREGATE_TYPE_P (ltype)
> > > >       && !access_has_children_p (lacc))
> > > >     {
> > > >       tree expr = unshare_expr (lhs);
> > > >       bool found = build_ref_for_offset (&expr, ltype, racc->offset, rtype,
> > > > 					 false);
> > > >       if (found)
> > > > 	{
> > > > 	  gimple_assign_set_lhs (*stmt, expr);
> > > > 	  return;
> > > > 	}
> > > >     }
> > > > 
> > > >   if (lacc && lacc->grp_to_be_replaced && AGGREGATE_TYPE_P (rtype)
> > > >       && !access_has_children_p (racc))
> > > >     {
> > > >       tree expr = unshare_expr (*rhs);
> > > >       bool found = build_ref_for_offset (&expr, rtype, lacc->offset, ltype,
> > > > 					 false);
> > > >       if (found)
> > > > 	{
> > > > 	  gimple_assign_set_rhs1 (*stmt, expr);
> > > > 	  return;
> > > > 	}
> > > >     }
> > > > 
> > > >   *rhs = fold_build1 (VIEW_CONVERT_EXPR, ltype, *rhs);
> > > >   gimple_assign_set_rhs_from_tree (gsi, *rhs);
> > > >   *stmt = gsi_stmt (*gsi);
> > > 
> > > Reading this I have a deja-vu - isn't there another function in this
> > > file doing the same thing?  You are doing much unsharing even though
> > > you re-build the access tree from scratch?
> > 
> > This function has a similar purpose as fix_incompatible_types_for_expr
> > but this time  only for assign statements.  That  is easier because we
> > can always put the  V_C_E on the RHS and be safe  and so no additional
> > statements need to be generated.
> > 
> > However,  the V_C_Es  rather than  COMPONENT_REFs and  ARRAY_REFs feel
> > unnatural for  accessing fields from  single field records  and unions
> > and single  element arrays.  According to  the comment I  used to have
> > problems of some sort with that  in the ssa-sra-2.C testcase but I can
> > no longer reproduce them (and don't remember them).
> > 
> > I  call  unshare_expr  in  this  context  only when  one  side  of  an
> > assignment statement is  a scalar replacement and the  other one is an
> > aggregate (but not necessarily a declaration) which can happen only in
> > the cases listed  above.  That is not very many  calls and chances are
> > good that build_ref_for_offset succeeds.
> > 
> > Does that explain what is going on here?
> 
> Yes.  I guess merging the two functions would make sense to me
> (or putting them next to each other at least).  The function signatures
> also look seemingly weird (that you store into *rhs but still set
> the stmt rhs, etc.).  I have another look here with the new version.

*rhs is a local variable of the caller, not a pointer in the
statement.  (It has only one callsite and thus this double indirection
will be inlined away).  I tried using gimple_assign_set_rhs1 instead
but it ICEs when passed a V_C_E.

The two functions have similar goal but work differently, in
particular fix_modified_assign_compatibility never generates new
statements while fix_incompatible_types_for_expr always does.  So I
don't think it makes sense to merge them.  But I can move this one
upward, sure.

> > > > }
> > > > 
> > > > /* Callback of scan_function to process assign statements.  It examines both
> > > >    sides of the statement, replaces them with a scalare replacement if there is
> > > >    one and generating copying of replacements if scalarized aggregates have been
> > > >    used in the assignment.  STMT is a pointer to the assign statement, GSI is
> > > >    used to hold generated statements for type conversions and subtree
> > > >    copying.  */
> > > > 
> > > > static enum scan_assign_result
> > > > sra_modify_assign (gimple *stmt, gimple_stmt_iterator *gsi,
> > > > 		   void *data ATTRIBUTE_UNUSED)
> > > > {
> > > >   struct access *lacc, *racc;
> > > >   tree ltype, rtype;
> > > >   tree lhs, rhs;
> > > >   bool modify_this_stmt;
> > > > 
> > > >   if (gimple_assign_rhs2 (*stmt))
> > > 
> > > !gimple_assign_single_p (*stmt)
> > > 
> > > (the only gimple assign that may access memory)
> > 
> > OK
> > 
> > > 
> > > >     return SRA_SA_NONE;
> > > >   lhs = gimple_assign_lhs (*stmt);
> > > >   rhs = gimple_assign_rhs1 (*stmt);
> > > > 
> > > >   if (TREE_CODE (rhs) == CONSTRUCTOR)
> > > >     return sra_modify_constructor_assign (stmt, gsi);
> > > > 
> > > >   if (TREE_CODE (lhs) == REALPART_EXPR || TREE_CODE (lhs) == IMAGPART_EXPR)
> > > >     return sra_modify_partially_complex_lhs (*stmt, gsi);
> > > > 
> > > >   if (TREE_CODE (rhs) == REALPART_EXPR || TREE_CODE (rhs) == IMAGPART_EXPR
> > > >       || TREE_CODE (rhs) == BIT_FIELD_REF || TREE_CODE (lhs) == BIT_FIELD_REF)
> > > >     {
> > > >       modify_this_stmt = sra_modify_expr (gimple_assign_rhs1_ptr (*stmt),
> > > > 					  gsi, false, data);
> > > >       modify_this_stmt |= sra_modify_expr (gimple_assign_lhs_ptr (*stmt),
> > > > 					   gsi, true, data);
> > > >       return modify_this_stmt ? SRA_SA_PROCESSED : SRA_SA_NONE;
> > > >     }
> > > > 
> > > >   lacc = get_access_for_expr (lhs);
> > > >   racc = get_access_for_expr (rhs);
> > > >   if (!lacc && !racc)
> > > >     return SRA_SA_NONE;
> > > > 
> > > >   modify_this_stmt = ((lacc && lacc->grp_to_be_replaced)
> > > > 		      || (racc && racc->grp_to_be_replaced));
> > > > 
> > > >   if (lacc && lacc->grp_to_be_replaced)
> > > >     {
> > > >       lhs = get_access_replacement (lacc);
> > > >       gimple_assign_set_lhs (*stmt, lhs);
> > > >       ltype = lacc->type;
> > > >     }
> > > >   else
> > > >     ltype = TREE_TYPE (lhs);
> > > > 
> > > >   if (racc && racc->grp_to_be_replaced)
> > > >     {
> > > >       rhs = get_access_replacement (racc);
> > > >       gimple_assign_set_rhs1 (*stmt, rhs);
> > > >       rtype = racc->type;
> > > >     }
> > > >   else
> > > >     rtype = TREE_TYPE (rhs);
> > > > 
> > > >   /* The possibility that gimple_assign_set_rhs_from_tree() might reallocate
> > > >      the statement makes the position of this pop_stmt_changes() a bit awkward
> > > >      but hopefully make some sense.  */
> > > 
> > > I don't see pop_stmt_changes().
> > 
> > Yeah, the comment is outdated. I've removed it.
> >  
> > > >   if (modify_this_stmt)
> > > >     {
> > > >       if (!useless_type_conversion_p (ltype, rtype))
> > > > 	fix_modified_assign_compatibility (gsi, stmt, lacc, racc,
> > > > 					   lhs, &rhs, ltype, rtype);
> > > >     }
> > > > 
> > > >   if (contains_view_convert_expr_p (rhs) || contains_view_convert_expr_p (lhs)
> > > >       || (access_has_children_p (racc)
> > > > 	  && !ref_expr_for_all_replacements_p (racc, lhs, racc->offset))
> > > >       || (access_has_children_p (lacc)
> > > > 	  && !ref_expr_for_all_replacements_p (lacc, rhs, lacc->offset)))
> > > 
> > > ?  A comment is missing what this case is about ...
> > > 
> > > (this smells like fixup that could be avoided by doing things correct
> > > in the first place)
> > 
> > From this  point on,  the function deals  with assignments  in between
> > aggregates  when at least  one has  scalar reductions  of some  of its
> > components.  There are three possible  scenarios: Both the LHS and RHS
> > have to-be-scalarized components,  2) only the RHS has  or 3) only the
> > LHS has.
> > 
> > In the first  case, we would like to load the  LHS components from RHS
> > components whenever possible.  If that  is not possible, we would like
> > to read it  directly from the RHS (after updating it  by storing in it
> > its own components).  If there are some necessary unscalarized data in
> > the  LHS, those will  be loaded  by the  original assignment  too.  If
> > neither of these cases happen,  the original statement can be removed.
> > Most of this is done by load_assign_lhs_subreplacements.
> > 
> > In  the  second  case, we  would  like  to  store all  RHS  scalarized
> > components  directly  into  LHS   and  if  they  cover  the  aggregate
> > completely, remove the statement too.   In the third case, we want the
> > LHS components to be loaded directly from the RHS (DSE will remove the
> > original statement if it becomes redundant).
> > 
> > This is a bit complex but  manageable when types match and when unions
> > do not cause confusion in a way that we cannot really load a component
> > of LHS from the RHS or  vice versa (the access representing this level
> > can  have subaccesses  that are  accessible only  through  a different
> > union field  at a higher  level - different  from the one used  in the
> > examined expression).  Unions are fun.
> > 
> > Therefore, I specially handle a fourth case, happening when there is a
> > specific  type  cast  or  it  is impossible  to  locate  a  scalarized
> > subaccess on  the other  side of the  expression.  If that  happens, I
> > simply "refresh"  the RHS  by storing in  it is  scalarized components
> > leave the original statement there to do the copying and then load the
> > scalar replacements of the LHS.  This is what the first branch does.
> > 
> > Is it  clearer now?  Perhaps I  should put these five  paragraphs as a
> > comment into the function?
> 
> Yes - that would be nice.
> 

I have  just started a bootstrap  of my latest  version (still keeping
complex  numbers  together).  If  it  passes,  I'll  mail it  tomorrow
(otherwise, I'll have to wait for a couple of days).

Thanks again,

Martin



More information about the Gcc-patches mailing list