GCC Bugzilla will be upgraded from version 4.4.9 to 5.0rc3 on Saturday, April 25, starting around 17:00 UTC. The upgrade process should only last a few minutes. Check bug 64968 for details.
Bug 33136 - [4.2 Regression] wrong code due to alias with allocation in loop
[4.2 Regression] wrong code due to alias with allocation in loop
Status: RESOLVED FIXED
Product: gcc
Classification: Unclassified
Component: tree-optimization
4.2.1
: P1 normal
: 4.3.0
Assigned To: Jakub Jelinek
http://gcc.gnu.org/ml/gcc-patches/200...
: alias, wrong-code
Depends on:
Blocks: 33784
  Show dependency treegraph
 
Reported: 2007-08-21 12:22 UTC by Denis Vlasenko
Modified: 2009-03-30 22:16 UTC (History)
12 users (show)

See Also:
Host: i386-pc-linux-gnu
Target: i386-pc-linux-gnu
Build: i386-pc-linux-gnu
Known to work: 4.3.0
Known to fail: 4.2.5
Last reconfirmed: 2007-10-15 18:51:59


Attachments
testcase (4.49 KB, text/plain)
2007-08-21 12:23 UTC, Denis Vlasenko
Details
assembler output generated by 4.1.1 (5.38 KB, text/plain)
2007-08-21 12:26 UTC, Denis Vlasenko
Details
assembler output generated by 4.2.1 (8.06 KB, text/plain)
2007-08-21 12:26 UTC, Denis Vlasenko
Details
gcc43-pr33136.patch (4.33 KB, patch)
2007-08-24 15:38 UTC, Jakub Jelinek
Details | Diff
gcc43-pr33136.patch (6.62 KB, patch)
2007-09-11 17:21 UTC, Jakub Jelinek
Details | Diff
gcc43-pr33136.patch (12.09 KB, patch)
2007-09-11 19:26 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Denis Vlasenko 2007-08-21 12:22:54 UTC
When you compile attached testcase with 4.1.1 and 4.2.1, resulting .o modules have following sizes:

$ size insmod.i-411.o insmod.i-421.o
   text    data     bss     dec     hex filename
    288       0       0     288     120 insmod.i-411.o
    543       0       0     543     21f insmod.i-421.o

Testcase was built using following command line:

gcc -std=gnu99 \
-Os -fno-builtin-strlen -finline-limit=0 -fomit-frame-pointer \
-ffunction-sections -fdata-sections -fno-guess-branch-probability \
-funsigned-char -static-libgcc \
-falign-functions=1 -falign-jumps=1 -falign-labels=1 -falign-loops=1 \
-march=i386 -mpreferred-stack-boundary=2 \
-fverbose-asm -S insmod.i.c
Comment 1 Denis Vlasenko 2007-08-21 12:23:49 UTC
Created attachment 14086 [details]
testcase
Comment 2 Denis Vlasenko 2007-08-21 12:26:25 UTC
Created attachment 14087 [details]
assembler output generated by 4.1.1

Sorry, don't have native gcc 4.1.1 at this machine, but have i486-linux-uclibc-gcc 4.1.1 and 4.2.1. Thus attaching asm output generated by these. Generated code should be identical, I think.
Comment 3 Denis Vlasenko 2007-08-21 12:26:49 UTC
Created attachment 14088 [details]
assembler output generated by 4.2.1
Comment 4 Richard Biener 2007-08-21 13:10:49 UTC
4.1 somehow has better alias information, you can verify this by adding
-fno-strict-aliasing to 4.1 and get the same (worse) result as with 4.2.
Comment 5 Richard Biener 2007-08-21 13:12:38 UTC
trunk regresses even more
Comment 6 Andrew Pinski 2007-08-21 16:20:43 UTC
Actually I think this might be related to removing iterative jump threading and not aliasing.
Comment 7 Jakub Jelinek 2007-08-22 13:27:06 UTC
Actually, to me this doesn't look like missed-optimization at all.
And you should be happy for 4.2.1 generated bigger code, 4.1.1 optimized out
something it shouldn't.
Below is an stripped down testcase, which works on 3.4/4.1/4.2/4.3 with -O0
or -O2 -fno-strict-aliasing, works even with -O2 with 3.4 and 4.2, segfaults
with 4.1 (x86_64 -m64, aborts with x86_64 -m32) and issues a bogus warning about unitialized common_head and aborts on the trunk.

/* { dg-do run } */
/* { dg-options "-O2" } */

typedef __SIZE_TYPE__ size_t;
extern void exit (int);
extern void *malloc (size_t);
extern void *realloc (void *, size_t);
extern void abort (void);

void *
__attribute__ ((noinline))
xmalloc (size_t size)
{
  void *p = malloc (size);
  if (p == 0)
    exit (0);
  return p;
}

void *
__attribute__ ((noinline))
xrealloc (void *old, size_t size)
{
  void *p = realloc (old, size);
  if (p == 0)
    exit (0);
  return p;
}

struct obj_section
{
  unsigned int sh_type;
  unsigned int sh_flags;
  unsigned int sh_size;
  unsigned int sh_addralign;
  const char *name;
  char *contents;
  struct obj_section *load_next;
  int idx;
};

struct obj_symbol
{
  struct obj_symbol *next;
  unsigned long value;
  unsigned long size;
  int secidx;
};

struct obj_file
{
  unsigned short e_shnum;
  struct obj_section **sections;
  struct obj_symbol *symtab[521];
};

static void
__attribute__((noinline))
obj_allocate_commons (struct obj_file *f)
{
  struct common_entry
  {
    struct common_entry *next;
    struct obj_symbol *sym;
  } *common_head = (void *) 0;
  unsigned long i;

  for (i = 0; i < 521; ++i)
    {
      struct obj_symbol *sym;
      for (sym = f->symtab[i]; sym; sym = sym->next)
        if (sym->secidx == 0xfff2)
          {
            struct common_entry **p, *n;
            for (p = &common_head; *p; p = &(*p)->next)
              if (sym->size <= (*p)->sym->size)
                break;
            n = __builtin_alloca (sizeof (*n));
            n->next = *p;
            n->sym = sym;
            *p = n;
          }
    }

  if (common_head)
    {
      for (i = 0; i < f->e_shnum; ++i)
        if (f->sections[i]->sh_type == 8)
          break;
      if (i == f->e_shnum)
        {
          struct obj_section *sec;
          f->sections = xrealloc (f->sections, (i + 1) * sizeof (sec));
          f->sections[i] = sec = xmalloc (sizeof (struct obj_section));
          f->e_shnum = i + 1;
          __builtin_memset (sec, 0, sizeof (*sec));
          sec->sh_type = 1;
          sec->sh_flags = (1 << 0) | (1 << 1);
          sec->name = ".bss";
          sec->idx = i;
        }
    }
}

int
main (void)
{
  struct obj_file of;
  struct obj_symbol s;
  struct obj_section *sec;
  __builtin_memset (&s, 0, sizeof (s));
  s.value = 4;
  s.size = 4;
  s.secidx = 0xfff2;
  __builtin_memset (&of, 0, sizeof (of));
  of.e_shnum = 2;
  of.sections = xmalloc (2 * sizeof (sec));
  of.sections[0] = sec = xmalloc (sizeof (struct obj_section));
  __builtin_memset (sec, 0, sizeof (*sec));
  sec->sh_type = 4;
  sec->sh_flags = (1 << 0) | (1 << 1);
  sec->name = ".foo";
  sec->idx = 0;
  of.sections[1] = sec = xmalloc (sizeof (struct obj_section));
  __builtin_memset (sec, 0, sizeof (*sec));
  sec->sh_type = 4;
  sec->sh_flags = (1 << 0) | (1 << 1);
  sec->name = ".bar";
  sec->idx = 1;
  of.symtab[0] = &s;
  obj_allocate_commons (&of);
  if (of.e_shnum != 3)
    abort ();
  return 0;
}

Even on the original testcase, obj_allocate_commons is so short because
the whole if (common_head) { ... } huge block has been completely optimized
out (watch for xrealloc call e.g.).

To me the code looks valid, there is no type puning involved.
Comment 8 Jakub Jelinek 2007-08-22 13:54:48 UTC
Even shorter testcase:

/* { dg-do run } */
/* { dg-options "-O2" } */

extern void abort (void);

struct S
{
  struct S *a;
  int b;
};
#ifdef VAR
struct S t;
#endif

int
main (void)
{
  struct S *s = (struct S *) 0, **p, *n;
  for (p = &s; *p; p = &(*p)->a);
  n = (struct S *) __builtin_alloca (sizeof (*n));
  n->a = *p;
  n->b = 1;
  *p = n;

  if (!s)
    abort ();
  return 0;
}

This one behaves identically with 4.1/4.2/trunk, -O2 -UVAR -> abort,
-O2 -DVAR -> success, -O2 -fno-strict-aliasing -UVAR -> success.
Comment 9 Jakub Jelinek 2007-08-22 13:56:56 UTC
When struct S is not defined at global scope, but within main, then no matter if
struct S t; is present or not 4.1/4.2/trunk aborts with -O2.
Comment 11 Jakub Jelinek 2007-08-22 14:29:18 UTC
Yeah, PR30708 seems to be stripped down from the same source.
But the stripped down testcase here in c#8 is 4.1/4.2/4.3 regression, not
just 4.1 regression.
Comment 12 Richard Biener 2007-08-22 15:15:31 UTC
Indeed.

<bb 2>:
  # s_13 = VDEF <s_12(D)>
  s = 0B;
  goto <bb 4>;

<bb 4>:
  # p_1 = PHI <&s(2), p_5(3)>
  # VUSE <HEAP.5_14(D), SMT.7_15(D)>
  D.2019_3 = *p_1;

is wrong alias info (from trunk salias dump).  The load from *p_1 misses
a use of s_13 - or the assignment to s should use SMT.7 instead.
Comment 13 Daniel Berlin 2007-08-22 15:52:51 UTC
At least for 4.3, ipa-type-escape is not looking into phi_nodes for address taking, so we end up returning false for may_alias_p (p, s) because we believe nobody ever takes the address of s.



IE if (ipa_type_escape_field_does_not_clobber_p (var_type,                                                             TREE_TYPE (ptr)))

incorrectly returns true.
ipa-type-escape.c needs to be changed to scan refs in phi_nodes (it was written when we had just plain old gimple during IPA)
Comment 14 Jakub Jelinek 2007-08-22 17:24:36 UTC
--- ipa-type-escape.c.jj13      2007-08-13 15:11:18.000000000 +0200
+++ ipa-type-escape.c   2007-08-22 19:21:07.000000000 +0200
@@ -1704,6 +1704,21 @@ analyze_function (struct cgraph_node *fn
     FOR_EACH_BB_FN (this_block, this_cfun)
       {
        block_stmt_iterator bsi;
+       tree phi, op;
+       use_operand_p use;
+       ssa_op_iter iter;
+
+       /* Find the addresses taken in phi node arguments.  */
+       for (phi = phi_nodes (this_block); phi; phi = PHI_CHAIN (phi))
+         {
+           FOR_EACH_PHI_ARG (use, phi, iter, SSA_OP_USE)
+             {
+               op = USE_FROM_PTR (use);
+               if (TREE_CODE (op) == ADDR_EXPR)
+                 check_rhs_var (op);
+             }
+         }
+
        for (bsi = bsi_start (this_block); !bsi_end_p (bsi); bsi_next (&bsi))
          walk_tree (bsi_stmt_ptr (bsi), scan_for_refs, 
                     fn, visited_nodes);

doesn't help here at all (it could have helped if one of the PHI arguments
contained say &s.some_field, but as it contains just &s, it doesn't do anything.
mark_interesting_addressof is only called for addresses of aggr fields
and has_proper_scope_for_analysis is a nop for automatic variables.
Comment 15 Daniel Berlin 2007-08-22 20:10:14 UTC
(In reply to comment #14)
> --- ipa-type-escape.c.jj13      2007-08-13 15:11:18.000000000 +0200
> +++ ipa-type-escape.c   2007-08-22 19:21:07.000000000 +0200
> @@ -1704,6 +1704,21 @@ analyze_function (struct cgraph_node *fn
>      FOR_EACH_BB_FN (this_block, this_cfun)
>        {
>         block_stmt_iterator bsi;
> +       tree phi, op;
> +       use_operand_p use;
> +       ssa_op_iter iter;
> +
> +       /* Find the addresses taken in phi node arguments.  */
> +       for (phi = phi_nodes (this_block); phi; phi = PHI_CHAIN (phi))
> +         {
> +           FOR_EACH_PHI_ARG (use, phi, iter, SSA_OP_USE)
> +             {
> +               op = USE_FROM_PTR (use);
> +               if (TREE_CODE (op) == ADDR_EXPR)
> +                 check_rhs_var (op);
> +             }
> +         }
> +
>         for (bsi = bsi_start (this_block); !bsi_end_p (bsi); bsi_next (&bsi))
>           walk_tree (bsi_stmt_ptr (bsi), scan_for_refs, 
>                      fn, visited_nodes);
> 
> doesn't help here at all (it could have helped if one of the PHI arguments
> contained say &s.some_field, but as it contains just &s, it doesn't do
> anything.
> mark_interesting_addressof is only called for addresses of aggr fields
> and has_proper_scope_for_analysis is a nop for automatic variables.
> 

Sigh.  It should be considering &s to be an address taking of the first field.  

Looks like look_for_address_of needs to be fixed.

Comment 16 Jakub Jelinek 2007-08-23 12:13:11 UTC
But doesn't ipa_type_escape_field_does_not_clobber_p do what is documented?
At least for the uses in nonoverlapping_memrefs_p where
ipa_type_escape_field_does_not_clobber_p is always called on some field, first
argument is a DECL_FIELD_CONTEXT of some field ans second argument is its type.
Then IMHO ipa_type_escape_field_does_not_clobber_p does the right thing.
If you take address of the whole struct rather than some specific field and
that address doesn't escape the CU, then that still doesn't explain how
could a pointer var with first field's type point to the struct.  Say for
struct A { int i; float f; } you still need (int *) ptrA where ptrA is
struct A *, or &ptrA->i.  Optimizations before ipa-type-escape will transform
the former to the latter and if they don't, I believe check_cast will handle
it anyway.  The only problem in alias.c might be if exprx is COMPONENT_REF
of the first field and expry is a var pointer to the whole struct A (or vice versa), then any taking of address of the whole struct anywhere would mean
the MEMs could overlap.  But in that case both MEMs will have different alias
sets, don't they?

The ipa_type_escape_field_does_not_clobber_p call in may_alias_p is very different though.  Here we don't necessarily call it with some record (or union)
type and a type of one of its fields, but rather with some record type (or pointer to it, pointer to pointer etc.) and some possibly completely unrelated other pointer type.  Well, because of previous
if (!alias_sets_conflict_p (mem_alias_set, var_alias_set)) return false;
it shouldn't be completely unrelated.  In most cases it will actually be the same
type and that's something ipa_type_escape_field_does_not_clobber_p wasn't
meant to answer.  I have instrumented may_alias_p, so that if
ipa_type_escape_field_does_not_clobber_p returned false for reasons other than
!initialized or !ipa_type_escape_type_contained_p, it would abort.
The only testcase in the whole make check-{gcc,g++,gfortran} testsuite triggering this was gcc.c-torture/execute/builtins/pr22237.c, where var had a union type which contained ptr's type as one of its subfields.

The whole use of ipa_type_escape_field_does_not_clobber_p in may_alias_p
is very much unclear to me.
E.g.:
              else if (ptr_star_count == 0)
                {
                  /* If PTR_TYPE was not really a pointer to type, it cannot
                     alias.  */
                  alias_stats.structnoaddress_queries++;
                  alias_stats.structnoaddress_resolved++;
                  alias_stats.alias_noalias++;
                  return false;
                }
Isn't ptr guaranteed to be have POINTER_TYPE or REFERENCE_TYPE?  Both from the
way how is ->pointers array populated and e.g. that PTR_IS_REF_ALL is used
before may_alias_p on the p_map->var resp. p_map1->var?  That implies
ptr_star_count > 0, so the above listed chunk is never executed.  Also, as
we don't care in the code whether ptr_star_count is 37 or just 1, I don't see
the point in computing ptr_star_count at all, nor the existence of ptr_type
variable.

If ipa_type_escape_star_count_of_interesting_type (var_type) > 0 (i.e. var is
a pointer to struct/union rather than struct/union itself, how is the fact that
something took address of fields within the struct itself relevant to whether some pointer may point to the pointer var or not?  If
ipa_type_escape_star_count_of_interesting_type (var_type) == 0, then
ipa_type_escape_field_does_not_clobber_p call would make sense if ptr was
a pointer to some field (field of field etc.), but then it needs to be called
with TREE_TYPE (TREE_TYPE (ptr)) as second argument, otherwise it is asking
a wrong question.  But if ptr's type is the struct/union itself, all we care is
if that record/union's address has been taken, which is not something ipa-type-escape won't answer.  Isn't that what TREE_ADDRESSABLE can be used for?

Please explain.
Comment 17 Daniel Berlin 2007-08-23 13:45:09 UTC
Subject: Re:  [4.1/4.2/4.3 Regression] wrong code due to alias with allocation in loop

On 23 Aug 2007 12:13:13 -0000, jakub at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #16 from jakub at gcc dot gnu dot org  2007-08-23 12:13 -------
> But doesn't ipa_type_escape_field_does_not_clobber_p do what is documented?

I was there when it was written. It may be documented to do one thing,
but the intent was to do another :)

> At least for the uses in nonoverlapping_memrefs_p where
> ipa_type_escape_field_does_not_clobber_p is always called on some field, first
> argument is a DECL_FIELD_CONTEXT of some field ans second argument is its type.
> Then IMHO ipa_type_escape_field_does_not_clobber_p does the right thing.
It doesn't.
> If you take address of the whole struct rather than some specific field and
> that address doesn't escape the CU, then that still doesn't explain how
> could a pointer var with first field's type point to the struct.

> Say for
> struct A { int i; float f; } you still need (int *) ptrA

Uh, no.

&ptrA will do just fine. You can clobber all fields through it if it escapes.

>
> The ipa_type_escape_field_does_not_clobber_p call in may_alias_p is very
> different though.  Here we don't necessarily call it with some record (or
> union)
> type and a type of one of its fields, but rather with some record type (or
> pointer to it, pointer to pointer etc.) and some possibly completely unrelated
> other pointer type.  Well, because of previous
> if (!alias_sets_conflict_p (mem_alias_set, var_alias_set)) return false;
> it shouldn't be completely unrelated.  In most cases it will actually be the
> same
> type and that's something ipa_type_escape_field_does_not_clobber_p wasn't
> meant to answer.  I have instrumented may_alias_p, so that if
> ipa_type_escape_field_does_not_clobber_p returned false for reasons other than
> !initialized or !ipa_type_escape_type_contained_p, it would abort.
> The only testcase in the whole make check-{gcc,g++,gfortran} testsuite
> triggering this was gcc.c-torture/execute/builtins/pr22237.c, where var had a
> union type which contained ptr's type as one of its subfields.
>
> The whole use of ipa_type_escape_field_does_not_clobber_p in may_alias_p
> is very much unclear to me.

The whole *point* of ipa_type_escape was to be used in may_alias_p.
The use in alias.c was actually an afterthought.

> E.g.:
>               else if (ptr_star_count == 0)
>                 {
>                   /* If PTR_TYPE was not really a pointer to type, it cannot
>                      alias.  */
>                   alias_stats.structnoaddress_queries++;
>                   alias_stats.structnoaddress_resolved++;
>                   alias_stats.alias_noalias++;
>                   return false;
>                 }
> Isn't ptr guaranteed to be have POINTER_TYPE or REFERENCE_TYPE?

yes.

> Both from the
> way how is ->pointers array populated and e.g. that PTR_IS_REF_ALL is used
> before may_alias_p on the p_map->var resp. p_map1->var?  That implies
> ptr_star_count > 0, so the above listed chunk is never executed.

Also true.
> Also, as
> we don't care in the code whether ptr_star_count is 37 or just 1, I don't see
> the point in computing ptr_star_count at all, nor the existence of ptr_type
> variable.
>
> If ipa_type_escape_star_count_of_interesting_type (var_type) > 0 (i.e. var is
> a pointer to struct/union rather than struct/union itself, how is the fact that
> something took address of fields within the struct itself relevant to whether
> some pointer may point to the pointer var or not?

If the address was never taken anywhere, it can't be pointed to.
Type-escape tries to go a little further, and see if, when the address
of a field is taken, if that address is ever cast'd,
incremented/decremented, or escapes.  If not, then only that field is
clobbbered, not the entire structure.  Otherwise, it is equivalent to
calculating TREE_ADDRESSABLE.

The other part of type-escape was to see if the address ever actually
escapes the CU, because if it does not, it could be transformed by
struct-reorg.

>   If
> ipa_type_escape_star_count_of_interesting_type (var_type) == 0, then
> ipa_type_escape_field_does_not_clobber_p call would make sense if ptr was
> a pointer to some field (field of field etc.), but then it needs to be called
> with TREE_TYPE (TREE_TYPE (ptr)) as second argument, otherwise it is asking
> a wrong question.

Also possible.  Kenny is not always good at knowing the intricacies of
our compiler.

> Please explain.
Comment 18 Jakub Jelinek 2007-08-23 14:49:39 UTC
Subject: Re:  [4.1/4.2/4.3 Regression] wrong code due to alias with allocation in loop

On Thu, Aug 23, 2007 at 01:45:11PM -0000, dberlin at dberlin dot org wrote:
> > If you take address of the whole struct rather than some specific field and
> > that address doesn't escape the CU, then that still doesn't explain how
> > could a pointer var with first field's type point to the struct.
> 
> > Say for
> > struct A { int i; float f; } you still need (int *) ptrA
> 
> Uh, no.
> 
> &ptrA will do just fine. You can clobber all fields through it if it escapes.

See "that address doesn't escape the CU" above I wrote.  But if so,
it will be in global_types_full_escape and so 
ipa_type_escape_field_does_not_clobber_p will return false.

> > If ipa_type_escape_star_count_of_interesting_type (var_type) > 0 (i.e. var is
> > a pointer to struct/union rather than struct/union itself, how is the fact that
> > something took address of fields within the struct itself relevant to whether
> > some pointer may point to the pointer var or not?
> 
> If the address was never taken anywhere, it can't be pointed to.
> Type-escape tries to go a little further, and see if, when the address
> of a field is taken, if that address is ever cast'd,
> incremented/decremented, or escapes.  If not, then only that field is
> clobbbered, not the entire structure.  Otherwise, it is equivalent to
> calculating TREE_ADDRESSABLE.

I was talking about say
struct S { int i; float f; struct S **s1; struct S ***s2; struct S ****s3; };
struct S ***var;
struct S ****ptr;

If we are asking whether
may_alias (ptr, get_alias_set (TREE_TYPE (TREE_TYPE (ptr))), var, get_alias_set (var), false)
then we are only interested in whether var is TREE_ADDRESSABLE,
and this is IMHO exactly the same as asking in case of:
void ***var;
void ****ptr;

But in the former case ipa_type_escape_star_count_of_interesting_type
(var_type) == 3 and so we will call
ipa_type_escape_field_does_not_clobber_p, in the latter case
not.  How are these two different?

In the c#8 testcase this is similar,
ipa_type_escape_star_count_of_interesting_type (var_type) == 1,
we have
struct S *var;
struct S **ptr;
again, why does it matter if *var is a struct or not?  What matters
is if var is TREE_ADDRESSABLE, if there is no &var, then it can't
point to it, otherwise it can.  The same as for
void *var;
void **ptr;

But if var isn't TREE_ADDRESSABLE, I'd bet nothing ever calls
may_alias_p with it as third argument.

So, do you agree the only case may_alias_p should handle with
ipa_type_escape_field_does_not_clobber_p is
ipa_type_escape_star_count_of_interesting_type (var_type) == 0 ?

	Jakub
Comment 19 Daniel Berlin 2007-08-23 17:08:38 UTC
Subject: Re:  [4.1/4.2/4.3 Regression] wrong code due to alias with allocation in loop

On 23 Aug 2007 14:49:43 -0000, jakub at redhat dot com
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #18 from jakub at redhat dot com  2007-08-23 14:49 -------
> Subject: Re:  [4.1/4.2/4.3 Regression] wrong code due to alias with allocation
> in loop
>
> On Thu, Aug 23, 2007 at 01:45:11PM -0000, dberlin at dberlin dot org wrote:
> > > If you take address of the whole struct rather than some specific field and
> > > that address doesn't escape the CU, then that still doesn't explain how
> > > could a pointer var with first field's type point to the struct.
> >
> > > Say for
> > > struct A { int i; float f; } you still need (int *) ptrA
> >
> > Uh, no.
> >
> > &ptrA will do just fine. You can clobber all fields through it if it escapes.
>
> See "that address doesn't escape the CU" above I wrote.  But if so,
> it will be in global_types_full_escape and so
> ipa_type_escape_field_does_not_clobber_p will return false.
>
> > > If ipa_type_escape_star_count_of_interesting_type (var_type) > 0 (i.e. var is
> > > a pointer to struct/union rather than struct/union itself, how is the fact that
> > > something took address of fields within the struct itself relevant to whether
> > > some pointer may point to the pointer var or not?
> >
> > If the address was never taken anywhere, it can't be pointed to.
> > Type-escape tries to go a little further, and see if, when the address
> > of a field is taken, if that address is ever cast'd,
> > incremented/decremented, or escapes.  If not, then only that field is
> > clobbbered, not the entire structure.  Otherwise, it is equivalent to
> > calculating TREE_ADDRESSABLE.
>
> I was talking about say
> struct S { int i; float f; struct S **s1; struct S ***s2; struct S ****s3; };
> struct S ***var;
> struct S ****ptr;
>
> If we are asking whether
> may_alias (ptr, get_alias_set (TREE_TYPE (TREE_TYPE (ptr))), var, get_alias_set
> (var), false)
> then we are only interested in whether var is TREE_ADDRESSABLE,
> and this is IMHO exactly the same as asking in case of:
> void ***var;
> void ****ptr;
>
> But in the former case ipa_type_escape_star_count_of_interesting_type
> (var_type) == 3 and so we will call
> ipa_type_escape_field_does_not_clobber_p, in the latter case
> not.  How are these two different?

They aren't.

>
> In the c#8 testcase this is similar,
> ipa_type_escape_star_count_of_interesting_type (var_type) == 1,
> we have
> struct S *var;
> struct S **ptr;
> again, why does it matter if *var is a struct or not?  What matters
> is if var is TREE_ADDRESSABLE, if there is no &var, then it can't
> point to it, otherwise it can.  The same as for
> void *var;
> void **ptr;
>
> But if var isn't TREE_ADDRESSABLE, I'd bet nothing ever calls
> may_alias_p with it as third argument.

I'm not sure I believe that.   We probably still ask for globals anyway.

>
> So, do you agree the only case may_alias_p should handle with
> ipa_type_escape_field_does_not_clobber_p is
> ipa_type_escape_star_count_of_interesting_type (var_type) == 0 ?
Yes

It does look, however, that we should be using
field_does_not_clobber_p during call clobbering computation to avoid
clobbering entire structures when fields are clobbered (if possible).
Comment 20 Jakub Jelinek 2007-08-24 15:38:27 UTC
Created attachment 14103 [details]
gcc43-pr33136.patch

Here is what I have been playing with.
But I'd really like to see some testcases where the ipa_type_escape_field_does_not_clobber_p is supposed to help.

I tried to write:

/* PR tree-optimization/33136 */
/* { dg-do compile } */
/* { dg-options "-O2" } */

struct S
{
  void *a;
  int b;
  float f;
};

static struct S s;

static int *
__attribute__((noinline, const))
foo (void)
{
  return &s.b;
}

float
__attribute__((noinline))
bar (float *f)
{
  s.f = 1.0;
  *f = 4.0;
  return s.f;
}

int
__attribute__((noinline))
baz (int *x)
{
  s.b = 1;
  *x = 4;
  return s.b;
}

int
t (void)
{
  float f = 8.0;
  return bar (&f) + baz (foo ());
}
My understanding would be this is a perfect example where this optimization should help, ipa-type-escape-var analysis says
ipa_type_escape_field_does_not_clobber_p (<struct S>, <void *>) == 1
ipa_type_escape_field_does_not_clobber_p (<struct S>, <int>) == 0
ipa_type_escape_field_does_not_clobber_p (<struct S>, <float>) == 1
which is IMHO correct.  In the baz function, we aren't supposed to optimize
out the read from s.b, as x may point to it (and in the testcase actually does).
In bar on the other side, I believe we can optimize this to
float
__attribute__((noinline))
bar (float *f)
{
  s.f = 1.0;
  *f = 4.0;
  return 1.0;
}
because as ipa-type-escap analysis found nothing ever takes address of any
float field in struct S, so f can't point to it.
But may_alias_p in that case is not called with var with struct S type (nor any pointer thereof), so ipa_type_escape_star_count_of_interesting_type will
always return -1.  It is called just with STRUCT_FIELD_TAGs with float or int or void * type or f var_decl in function t.
Comment 21 Daniel Berlin 2007-08-24 15:42:46 UTC
Subject: Re:  [4.1/4.2/4.3 Regression] wrong code due to alias with allocation in loop

On 24 Aug 2007 15:38:58 -0000, jakub at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #20 from jakub at gcc dot gnu dot org  2007-08-24 15:38 -------
> Created an attachment (id=14103)
>  --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14103&action=view)
> gcc43-pr33136.patch
>
> Here is what I have been playing with.
> But I'd really like to see some testcases where the
> ipa_type_escape_field_does_not_clobber_p is supposed to help.
>
> I tried to write:
>
> /* PR tree-optimization/33136 */
> /* { dg-do compile } */
> /* { dg-options "-O2" } */
>
> struct S
> {
>   void *a;
>   int b;
>   float f;
> };
>
> static struct S s;
>
> static int *
> __attribute__((noinline, const))
> foo (void)
> {
>   return &s.b;
> }
>
> float
> __attribute__((noinline))
> bar (float *f)
> {
>   s.f = 1.0;
>   *f = 4.0;
>   return s.f;
> }
>
> int
> __attribute__((noinline))
> baz (int *x)
> {
>   s.b = 1;
>   *x = 4;
>   return s.b;
> }
>
> int
> t (void)
> {
>   float f = 8.0;
>   return bar (&f) + baz (foo ());
> }
> My understanding would be this is a perfect example where this optimization
> should help, ipa-type-escape-var analysis says
> ipa_type_escape_field_does_not_clobber_p (<struct S>, <void *>) == 1
> ipa_type_escape_field_does_not_clobber_p (<struct S>, <int>) == 0
> ipa_type_escape_field_does_not_clobber_p (<struct S>, <float>) == 1
> which is IMHO correct.  In the baz function, we aren't supposed to optimize
> out the read from s.b, as x may point to it (and in the testcase actually
> does).
> In bar on the other side, I believe we can optimize this to
> float
> __attribute__((noinline))
> bar (float *f)
> {
>   s.f = 1.0;
>   *f = 4.0;
>   return 1.0;
> }
> because as ipa-type-escap analysis found nothing ever takes address of any
> float field in struct S, so f can't point to it.
> But may_alias_p in that case is not called with var with struct S type (nor any
> pointer thereof), so ipa_type_escape_star_count_of_interesting_type will
> always return -1.  It is called just with STRUCT_FIELD_TAGs with float or int
> or void * type or f var_decl in function t.

The call to field_not_clobber_p needs to be in the call clobbering
functions (set_initial_properties and mark_aliases_clobbered) and
access_can_touch_variable to have any real effect.
Comment 22 Jakub Jelinek 2007-08-24 16:16:41 UTC
Not sure how, could I pass the buck to you then?

My thought was just that may_alias_p could for STRUCT_FIELD_TAG call
ipa_type_escape_field_does_not_clobber (TREE_TYPE (SFT_PARENT_VAR (var)), TREE_TYPE (var)) (i.e. don't care about PTRs type, just say if that field's
address wasn't ever taken that it can't alias.  That would be similar to how
alias.c uses this function (again, doesn't use the pointer type at all).

Anyway, I found that current GCC 4.2 miscompiles following modified testcase
(works with 4.1 and the trunk, though neither 4.1 nor 4.3 actually optimize bar function as they could).  4.2 optimizes bar and misoptimizes baz, by assuming
*x = 4; will not clobber s.b.

/* { dg-do run } */
/* { dg-options "-O2" } */

extern void abort (void);

struct S
{
  struct S *a;
  int b;
  float f;
};

static struct S s;

static int *
__attribute__((noinline, const))
foo (void)
{
  return &s.b;
}

float
__attribute__((noinline))
bar (float *f)
{
  s.f = 1.0;
  *f = 4.0;
  return s.f;
}

int
__attribute__((noinline))
baz (int *x)
{
  s.b = 1;
  *x = 4;
  return s.b;
}

int
t (void)
{
  float f = 8.0;
  return bar (&f) + baz (foo ());
}

int
main (void)
{
  if (t () != 5)
    abort ();
  return 0;
}
Comment 23 Daniel Berlin 2007-08-24 16:21:17 UTC
Subject: Re:  [4.1/4.2/4.3 Regression] wrong code due to alias with allocation in loop

On 24 Aug 2007 16:16:44 -0000, jakub at gcc dot gnu dot org
<gcc-bugzilla@gcc.gnu.org> wrote:
>
>
> ------- Comment #22 from jakub at gcc dot gnu dot org  2007-08-24 16:16 -------
> Not sure how, could I pass the buck to you then?
>
> My thought was just that may_alias_p could for STRUCT_FIELD_TAG call
> ipa_type_escape_field_does_not_clobber (TREE_TYPE (SFT_PARENT_VAR (var)),
> TREE_TYPE (var)) (i.e. don't care about PTRs type, just say if that field's
> address wasn't ever taken that it can't alias.  That would be similar to how
> alias.c uses this function (again, doesn't use the pointer type at all).
>
This would work too.
I will take care of it in about a week if you don't get their first.

> Anyway, I found that current GCC 4.2 miscompiles following modified testcase
> (works with 4.1 and the trunk, though neither 4.1 nor 4.3 actually optimize bar
> function as they could).  4.2 optimizes bar and misoptimizes baz, by assuming
> *x = 4; will not clobber s.b.
For 4.2, the safest thing to do is just remove the call from may_alias_p.
I serously doubt this will have any real effect on performance.
Comment 24 Jakub Jelinek 2007-08-24 16:36:52 UTC
Removing that call in 4.1/4.2 would indeed cure c#8 testcase.
c#22 would be still broken (ipa_type_escape_field_does_not_clobber_p isn't
called at all on that testcase, neither in 4.1, nor 4.2, nor 4.3.
Guess c#22 testcase should go as a separate bug.
Comment 25 Mark Mitchell 2007-09-05 01:08:57 UTC
It's not clear to me what's going on in this PR.  At one point, Jakub seems to be saying that 4.2 does a better job than 4.1, which would suggest that this is just a 4.1.x PR?  Can we split this into one PR for 4.1-only issues, and another one or two, for 4.2 and/or 4.3 issues?
Comment 26 Jakub Jelinek 2007-09-05 05:46:51 UTC
This is just one bug, present in GCC 4.1 and onwards, no need to have
several bug ids.
tree-ssa-alias.c just uses ipa_type_escape_field_does_not_clobber_p
incorrectly, it asks an unrelated question and based on the answer decides
if things can or can't alias.  There are testcases where this bug exhibits
only in some gcc versions (e.g. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33136#c7 ), because it depends
on what exactly other optimization passes do, and other testcases, like #c8,
which fail in all 4.1+ GCCs.
Comment 27 Jakub Jelinek 2007-09-11 17:21:23 UTC
Created attachment 14191 [details]
gcc43-pr33136.patch

Updated patch which tries to handle SFTs as well.
Unfortunately it causes a regression with gcc.dg/torture/pr26587.c at -O2 and above.  That to me looks like an ipa-type-escape.c bug.
In particular, may_alias_p is now called with a SFT whose type is ARRAY_TYPE.
The first argument to ipa_type_escape_does_not_clobber_p is
 <record_type 0x2aaaaea4f1a0 type_0 BLK
    size <integer_cst 0x2aaaae941b40 type <integer_type 0x2aaaae9340d0 bit_size_type> constant invariant 576>
    unit size <integer_cst 0x2aaaaea4e150 type <integer_type 0x2aaaae934000 long unsigned int> constant invariant 72>
    align 32 symtab 0 alias set -1 canonical type 0x2aaaaea4f1a0
    fields <field_decl 0x2aaaae9f10b0 P
        type <array_type 0x2aaaaea4f0d0 BF_key type <integer_type 0x2aaaaea39a90 BF_word>
            BLK size <integer_cst 0x2aaaae941b40 576> unit size <integer_cst 0x2aaaaea4e150 72>
            align 32 symtab 0 alias set -1 canonical type 0x2aaaaea39ea0 domain <integer_type 0x2aaaaea39b60>
            pointer_to_this <pointer_type 0x2aaaaea4f410>>
        BLK file pr26587.c line 8 size <integer_cst 0x2aaaae941b40 576> unit size <integer_cst 0x2aaaaea4e150 72>
        align 32 offset_align 128
        offset <integer_cst 0x2aaaae925750 constant invariant 0>
        bit offset <integer_cst 0x2aaaae9413f0 constant invariant 0> context <record_type 0x2aaaaea4f1a0>> context <translation_unit_decl 0x2aaaaea4f750 D.1521>
    pointer_to_this <pointer_type 0x2aaaaea4f820> chain <type_decl 0x2aaaaea4f270 D.1500>>
and the second is
 <array_type 0x2aaaaea4f0d0 BF_key
    type <integer_type 0x2aaaaea39a90 BF_word sizes-gimplified public unsigned SI
        size <integer_cst 0x2aaaae925ab0 constant invariant 32>
        unit size <integer_cst 0x2aaaae925720 constant invariant 4>
        align 32 symtab 0 alias set -1 canonical type 0x2aaaae934680 precision 32 min <integer_cst 0x2aaaae925ae0 0> max <integer_cst 0x2aaaae925a80 4294967295>
        pointer_to_this <pointer_type 0x2aaaaea39d00>>
    BLK
    size <integer_cst 0x2aaaae941b40 type <integer_type 0x2aaaae9340d0 bit_size_type> constant invariant 576>
    unit size <integer_cst 0x2aaaaea4e150 type <integer_type 0x2aaaae934000 long unsigned int> constant invariant 72>
    align 32 symtab 0 alias set -1 canonical type 0x2aaaaea39ea0
    domain <integer_type 0x2aaaaea39b60
        type <integer_type 0x2aaaae934000 long unsigned int public unsigned sizetype DI
            size <integer_cst 0x2aaaae925ba0 constant invariant 64>
            unit size <integer_cst 0x2aaaae925bd0 constant invariant 8>
            align 64 symtab 0 alias set -1 canonical type 0x2aaaae942270 precision 64 min <integer_cst 0x2aaaae925c00 0> max <integer_cst 0x2aaaae9414e0 -1>>
        public DI size <integer_cst 0x2aaaae925ba0 64> unit size <integer_cst 0x2aaaae925bd0 8>
        align 64 symtab 0 alias set -1 canonical type 0x2aaaaea39b60 precision 64 min <integer_cst 0x2aaaae925750 0> max <integer_cst 0x2aaaaea4e090 17>>
    pointer_to_this <pointer_type 0x2aaaaea4f410>>

ipa_type_escape_field_does_not_clobber_p calls get_canon_type_uid (, true, true)
on this, which returns TYPE_UID of an unsigned int.
But if I check what actually mark_interesting_addressof marked, it is TYPE_UID
of a different ARRAY_TYPE - 
 <array_type 0x2aaaaea39c30
    type <integer_type 0x2aaaaea39a90 BF_word sizes-gimplified public unsigned SI
        size <integer_cst 0x2aaaae925ab0 constant invariant 32>
        unit size <integer_cst 0x2aaaae925720 constant invariant 4>
        align 32 symtab 0 alias set -1 canonical type 0x2aaaae934680 precision 32 min <integer_cst 0x2aaaae925ae0 0> max <integer_cst 0x2aaaae925a80 4294967295>
        pointer_to_this <pointer_type 0x2aaaaea39d00>>
    BLK
    size <integer_cst 0x2aaaae941b40 type <integer_type 0x2aaaae9340d0 bit_size_type> constant invariant 576>
    unit size <integer_cst 0x2aaaaea4e150 type <integer_type 0x2aaaae934000 long unsigned int> constant invariant 72>
    align 32 symtab 0 alias set 2 canonical type 0x2aaaaea39ea0
    domain <integer_type 0x2aaaaea39b60
        type <integer_type 0x2aaaae934000 long unsigned int public unsigned sizetype DI
            size <integer_cst 0x2aaaae925ba0 constant invariant 64>
            unit size <integer_cst 0x2aaaae925bd0 constant invariant 8>
            align 64 symtab 0 alias set -1 canonical type 0x2aaaae942270 precision 64 min <integer_cst 0x2aaaae925c00 0> max <integer_cst 0x2aaaae9414e0 -1>>
        public DI size <integer_cst 0x2aaaae925ba0 64> unit size <integer_cst 0x2aaaae925bd0 8>
        align 64 symtab 0 alias set -1 canonical type 0x2aaaaea39b60 precision 64 min <integer_cst 0x2aaaae925750 0> max <integer_cst 0x2aaaaea4e090 17>>>
so very different uid, and nothing actually ensures that if this type has been
marked that e.g. what it sees through has been marked as well.
Comment 28 Jakub Jelinek 2007-09-11 19:26:28 UTC
Created attachment 14193 [details]
gcc43-pr33136.patch

Added further testcases.
Comment 29 Jakub Jelinek 2007-09-11 19:52:00 UTC
The new testcases show further problems.  I believe that for ARRAY_TYPE fields
we need to mark the type of its fields (recursively down), using close_type_seen
like we already do for records and unions.
On the other side, get_canon_type_uid in ipa_type_escape_field_does_not_clobber_p
going through POINTER_TYPEs to what they point to seems wrong.
This is what I have in pr33136-4.c - I can't understand why the fact that
something took address of an int field has any influence on int * fields
- if nothing took address of an int * field, then how could it alias with what
int ** pointer points to?

Could anyone with access to SPEC see what is the difference between
-O2 vs. -O2 -fno-ipa-type-escape (or -O3 vs. -O3 -fno-ipa-type-escape)
on 4.1 and/or 4.2 branches?  For those I think the safest fix is just disable
this optimization altogether, in many cases it gives wrong answers.
4.1 miscompiles at -O2 just 20070824-1.c and pr33136-5.c (and the bugs go
away with -O2 -fno-ipa-type-escape), while gcc 4.2 is much worse,
all of 20070824-1.c and pr33136-{1,3,5}.c are miscompiled at -O2, and all the pr33136-{1,3,5}.c are even miscompiled with -O2 -fno-ipa-type-escape -fno-strict-aliasing.

Mark, can this be a P1?
Comment 30 Jakub Jelinek 2007-10-15 18:30:10 UTC
Subject: Bug 33136

Author: jakub
Date: Mon Oct 15 18:29:54 2007
New Revision: 129366

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129366
Log:
	PR tree-optimization/33136
	* opts.c (decode_options): Don't enable flag_ipa_type_escape.

	* gcc.c-torture/execute/20070824-1.c: New test.
	* gcc.dg/pr33136-1.c: New test.
	* gcc.dg/pr33136-2.c: New test.
	* gcc.dg/pr33136-3.c: New test.

Added:
    trunk/gcc/testsuite/gcc.c-torture/execute/20070824-1.c
    trunk/gcc/testsuite/gcc.dg/pr33136-1.c
    trunk/gcc/testsuite/gcc.dg/pr33136-2.c
    trunk/gcc/testsuite/gcc.dg/pr33136-3.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/opts.c
    trunk/gcc/testsuite/ChangeLog

Comment 31 Jakub Jelinek 2007-10-15 18:51:59 UTC
Fixed so far on the trunk.
Comment 32 Joseph S. Myers 2008-07-04 22:15:35 UTC
Closing 4.1 branch.
Comment 33 Joseph S. Myers 2009-03-30 22:16:33 UTC
Closing 4.2 branch, fixed in 4.3.