When you compile attached testcase with 4.1.1 and 4.2.1, resulting .o modules have following sizes: $ size insmod.i-411.o insmod.i-421.o text data bss dec hex filename 288 0 0 288 120 insmod.i-411.o 543 0 0 543 21f insmod.i-421.o Testcase was built using following command line: gcc -std=gnu99 \ -Os -fno-builtin-strlen -finline-limit=0 -fomit-frame-pointer \ -ffunction-sections -fdata-sections -fno-guess-branch-probability \ -funsigned-char -static-libgcc \ -falign-functions=1 -falign-jumps=1 -falign-labels=1 -falign-loops=1 \ -march=i386 -mpreferred-stack-boundary=2 \ -fverbose-asm -S insmod.i.c
Created attachment 14086 [details] testcase
Created attachment 14087 [details] assembler output generated by 4.1.1 Sorry, don't have native gcc 4.1.1 at this machine, but have i486-linux-uclibc-gcc 4.1.1 and 4.2.1. Thus attaching asm output generated by these. Generated code should be identical, I think.
Created attachment 14088 [details] assembler output generated by 4.2.1
4.1 somehow has better alias information, you can verify this by adding -fno-strict-aliasing to 4.1 and get the same (worse) result as with 4.2.
trunk regresses even more
Actually I think this might be related to removing iterative jump threading and not aliasing.
Actually, to me this doesn't look like missed-optimization at all. And you should be happy for 4.2.1 generated bigger code, 4.1.1 optimized out something it shouldn't. Below is an stripped down testcase, which works on 3.4/4.1/4.2/4.3 with -O0 or -O2 -fno-strict-aliasing, works even with -O2 with 3.4 and 4.2, segfaults with 4.1 (x86_64 -m64, aborts with x86_64 -m32) and issues a bogus warning about unitialized common_head and aborts on the trunk. /* { dg-do run } */ /* { dg-options "-O2" } */ typedef __SIZE_TYPE__ size_t; extern void exit (int); extern void *malloc (size_t); extern void *realloc (void *, size_t); extern void abort (void); void * __attribute__ ((noinline)) xmalloc (size_t size) { void *p = malloc (size); if (p == 0) exit (0); return p; } void * __attribute__ ((noinline)) xrealloc (void *old, size_t size) { void *p = realloc (old, size); if (p == 0) exit (0); return p; } struct obj_section { unsigned int sh_type; unsigned int sh_flags; unsigned int sh_size; unsigned int sh_addralign; const char *name; char *contents; struct obj_section *load_next; int idx; }; struct obj_symbol { struct obj_symbol *next; unsigned long value; unsigned long size; int secidx; }; struct obj_file { unsigned short e_shnum; struct obj_section **sections; struct obj_symbol *symtab[521]; }; static void __attribute__((noinline)) obj_allocate_commons (struct obj_file *f) { struct common_entry { struct common_entry *next; struct obj_symbol *sym; } *common_head = (void *) 0; unsigned long i; for (i = 0; i < 521; ++i) { struct obj_symbol *sym; for (sym = f->symtab[i]; sym; sym = sym->next) if (sym->secidx == 0xfff2) { struct common_entry **p, *n; for (p = &common_head; *p; p = &(*p)->next) if (sym->size <= (*p)->sym->size) break; n = __builtin_alloca (sizeof (*n)); n->next = *p; n->sym = sym; *p = n; } } if (common_head) { for (i = 0; i < f->e_shnum; ++i) if (f->sections[i]->sh_type == 8) break; if (i == f->e_shnum) { struct obj_section *sec; f->sections = xrealloc (f->sections, (i + 1) * sizeof (sec)); f->sections[i] = sec = xmalloc (sizeof (struct obj_section)); f->e_shnum = i + 1; __builtin_memset (sec, 0, sizeof (*sec)); sec->sh_type = 1; sec->sh_flags = (1 << 0) | (1 << 1); sec->name = ".bss"; sec->idx = i; } } } int main (void) { struct obj_file of; struct obj_symbol s; struct obj_section *sec; __builtin_memset (&s, 0, sizeof (s)); s.value = 4; s.size = 4; s.secidx = 0xfff2; __builtin_memset (&of, 0, sizeof (of)); of.e_shnum = 2; of.sections = xmalloc (2 * sizeof (sec)); of.sections[0] = sec = xmalloc (sizeof (struct obj_section)); __builtin_memset (sec, 0, sizeof (*sec)); sec->sh_type = 4; sec->sh_flags = (1 << 0) | (1 << 1); sec->name = ".foo"; sec->idx = 0; of.sections[1] = sec = xmalloc (sizeof (struct obj_section)); __builtin_memset (sec, 0, sizeof (*sec)); sec->sh_type = 4; sec->sh_flags = (1 << 0) | (1 << 1); sec->name = ".bar"; sec->idx = 1; of.symtab[0] = &s; obj_allocate_commons (&of); if (of.e_shnum != 3) abort (); return 0; } Even on the original testcase, obj_allocate_commons is so short because the whole if (common_head) { ... } huge block has been completely optimized out (watch for xrealloc call e.g.). To me the code looks valid, there is no type puning involved.
Even shorter testcase: /* { dg-do run } */ /* { dg-options "-O2" } */ extern void abort (void); struct S { struct S *a; int b; }; #ifdef VAR struct S t; #endif int main (void) { struct S *s = (struct S *) 0, **p, *n; for (p = &s; *p; p = &(*p)->a); n = (struct S *) __builtin_alloca (sizeof (*n)); n->a = *p; n->b = 1; *p = n; if (!s) abort (); return 0; } This one behaves identically with 4.1/4.2/trunk, -O2 -UVAR -> abort, -O2 -DVAR -> success, -O2 -fno-strict-aliasing -UVAR -> success.
When struct S is not defined at global scope, but within main, then no matter if struct S t; is present or not 4.1/4.2/trunk aborts with -O2.
Probably related to some of the open alias issues of 4.1: http://gcc.gnu.org/bugzilla/buglist.cgi?query_format=advanced&short_desc_type=allwordssubstr&short_desc=4.1&target_milestone=4.1.3&target_milestone=4.1.4&known_to_fail_type=allwordssubstr&known_to_work_type=allwordssubstr&long_desc_type=allwordssubstr&long_desc=&bug_file_loc_type=allwordssubstr&bug_file_loc=&gcchost_type=allwordssubstr&gcchost=&gcctarget_type=allwordssubstr&gcctarget=&gccbuild_type=allwordssubstr&gccbuild=&keywords_type=allwords&keywords=wrong-code%2C+alias&bug_status=UNCONFIRMED&bug_status=NEW&bug_status=ASSIGNED&bug_status=SUSPENDED&bug_status=WAITING&bug_status=REOPENED&priority=P1&priority=P2&priority=P3&emailtype1=substring&email1=&emailtype2=substring&email2=&bugidtype=include&bug_id=&votes=&chfieldfrom=&chfieldto=Now&chfieldvalue=&cmdtype=doit&order=Reuse+same+sort+as+last+time&query_based_on=4.1+blocker&field0-0-0=noop&type0-0-0=noop&value0-0-0=
Yeah, PR30708 seems to be stripped down from the same source. But the stripped down testcase here in c#8 is 4.1/4.2/4.3 regression, not just 4.1 regression.
Indeed. <bb 2>: # s_13 = VDEF <s_12(D)> s = 0B; goto <bb 4>; <bb 4>: # p_1 = PHI <&s(2), p_5(3)> # VUSE <HEAP.5_14(D), SMT.7_15(D)> D.2019_3 = *p_1; is wrong alias info (from trunk salias dump). The load from *p_1 misses a use of s_13 - or the assignment to s should use SMT.7 instead.
At least for 4.3, ipa-type-escape is not looking into phi_nodes for address taking, so we end up returning false for may_alias_p (p, s) because we believe nobody ever takes the address of s. IE if (ipa_type_escape_field_does_not_clobber_p (var_type, TREE_TYPE (ptr))) incorrectly returns true. ipa-type-escape.c needs to be changed to scan refs in phi_nodes (it was written when we had just plain old gimple during IPA)
--- ipa-type-escape.c.jj13 2007-08-13 15:11:18.000000000 +0200 +++ ipa-type-escape.c 2007-08-22 19:21:07.000000000 +0200 @@ -1704,6 +1704,21 @@ analyze_function (struct cgraph_node *fn FOR_EACH_BB_FN (this_block, this_cfun) { block_stmt_iterator bsi; + tree phi, op; + use_operand_p use; + ssa_op_iter iter; + + /* Find the addresses taken in phi node arguments. */ + for (phi = phi_nodes (this_block); phi; phi = PHI_CHAIN (phi)) + { + FOR_EACH_PHI_ARG (use, phi, iter, SSA_OP_USE) + { + op = USE_FROM_PTR (use); + if (TREE_CODE (op) == ADDR_EXPR) + check_rhs_var (op); + } + } + for (bsi = bsi_start (this_block); !bsi_end_p (bsi); bsi_next (&bsi)) walk_tree (bsi_stmt_ptr (bsi), scan_for_refs, fn, visited_nodes); doesn't help here at all (it could have helped if one of the PHI arguments contained say &s.some_field, but as it contains just &s, it doesn't do anything. mark_interesting_addressof is only called for addresses of aggr fields and has_proper_scope_for_analysis is a nop for automatic variables.
(In reply to comment #14) > --- ipa-type-escape.c.jj13 2007-08-13 15:11:18.000000000 +0200 > +++ ipa-type-escape.c 2007-08-22 19:21:07.000000000 +0200 > @@ -1704,6 +1704,21 @@ analyze_function (struct cgraph_node *fn > FOR_EACH_BB_FN (this_block, this_cfun) > { > block_stmt_iterator bsi; > + tree phi, op; > + use_operand_p use; > + ssa_op_iter iter; > + > + /* Find the addresses taken in phi node arguments. */ > + for (phi = phi_nodes (this_block); phi; phi = PHI_CHAIN (phi)) > + { > + FOR_EACH_PHI_ARG (use, phi, iter, SSA_OP_USE) > + { > + op = USE_FROM_PTR (use); > + if (TREE_CODE (op) == ADDR_EXPR) > + check_rhs_var (op); > + } > + } > + > for (bsi = bsi_start (this_block); !bsi_end_p (bsi); bsi_next (&bsi)) > walk_tree (bsi_stmt_ptr (bsi), scan_for_refs, > fn, visited_nodes); > > doesn't help here at all (it could have helped if one of the PHI arguments > contained say &s.some_field, but as it contains just &s, it doesn't do > anything. > mark_interesting_addressof is only called for addresses of aggr fields > and has_proper_scope_for_analysis is a nop for automatic variables. > Sigh. It should be considering &s to be an address taking of the first field. Looks like look_for_address_of needs to be fixed.
But doesn't ipa_type_escape_field_does_not_clobber_p do what is documented? At least for the uses in nonoverlapping_memrefs_p where ipa_type_escape_field_does_not_clobber_p is always called on some field, first argument is a DECL_FIELD_CONTEXT of some field ans second argument is its type. Then IMHO ipa_type_escape_field_does_not_clobber_p does the right thing. If you take address of the whole struct rather than some specific field and that address doesn't escape the CU, then that still doesn't explain how could a pointer var with first field's type point to the struct. Say for struct A { int i; float f; } you still need (int *) ptrA where ptrA is struct A *, or &ptrA->i. Optimizations before ipa-type-escape will transform the former to the latter and if they don't, I believe check_cast will handle it anyway. The only problem in alias.c might be if exprx is COMPONENT_REF of the first field and expry is a var pointer to the whole struct A (or vice versa), then any taking of address of the whole struct anywhere would mean the MEMs could overlap. But in that case both MEMs will have different alias sets, don't they? The ipa_type_escape_field_does_not_clobber_p call in may_alias_p is very different though. Here we don't necessarily call it with some record (or union) type and a type of one of its fields, but rather with some record type (or pointer to it, pointer to pointer etc.) and some possibly completely unrelated other pointer type. Well, because of previous if (!alias_sets_conflict_p (mem_alias_set, var_alias_set)) return false; it shouldn't be completely unrelated. In most cases it will actually be the same type and that's something ipa_type_escape_field_does_not_clobber_p wasn't meant to answer. I have instrumented may_alias_p, so that if ipa_type_escape_field_does_not_clobber_p returned false for reasons other than !initialized or !ipa_type_escape_type_contained_p, it would abort. The only testcase in the whole make check-{gcc,g++,gfortran} testsuite triggering this was gcc.c-torture/execute/builtins/pr22237.c, where var had a union type which contained ptr's type as one of its subfields. The whole use of ipa_type_escape_field_does_not_clobber_p in may_alias_p is very much unclear to me. E.g.: else if (ptr_star_count == 0) { /* If PTR_TYPE was not really a pointer to type, it cannot alias. */ alias_stats.structnoaddress_queries++; alias_stats.structnoaddress_resolved++; alias_stats.alias_noalias++; return false; } Isn't ptr guaranteed to be have POINTER_TYPE or REFERENCE_TYPE? Both from the way how is ->pointers array populated and e.g. that PTR_IS_REF_ALL is used before may_alias_p on the p_map->var resp. p_map1->var? That implies ptr_star_count > 0, so the above listed chunk is never executed. Also, as we don't care in the code whether ptr_star_count is 37 or just 1, I don't see the point in computing ptr_star_count at all, nor the existence of ptr_type variable. If ipa_type_escape_star_count_of_interesting_type (var_type) > 0 (i.e. var is a pointer to struct/union rather than struct/union itself, how is the fact that something took address of fields within the struct itself relevant to whether some pointer may point to the pointer var or not? If ipa_type_escape_star_count_of_interesting_type (var_type) == 0, then ipa_type_escape_field_does_not_clobber_p call would make sense if ptr was a pointer to some field (field of field etc.), but then it needs to be called with TREE_TYPE (TREE_TYPE (ptr)) as second argument, otherwise it is asking a wrong question. But if ptr's type is the struct/union itself, all we care is if that record/union's address has been taken, which is not something ipa-type-escape won't answer. Isn't that what TREE_ADDRESSABLE can be used for? Please explain.
Subject: Re: [4.1/4.2/4.3 Regression] wrong code due to alias with allocation in loop On 23 Aug 2007 12:13:13 -0000, jakub at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #16 from jakub at gcc dot gnu dot org 2007-08-23 12:13 ------- > But doesn't ipa_type_escape_field_does_not_clobber_p do what is documented? I was there when it was written. It may be documented to do one thing, but the intent was to do another :) > At least for the uses in nonoverlapping_memrefs_p where > ipa_type_escape_field_does_not_clobber_p is always called on some field, first > argument is a DECL_FIELD_CONTEXT of some field ans second argument is its type. > Then IMHO ipa_type_escape_field_does_not_clobber_p does the right thing. It doesn't. > If you take address of the whole struct rather than some specific field and > that address doesn't escape the CU, then that still doesn't explain how > could a pointer var with first field's type point to the struct. > Say for > struct A { int i; float f; } you still need (int *) ptrA Uh, no. &ptrA will do just fine. You can clobber all fields through it if it escapes. > > The ipa_type_escape_field_does_not_clobber_p call in may_alias_p is very > different though. Here we don't necessarily call it with some record (or > union) > type and a type of one of its fields, but rather with some record type (or > pointer to it, pointer to pointer etc.) and some possibly completely unrelated > other pointer type. Well, because of previous > if (!alias_sets_conflict_p (mem_alias_set, var_alias_set)) return false; > it shouldn't be completely unrelated. In most cases it will actually be the > same > type and that's something ipa_type_escape_field_does_not_clobber_p wasn't > meant to answer. I have instrumented may_alias_p, so that if > ipa_type_escape_field_does_not_clobber_p returned false for reasons other than > !initialized or !ipa_type_escape_type_contained_p, it would abort. > The only testcase in the whole make check-{gcc,g++,gfortran} testsuite > triggering this was gcc.c-torture/execute/builtins/pr22237.c, where var had a > union type which contained ptr's type as one of its subfields. > > The whole use of ipa_type_escape_field_does_not_clobber_p in may_alias_p > is very much unclear to me. The whole *point* of ipa_type_escape was to be used in may_alias_p. The use in alias.c was actually an afterthought. > E.g.: > else if (ptr_star_count == 0) > { > /* If PTR_TYPE was not really a pointer to type, it cannot > alias. */ > alias_stats.structnoaddress_queries++; > alias_stats.structnoaddress_resolved++; > alias_stats.alias_noalias++; > return false; > } > Isn't ptr guaranteed to be have POINTER_TYPE or REFERENCE_TYPE? yes. > Both from the > way how is ->pointers array populated and e.g. that PTR_IS_REF_ALL is used > before may_alias_p on the p_map->var resp. p_map1->var? That implies > ptr_star_count > 0, so the above listed chunk is never executed. Also true. > Also, as > we don't care in the code whether ptr_star_count is 37 or just 1, I don't see > the point in computing ptr_star_count at all, nor the existence of ptr_type > variable. > > If ipa_type_escape_star_count_of_interesting_type (var_type) > 0 (i.e. var is > a pointer to struct/union rather than struct/union itself, how is the fact that > something took address of fields within the struct itself relevant to whether > some pointer may point to the pointer var or not? If the address was never taken anywhere, it can't be pointed to. Type-escape tries to go a little further, and see if, when the address of a field is taken, if that address is ever cast'd, incremented/decremented, or escapes. If not, then only that field is clobbbered, not the entire structure. Otherwise, it is equivalent to calculating TREE_ADDRESSABLE. The other part of type-escape was to see if the address ever actually escapes the CU, because if it does not, it could be transformed by struct-reorg. > If > ipa_type_escape_star_count_of_interesting_type (var_type) == 0, then > ipa_type_escape_field_does_not_clobber_p call would make sense if ptr was > a pointer to some field (field of field etc.), but then it needs to be called > with TREE_TYPE (TREE_TYPE (ptr)) as second argument, otherwise it is asking > a wrong question. Also possible. Kenny is not always good at knowing the intricacies of our compiler. > Please explain.
Subject: Re: [4.1/4.2/4.3 Regression] wrong code due to alias with allocation in loop On Thu, Aug 23, 2007 at 01:45:11PM -0000, dberlin at dberlin dot org wrote: > > If you take address of the whole struct rather than some specific field and > > that address doesn't escape the CU, then that still doesn't explain how > > could a pointer var with first field's type point to the struct. > > > Say for > > struct A { int i; float f; } you still need (int *) ptrA > > Uh, no. > > &ptrA will do just fine. You can clobber all fields through it if it escapes. See "that address doesn't escape the CU" above I wrote. But if so, it will be in global_types_full_escape and so ipa_type_escape_field_does_not_clobber_p will return false. > > If ipa_type_escape_star_count_of_interesting_type (var_type) > 0 (i.e. var is > > a pointer to struct/union rather than struct/union itself, how is the fact that > > something took address of fields within the struct itself relevant to whether > > some pointer may point to the pointer var or not? > > If the address was never taken anywhere, it can't be pointed to. > Type-escape tries to go a little further, and see if, when the address > of a field is taken, if that address is ever cast'd, > incremented/decremented, or escapes. If not, then only that field is > clobbbered, not the entire structure. Otherwise, it is equivalent to > calculating TREE_ADDRESSABLE. I was talking about say struct S { int i; float f; struct S **s1; struct S ***s2; struct S ****s3; }; struct S ***var; struct S ****ptr; If we are asking whether may_alias (ptr, get_alias_set (TREE_TYPE (TREE_TYPE (ptr))), var, get_alias_set (var), false) then we are only interested in whether var is TREE_ADDRESSABLE, and this is IMHO exactly the same as asking in case of: void ***var; void ****ptr; But in the former case ipa_type_escape_star_count_of_interesting_type (var_type) == 3 and so we will call ipa_type_escape_field_does_not_clobber_p, in the latter case not. How are these two different? In the c#8 testcase this is similar, ipa_type_escape_star_count_of_interesting_type (var_type) == 1, we have struct S *var; struct S **ptr; again, why does it matter if *var is a struct or not? What matters is if var is TREE_ADDRESSABLE, if there is no &var, then it can't point to it, otherwise it can. The same as for void *var; void **ptr; But if var isn't TREE_ADDRESSABLE, I'd bet nothing ever calls may_alias_p with it as third argument. So, do you agree the only case may_alias_p should handle with ipa_type_escape_field_does_not_clobber_p is ipa_type_escape_star_count_of_interesting_type (var_type) == 0 ? Jakub
Subject: Re: [4.1/4.2/4.3 Regression] wrong code due to alias with allocation in loop On 23 Aug 2007 14:49:43 -0000, jakub at redhat dot com <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #18 from jakub at redhat dot com 2007-08-23 14:49 ------- > Subject: Re: [4.1/4.2/4.3 Regression] wrong code due to alias with allocation > in loop > > On Thu, Aug 23, 2007 at 01:45:11PM -0000, dberlin at dberlin dot org wrote: > > > If you take address of the whole struct rather than some specific field and > > > that address doesn't escape the CU, then that still doesn't explain how > > > could a pointer var with first field's type point to the struct. > > > > > Say for > > > struct A { int i; float f; } you still need (int *) ptrA > > > > Uh, no. > > > > &ptrA will do just fine. You can clobber all fields through it if it escapes. > > See "that address doesn't escape the CU" above I wrote. But if so, > it will be in global_types_full_escape and so > ipa_type_escape_field_does_not_clobber_p will return false. > > > > If ipa_type_escape_star_count_of_interesting_type (var_type) > 0 (i.e. var is > > > a pointer to struct/union rather than struct/union itself, how is the fact that > > > something took address of fields within the struct itself relevant to whether > > > some pointer may point to the pointer var or not? > > > > If the address was never taken anywhere, it can't be pointed to. > > Type-escape tries to go a little further, and see if, when the address > > of a field is taken, if that address is ever cast'd, > > incremented/decremented, or escapes. If not, then only that field is > > clobbbered, not the entire structure. Otherwise, it is equivalent to > > calculating TREE_ADDRESSABLE. > > I was talking about say > struct S { int i; float f; struct S **s1; struct S ***s2; struct S ****s3; }; > struct S ***var; > struct S ****ptr; > > If we are asking whether > may_alias (ptr, get_alias_set (TREE_TYPE (TREE_TYPE (ptr))), var, get_alias_set > (var), false) > then we are only interested in whether var is TREE_ADDRESSABLE, > and this is IMHO exactly the same as asking in case of: > void ***var; > void ****ptr; > > But in the former case ipa_type_escape_star_count_of_interesting_type > (var_type) == 3 and so we will call > ipa_type_escape_field_does_not_clobber_p, in the latter case > not. How are these two different? They aren't. > > In the c#8 testcase this is similar, > ipa_type_escape_star_count_of_interesting_type (var_type) == 1, > we have > struct S *var; > struct S **ptr; > again, why does it matter if *var is a struct or not? What matters > is if var is TREE_ADDRESSABLE, if there is no &var, then it can't > point to it, otherwise it can. The same as for > void *var; > void **ptr; > > But if var isn't TREE_ADDRESSABLE, I'd bet nothing ever calls > may_alias_p with it as third argument. I'm not sure I believe that. We probably still ask for globals anyway. > > So, do you agree the only case may_alias_p should handle with > ipa_type_escape_field_does_not_clobber_p is > ipa_type_escape_star_count_of_interesting_type (var_type) == 0 ? Yes It does look, however, that we should be using field_does_not_clobber_p during call clobbering computation to avoid clobbering entire structures when fields are clobbered (if possible).
Created attachment 14103 [details] gcc43-pr33136.patch Here is what I have been playing with. But I'd really like to see some testcases where the ipa_type_escape_field_does_not_clobber_p is supposed to help. I tried to write: /* PR tree-optimization/33136 */ /* { dg-do compile } */ /* { dg-options "-O2" } */ struct S { void *a; int b; float f; }; static struct S s; static int * __attribute__((noinline, const)) foo (void) { return &s.b; } float __attribute__((noinline)) bar (float *f) { s.f = 1.0; *f = 4.0; return s.f; } int __attribute__((noinline)) baz (int *x) { s.b = 1; *x = 4; return s.b; } int t (void) { float f = 8.0; return bar (&f) + baz (foo ()); } My understanding would be this is a perfect example where this optimization should help, ipa-type-escape-var analysis says ipa_type_escape_field_does_not_clobber_p (<struct S>, <void *>) == 1 ipa_type_escape_field_does_not_clobber_p (<struct S>, <int>) == 0 ipa_type_escape_field_does_not_clobber_p (<struct S>, <float>) == 1 which is IMHO correct. In the baz function, we aren't supposed to optimize out the read from s.b, as x may point to it (and in the testcase actually does). In bar on the other side, I believe we can optimize this to float __attribute__((noinline)) bar (float *f) { s.f = 1.0; *f = 4.0; return 1.0; } because as ipa-type-escap analysis found nothing ever takes address of any float field in struct S, so f can't point to it. But may_alias_p in that case is not called with var with struct S type (nor any pointer thereof), so ipa_type_escape_star_count_of_interesting_type will always return -1. It is called just with STRUCT_FIELD_TAGs with float or int or void * type or f var_decl in function t.
Subject: Re: [4.1/4.2/4.3 Regression] wrong code due to alias with allocation in loop On 24 Aug 2007 15:38:58 -0000, jakub at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #20 from jakub at gcc dot gnu dot org 2007-08-24 15:38 ------- > Created an attachment (id=14103) > --> (http://gcc.gnu.org/bugzilla/attachment.cgi?id=14103&action=view) > gcc43-pr33136.patch > > Here is what I have been playing with. > But I'd really like to see some testcases where the > ipa_type_escape_field_does_not_clobber_p is supposed to help. > > I tried to write: > > /* PR tree-optimization/33136 */ > /* { dg-do compile } */ > /* { dg-options "-O2" } */ > > struct S > { > void *a; > int b; > float f; > }; > > static struct S s; > > static int * > __attribute__((noinline, const)) > foo (void) > { > return &s.b; > } > > float > __attribute__((noinline)) > bar (float *f) > { > s.f = 1.0; > *f = 4.0; > return s.f; > } > > int > __attribute__((noinline)) > baz (int *x) > { > s.b = 1; > *x = 4; > return s.b; > } > > int > t (void) > { > float f = 8.0; > return bar (&f) + baz (foo ()); > } > My understanding would be this is a perfect example where this optimization > should help, ipa-type-escape-var analysis says > ipa_type_escape_field_does_not_clobber_p (<struct S>, <void *>) == 1 > ipa_type_escape_field_does_not_clobber_p (<struct S>, <int>) == 0 > ipa_type_escape_field_does_not_clobber_p (<struct S>, <float>) == 1 > which is IMHO correct. In the baz function, we aren't supposed to optimize > out the read from s.b, as x may point to it (and in the testcase actually > does). > In bar on the other side, I believe we can optimize this to > float > __attribute__((noinline)) > bar (float *f) > { > s.f = 1.0; > *f = 4.0; > return 1.0; > } > because as ipa-type-escap analysis found nothing ever takes address of any > float field in struct S, so f can't point to it. > But may_alias_p in that case is not called with var with struct S type (nor any > pointer thereof), so ipa_type_escape_star_count_of_interesting_type will > always return -1. It is called just with STRUCT_FIELD_TAGs with float or int > or void * type or f var_decl in function t. The call to field_not_clobber_p needs to be in the call clobbering functions (set_initial_properties and mark_aliases_clobbered) and access_can_touch_variable to have any real effect.
Not sure how, could I pass the buck to you then? My thought was just that may_alias_p could for STRUCT_FIELD_TAG call ipa_type_escape_field_does_not_clobber (TREE_TYPE (SFT_PARENT_VAR (var)), TREE_TYPE (var)) (i.e. don't care about PTRs type, just say if that field's address wasn't ever taken that it can't alias. That would be similar to how alias.c uses this function (again, doesn't use the pointer type at all). Anyway, I found that current GCC 4.2 miscompiles following modified testcase (works with 4.1 and the trunk, though neither 4.1 nor 4.3 actually optimize bar function as they could). 4.2 optimizes bar and misoptimizes baz, by assuming *x = 4; will not clobber s.b. /* { dg-do run } */ /* { dg-options "-O2" } */ extern void abort (void); struct S { struct S *a; int b; float f; }; static struct S s; static int * __attribute__((noinline, const)) foo (void) { return &s.b; } float __attribute__((noinline)) bar (float *f) { s.f = 1.0; *f = 4.0; return s.f; } int __attribute__((noinline)) baz (int *x) { s.b = 1; *x = 4; return s.b; } int t (void) { float f = 8.0; return bar (&f) + baz (foo ()); } int main (void) { if (t () != 5) abort (); return 0; }
Subject: Re: [4.1/4.2/4.3 Regression] wrong code due to alias with allocation in loop On 24 Aug 2007 16:16:44 -0000, jakub at gcc dot gnu dot org <gcc-bugzilla@gcc.gnu.org> wrote: > > > ------- Comment #22 from jakub at gcc dot gnu dot org 2007-08-24 16:16 ------- > Not sure how, could I pass the buck to you then? > > My thought was just that may_alias_p could for STRUCT_FIELD_TAG call > ipa_type_escape_field_does_not_clobber (TREE_TYPE (SFT_PARENT_VAR (var)), > TREE_TYPE (var)) (i.e. don't care about PTRs type, just say if that field's > address wasn't ever taken that it can't alias. That would be similar to how > alias.c uses this function (again, doesn't use the pointer type at all). > This would work too. I will take care of it in about a week if you don't get their first. > Anyway, I found that current GCC 4.2 miscompiles following modified testcase > (works with 4.1 and the trunk, though neither 4.1 nor 4.3 actually optimize bar > function as they could). 4.2 optimizes bar and misoptimizes baz, by assuming > *x = 4; will not clobber s.b. For 4.2, the safest thing to do is just remove the call from may_alias_p. I serously doubt this will have any real effect on performance.
Removing that call in 4.1/4.2 would indeed cure c#8 testcase. c#22 would be still broken (ipa_type_escape_field_does_not_clobber_p isn't called at all on that testcase, neither in 4.1, nor 4.2, nor 4.3. Guess c#22 testcase should go as a separate bug.
It's not clear to me what's going on in this PR. At one point, Jakub seems to be saying that 4.2 does a better job than 4.1, which would suggest that this is just a 4.1.x PR? Can we split this into one PR for 4.1-only issues, and another one or two, for 4.2 and/or 4.3 issues?
This is just one bug, present in GCC 4.1 and onwards, no need to have several bug ids. tree-ssa-alias.c just uses ipa_type_escape_field_does_not_clobber_p incorrectly, it asks an unrelated question and based on the answer decides if things can or can't alias. There are testcases where this bug exhibits only in some gcc versions (e.g. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=33136#c7 ), because it depends on what exactly other optimization passes do, and other testcases, like #c8, which fail in all 4.1+ GCCs.
Created attachment 14191 [details] gcc43-pr33136.patch Updated patch which tries to handle SFTs as well. Unfortunately it causes a regression with gcc.dg/torture/pr26587.c at -O2 and above. That to me looks like an ipa-type-escape.c bug. In particular, may_alias_p is now called with a SFT whose type is ARRAY_TYPE. The first argument to ipa_type_escape_does_not_clobber_p is <record_type 0x2aaaaea4f1a0 type_0 BLK size <integer_cst 0x2aaaae941b40 type <integer_type 0x2aaaae9340d0 bit_size_type> constant invariant 576> unit size <integer_cst 0x2aaaaea4e150 type <integer_type 0x2aaaae934000 long unsigned int> constant invariant 72> align 32 symtab 0 alias set -1 canonical type 0x2aaaaea4f1a0 fields <field_decl 0x2aaaae9f10b0 P type <array_type 0x2aaaaea4f0d0 BF_key type <integer_type 0x2aaaaea39a90 BF_word> BLK size <integer_cst 0x2aaaae941b40 576> unit size <integer_cst 0x2aaaaea4e150 72> align 32 symtab 0 alias set -1 canonical type 0x2aaaaea39ea0 domain <integer_type 0x2aaaaea39b60> pointer_to_this <pointer_type 0x2aaaaea4f410>> BLK file pr26587.c line 8 size <integer_cst 0x2aaaae941b40 576> unit size <integer_cst 0x2aaaaea4e150 72> align 32 offset_align 128 offset <integer_cst 0x2aaaae925750 constant invariant 0> bit offset <integer_cst 0x2aaaae9413f0 constant invariant 0> context <record_type 0x2aaaaea4f1a0>> context <translation_unit_decl 0x2aaaaea4f750 D.1521> pointer_to_this <pointer_type 0x2aaaaea4f820> chain <type_decl 0x2aaaaea4f270 D.1500>> and the second is <array_type 0x2aaaaea4f0d0 BF_key type <integer_type 0x2aaaaea39a90 BF_word sizes-gimplified public unsigned SI size <integer_cst 0x2aaaae925ab0 constant invariant 32> unit size <integer_cst 0x2aaaae925720 constant invariant 4> align 32 symtab 0 alias set -1 canonical type 0x2aaaae934680 precision 32 min <integer_cst 0x2aaaae925ae0 0> max <integer_cst 0x2aaaae925a80 4294967295> pointer_to_this <pointer_type 0x2aaaaea39d00>> BLK size <integer_cst 0x2aaaae941b40 type <integer_type 0x2aaaae9340d0 bit_size_type> constant invariant 576> unit size <integer_cst 0x2aaaaea4e150 type <integer_type 0x2aaaae934000 long unsigned int> constant invariant 72> align 32 symtab 0 alias set -1 canonical type 0x2aaaaea39ea0 domain <integer_type 0x2aaaaea39b60 type <integer_type 0x2aaaae934000 long unsigned int public unsigned sizetype DI size <integer_cst 0x2aaaae925ba0 constant invariant 64> unit size <integer_cst 0x2aaaae925bd0 constant invariant 8> align 64 symtab 0 alias set -1 canonical type 0x2aaaae942270 precision 64 min <integer_cst 0x2aaaae925c00 0> max <integer_cst 0x2aaaae9414e0 -1>> public DI size <integer_cst 0x2aaaae925ba0 64> unit size <integer_cst 0x2aaaae925bd0 8> align 64 symtab 0 alias set -1 canonical type 0x2aaaaea39b60 precision 64 min <integer_cst 0x2aaaae925750 0> max <integer_cst 0x2aaaaea4e090 17>> pointer_to_this <pointer_type 0x2aaaaea4f410>> ipa_type_escape_field_does_not_clobber_p calls get_canon_type_uid (, true, true) on this, which returns TYPE_UID of an unsigned int. But if I check what actually mark_interesting_addressof marked, it is TYPE_UID of a different ARRAY_TYPE - <array_type 0x2aaaaea39c30 type <integer_type 0x2aaaaea39a90 BF_word sizes-gimplified public unsigned SI size <integer_cst 0x2aaaae925ab0 constant invariant 32> unit size <integer_cst 0x2aaaae925720 constant invariant 4> align 32 symtab 0 alias set -1 canonical type 0x2aaaae934680 precision 32 min <integer_cst 0x2aaaae925ae0 0> max <integer_cst 0x2aaaae925a80 4294967295> pointer_to_this <pointer_type 0x2aaaaea39d00>> BLK size <integer_cst 0x2aaaae941b40 type <integer_type 0x2aaaae9340d0 bit_size_type> constant invariant 576> unit size <integer_cst 0x2aaaaea4e150 type <integer_type 0x2aaaae934000 long unsigned int> constant invariant 72> align 32 symtab 0 alias set 2 canonical type 0x2aaaaea39ea0 domain <integer_type 0x2aaaaea39b60 type <integer_type 0x2aaaae934000 long unsigned int public unsigned sizetype DI size <integer_cst 0x2aaaae925ba0 constant invariant 64> unit size <integer_cst 0x2aaaae925bd0 constant invariant 8> align 64 symtab 0 alias set -1 canonical type 0x2aaaae942270 precision 64 min <integer_cst 0x2aaaae925c00 0> max <integer_cst 0x2aaaae9414e0 -1>> public DI size <integer_cst 0x2aaaae925ba0 64> unit size <integer_cst 0x2aaaae925bd0 8> align 64 symtab 0 alias set -1 canonical type 0x2aaaaea39b60 precision 64 min <integer_cst 0x2aaaae925750 0> max <integer_cst 0x2aaaaea4e090 17>>> so very different uid, and nothing actually ensures that if this type has been marked that e.g. what it sees through has been marked as well.
Created attachment 14193 [details] gcc43-pr33136.patch Added further testcases.
The new testcases show further problems. I believe that for ARRAY_TYPE fields we need to mark the type of its fields (recursively down), using close_type_seen like we already do for records and unions. On the other side, get_canon_type_uid in ipa_type_escape_field_does_not_clobber_p going through POINTER_TYPEs to what they point to seems wrong. This is what I have in pr33136-4.c - I can't understand why the fact that something took address of an int field has any influence on int * fields - if nothing took address of an int * field, then how could it alias with what int ** pointer points to? Could anyone with access to SPEC see what is the difference between -O2 vs. -O2 -fno-ipa-type-escape (or -O3 vs. -O3 -fno-ipa-type-escape) on 4.1 and/or 4.2 branches? For those I think the safest fix is just disable this optimization altogether, in many cases it gives wrong answers. 4.1 miscompiles at -O2 just 20070824-1.c and pr33136-5.c (and the bugs go away with -O2 -fno-ipa-type-escape), while gcc 4.2 is much worse, all of 20070824-1.c and pr33136-{1,3,5}.c are miscompiled at -O2, and all the pr33136-{1,3,5}.c are even miscompiled with -O2 -fno-ipa-type-escape -fno-strict-aliasing. Mark, can this be a P1?
Subject: Bug 33136 Author: jakub Date: Mon Oct 15 18:29:54 2007 New Revision: 129366 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=129366 Log: PR tree-optimization/33136 * opts.c (decode_options): Don't enable flag_ipa_type_escape. * gcc.c-torture/execute/20070824-1.c: New test. * gcc.dg/pr33136-1.c: New test. * gcc.dg/pr33136-2.c: New test. * gcc.dg/pr33136-3.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/20070824-1.c trunk/gcc/testsuite/gcc.dg/pr33136-1.c trunk/gcc/testsuite/gcc.dg/pr33136-2.c trunk/gcc/testsuite/gcc.dg/pr33136-3.c Modified: trunk/gcc/ChangeLog trunk/gcc/opts.c trunk/gcc/testsuite/ChangeLog
Fixed so far on the trunk.
Closing 4.1 branch.
Closing 4.2 branch, fixed in 4.3.