Created attachment 24427 [details] Testcase (reduced automatically using multidelta) GCC is apparently producing the wrong code with Eigen2 (a template-based linear algebra library) with optimization levels -O3 and -O2 for x86_64-unknown-linux-gnu targets. A reduced test case is provided that reproduces the error. As I understand, the core of the problem is this loop (line 1132 of the submitted test case): for (; (ProcessFirstHalf ? i && i.index () < j : i); ++i) { if (LhsIsSelfAdjoint) { int a = LhsIsRowMajor ? j : i.index (); int b = LhsIsRowMajor ? i.index () : j; Scalar v = i.value (); derived ().row (b) += ei_conj (v) * product.rhs ().row (a); } } which is being translated into: movq -8(%rsp), %rsi movq (%rsi), %rbp addq %rdx, %rbp movsd 0(%rbp), %xmm1 # <- %xmm1 is initialized here and .L5: # no longer touched! leaq 0(,%rcx,8), %rsi leaq 4(%r8,%rcx,4), %r8 movl %r9d, %ecx jmp .L8 .L13: # <-Loop here!!! movl (%r8), %r10d addq $4, %r8 .L8: movsd 0(%r13,%rsi), %xmm0 movslq %r10d, %r10 addl $1, %ecx mulsd (%r12,%r10,8), %xmm0 cmpl %r11d, %ecx addsd %xmm1, %xmm0 movsd %xmm0, 0(%rbp) # <- % shouldn't %xmm1 be updated here? je .L3 addq $8, %rsi cmpl %ecx, %r9d jle .L13 # <- Loop ends the sum operation on line derived ().row (b) += ei_conj (v) * product.rhs ().row (a); is apparently being performed by the instruction addsd %xmm1, %xmm0 but the value of %xmm1 isn't being updated inside the loop!! Apparently the compiler is presuming derived ().row (b) is constant inside the loop, which is evidently *not* true. Since the value of %xmm1 is never updated, the value of derived ().row (b) at the end of the loop is equal to the last ei_conj (v) * product.rhs ().row (a) result. The bug was verified on gcc versions 4.5.2 and 4.6.0 with -O2 and -O3 switches. The compiler produces the correct code with -O0 and -O switches. It is *NOT* present on the 4.4 branch (that is, 4.4 compiles the code correctly) for -O0, -0, -02 and -O3 switches. I suppose it is a regression of the 4.5 branch. Command line (for gcc 4.6.0): /opt/gnu/gcc-4.6/bin/g++ -v -save-temps -nostdinc -O3 testcase.a.cpp Compiler output: Using built-in specs. COLLECT_GCC=/opt/gnu/gcc-4.6/bin/g++ COLLECT_LTO_WRAPPER=/opt/gnu/gcc-4.6/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../configure : (reconfigured) ../configure : (reconfigured) ../configure Thread model: posix gcc version 4.6.0 (GCC) COLLECT_GCC_OPTIONS='-v' '-save-temps' '-nostdinc' '-O3' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /opt/gnu/gcc-4.6/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/cc1plus -E -quiet -nostdinc -v -iprefix /opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/ -D_GNU_SOURCE testcase.a.cpp -mtune=generic -march=x86-64 -O3 -fpch-preprocess -o testcase.a.ii #include "..." search starts here: #include <...> search starts here: End of search list. COLLECT_GCC_OPTIONS='-v' '-save-temps' '-nostdinc' '-O3' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /opt/gnu/gcc-4.6/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/cc1plus -fpreprocessed testcase.a.ii -quiet -dumpbase testcase.a.cpp -mtune=generic -march=x86-64 -auxbase testcase.a -O3 -version -o testcase.a.s GNU C++ (GCC) version 4.6.0 (x86_64-unknown-linux-gnu) compiled by GNU C version 4.6.0, GMP version 4.3.2, MPFR version 3.0.0-p8, MPC version 0.9 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 GNU C++ (GCC) version 4.6.0 (x86_64-unknown-linux-gnu) compiled by GNU C version 4.6.0, GMP version 4.3.2, MPFR version 3.0.0-p8, MPC version 0.9 GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: 3f3899c46d47b31a2bc0cb7f3d1408a6 COLLECT_GCC_OPTIONS='-v' '-save-temps' '-nostdinc' '-O3' '-shared-libgcc' '-mtune=generic' '-march=x86-64' as --64 -o testcase.a.o testcase.a.s COMPILER_PATH=/opt/gnu/gcc-4.6/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/:/opt/gnu/gcc-4.6/bin/../libexec/gcc/ LIBRARY_PATH=/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/:/opt/gnu/gcc-4.6/bin/../lib/gcc/:/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/usr/lib/nvidia-current/:/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-v' '-save-temps' '-nostdinc' '-O3' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /opt/gnu/gcc-4.6/bin/../libexec/gcc/x86_64-unknown-linux-gnu/4.6.0/collect2 --eh-frame-hdr -m elf_x86_64 -dynamic-linker /lib64/ld-linux-x86-64.so.2 /usr/lib/../lib64/crt1.o /usr/lib/../lib64/crti.o /opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/crtbegin.o -L/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0 -L/opt/gnu/gcc-4.6/bin/../lib/gcc -L/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../../../lib64 -L/lib/../lib64 -L/usr/lib/../lib64 -L/usr/lib/nvidia-current -L/opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/../../.. testcase.a.o -lstdc++ -lm -lgcc_s -lgcc -lc -lgcc_s -lgcc /opt/gnu/gcc-4.6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.6.0/crtend.o /usr/lib/../lib64/crtn.o A test case was produced with the preprocessed output (generated from Eigen version 2.0.15) and automatically reduced using the multidelta tool. Testcase included.
It is caused by revision 155360: http://gcc.gnu.org/ml/gcc-cvs/2009-12/msg00504.html
Passes with -fno-strict-aliasing. Presumably the alias-improvement branch merge is the real issue. Not sure if it isn't an issue with the testcase which has inline const Derived & derived () const { return *static_cast < const Derived *>(this); } inline Derived & derived () { return *static_cast < Derived * >(this); } I wasn't able to follow the template instantiation quickly ... ;) I will have a look.
Indeed PRE thinks that it is loop invariant, presumambly because of TBAA: D.7681_206 = MEM[(const double &)D.7674_199]; D.7680_208 = D.7425_99 * D.7681_206; D.7663_216 = MEM[(const double &)SR.167_105]; D.7664_217 = D.7680_208 + D.7663_216; MEM[(Scalar &)SR.167_105] = D.7664_217; and MEM[(const double &)SR.167_105] is sought to be loop-invariant, not aliasing MEM[(Scalar &)SR.167_105]. We do not see the must-alias because SR.167_105 is value-numbered to some other SSA name and value_dies_in_block_x (after fixing another bug in that function) does not do value-replacement on the stmt it queries stmt_may_clobber_ref_p_1. Then the oracle figures if (!ptr_derefs_may_alias_p (ptr1, ptr2)) return false; and both vars point to restrict tags (but different restrict tags). So this looks to me related to PR48764. Of course this case is somewhat different in that we basically have struct { double * restrict x; } m; double * restrict wrap () { return m.x; } double x; m.x = &x; for (;;) wrap (&x) = wrap (&x) + 1; and GCC thinks that the two restrict qualified pointers point to different things. See the restrict qualified m_data pointer in MapBase. As you are creating a new object via forceAligned -> force_aligned_impl.run -> _convertToForceAligned you effectively are loading from two different restrict qualified pointers. I'll fixup the bug I noticed in value_dies_in_block_x, but it won't fix this PR.
# PT = nonlocal escaped D.7613_101 = MEM[(struct ei_matrix_storage *)this_30(D)]; # PT = nonlocal escaped D.7616_104 = D.7613_101 + D.7582_90; # PT = nonlocal escaped { D.7934 } (restr) SR.167_105 = (const Scalar * restrict) D.7616_104; # PT = nonlocal escaped D.7671_127 = MEM[(struct ei_matrix_storage *)this_30(D)]; D.7672_129 = D.7554_68 * 8; # PT = nonlocal escaped D.7674_130 = D.7671_127 + D.7672_129; # PT = nonlocal escaped { D.7936 } (restr) SR.169_131 = (const Scalar * restrict) D.7674_130; so while the fix for PR48764 makes this_30 point to a single tag that would make derived pointers derived it doesn't seem to work for this case. I have a variant that does.
Looking at this (don't see how this is related to the other PR), I think the bug is either that we have these TYPE_RESTRICT casts (caused by storing a non-restricted pointer into a restricted field) in the IL at all, or that we are propagating the restrict tags through the restricted field which we have decided not to have a restrict tag for. If it is a field that can have a restrict tag safely attached to it (restrict field in a structure argument, or DECL_BY_REFERENCE/restrict REFERENCE_TYPE argument pointed structure, or global variable field, or perhaps automatic variable without address taken, the casts for store to it should either not have (restr) at all, or at least should use the same restrict tag as the field. But if the field can't have safely a restrict tag, if we make it up because of the casts for store to it, different stores might be with different tags and if two different pointers alias and PTA doesn't figure it out, we have expressions based on the same restricted field with different restrict tags and miscompile. Now the question is how to safely find out what is a cast for store from other casts. TYPE_RESTRICT casts from non-restrict pointers feeding one stmt which stores it into restricted field? Or should we just say such casts shouldn't be in the IL? Or should only TYPE_RESTRICT casts to SSA_NAMEs of non-artificial vars be considered? In the testcase, all the __restrict casts are casts for store into some field that can't have safely a restrict tag attached: # PT = nonlocal escaped { D.7761 } (restr) data.11D.7427_92 = (const ScalarD.4717 * restrict) D.7426_91; # .MEMD.7374_151 = VDEF <.MEMD.7374_52> MEM[(struct MapBaseD.4571 *)&D.4884].m_dataD.4766 = data.11D.7427_92;
Short testcase that is miscompiled: struct S { int a; int *__restrict p; }; struct S *bar (struct S *); int foo (int *p, int *q, int z) { struct S s, *t; s.a = 1; s.p = p; t = bar (&s); t->p = q; s.p[0] = 0; t->p[0] = 1; return s.p[0]; } If bar returns the parameter it has been passed, s and *t are the same object, so it must return 1 instead of 0 that 4.6 as well as trunk return.
BTW, the extra problematic casts aren't coming from the frontend, it is the gimplifier that is inserting them: /* Insert pointer conversions required by the middle-end that are not required by the frontend. This fixes middle-end type checking for for example gcc.dg/redecl-6.c. */ if (POINTER_TYPE_P (TREE_TYPE (*to_p))) { STRIP_USELESS_TYPE_CONVERSION (*from_p); if (!useless_type_conversion_p (TREE_TYPE (*to_p), TREE_TYPE (*from_p))) *from_p = fold_convert_loc (loc, TREE_TYPE (*to_p), *from_p); }
Created attachment 25419 [details] patch An incomplete patch to avoid those casts from the IL. We'd need to adjust SRA to add casts if a restrict field is scalarized and a non-restrict pointer is stored into it. It fixes this short testcase though. It is not enough though, because FRE/PRE can see through the field store and will create undesirable restrict casts: t_3->p = q_4(D); ... D.2099_6 = t_3->p; becomes t_3->p = q_4(D); ... D.2099_6 = (int * restrict) q_4(D); during fre1 (in this case it doesn't matter, as for s.p FRE can't see through the pointer and thus it doesn't have a restrict tag). So, either FRE/PRE (and whatever else) would need to give up on seeing through in these non-restrict -> restrict field cases, or it would be better instead of avoiding these casts make non-restrict -> restrict casts from user code explicit (e.g. some different kind of gimple unary assignment), treat them more carefully during optimizations (yes, it would prevent some optimizations, but hopefully restrict is used only where it actually can give more improvements from successful disambiguation) and only handle those as CAST_RESTRICT and not random other !TYPE_RESTRICT -> TYPE_RESTRICT conversions in the IL which are there just for IL consistency (we could even make such non-special !restrict -> restrict conversions useless).
On Wed, 5 Oct 2011, jakub at gcc dot gnu.org wrote: > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49279 > > --- Comment #8 from Jakub Jelinek <jakub at gcc dot gnu.org> 2011-10-05 09:04:59 UTC --- > Created attachment 25419 [details] > --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=25419 > patch > > An incomplete patch to avoid those casts from the IL. > We'd need to adjust SRA to add casts if a restrict field is scalarized and a > non-restrict pointer is stored into it. It fixes this short testcase though. > It is not enough though, because FRE/PRE can see through the field store > and will create undesirable restrict casts: > t_3->p = q_4(D); > ... > D.2099_6 = t_3->p; > becomes > t_3->p = q_4(D); > ... > D.2099_6 = (int * restrict) q_4(D); > during fre1 (in this case it doesn't matter, as for s.p FRE can't see through > the pointer and thus it doesn't have a restrict tag). So, either FRE/PRE (and > whatever else) would need to give up on seeing through in these non-restrict -> > restrict field cases, or it would be better instead of avoiding these casts > make non-restrict -> restrict casts from user code explicit (e.g. some > different kind of gimple unary assignment), treat them more carefully during > optimizations (yes, it would prevent some optimizations, but hopefully restrict > is used only where it actually can give more improvements from successful > disambiguation) and only handle those as CAST_RESTRICT and not random other > !TYPE_RESTRICT -> TYPE_RESTRICT conversions in the IL which are there just for > IL consistency > (we could even make such non-special !restrict -> restrict conversions > useless). I think rather than trying to remove those casts from the IL we should think if preserving restrict optimizations for inlining is really important (which is what the cast handling is for). The cast handling, compared to the existing handling of arguments and global vars, is really sensitive to optimizations to be 'correct' (see the other PR where we inline a restrict argument function twice and assign different restrict tags during PTA to the two copies and do wrong cross inlined-body optimizations). So, for correctness we should consider removing the cast handling. (My proposed fix for the other PR sofar was to assign a fake tag to every pointer so propagation can see the two pointers (with different restrict tag) are still based on each other). Richard.
After talking about this for some time another idea came up. Basically, assign the restrict tags for parameters at gimplification time by gimplifying foo (int * restrict p) { to foo (int * restrict p) { p = RESTRICT <p, tag>; with tag coming from allocate_decl_uid (). We would use these copies as restrict tag sources using the specified UID. Thus every inline copy (and clone) would share them. We could even expose this as __builtin_restrict (p, tag) (with a mapping from user tags to uids) much similar to __builtin_assume_aligned. The fortran frontend could use this instead of restrict qualification for the descriptor->data loads.
Created attachment 25423 [details] CAST_RESTRICT removal Attaching a test patch that just removed CAST_RESTRICT altogether, plus IRC discussion that lead to it. The only testsuite regressions are Wobjsize-1.c and strlenopt-4gf.c which show an important security related problem - we probably shouldn't be folding builtins if DECL_INITIAL (fndecl) != NULL && DECL_DECLARED_INLINE_P (fndecl) && cfun && !cfun->after_inlining, because then we happily fold e.g. char buf[2]; strcpy (buf, "abcd"); into __builtin_memcpy even when strcpy is always_inline inline wrapper that calls __builtin___strcpy_chk and would complain about the buffer overflow resp. add runtime checking. jakub> IMHO something like RESTRICT_CAST_EXPR gimple stmts or handling as CAST_RESTRICT only non-restrict -> restrict assignments to non-artificial vars would be some form of smaller CAST_RESTRICT removal jakub> with both, even if we forward propagate or CSE the rhs, we wouldn't create the undesirable restrict tags richi> yeah richi> it would also remove that awkward single non-useless pointer cast jakub> while still restrict could be (somewhat) supported for inlined routines and even for cases where one writes { int *__restrict p = something; *p ... in the code richi> of course we'd have to handle CAST_RESTRICT everywhere ... richi> but currently it's very fragile jakub> yes, it would be some work richi> maybe try removing the cast handling first? richi> and see how many people complain? richi> it would be the solution for release branches anyway I guess jakub> would be nice to see SPEC numbers with that removed first richi> not a single restrict in SPEC ... jakub> really? Even in 2k6? richi> really. richi> test coverage is very bad for restrict :/ richi> we get some via fortran of course richi> but those are for globals and params only jakub> but with inlined fortran routines you get the restrict casts, right? richi> but you get them like the case you want to remove - descriptor->data = (restrict *)x; richi> but yes jakub> yeah, those are definitely undesirable richi> though we either get a decl in the caller (even better) or another restrict param passed throuhg in fortran jakub> anyway, if you want, I can try the RESTRICT_CAST_EXPR (or CAST_RESTRICT_EXPR, better suggestion) approach richi> so for fortran I believe we wouldn't lose anything richi> I suppose only the frontends would insert those? How would those be different in the case of inlining two copies? jakub> I'd think the FEs as well as the inliner would insert those jakub> not sure if I understood well the inlining two copies problem though jakub> you have inline with __restrict parameter(s), inline it twice richi> restrict pointers on arguments point to { restrict-tag NONLOCAL }, the disambiguator ignores NONLOCAL, but that's a problem if two restrict pointers are based on one that only points to NONLOCAL (like a non-restrict parameter from the caller) richi> foo (int *p) { bar(p); bar(p); } bar (int * restrict p) { *p = 1; } would disambiguate the two stores to p richi> that's the issue in the other PR richi> needs quite some obfuscation to trigger of course ;) jakub> and the problem was during scheduling? jakub> or something that intermixed the bodies? richi> yes, but I can imagine that you could trigger bogus CSE as well richi> the alias info is clearly wrong richi> my idea was to improve the based-on tracking by giving each pointer a distinct pointed-to tag richi> instead of just NONLOCAL if we don't know richi> of course that will increase the points-to set sizes even more richi> so I don't like it too much jakub> not sure I understand how that would help richi> ISTR that issue is also why we require two restrict-tag pointers and don't disambiguate int * restrict p and int *q richi> you'd get { TAG1 NONLOCAL ptr } { TAG2 NONLOCAL ptr } and thus no disambiguation richi> see the bitmap_intersect_p check in pt_solutions_same_restrict_base jakub> on the other side, restrict in C99 is very tight to the scope in which the cast occurs jakub> so perhaps we could disambiguate through the explicit cast restricts only if in the right BLOCK? richi> ick richi> what's the "right block"? richi> also that doesn't sound like a viable middle-end concept jakub> not sure if it would be enough to just consider TREE_BLOCK of the RESTRICT_CAST_EXPR stmts richi> we don't implement C99 restrict in the middle-end, we implement something that should capture it jakub> "right block" in the sense the same block or one block is subblock of the other (transitively) richi> you'd have to remember which RESTRICT_CAST_EXPR feeds a memory access richi> at least PTA propagation won't be able to avoid leaking the restrict tags out of the blocks jakub> so, if we have baz (int *__restrict p, int *__restrict q) { *p = 1; *q = 1; return *p; } inline, both the cast restricts would be the same block, so can be used to disambiguate richi> but at disambiguation we don't see the casts but two memory accesses jakub> if you inline it twice, we wouldn't disambiguate p' and p'' based accesses jakub> if we can from the CAST_RESTRICT artificial heap var get to the TREE_BLOCK to which it is attached, we could use that... richi> the heap var doesn't exist - it's just a UID ... jakub> perhaps as proof of concept we could start with only testing TREE_BLOCKs for equality, not walking to the parents jakub> but we can somewhere on the side map that UID to TREE_BLOCK, can't we? richi> and we can't go from random UIDs to DECLs (and they don't exist) richi> huh, sure we could richi> it sounds awkward though, tailored to stupid C99 richi> btw, you don't know which bit is the restrict tag, so you'd need to iterate over all and check them all jakub> bitmap which uids are cast restrict tags? richi> if you have p1 { R1 R2 } p2 { R3 R4 } is it enough to have one pair that has matching blocks, say R1 and R4? richi> or do all possible pairs to be useful for disambiguation? jakub> not sure, when exactly you get two or more restrict tags in points-to set? x = y ? p1 : p2 ? richi> OTOH, the simpler patch that just removes the offending four lines that handle restrict casts sounds very very much more appealing to me ;) richi> from PHIs for example, or from context-insensitive parts (when we go through memory or calls) jakub> I can try to bootstrap that, wonder what the fallout in the testsuite would be richi> nothing we can't XFAIL jakub> sure, but the question is if we don't pessimize real-world code that way too much richi> at least for fixing the two bugs on release branches that sounds best richi> well, it opens up optimizing restrict vs. non-restrict qualified pointers jakub> probably jakub> btw, do we create restrict tags for fields of automatic aggregates which we can name? richi> yes, I think so jakub> without CAST_RESTRICT, I guess in my short testcase it would be fine if s.p was given a restrict tag (but apparently it doesn't get it) richi> let me check jakub> even if we take its address, just going through the pointer (t->p) shouldn't have restrict cast and thus not miscompile it richi> ah no, we don't initialize them richi> they don't point to anything initially richi> so we assume we can compute precise points-to sets for them jakub> probably not even with s = { 1, p }; in the source richi> right, because we gimplify that richi> if you make them local static they get restrict tags jakub> well, local statics are like global jakub> they live forever richi> right jakub> though, with the automatic vars after scheduling we might hit the same issue as with cast restrict - if code from different blocks overlaps
(In reply to comment #10) > with tag coming from allocate_decl_uid (). We would use these copies > as restrict tag sources using the specified UID. Thus every inline > copy (and clone) would share them. Assuming we don't CSE over it (in particular, the LHS of such a builtin with some non-restricted later pointers not based on it), perhaps it could work. I wouldn't expose it to users, because how would users ensure uniqueness of the tag over the whole CU? __COUNTER__ or something similar? That's going to lead to bugs... The FEs should probably add that for user __restrict variable initializers too.
(In reply to comment #12) > (In reply to comment #10) > > with tag coming from allocate_decl_uid (). We would use these copies > > as restrict tag sources using the specified UID. Thus every inline > > copy (and clone) would share them. > > Assuming we don't CSE over it (in particular, the LHS of such a builtin with > some non-restricted later pointers not based on it), perhaps it could work. > I wouldn't expose it to users, because how would users ensure uniqueness of the > tag over the whole CU? __COUNTER__ or something similar? That's going to lead > to bugs... Yeah, I suppose it's not trivial. OTOH programs have to sort out global symbol space as well, so ... > The FEs should probably add that for user __restrict variable initializers too. Yes. One slight complication with the proposal arises when the restrict qualified parameter is address-taken, you'd get foo (int * restrict p) { tem = p; tem = RESTRICT <tem, tag>; p = tem; bar (&p); } which would work for the purpose of making loads from p restrict, but the load and the store above would not be identifiable as artificial (as opposed to the RESTRICT statement), and thus may persist in the final code. Not sure if we need to worry about this case though (similar issue with address-taken pointers and __builtin_assume_aligned). I'd just add the RESTRICT casts for non-address-taken pointers.
(In reply to comment #11) > Created attachment 25423 [details] > CAST_RESTRICT removal > > Attaching a test patch that just removed CAST_RESTRICT altogether, plus IRC > discussion that lead to it. The only testsuite regressions are Wobjsize-1.c > and strlenopt-4gf.c which show an important security related problem - we > probably shouldn't be folding builtins if DECL_INITIAL (fndecl) != NULL && > DECL_DECLARED_INLINE_P (fndecl) && cfun && !cfun->after_inlining, > because then we happily fold e.g. char buf[2]; strcpy (buf, "abcd"); into > __builtin_memcpy even when strcpy is always_inline inline wrapper that calls > __builtin___strcpy_chk and would complain about the buffer overflow resp. add > runtime checking. The patch looks ok to me once we solved the folding issue (we probably have to backport that as well then).
Author: jakub Date: Thu Oct 6 16:38:29 2011 New Revision: 179620 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=179620 Log: PR tree-optimization/49279 * tree-ssa-structalias.c (find_func_aliases): Don't handle CAST_RESTRICT. * tree-ssa-forwprop.c (forward_propagate_addr_expr_1): Allow restrict propagation. * tree-ssa.c (useless_type_conversion_p): Don't return false if TYPE_RESTRICT differs. * gcc.dg/tree-ssa/restrict-4.c: XFAIL. * gcc.c-torture/execute/pr49279.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/execute/pr49279.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/tree-ssa/restrict-4.c trunk/gcc/tree-ssa-forwprop.c trunk/gcc/tree-ssa-structalias.c trunk/gcc/tree-ssa.c
Author: jakub Date: Thu Oct 6 19:56:32 2011 New Revision: 179633 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=179633 Log: PR tree-optimization/49279 * tree-ssa-structalias.c (find_func_aliases): Don't handle CAST_RESTRICT. * gcc.c-torture/execute/pr49279.c: New test. Added: branches/gcc-4_6-branch/gcc/testsuite/gcc.c-torture/execute/pr49279.c Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/testsuite/ChangeLog branches/gcc-4_6-branch/gcc/tree-ssa-structalias.c
Sofar fixed on trunk and the 4.6 branch.
I'm working on the ADD_RESTRICT idea implementation now.
*** Bug 48764 has been marked as a duplicate of this bug. ***
Author: rguenth Date: Tue Jan 3 13:54:44 2012 New Revision: 182846 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=182846 Log: 2012-01-03 Richard Guenther <rguenther@suse.de> Backport from mainline 2011-10-06 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/49279 * tree-ssa-structalias.c (find_func_aliases): Don't handle CAST_RESTRICT. 2011-10-06 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/49279 * gcc.c-torture/execute/pr49279.c: New test. Added: branches/gcc-4_5-branch/gcc/testsuite/gcc.c-torture/execute/pr49279.c Modified: branches/gcc-4_5-branch/gcc/ChangeLog branches/gcc-4_5-branch/gcc/testsuite/ChangeLog branches/gcc-4_5-branch/gcc/tree-ssa-structalias.c
Fixed.