Created attachment 26797 [details] minimized test case The cond_store_replacement() optimization can move a memory access outside of a conditional statement that checks whether it is safe to access the memory. This can cause the program to segfault. I've attached a simplified test case that reproduces the problem. It uses mprotect to ensure the following byte is inaccessible. In practice we see segfaults simply because malloc sometimes returns a buffer at the end of a readable region. The conditional store replacement moves the memory access outside of the length check. I've verified the problem occurs in 4.6.1, 4.6.2, and the gcc-4.6.2-20120210 and gcc-4.7-20120225 snapshots. The problem doesn't occur with 4.4.5, as it doesn't perform conditional store replacement in this case.
Confirmed, I don't see why cselim is marking buf[1] as nothrow.
Started with MEM_REF merge: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=161655
I'd say the problem is that add_or_mark_expr hasn't been properly adjusted for MEM_REFs, it ignores the offset of the MEM_REFs altogether. Before MEM_REF, this function would just track INDIRECT_REFs dereferencing the same pointer SSA_NAME (i.e. all of them would be same offset, same size). Now it should take into account not just the offset, but also the access size.
Not to mention that in this exact case, even if was always non-trapping, I doubt it will ever be an optimization to "optimize" if (len_1(D) > 1) goto <bb 5>; else goto <bb 6>; <bb 5>: MEM[(char *)buf_2(D) + 1B] = 0; <bb 6>: return; into: if (len_1(D) > 1) goto <bb 6>; else goto <bb 5>; <bb 5>: cstore.2_7 = MEM[(char *)buf_2(D) + 1B]; <bb 6>: # cstore.2_9 = PHI <cstore.2_7(5), 0(4)> MEM[(char *)buf_2(D) + 1B] = cstore.2_9; return; because the latter we then expand into: jbe .L8 movb %al, 1(%rdi) ret ... .L8: movzbl 1(%rdi), %eax movb %al, 1(%rdi) ret So if the conditional bb contains just the potentionally cselim optimized store, perhaps we should punt. Plus for C++11/C11 memory model we probably need to disable cselim optimization altogether.
The quick fix that would IMHO brings us back to pre-161655 decisions would be just to store also the offset and size into the hash table entries and use them as a requirement for the lookups. I think that would essentially map to what we did before. A better, but more complicated, change, would be to only keep using ssa_name and store as lookup criteria as we do right now, add a linked list of offset/sizes and consider non-trapping stores if the [offset, offset+size) interval is subset of the non-trapping bytes. This would be able to optimize even the cases where say there are is a larger store (or several smaller stores) that cover the area. We might need to prune the chains in nt_fini_block though. Richard, do you think for 4.7.0/4.6.4 just implementing the simpler approach would be fine?
Created attachment 26800 [details] gcc47-pr52445.patch Untested fix.
Ok with + && TREE_CODE (TREE_OPERAND (exp, 0)) == SSA_NAME + && host_integerp (TREE_OPERAND (exp, 1), 0)) also checking that int_size_in_bytes does not return -1.
(In reply to comment #7) > Ok with > > + && TREE_CODE (TREE_OPERAND (exp, 0)) == SSA_NAME > + && host_integerp (TREE_OPERAND (exp, 1), 0)) > > also checking that int_size_in_bytes does not return -1. I'm doing that check later. The reason I wanted to avoid doing it in the first if, is that either it will mean int_size_in_bytes needs to be called twice, or we'd need (size = int_size_in_bytes (TREE_TYPE (exp))) > 0 (i.e. setting variables in the if condition). But if you prefer one of these, I'll adjust.
Created attachment 26801 [details] gcc47-pr52445.patch Adjusted patch.
Author: jakub Date: Thu Mar 1 14:13:06 2012 New Revision: 184743 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184743 Log: PR tree-optimization/52445 * tree-ssa-phiopt.c (struct name_to_bb): Remove ssa_name field, add ssa_name_ver, offset and size fields and change store field to bool. (name_to_bb_hash, name_to_bb_eq): Adjust for the above changes. (add_or_mark_expr): Likewise. Only consider previous stores with the same size and offset. (nt_init_block): Only look at gimple_assign_single_p stmts, doesn't look at rhs2. * gcc.dg/pr52445.c: New test. Added: trunk/gcc/testsuite/gcc.dg/pr52445.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-phiopt.c
GCC 4.6.3 is being released.
Fixed on the trunk so far.
Could this be applied to gcc-4.6.4 please? A recently reported miscompilation of a device driver in the Linux/ARM kernel by gcc-4.6.3 was traced to this bug. Applying the trunk patch to 4.6.3 fixed that test case. FWIW, I've been using and testing this fix in my own 4.6-based branch since early March, on multiple targets, without regressions.
Author: jakub Date: Wed Apr 3 17:51:16 2013 New Revision: 197440 URL: http://gcc.gnu.org/viewcvs?rev=197440&root=gcc&view=rev Log: Backported from mainline 2012-03-01 Jakub Jelinek <jakub@redhat.com> PR tree-optimization/52445 * tree-ssa-phiopt.c (struct name_to_bb): Remove ssa_name field, add ssa_name_ver, offset and size fields and change store field to bool. (name_to_bb_hash, name_to_bb_eq): Adjust for the above changes. (add_or_mark_expr): Likewise. Only consider previous stores with the same size and offset. (nt_init_block): Only look at gimple_assign_single_p stmts, doesn't look at rhs2. * gcc.dg/pr52445.c: New test. Added: branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/pr52445.c Modified: branches/gcc-4_6-branch/gcc/ChangeLog branches/gcc-4_6-branch/gcc/testsuite/ChangeLog branches/gcc-4_6-branch/gcc/tree-ssa-phiopt.c