Bug 52445 - [4.6 Regression] conditional store replacement causes segfault in generated code
Summary: [4.6 Regression] conditional store replacement causes segfault in generated code
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.7.0
: P2 normal
Target Milestone: 4.6.4
Assignee: Jakub Jelinek
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2012-03-01 04:47 UTC by Adam Simpkins
Modified: 2013-04-03 18:15 UTC (History)
6 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.4.5
Known to fail: 4.6.0, 4.7.0
Last reconfirmed: 2012-03-01 00:00:00


Attachments
minimized test case (582 bytes, text/x-csrc)
2012-03-01 04:47 UTC, Adam Simpkins
Details
gcc47-pr52445.patch (1.74 KB, patch)
2012-03-01 10:37 UTC, Jakub Jelinek
Details | Diff
gcc47-pr52445.patch (1.75 KB, patch)
2012-03-01 12:31 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Adam Simpkins 2012-03-01 04:47:11 UTC
Created attachment 26797 [details]
minimized test case

The cond_store_replacement() optimization can move a memory access outside of a conditional statement that checks whether it is safe to access the memory.  This can cause the program to segfault.

I've attached a simplified test case that reproduces the problem.  It uses mprotect to ensure the following byte is inaccessible.  In practice we see segfaults simply because malloc sometimes returns a buffer at the end of a readable region.  The conditional store replacement moves the memory access outside of the length check.

I've verified the problem occurs in 4.6.1, 4.6.2, and the gcc-4.6.2-20120210 and gcc-4.7-20120225 snapshots.  The problem doesn't occur with 4.4.5, as it doesn't perform conditional store replacement in this case.
Comment 1 Andrew Pinski 2012-03-01 04:55:39 UTC
Confirmed, I don't see why cselim is marking buf[1] as nothrow.
Comment 2 Jakub Jelinek 2012-03-01 07:58:46 UTC
Started with MEM_REF merge: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=161655
Comment 3 Jakub Jelinek 2012-03-01 08:54:51 UTC
I'd say the problem is that add_or_mark_expr hasn't been properly adjusted for MEM_REFs, it ignores the offset of the MEM_REFs altogether.
Before MEM_REF, this function would just track INDIRECT_REFs dereferencing the same pointer SSA_NAME (i.e. all of them would be same offset, same size).
Now it should take into account not just the offset, but also the access size.
Comment 4 Jakub Jelinek 2012-03-01 09:01:59 UTC
Not to mention that in this exact case, even if was always non-trapping, I doubt it will ever be an optimization to "optimize"
  if (len_1(D) > 1)
    goto <bb 5>;
  else
    goto <bb 6>;

<bb 5>:
  MEM[(char *)buf_2(D) + 1B] = 0;

<bb 6>:
  return;
into:
  if (len_1(D) > 1)
    goto <bb 6>;
  else
    goto <bb 5>;

<bb 5>:
  cstore.2_7 = MEM[(char *)buf_2(D) + 1B];

<bb 6>:
  # cstore.2_9 = PHI <cstore.2_7(5), 0(4)>
  MEM[(char *)buf_2(D) + 1B] = cstore.2_9;
  return;

because the latter we then expand into:
        jbe     .L8
        movb    %al, 1(%rdi)
        ret
...
.L8:
        movzbl  1(%rdi), %eax
        movb    %al, 1(%rdi)
        ret

So if the conditional bb contains just the potentionally cselim optimized store, perhaps we should punt.

Plus for C++11/C11 memory model we probably need to disable cselim optimization altogether.
Comment 5 Jakub Jelinek 2012-03-01 09:14:42 UTC
The quick fix that would IMHO brings us back to pre-161655 decisions would be just to store also the offset and size into the hash table entries and use them as a requirement for the lookups.  I think that would essentially map to what we did before.  A better, but more complicated, change, would be to only keep using ssa_name and store as lookup criteria as we do right now, add a linked list of offset/sizes and consider non-trapping stores if the [offset, offset+size) interval is subset of the non-trapping bytes.  This would be able to optimize even the cases where say there are is a larger store (or several smaller stores) that cover the area.  We might need to prune the chains in nt_fini_block though.

Richard, do you think for 4.7.0/4.6.4 just implementing the simpler approach would be fine?
Comment 6 Jakub Jelinek 2012-03-01 10:37:18 UTC
Created attachment 26800 [details]
gcc47-pr52445.patch

Untested fix.
Comment 7 Richard Biener 2012-03-01 10:47:25 UTC
Ok with

+      && TREE_CODE (TREE_OPERAND (exp, 0)) == SSA_NAME
+      && host_integerp (TREE_OPERAND (exp, 1), 0))

also checking that int_size_in_bytes does not return -1.
Comment 8 Jakub Jelinek 2012-03-01 10:50:50 UTC
(In reply to comment #7)
> Ok with
> 
> +      && TREE_CODE (TREE_OPERAND (exp, 0)) == SSA_NAME
> +      && host_integerp (TREE_OPERAND (exp, 1), 0))
> 
> also checking that int_size_in_bytes does not return -1.

I'm doing that check later.  The reason I wanted to avoid doing it in the first if, is that either it will mean int_size_in_bytes needs to be called twice, or we'd need
  (size = int_size_in_bytes (TREE_TYPE (exp))) > 0
(i.e. setting variables in the if condition).  But if you prefer one of these, I'll adjust.
Comment 9 Jakub Jelinek 2012-03-01 12:31:53 UTC
Created attachment 26801 [details]
gcc47-pr52445.patch

Adjusted patch.
Comment 10 Jakub Jelinek 2012-03-01 14:13:13 UTC
Author: jakub
Date: Thu Mar  1 14:13:06 2012
New Revision: 184743

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=184743
Log:
	PR tree-optimization/52445
	* tree-ssa-phiopt.c (struct name_to_bb): Remove ssa_name field,
	add ssa_name_ver, offset and size fields and change store field
	to bool.
	(name_to_bb_hash, name_to_bb_eq): Adjust for the above changes.
	(add_or_mark_expr): Likewise.  Only consider previous stores
	with the same size and offset.
	(nt_init_block): Only look at gimple_assign_single_p stmts,
	doesn't look at rhs2.

	* gcc.dg/pr52445.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/pr52445.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree-ssa-phiopt.c
Comment 11 Jakub Jelinek 2012-03-01 14:38:57 UTC
GCC 4.6.3 is being released.
Comment 12 Jakub Jelinek 2012-03-01 14:42:13 UTC
Fixed on the trunk so far.
Comment 13 Mikael Pettersson 2012-09-11 14:21:09 UTC
Could this be applied to gcc-4.6.4 please?  A recently reported miscompilation of a device driver in the Linux/ARM kernel by gcc-4.6.3 was traced to this bug.  Applying the trunk patch to 4.6.3 fixed that test case.

FWIW, I've been using and testing this fix in my own 4.6-based branch since early March, on multiple targets, without regressions.
Comment 14 Jakub Jelinek 2013-04-03 18:15:53 UTC
Author: jakub
Date: Wed Apr  3 17:51:16 2013
New Revision: 197440

URL: http://gcc.gnu.org/viewcvs?rev=197440&root=gcc&view=rev
Log:
	Backported from mainline
	2012-03-01  Jakub Jelinek  <jakub@redhat.com>

	PR tree-optimization/52445
	* tree-ssa-phiopt.c (struct name_to_bb): Remove ssa_name field,
	add ssa_name_ver, offset and size fields and change store field
	to bool.
	(name_to_bb_hash, name_to_bb_eq): Adjust for the above changes.
	(add_or_mark_expr): Likewise.  Only consider previous stores
	with the same size and offset.
	(nt_init_block): Only look at gimple_assign_single_p stmts,
	doesn't look at rhs2.

	* gcc.dg/pr52445.c: New test.

Added:
    branches/gcc-4_6-branch/gcc/testsuite/gcc.dg/pr52445.c
Modified:
    branches/gcc-4_6-branch/gcc/ChangeLog
    branches/gcc-4_6-branch/gcc/testsuite/ChangeLog
    branches/gcc-4_6-branch/gcc/tree-ssa-phiopt.c