Summary: | Stores to bitfields introduce a store-data-race on adjacent data | ||
---|---|---|---|
Product: | gcc | Reporter: | Richard Biener <rguenth> |
Component: | target | Assignee: | Richard Biener <rguenth> |
Status: | RESOLVED FIXED | ||
Severity: | normal | CC: | bergner, ebotcazou, gcc-bugs, hp, kirill, matz, toolchain, vr5 |
Priority: | P3 | Keywords: | wrong-code |
Version: | 4.7.0 | ||
Target Milestone: | 4.8.0 | ||
Host: | Target: | ia64-*-linux, sparc64-*-linux, powerpc64-*-linux | |
Build: | Known to work: | 4.8.0 | |
Known to fail: | 4.3.3, 4.6.2, 4.7.0 | Last reconfirmed: | 2012-02-01 00:00:00 |
Bug Depends on: | 48124 | ||
Bug Blocks: | |||
Attachments: | candidate patch |
Description
Richard Biener
2012-02-01 10:06:55 UTC
SPARC64 is also affected. ;; ptr_1(D)->full = 1; (insn 6 5 7 (set (reg:DI 110) (mem/j:DI (plus:DI (reg/v/f:DI 109 [ ptr ]) (const_int 8 [0x8])) [0+8 S8 A64])) t.c:10 -1 (nil)) (insn 7 6 8 (set (reg:DI 112) (const_int 2147483648 [0x80000000])) t.c:10 -1 (nil)) (insn 8 7 9 (set (reg:DI 111) (ior:DI (reg:DI 110) (reg:DI 112))) t.c:10 -1 (nil)) (insn 9 8 0 (set (mem/j:DI (plus:DI (reg/v/f:DI 109 [ ptr ]) (const_int 8 [0x8])) [0+8 S8 A64]) (reg:DI 111)) t.c:10 -1 (nil)) wrong: ldx [%o0+8], %g2 sethi %hi(2147483648), %g1 or %g2, %g1, %g1 jmp %o7+8 stx %g1, [%o0+8] At least IA64 also can do 4-byte loads/stores (but not sure the HW wouldn't re-introduce the data race). SPARC64 also can do 32bit loads/stores as the following testcase shows: struct x { long a; unsigned int lock; unsigned int full; }; void wrong(struct x *ptr) { ptr->full = 1; } here we simply use a 32bit store. For SPARC64 optimize_bitfield_assignment_op fails so we fall into the store_field path with mode1 == VOIDmode (what get_inner_refrence says). to_rtx is (mem/j:BLK (reg/v/f:DI 109 [ ptr ]) [2 *ptr_1(D)+0 S1 A64]) Related bug: PR48124. Btw, offset = bitnum / unit; bitpos = bitnum % unit; byte_offset = (bitnum % BITS_PER_WORD) / BITS_PER_UNIT + (offset * UNITS_PER_WORD); byte_offset is bollocks (or has a really poor name). On SPARC64 I see (gdb) p unit $11 = 8 (gdb) p offset $12 = 12 (gdb) p bitpos $13 = 0 (gdb) p byte_offset $14 = 100 Other than that we are falling into the generic store_fixed_bit_field routine which at the top does unsigned int total_bits = BITS_PER_WORD; and mode = GET_MODE (op0); if (GET_MODE_BITSIZE (mode) == 0 || GET_MODE_BITSIZE (mode) > GET_MODE_BITSIZE (word_mode)) mode = word_mode; (now, why we use a BLKmode mem here and not a QImode mem may be surprising) get_best_mode still returns DImode because that's the largest mode the given alignment supports. After that we're lost. So it seems that to fix this case we'd need to figure out some other largest mode we can pass to get_best_mode. The only hint would be from providing a different mode for the initial MEM we create, like with Index: gcc/expr.c =================================================================== --- gcc/expr.c (revision 183791) +++ gcc/expr.c (working copy) @@ -4705,6 +4705,12 @@ expand_assignment (tree to, tree from, b to_rtx = adjust_address (to_rtx, mode1, 0); else if (GET_MODE (to_rtx) == VOIDmode) to_rtx = adjust_address (to_rtx, BLKmode, 0); + else if (TREE_CODE (to) == COMPONENT_REF + && DECL_BIT_FIELD (TREE_OPERAND (to, 1)) + && DECL_MODE (TREE_OPERAND (to, 1)) != BLKmode) + to_rtx = adjust_address (to_rtx, + TYPE_MODE (DECL_BIT_FIELD_TYPE + (TREE_OPERAND (to, 1))), 0); } if (offset != 0) That avoids the use of QImode we have on the field-decl but also adjusts MEM_SIZE ... You probably need to add a 'volatile' somewhere to really have wrong code. get_best_mode returns DImode because of: /* Nonzero if access to memory by bytes is slow and undesirable. For RISC chips, it means that access to memory by bytes is no better than access by words when possible, so grab a whole word and maybe make use of that. */ #define SLOW_BYTE_ACCESS 1 Yeah. Which is okay for reading, but doing the same when writing is problematic. This fails on powerpc64-linux as well (-m64), where we generate: ld 9,8(3) li 10,1 rldimi 9,10,31,32 std 9,8(3) blr OK, my minimal test case removed the "volatile" keyword by mistake. The real code in BTRFS has the volatile for the lock value which precedes the bitfield, so the corresponding structure would be: struct x { long a; volatile unsigned int lock; /* <- note the "volatile" here */ unsigned int full : 1; }; Now, GCC should honour that the value of "lock" can change any time, so it must not assume that writing back the same value a few cycles later is safe. (In reply to comment #9) > OK, my minimal test case removed the "volatile" keyword by mistake. > > The real code in BTRFS has the volatile for the lock value which precedes the > bitfield, so the corresponding structure would be: > > struct x { > long a; > volatile unsigned int lock; /* <- note the "volatile" here */ > unsigned int full : 1; > }; > > Now, GCC should honour that the value of "lock" can change any time, so it must > not assume that writing back the same value a few cycles later is safe. volatiles on single structure members is of course under- (or even un-)specified. Consider struct x { int i : 1; volatile int j : 1; }; Where we clearly cannot access i without modifying j (but it's still valid C). So I don't think that a volatile member inside a non-volatile struct guarantees anything. Now, with struct x { long a; volatile unsigned int lock; unsigned int full : 1; }; void wrong(volatile struct x *ptr) { ptr->full = 1; } IA64 uses .mmi ld8.acq r14 = [r32] ;; nop 0 dep r14 = r15, r14, 32, 1 ;; .mib st8.rel [r32] = r14 which seems to be an attempt to work around this issue (albeit a possibly very slow one). (In reply to comment #10) > IA64 uses > > .mmi > ld8.acq r14 = [r32] > ;; > nop 0 > dep r14 = r15, r14, 32, 1 > ;; > .mib > st8.rel [r32] = r14 > > which seems to be an attempt to work around this issue (albeit a > possibly very slow one). Are you referring to the ".acq" and ".rel" forms? That doesn't change the situation at all. All it does is ensure correct memory ordering with respect to external visibility, but it does nothing to avoid the race condition. Created attachment 26712 [details] candidate patch Here is a candidate patch (it ICEs one Ada testcase, see PR52134). It enforces the C++11 memory model (as far as bitfields are concerned) upon everyone and builds on the machinery that was implemented for it (changing the implementation for when we compute "an underlying object" and permanently store it, to make it possible to use this information for optimization purposes). It fixes the related PR48124 as well. It's an invasive change that will change code generation on STRICT_ALIGNMENT platforms quite severely for bitfield accesses. So it is not an appropriate fix for GCC 4.7 at this point of the development cycyle, nor would it be appropriate to backport it. Queued for GCC 4.8. Author: rguenth Date: Wed Mar 14 10:55:09 2012 New Revision: 185379 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=185379 Log: 2012-03-14 Richard Guenther <rguenther@suse.de> * tree.h (DECL_BIT_FIELD_REPRESENTATIVE): New define. * stor-layout.c (start_bitfield_representative): New function. (finish_bitfield_representative): Likewise. (finish_bitfield_layout): Likewise. (finish_record_layout): Call finish_bitfield_layout. * tree.c (free_lang_data_in_decl): Only free DECL_QUALIFIER for QUAL_UNION_TYPE fields. * tree-streamer-in.c (lto_input_ts_field_decl_tree_pointers): Stream DECL_BIT_FIELD_REPRESENTATIVE. * tree-streamer-out.c (write_ts_field_decl_tree_pointers): Likewise. PR middle-end/52080 PR middle-end/52097 PR middle-end/48124 * expr.c (get_bit_range): Unconditionally extract bitrange from DECL_BIT_FIELD_REPRESENTATIVE. (expand_assignment): Adjust call to get_bit_range. * gcc.dg/torture/pr48124-1.c: New testcase. * gcc.dg/torture/pr48124-2.c: Likewise. * gcc.dg/torture/pr48124-3.c: Likewise. * gcc.dg/torture/pr48124-4.c: Likewise. Added: trunk/gcc/testsuite/gcc.dg/torture/pr48124-1.c trunk/gcc/testsuite/gcc.dg/torture/pr48124-2.c trunk/gcc/testsuite/gcc.dg/torture/pr48124-3.c trunk/gcc/testsuite/gcc.dg/torture/pr48124-4.c Modified: trunk/gcc/ChangeLog trunk/gcc/expr.c trunk/gcc/stor-layout.c trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-streamer-in.c trunk/gcc/tree-streamer-out.c trunk/gcc/tree.c trunk/gcc/tree.h Fixed for 4.8. Author: rguenth Date: Mon Jun 4 08:43:23 2012 New Revision: 188167 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=188167 Log: 2012-06-04 Richard Guenther <rguenther@suse.de> Eric Botcazou <ebotcazou@adacore.com> Backport from mainline 2012-04-03 Eric Botcazou <ebotcazou@adacore.com> * expr.c (get_bit_range): Add OFFSET parameter and adjust BITPOS. Change type of BITOFFSET to signed. Make sure the lower bound of the computed range is non-negative by adjusting OFFSET and BITPOS. (expand_assignment): Adjust call to get_bit_range. 2012-03-27 Eric Botcazou <ebotcazou@adacore.com> * expr.c (get_bit_range): Return the null range if the enclosing record is part of a larger bit field. 2012-03-20 Richard Guenther <rguenther@suse.de> * stor-layout.c (finish_bitfield_representative): Fallback to conservative maximum size if the padding up to the next field cannot be computed as a constant. (finish_bitfield_layout): If we cannot compute the distance between the start of the bitfield representative and the bitfield member start a new representative. * expr.c (get_bit_range): The distance between the start of the bitfield representative and the bitfield member is zero if the field offsets are not constants. 2012-03-16 Richard Guenther <rguenther@suse.de> * stor-layout.c (finish_bitfield_representative): Fall back to the conservative maximum size if we cannot compute the size of the tail padding. 2012-03-14 Richard Guenther <rguenther@suse.de> * tree.h (DECL_BIT_FIELD_REPRESENTATIVE): New define. * stor-layout.c (start_bitfield_representative): New function. (finish_bitfield_representative): Likewise. (finish_bitfield_layout): Likewise. (finish_record_layout): Call finish_bitfield_layout. * tree.c (free_lang_data_in_decl): Only free DECL_QUALIFIER for QUAL_UNION_TYPE fields. * tree-streamer-in.c (lto_input_ts_field_decl_tree_pointers): Stream DECL_BIT_FIELD_REPRESENTATIVE. * tree-streamer-out.c (write_ts_field_decl_tree_pointers): Likewise. PR middle-end/52080 PR middle-end/52097 PR middle-end/48124 * expr.c (get_bit_range): Unconditionally extract bitrange from DECL_BIT_FIELD_REPRESENTATIVE. (expand_assignment): Adjust call to get_bit_range. * gcc.dg/torture/pr48124-1.c: New testcase. * gcc.dg/torture/pr48124-2.c: Likewise. * gcc.dg/torture/pr48124-3.c: Likewise. * gcc.dg/torture/pr48124-4.c: Likewise. * gnat.dg/pack16.adb: Likewise. * gnat.dg/pack16_pkg.ads: Likewise. * gnat.dg/pack17.adb: Likewise. * gnat.dg/specs/pack7.ads: Likewise. * gnat.dg/specs/pack8.ads: Likewise. * gnat.dg/specs/pack8_pkg.ads: Likewise. Added: branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/torture/pr48124-1.c branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/torture/pr48124-2.c branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/torture/pr48124-3.c branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/torture/pr48124-4.c branches/gcc-4_7-branch/gcc/testsuite/gnat.dg/pack16.adb branches/gcc-4_7-branch/gcc/testsuite/gnat.dg/pack16_pkg.ads branches/gcc-4_7-branch/gcc/testsuite/gnat.dg/pack17.adb branches/gcc-4_7-branch/gcc/testsuite/gnat.dg/specs/pack7.ads branches/gcc-4_7-branch/gcc/testsuite/gnat.dg/specs/pack8.ads branches/gcc-4_7-branch/gcc/testsuite/gnat.dg/specs/pack8_pkg.ads Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/expr.c branches/gcc-4_7-branch/gcc/stor-layout.c branches/gcc-4_7-branch/gcc/testsuite/ChangeLog branches/gcc-4_7-branch/gcc/tree-streamer-in.c branches/gcc-4_7-branch/gcc/tree-streamer-out.c branches/gcc-4_7-branch/gcc/tree.c branches/gcc-4_7-branch/gcc/tree.h *** Bug 260998 has been marked as a duplicate of this bug. *** Seen from the domain http://volichat.com Page where seen: http://volichat.com/adult-chat-rooms Marked for reference. Resolved as fixed @bugzilla. |