This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM
- From: Richard Biener <richard dot guenther at gmail dot com>
- To: Bernd Edlinger <bernd dot edlinger at hotmail dot de>
- Cc: Jason Merrill <jason at redhat dot com>, "Joseph S. Myers" <joseph at codesourcery dot com>, "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Sandra Loosemore <sandra at codesourcery dot com>
- Date: Fri, 15 Nov 2013 11:12:39 +0100
- Subject: Re: [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM
- Authentication-results: sourceware.org; auth=none
- References: <DUB122-W15E51EED0BDD4E4BBB7280E40D0 at phx dot gbl> <CAFiYyc3B19C+KzWADqMuFftHBsf9R-BVmhx3byk4HMaHDZVCgQ at mail dot gmail dot com> <DUB122-W349FBC26ACB80FE7DB42E7E40D0 at phx dot gbl> <DUB122-W34629C286171333E30C282E40B0 at phx dot gbl>
On Thu, Oct 31, 2013 at 11:27 AM, Bernd Edlinger
<bernd.edlinger@hotmail.de> wrote:
> Hello,
>
> meanwhile, I have added a test case to that patch.
>
> Boot-strapped and regression-tested as usual.
>
> OK for trunk?
Err, well. This just means that the generic C++ memory model
handling isn't complete. We do
if (TREE_CODE (to) == COMPONENT_REF
&& DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
get_bit_range (&bitregion_start, &bitregion_end, to, &bitpos, &offset);
and thus restrict ourselves to bitfields to initialize the region we may touch.
This just lacks computing bitregion_start / bitregion_end for other
fields.
Btw, what does the C++ memory model say for
struct { char x; short a; short b; } a __attribute__((packed));
short *p = &a.a;
*p = 1;
I suppose '*p' may not touch bits outside of a.a either? Or are
memory accesses via pointers less constrained? What about
array element accesses?
That is, don't we need
else
{
bitregion_start = 0;
bitregion_end = bitsize;
}
?
Richard.
> Bernd.
>
>> Hi,
>>
>> On Fri, 25 Oct 2013 11:26:20, Richard Biener wrote:
>>>
>>> On Fri, Oct 25, 2013 at 10:40 AM, Bernd Edlinger
>>> <bernd.edlinger@hotmail.de> wrote:
>>>> Hello,
>>>>
>>>> this patch fixes the recently discovered data store race on arm-eabi-gcc with -fno-strict-volatile-bitfields
>>>> for structures like this:
>>>>
>>>> #define test_type unsigned short
>>>>
>>>> typedef struct s{
>>>> unsigned char Prefix[1];
>>>> test_type Type;
>>>> }__attribute((__packed__,__aligned__(4))) ss;
>>>>
>>>> volatile ss v;
>>>>
>>>> void __attribute__((noinline))
>>>> foo (test_type u)
>>>> {
>>>> v.Type = u;
>>>> }
>>>>
>>>> test_type __attribute__((noinline))
>>>> bar (void)
>>>> {
>>>> return v.Type;
>>>> }
>>>>
>>>>
>>>> I've manually confirmed the correct code generation using variations of the
>>>> example above on an ARM cross-compiler for -fno-strict-volatile-bitfields.
>>>>
>>>> Note, that this example is still causes ICE's for -fstrict-volatile-bitfields,
>>>> but I'd like to fix that separately.
>>>>
>>>> Boot-strapped and regression-tested on x86_64-linux-gnu.
>>>>
>>>> Ok for trunk?
>>>
>>> Isn't it more appropriate to fix it here:
>>>
>>> if (TREE_CODE (to) == COMPONENT_REF
>>> && DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
>>> get_bit_range (&bitregion_start, &bitregion_end, to, &bitpos, &offset);
>>>
>>> ?
>>>
>>
>> Honestly, I'd call this is a work-around, not a design.
>>
>> Therefore I would not move that workaround to expr.c.
>>
>> Also the block below is only a work-around IMHO.
>>
>> if (MEM_P (str_rtx) && bitregion_start> 0)
>> {
>> enum machine_mode bestmode;
>> HOST_WIDE_INT offset, size;
>>
>> gcc_assert ((bitregion_start % BITS_PER_UNIT) == 0);
>>
>> offset = bitregion_start / BITS_PER_UNIT;
>> bitnum -= bitregion_start;
>> size = (bitnum + bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT;
>> bitregion_end -= bitregion_start;
>> bitregion_start = 0;
>> bestmode = get_best_mode (bitsize, bitnum,
>> bitregion_start, bitregion_end,
>> MEM_ALIGN (str_rtx), VOIDmode,
>> MEM_VOLATILE_P (str_rtx));
>> str_rtx = adjust_bitfield_address_size (str_rtx, bestmode, offset, size);
>> }
>>
>> Here, if bitregion_start = 8, we have a 4 byte aligned memory context,
>> and whoops, now it is only 1 byte aligned.
>>
>> this example:
>>
>> struct s
>> {
>> char a;
>> int b:24;
>> };
>>
>> struct s ss;
>>
>> void foo(int b)
>> {
>> ss.b = b;
>> }
>>
>>
>> gets compiled (at -O3) to:
>>
>> foo:
>> @ Function supports interworking.
>> @ args = 0, pretend = 0, frame = 0
>> @ frame_needed = 0, uses_anonymous_args = 0
>> @ link register save eliminated.
>> ldr r3, .L2
>> mov r1, r0, lsr #8
>> mov r2, r0, lsr #16
>> strb r1, [r3, #2]
>> strb r0, [r3, #1]
>> strb r2, [r3, #3]
>> bx lr
>>
>> while...
>>
>> struct s
>> {
>> char a;
>> int b:24;
>> };
>>
>> struct s ss;
>>
>> void foo(int b)
>> {
>> ss.b = b;
>> }
>>
>>
>> gets compiled (at -O3) to
>>
>> foo:
>> @ Function supports interworking.
>> @ args = 0, pretend = 0, frame = 0
>> @ frame_needed = 0, uses_anonymous_args = 0
>> @ link register save eliminated.
>> ldr r3, .L2
>> mov r2, r0, lsr #16
>> strb r2, [r3, #2]
>> strh r0, [r3] @ movhi
>> bx lr
>>
>> which is more efficient, but only because the memory context is still
>> aligned in this case.
>>
>>> Btw, the C++ standard doesn't cover packed or aligned attributes so
>>> we could declare this a non-issue. Any opinion on that?
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Thanks
>>>> Bernd.