This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
[PING] [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM
- From: Bernd Edlinger <bernd dot edlinger at hotmail dot de>
- To: Richard Biener <richard dot guenther at gmail dot com>, Jason Merrill <jason at redhat dot com>, "Joseph S. Myers" <joseph at codesourcery dot com>
- Cc: "gcc-patches at gcc dot gnu dot org" <gcc-patches at gcc dot gnu dot org>, Sandra Loosemore <sandra at codesourcery dot com>
- Date: Fri, 15 Nov 2013 08:45:15 +0100
- Subject: [PING] [PATCH] Fix C++0x memory model for -fno-strict-volatile-bitfields on ARM
- Authentication-results: sourceware.org; auth=none
- References: <DUB122-W15E51EED0BDD4E4BBB7280E40D0 at phx dot gbl>,<CAFiYyc3B19C+KzWADqMuFftHBsf9R-BVmhx3byk4HMaHDZVCgQ at mail dot gmail dot com>,<DUB122-W349FBC26ACB80FE7DB42E7E40D0 at phx dot gbl>,<DUB122-W34629C286171333E30C282E40B0 at phx dot gbl>
Hello,
could you please approve this patch: http://gcc.gnu.org/ml/gcc-patches/2013-10/msg02664.html
As it looks like, it fixes the problem reported under PR59134,
which is the expander enters infinite recursion while it tries to violate the C++ memory model.
The memory context has still the structure mode, which is QImode, and due to this patch the
memory accesses are no longer tried in SImode but in QImode, which is fixes the ICE here.
Maybe I should add the test case from PR59134 ?
Thanks
Bernd.
>
> Hello,
>
> meanwhile, I have added a test case to that patch.
>
> Boot-strapped and regression-tested as usual.
>
> OK for trunk?
>
> Bernd.
>
>> Hi,
>>
>> On Fri, 25 Oct 2013 11:26:20, Richard Biener wrote:
>>>
>>> On Fri, Oct 25, 2013 at 10:40 AM, Bernd Edlinger
>>> <bernd.edlinger@hotmail.de> wrote:
>>>> Hello,
>>>>
>>>> this patch fixes the recently discovered data store race on arm-eabi-gcc with -fno-strict-volatile-bitfields
>>>> for structures like this:
>>>>
>>>> #define test_type unsigned short
>>>>
>>>> typedef struct s{
>>>> unsigned char Prefix[1];
>>>> test_type Type;
>>>> }__attribute((__packed__,__aligned__(4))) ss;
>>>>
>>>> volatile ss v;
>>>>
>>>> void __attribute__((noinline))
>>>> foo (test_type u)
>>>> {
>>>> v.Type = u;
>>>> }
>>>>
>>>> test_type __attribute__((noinline))
>>>> bar (void)
>>>> {
>>>> return v.Type;
>>>> }
>>>>
>>>>
>>>> I've manually confirmed the correct code generation using variations of the
>>>> example above on an ARM cross-compiler for -fno-strict-volatile-bitfields.
>>>>
>>>> Note, that this example is still causes ICE's for -fstrict-volatile-bitfields,
>>>> but I'd like to fix that separately.
>>>>
>>>> Boot-strapped and regression-tested on x86_64-linux-gnu.
>>>>
>>>> Ok for trunk?
>>>
>>> Isn't it more appropriate to fix it here:
>>>
>>> if (TREE_CODE (to) == COMPONENT_REF
>>> && DECL_BIT_FIELD_TYPE (TREE_OPERAND (to, 1)))
>>> get_bit_range (&bitregion_start, &bitregion_end, to, &bitpos, &offset);
>>>
>>> ?
>>>
>>
>> Honestly, I'd call this is a work-around, not a design.
>>
>> Therefore I would not move that workaround to expr.c.
>>
>> Also the block below is only a work-around IMHO.
>>
>> if (MEM_P (str_rtx) && bitregion_start> 0)
>> {
>> enum machine_mode bestmode;
>> HOST_WIDE_INT offset, size;
>>
>> gcc_assert ((bitregion_start % BITS_PER_UNIT) == 0);
>>
>> offset = bitregion_start / BITS_PER_UNIT;
>> bitnum -= bitregion_start;
>> size = (bitnum + bitsize + BITS_PER_UNIT - 1) / BITS_PER_UNIT;
>> bitregion_end -= bitregion_start;
>> bitregion_start = 0;
>> bestmode = get_best_mode (bitsize, bitnum,
>> bitregion_start, bitregion_end,
>> MEM_ALIGN (str_rtx), VOIDmode,
>> MEM_VOLATILE_P (str_rtx));
>> str_rtx = adjust_bitfield_address_size (str_rtx, bestmode, offset, size);
>> }
>>
>> Here, if bitregion_start = 8, we have a 4 byte aligned memory context,
>> and whoops, now it is only 1 byte aligned.
>>
>> this example:
>>
>> struct s
>> {
>> char a;
>> int b:24;
>> };
>>
>> struct s ss;
>>
>> void foo(int b)
>> {
>> ss.b = b;
>> }
>>
>>
>> gets compiled (at -O3) to:
>>
>> foo:
>> @ Function supports interworking.
>> @ args = 0, pretend = 0, frame = 0
>> @ frame_needed = 0, uses_anonymous_args = 0
>> @ link register save eliminated.
>> ldr r3, .L2
>> mov r1, r0, lsr #8
>> mov r2, r0, lsr #16
>> strb r1, [r3, #2]
>> strb r0, [r3, #1]
>> strb r2, [r3, #3]
>> bx lr
>>
>> while...
>>
>> struct s
>> {
>> char a;
>> int b:24;
>> };
>>
>> struct s ss;
>>
>> void foo(int b)
>> {
>> ss.b = b;
>> }
>>
>>
>> gets compiled (at -O3) to
>>
>> foo:
>> @ Function supports interworking.
>> @ args = 0, pretend = 0, frame = 0
>> @ frame_needed = 0, uses_anonymous_args = 0
>> @ link register save eliminated.
>> ldr r3, .L2
>> mov r2, r0, lsr #16
>> strb r2, [r3, #2]
>> strh r0, [r3] @ movhi
>> bx lr
>>
>> which is more efficient, but only because the memory context is still
>> aligned in this case.
>>
>>> Btw, the C++ standard doesn't cover packed or aligned attributes so
>>> we could declare this a non-issue. Any opinion on that?
>>>
>>> Thanks,
>>> Richard.
>>>
>>>> Thanks
>>>> Bernd.