Obscure crashes due to gcc 4.9 -O2 => -fisolate-erroneous-paths-dereference

Chris Johns chrisj@rtems.org
Thu Feb 19 21:56:00 GMT 2015


On 20/02/2015 8:23 am, Joel Sherrill wrote:
>
> On 2/19/2015 2:56 PM, Sandra Loosemore wrote:
>> Jakub Jelinek wrote:
>>> On Wed, Feb 18, 2015 at 11:21:56AM -0800, Jeff Prothero wrote:
>>>> Starting with gcc 4.9, -O2 implicitly invokes
>>>>
>>>>      -fisolate-erroneous-paths-dereference:
>>>>
>>>> which
>>>>
>>>>      https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
>>>>
>>>> documents as
>>>>
>>>>      Detect paths that trigger erroneous or undefined behavior due to
>>>>      dereferencing a null pointer. Isolate those paths from the main control
>>>>      flow and turn the statement with erroneous or undefined behavior into a
>>>>      trap. This flag is enabled by default at -O2 and higher.
>>>>
>>>> This results in a sizable number of previously working embedded programs mysteriously
>>>> crashing when recompiled under gcc 4.9.  The problem is that embedded
>>>> programs will often have ram starting at address zero (think hardware-defined
>>>> interrupt vectors, say) which gets initialized by code which the
>>>> -fisolate-erroneous-paths-deference logic can recognize as reading and/or
>>>> writing address zero.
>>> If you have some pages mapped at address 0, you really should compile your
>>> code with -fno-delete-null-pointer-checks, otherwise you can run into tons
>>> of other issues.
>> Hmmmm,  Passing the additional option in user code would be one thing,
>> but what about library code?  E.g., using memcpy (either explicitly or
>> implicitly for a structure copy)?
>>
>> It looks to me like cr16 and avr are currently the only architectures
>> that disable flag_delete_null_pointer_checks entirely, but I am sure
>> that this issue affects other embedded targets besides nios2, too.  E.g.
>> scanning Mentor's ARM board support library, I see a whole pile of
>> devices that have memory mapped at address zero (TI Stellaris/Tiva,
>> Energy Micro EFM32Gxxx,  Atmel AT91SAMxxx, ....).  Plus our simulator
>> BSPs assume a flat address space starting at address 0.
> I forwarded this to the RTEMS list and was promptly pointed to a patch
> on a Coldfire BSP where someone worked around this behavior.
>
> We are discussing how to deal with this. It is likely OK in user code but
> horrible in BSP and driver code. We don't have a solution ourselves. We
> just recognize it impacts a number of targets.
>

My main concern is not knowing the trap has been added to the code. If I 
could build an application and audit it somehow then I can manage it. We 
have a similar issue with the possible use of FP registers being used in 
general code (ISR save/restore trade off).

Can the ELF be annotated in some GCC specific way that makes it to the 
final executable to flag this is happening ? We can then create tools to 
audit the executables.

Chris



More information about the Gcc mailing list