Bug 81142 - Segmentation fault when using static __thread variables
Summary: Segmentation fault when using static __thread variables
Status: UNCONFIRMED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 6.3.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-06-20 18:08 UTC by tomas_paukrt
Modified: 2019-08-02 06:12 UTC (History)
2 users (show)

See Also:
Host:
Target: arm
Build:
Known to work:
Known to fail: 4.9.4, 5.4.0, 6.3.0
Last reconfirmed:


Attachments
C source file (150 bytes, text/x-csrc)
2017-06-20 18:08 UTC, tomas_paukrt
Details
Preprocessed source file (3.47 KB, text/plain)
2017-06-20 18:09 UTC, tomas_paukrt
Details
Output of gcc -v (1.04 KB, text/plain)
2017-06-20 18:10 UTC, tomas_paukrt
Details
C source file without snprintf (121 bytes, text/plain)
2017-06-21 11:04 UTC, tomas_paukrt
Details
the reproducer created by osmocom (1.89 KB, application/x-bzip)
2019-08-02 06:10 UTC, Harald Welte
Details

Note You need to log in before you can comment on or make changes to this bug.
Description tomas_paukrt 2017-06-20 18:08:51 UTC
Created attachment 41591 [details]
C source file

If the attached C source file is cross-compiled using GCC 4.9.4 for AM335x CPU (BeagleBone) then segmentation fault occurs.

Command line to compile: gcc -O2 -fPIC -march=armv7-a -mtune=cortex-a8 -mfloat-abi=softfp -mfpu=vfpv3 -mtls-dialect=gnu crash.c

The array size in the first function is key to causing an error.
Comment 1 tomas_paukrt 2017-06-20 18:09:52 UTC
Created attachment 41592 [details]
Preprocessed source file
Comment 2 tomas_paukrt 2017-06-20 18:10:34 UTC
Created attachment 41593 [details]
Output of gcc -v
Comment 3 Andrew Pinski 2017-06-20 18:17:57 UTC
This could be a glibc, binutils or gcc issue.  You have more than one page of TLS variables here.  I don't remember the limits but it might be you are hitting that limit.
Comment 4 Richard Biener 2017-06-21 08:09:38 UTC
Also GCC 4.9 is no longer supported.
Comment 5 tomas_paukrt 2017-06-21 08:32:15 UTC
I was able to reproduce this bug with GCC 5.4.0 and 6.3.0.
GCC 7.1.0 generates slightly different assembler code and I have not been able to trigger this bug yet.
Comment 6 tomas_paukrt 2017-06-21 11:04:46 UTC
Created attachment 41595 [details]
C source file without snprintf
Comment 7 tomas_paukrt 2017-06-21 11:10:48 UTC
I attached another C source file that is even simpler.

Compiled program causes segmentation fault on AM335X (Cortex-A8) as well as on SPEAr320S-2 (ARM926EJ-S).

Using option -ftls-model=initial-exec or -mtls-dialect=gnu2 leads to generating different assembler code that do not cause segmentation fault.
Comment 8 tomas_paukrt 2017-06-22 08:26:24 UTC
Conditions for reproducing the bug:
- ARM architecture
- O1 or better optimalization
- TLS model global-dynamic or local-dynamic
- TLS dialect gnu
- at least two static thread local variables while the first is larger than page size

If all these conditions are met then address of the second variable seems to be miscalculated (difference of addresses is much larger than size of the first variable).
Comment 9 Ramana Radhakrishnan 2017-06-22 15:30:30 UTC
(In reply to tomas_paukrt from comment #0)
> Created attachment 41591 [details]
> C source file
> 
> If the attached C source file is cross-compiled using GCC 4.9.4 for AM335x
> CPU (BeagleBone) then segmentation fault occurs.
> 
> Command line to compile: gcc -O2 -fPIC -march=armv7-a -mtune=cortex-a8
> -mfloat-abi=softfp -mfpu=vfpv3 -mtls-dialect=gnu crash.c
> 


I'm not sure that you can expect executables created with -fPIC to work properly with TLS .

What happens if you use -fPIE since that is designed to create executables that are position independent and would do the right thing as far as various initializations go ? 

Or is this a crash you've observed in a shared library and has been reduced to a testcase in this form ? 

regards
Ramana
Comment 10 tomas_paukrt 2017-06-22 16:01:05 UTC
> What happens if you use -fPIE since that is designed to create executables
> that are position independent and would do the right thing as far as various
> initializations go ? 

It does not crash, because TLS model initial-exec or local-exec seems to be used in this case.

> Or is this a crash you've observed in a shared library and has been reduced
> to a testcase in this form ? 

Yes, crash was observed in a shared library with static thread local variables.

Best regards

Tomas
Comment 11 Harald Welte 2019-08-01 21:01:26 UTC
more than two years later, we can reproduce and observe this bug in a variety of ARM32 platforms, including raspbian 9 (with any gcc version shipped there, up to 6.3.x) and rapsbian 10 (up to gcc 6.5, but not with gcc-7.3 or gcc-8.3)

You can find the lengthy journey of investing several person-days of work at https://osmocom.org/issues/4062

At Osmocom, we're not compiler experts, but it seems to relate to to what kind of ELF relocations gcc emits during code generation.

IIRC, TLS was introduced in 2003, I'm surprised it still appears to be such an under-tested/under-used feature.

What would be important is to fully understand this issue in order to desing a proper work-around.  Dropping support for 32bit ARM systems or for platforms using gcc lower than 7.x is unfortunately not an option :/
Comment 12 tomas_paukrt 2019-08-01 22:51:06 UTC
Hi Herald,

have you tried to cross-compile the library with -mtls-dialect=gnu2 ? We are using this option as a workaround for this bug because no one seems to be interested in fixing it. I have exactly same findings as you and using a different TLS dialect is the best possible solution that I have found so far.

Best regards

Tomas
Comment 13 Harald Welte 2019-08-02 06:10:32 UTC
Created attachment 46657 [details]
the reproducer created by osmocom
Comment 14 Harald Welte 2019-08-02 06:12:35 UTC
Hi Tomas, thanks a lot for your suggested workaround.  

Indeed -mtls-dialect=gnu2 seems to be working also in our case.

-fPIE as suggested earlier is not an option as the __thread variables are used in a shared library (libosmocore from http://git.osmocom.org/libosmocore/ in our case).  Originally the reproducer also built a separate .so file, but it turned out this is not needed, i.e. even when stattically linking a -fPIC built .o file into the executable the problem can be seen very clearly.