This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug c/69702] New: excessive stack usage with -fprofile-arcs
- From: "arnd at linaro dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: Fri, 05 Feb 2016 22:45:07 +0000
- Subject: [Bug c/69702] New: excessive stack usage with -fprofile-arcs
- Auto-submitted: auto-generated
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69702
Bug ID: 69702
Summary: excessive stack usage with -fprofile-arcs
Product: gcc
Version: 5.3.1
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: arnd at linaro dot org
Target Milestone: ---
Created attachment 37604
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=37604&action=edit
standalone test case extracted from Linux kernel
With gcc versions 4.9 or higher, the stack usage of some functions in the Linux
kernel has grown to the point where we risk a stack overflow, with 8kb or 16kb
of stack being available per thread.
When building an ARM kernel, I get at least these warnings in some
configurations when using "gcc -fprofile-arcs -Wframe-larger-than=1024", and
don't get them without -fprofile-arcs:
drivers/isdn/isdnhdlc.c:629:1: error: the frame size of 1152 bytes is larger
than 1024 bytes
drivers/media/common/saa7146/saa7146_hlp.c:464:1: error: the frame size of 1040
bytes is larger than 1024 bytes
drivers/mtd/chips/cfi_cmdset_0020.c:651:1: error: the frame size of 1040 bytes
is larger than 1024 bytes
drivers/net/wireless/ath/ath6kl/main.c:495:1: error: the frame size of 1200
bytes is larger than 1024 bytes
drivers/net/wireless/ath/ath9k/ar9003_aic.c:434:1: error: the frame size of
1208 bytes is larger than 1024 bytes
drivers/video/fbdev/riva/riva_hw.c:426:1: error: the frame size of 1248 bytes
is larger than 1024 bytes
lib/lz4/lz4hc_compress.c:514:1: error: the frame size of 2464 bytes is larger
than 1024 bytes
The lz4hc_compress.c file is a good example, as it has the worst stack usage
and is usable as a working test case outside of the kernel. I have reduced this
file to a standalone .c file that can optionally compile into an executable
program (lz4 compression from stdin to stdout). The code is orginally from
www.lz4.org, but has been adapted for use in Linux.
Compile with:
gcc -O2 -Wall -Wno-pointer-sign -Wframe-larger-than=200 -fprofile-arcs -c
lz4hc_compress.c
The same problem happens on all architectures, e.g. gcc-4.9.3:
Target -fprofile-arcs normal
aarch64-linux-gcc 1136 112
alpha-linux-gcc 1008 304
am33_2.0-linux-gcc 1280 84
arm-linux-gnueabi-gcc 1080 112
cris-linux-gcc 828 100
frv-linux-gcc 904 104
hppa64-linux-gcc 944 248
hppa-linux-gcc 824 92
i386-linux-gcc 824 108
m32r-linux-gcc 908 136
microblaze-linux-gcc 832 88
mips64-linux-gcc 864 192
mips-linux-gcc 792 120
powerpc64-linux-gcc 800 96
powerpc-linux-gcc 808 56
s390-linux-gcc 832 112
sh3-linux-gcc 824 128
sparc64-linux-gcc 896 192
sparc-linux-gcc 824 104
x86_64-linux-gcc 912 192
xtensa-linux-gcc 816 128
With gcc-4.8.1, the numbers are much lower:
arm-linux-gnueabi-gcc 184 104
x86_64-linux-gcc 224 192
The size of the binary object has also grown noticeably, from around 3000 bytes
without -fprofile-arcs (on any version) to 10300 bytes with gcc-5.3.1 but only
6941 bytes with gcc-4.8. Runtime speed does not appear to be affected much
(less than 20% overhead for -fprofile-arcs, which seems reasonable).
I have tested ARM cross-compilers version 4.9.3 through 5.3.1, which all show
similar problematic behavior, while version 4.6 through 4.8.3 are ok.