How to control GCC builtin functions optimization
Fri Jan 11 11:04:00 GMT 2019
On 1/11/19 5:12 PM, Jakub Jelinek wrote:
> Removing gcc@ , as this is not relevant to GCC development.
> On Fri, Jan 11, 2019 at 11:03:35AM +0800, Cao jin wrote:
>> (pls CC me when replying because I am not subscriber)
>> I met an interesting phenomenon when looking into linux kernel
>> compilation, it can be simply summarized as following: in
>> arch/x86/boot/compressed, memcpy is defined as __builtin_memcpy, while
>> also implemented as a function. But when using memcpy, in some case GCC
>> optimize it to inline code, in other case GCC just emit a call to
>> self-defined memcpy function. This can be confirmed according to the
>> symbol table via `nm bluh.o`.
>> The compiling flags is, for example:
>> cmd_arch/x86/boot/compressed/pgtable_64.o := gcc
>> -Wp,-MD,arch/x86/boot/compressed/.pgtable_64.o.d -nostdinc -isystem
>> /usr/lib/gcc/x86_64-redhat-linux/8/include -I./arch/x86/include
>> -I./arch/x86/include/gene rated -I./include
>> -I./arch/x86/include/uapi -I./arch/x86/include/generated/uapi
>> -I./include/uapi -I./include/generated/uapi -include
>> ./include/linux/kconfig.h -include ./include/linux/compiler_types.h
>> -D__KERNEL__ -DCONFIG_CC_STACKPROTECTOR -m64 -O2 -fno-strict-aliasing
>> -fPIE -DDISABLE_BRANCH_PROFILING -mcmodel=small -mno-mmx -mno-sse
>> -ffreestanding -fno-stack-protector -DKBUILD_BASENAME='"pgtable_64"'
>> -DKBUILD_MODNAME='"pgtable_64"' -c -o
>> arch/x86/boot/compressed/pgtable_64.o arch/x86/boot/compressed/pgtable_64.c
>> Now the questions is: from code-reading, it is kind of non-intuitive, is
>> there any explicit way to control the optimization behavior accurately?
> memcpy and __builtin_memcpy are the same thing, ditto for other builtins
> that have a library counterpart. The difference in between them is in
> 1) __builtin_* doesn't need to be prototyped
> 2) __builtin_* works even with -fno-builtin-memcpy or -fno-builtin
> Otherwise, if memcpy acts as a builtin, they do the same thing.
> You can control how memcpy is expanded through various command line
> switches, -Os affects it, on x86 e.g. -mstringop-strategy=,
> -mmemcpy-strategy=, -mtune= etc., on various other architectures other
Thanks very much! I tried the -mmemcpy-strategy=byte_loop, it seems
worked! The .o used to emit call to self-defined memcpy() now has no
memcpy entry in the nm output!
More information about the Gcc-help