Bug 31241 - Post Increment opportunity missed
Post Increment opportunity missed
Status: NEW
Product: gcc
Classification: Unclassified
Component: middle-end
4.2.0
: P3 enhancement
: ---
Assigned To: Not yet assigned to anyone
: missed-optimization
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-03-17 14:10 UTC by Pranav Bhandarkar
Modified: 2010-03-22 21:16 UTC (History)
5 users (show)

See Also:
Host:
Target: arm-none-eabi
Build:
Known to work:
Known to fail:
Last reconfirmed: 2009-04-30 10:30:18


Attachments
Source that exposes the mentioned deficiency in the compiler (188 bytes, application/octet-stream)
2007-03-17 14:13 UTC, Pranav Bhandarkar
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Pranav Bhandarkar 2007-03-17 14:10:36 UTC
A simple code that adds a 'value' too all the elements of an array should generate post increments while loading/storing values from/into the array. The code looks something like this
  
for (i = 0; i < 10; i++) {
    *(intArray++) |= value;
}

However a post increment is not generated at O3 ( that causes the tree-optimizer to unroll the loop)

Here is the information of the toolchain and the code produced.

$>arm-none-eabi-gcc -v -O3 -S enhance.c --save-temps -o-
Using built-in specs.
Target: arm-none-eabi
Configured with: /mnt/tools/fsf/build/combined-arm-none-eabi-gcc-4.2-branch-2007-03-16/configure --target=arm-none-eabi --prefix=/mnt/tools/fsf/install/arm-none-eabi-gcc-4.2-branch-2007-03-16 --enable-languages=c,c++ --disable-nls --with-newlib --disable-gdbtk --disable-libssp
Thread model: single
gcc version 4.2.0 20070315 (prerelease)
 /mnt/tools/fsf/install/arm-none-eabi-gcc-4.2-branch-2007-03-16/libexec/gcc/arm-none-eabi/4.2.0/cc1 -E -quiet -v -D__USES_INITFINI__ enhance.c -O3 -fpch-preprocess -o enhance.i
ignoring nonexistent directory "/mnt/tools/fsf/install/arm-none-eabi-gcc-4.2-branch-2007-03-16/lib/gcc/arm-none-eabi/4.2.0/../../../../arm-none-eabi/sys-include"
ignoring nonexistent directory "/mnt/tools/fsf/install/arm-none-eabi-gcc-4.2-branch-2007-03-16/lib/gcc/arm-none-eabi/4.2.0/../../../../arm-none-eabi/include"#include "..." search starts here:
#include <...> search starts here:
 /mnt/tools/fsf/install/arm-none-eabi-gcc-4.2-branch-2007-03-16/lib/gcc/arm-none-eabi/4.2.0/include
End of search list.
 /mnt/tools/fsf/install/arm-none-eabi-gcc-4.2-branch-2007-03-16/libexec/gcc/arm-none-eabi/4.2.0/cc1 -fpreprocessed enhance.i -quiet -dumpbase enhance.c -auxbase-strip - -O3 -version -o-
GNU C version 4.2.0 20070315 (prerelease) (arm-none-eabi)
        compiled by GNU C version 4.0.3 (Ubuntu 4.0.3-1ubuntu5).
GGC heuristics: --param ggc-min-expand=97 --param ggc-min-heapsize=127206
Compiler executable checksum: 31464fade10aeea055a352aa873c9729
        .file   "enhance.c"
        .text
        .align  2
        .global ShouldUsePostModify
        .type   ShouldUsePostModify, %function
ShouldUsePostModify:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        ldr     ip, [r0, #0]
        mov     r3, r0
        orr     ip, ip, r1
        str     ip, [r3], #4
        ldr     r2, [r0, #4]
        orr     r2, r2, r1
        str     r2, [r0, #4]
        ldr     r0, [r3, #4]
        orr     r0, r0, r1
        str     r0, [r3, #4]
        add     r3, r3, #4
        ldr     r2, [r3, #4]
        orr     r2, r2, r1
        str     r2, [r3, #4]
        add     r3, r3, #4
        ldr     r2, [r3, #4]
        orr     r2, r2, r1
        str     r2, [r3, #4]
        add     r3, r3, #4
        ldr     r2, [r3, #4]
        orr     r2, r2, r1
        str     r2, [r3, #4]
        add     r3, r3, #4
        ldr     r2, [r3, #4]
        orr     r2, r2, r1
        str     r2, [r3, #4]
        add     r3, r3, #4
        ldr     r2, [r3, #4]
        orr     r2, r2, r1
        str     r2, [r3, #4]
        add     r3, r3, #4
        ldr     r2, [r3, #4]
        orr     r2, r2, r1
        str     r2, [r3, #4]
        add     r3, r3, #4
        ldr     r2, [r3, #4]
        orr     r2, r2, r1
        @ lr needed for prologue
        str     r2, [r3, #4]
        bx      lr
        .size   ShouldUsePostModify, .-ShouldUsePostModify
        .ident  "GCC: (GNU) 4.2.0 20070315 (prerelease)"

However this problem vanishes when I use -fno-tree-lrs, this is becuase, then each copy of intArray created by the unroller gets combined and the load/ store can be combined with the increment of intArray. However, I dont think that this ( the use of -fno-tree-lrs) is the way to go.
Comment 1 Pranav Bhandarkar 2007-03-17 14:13:07 UTC
Created attachment 13218 [details]
Source that exposes the mentioned deficiency in the compiler
Comment 2 Ramana Radhakrishnan 2007-03-19 02:52:50 UTC
A similar problem also exists on the dataflow branch. Adding Kenneth Zadeck to the CC.

However fno-tree-lrs has no impact in the df branch as on revision 122955 .
Comment 3 Hans-Peter Nilsson 2008-05-23 23:05:09 UTC
This could be a duplicate of PR20211.
Comment 4 Ramana Radhakrishnan 2009-04-30 10:30:18 UTC
Not sure if this is related to PR31849 as well.
Comment 5 Ramana Radhakrishnan 2009-05-30 09:18:15 UTC
This is improved by http://gcc.gnu.org/ml/gcc-patches/2009-05/msg01622.html. With the patch we get the following code generated. 

	.cpu cortex-a8
	.eabi_attribute 27, 3
	.fpu neon
	.eabi_attribute 20, 1
	.eabi_attribute 21, 1
	.eabi_attribute 23, 3
	.eabi_attribute 24, 1
	.eabi_attribute 25, 1
	.eabi_attribute 26, 1
	.eabi_attribute 30, 2
	.eabi_attribute 18, 4
	.file	"31241.c"
	.text
	.align	2
	.global	foo
	.type	foo, %function
foo:
	@ args = 0, pretend = 0, frame = 0
	@ frame_needed = 0, uses_anonymous_args = 0
	@ link register save eliminated.
	add	r2, r0, #40
.L2:
	ldr	r3, [r0, #0]
	orr	r3, r1, r3
	str	r3, [r0], #4
	cmp	r0, r2
	bne	.L2
	bx	lr
	.size	foo, .-foo
	.ident	"GCC: (GNU) 4.5.0 20090527 (experimental)"
Comment 6 Ramana Radhakrishnan 2010-03-22 21:16:04 UTC
Unrolling the loop still doesn't show better use of auto-incs.