Same problem for -0s/-02 version 4.1.0 etc... [Code] typedef unsigned * ptr_t; void f (void) { ptr_t p = (ptr_t)0xFED0; p[0] = 0xDEAD; p[2] = 0xDEAD; p[4] = 0xDEAD; p[6] = 0xDEAD; } [Assembly generated by version gcc-4.3-20071005] 00000000 <f>: 0: 3404dead li a0,0xdead 4: 3402fee8 li v0,0xfee8 8: 3403fed0 li v1,0xfed0 c: ac440000 sw a0,0(v0) 10: ac640000 sw a0,0(v1) 14: 3402fed8 li v0,0xfed8 18: 3403fee0 li v1,0xfee0 1c: ac440000 sw a0,0(v0) 20: 03e00008 jr ra 24: ac640000 sw a0,0(v1) [Assembly generated by version 3.4.5 (seems better)] 00000000 <f>: 0: 3403fed0 li v1,0xfed0 4: 3402dead li v0,0xdead 8: ac620018 sw v0,24(v1) c: ac620000 sw v0,0(v1) 10: ac620008 sw v0,8(v1) 14: 03e00008 jr ra 18: ac620010 sw v0,16(v1) 1c: 00000000 nop [Version] Using built-in specs. Target: mips-elf Configured with: ../gcc-4.3-20071005/configure --enable-languages=c,c++ --prefix=/auto/mipaproj/fshvaige/apps/Linux/gcc-4.3-20071005 --target=mips-elf --program-suffix=.mips --without-headers --with-newlib Thread model: single gcc version 4.3.0 20071005 (experimental) (GCC) [Command line options] gcc.mips -c -o main.o -v -save-temps -O3 -march=mips64 -mabi=eabi -mexplicit-relocs main.c
The issue here is that we are using constants as being free in the first place and not being able to decompose them later on, in the RTL level.
This is related to some work done in the past for auto-increment addressing modes (even though there are no auto-inc/dec modes in the reporter's assembly). See one of Joern's old patches: http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01612.html Look at the comment before optimize_related_value() to understand what this patch is supposed to achieve. Let's not talk about how it achieved this -- it suffices to say that the patch is not in the trunk -- but we really do need a pass over RTL to optimize this kind of thing.
Closing 4.1 branch.
(In reply to comment #2) > This is related to some work done in the past for auto-increment addressing > modes Actually, the problem with constants that are loaded into registers - and in the same basic block, at that - is much simpler. If the targets rtx_cost works properly, then reload_cse_move2add should fix up this code. We need, however, some way to deal with the case where constants are expensive addresses; this is completely broken at the moment. Complete unrolling of loops accessing static arrays can create oodles of constant addresses; I've managed to split these up with LEGITIMIZE_ADDRESS, the movsi expander, and a patch to momory_address, however, gcse just recombines the costly constants, irrespective of what rtx_cost and address_cost says. And the havoc that gcse can wreak transcends basic blocks, so any attempt to clean up after if with lesser scope is bound to be inferior.
Closing 4.2 branch.
Subject: Bug 33699 Author: nemet Date: Thu May 28 07:42:52 2009 New Revision: 147944 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147944 Log: PR middle-end/33699 * target.h (struct gcc_target): Fix indentation. Add const_anchor. * target-def.h (TARGET_CONST_ANCHOR): New macro. (TARGET_INITIALIZER): Use it. * cse.c (CHEAPER): Move it up to the other macros. (insert): Rename this ... (insert_with_costs): ... to this. Add cost parameters. Update function comment. (insert): New function. Call insert_with_costs. (compute_const_anchors, insert_const_anchor, insert_const_anchors, find_reg_offset_for_const, try_const_anchors): New functions. (cse_insn): Call try_const_anchors. Adjust cost of src_related when using a const-anchor. Call insert_const_anchors. * config/mips/mips.c (mips_set_mips16_mode): Set targetm.const_anchor. * doc/tm.texi (Misc): Document TARGET_CONST_ANCHOR. testsuite/ * gcc.target/mips/const-anchor-1.c: New test. * gcc.target/mips/const-anchor-2.c: New test. Added: trunk/gcc/testsuite/gcc.target/mips/const-anchor-1.c trunk/gcc/testsuite/gcc.target/mips/const-anchor-2.c Modified: trunk/gcc/ChangeLog trunk/gcc/config/mips/mips.c trunk/gcc/cse.c trunk/gcc/doc/tm.texi trunk/gcc/target-def.h trunk/gcc/target.h trunk/gcc/testsuite/ChangeLog
Note that the above patch does not yet fix the testcase. Besides this patch we need some more cost adjustments and also some changes in fwprop to propagate into the address expression.
GCC 4.3.4 is being released, adjusting target milestone.
PowerPC has the same issue. X86 does not because it's move instruction can take a constant address.
GCC 4.3.5 is being released, adjusting target milestone.
Even on x86 it's smaller to not replicate 0xDEAD and to use offset addressing, like 0000000000000000 <f>: 0: b8 d0 fe 00 00 mov $0xfed0,%eax 5: bb ad de 00 00 mov $0xdead,%ebx a: 89 18 mov %ebx,(%rax) c: 89 58 08 mov %ebx,0x8(%rax) f: 89 58 10 mov %ebx,0x10(%rax) 12: 89 58 18 mov %ebx,0x18(%rax) 15: c3 retq instead of the generated 0: c7 04 25 d0 fe 00 00 movl $0xdead,0xfed0 7: ad de 00 00 b: c7 04 25 d8 fe 00 00 movl $0xdead,0xfed8 12: ad de 00 00 16: c7 04 25 e0 fe 00 00 movl $0xdead,0xfee0 1d: ad de 00 00 21: c7 04 25 e8 fe 00 00 movl $0xdead,0xfee8 28: ad de 00 00 2c: c3 retq
Smaller perhaps, but it uses two registers, where it originally used none. For x86 that's the better tradeoff.
4.3 branch is being closed, moving to 4.4.7 target.
4.4 branch is being closed, moving to 4.5.4 target.
The 4.5 branch is being closed, adjusting target milestone.
(In reply to comment #12) > Smaller perhaps, but it uses two registers, where it originally used none. > For x86 that's the better tradeoff. Except for the obvious -Os.
GCC 4.6.4 has been released and the branch has been closed.
The 4.7 branch is being closed, moving target milestone to 4.8.4.
GCC 4.8.4 has been released.
The gcc-4_8-branch is being closed, re-targeting regressions to 4.9.3.
GCC 4.9.3 has been released.
GCC 4.9 branch is being closed
No progress in GCC 7.0 which emits the following code for powerpc64le at -O2 (-Os is slightly different but the same size): 0000000000000000 <f>: 0: 00 00 20 39 li r9,0 4: 00 00 c0 38 li r6,0 8: 00 00 e0 38 li r7,0 c: 00 00 00 39 li r8,0 10: 00 00 40 39 li r10,0 14: ad de 29 61 ori r9,r9,57005 18: d0 fe c6 60 ori r6,r6,65232 1c: d8 fe e7 60 ori r7,r7,65240 20: e0 fe 08 61 ori r8,r8,65248 24: e8 fe 4a 61 ori r10,r10,65256 28: 00 00 26 91 stw r9,0(r6) 2c: 00 00 27 91 stw r9,0(r7) 30: 00 00 28 91 stw r9,0(r8) 34: 00 00 2a 91 stw r9,0(r10) 38: 20 00 80 4e blr Clang in contrast emits the following more compact code: 0000000000000000 <f>: 0: 00 00 60 3c lis r3,0 4: 00 00 80 38 li r4,0 8: 01 00 a0 3c lis r5,1 c: ad de 63 60 ori r3,r3,57005 10: d0 fe 84 60 ori r4,r4,65232 14: d0 fe 65 90 stw r3,-304(r5) 18: 08 00 64 90 stw r3,8(r4) 1c: 10 00 64 90 stw r3,16(r4) 20: 18 00 64 90 stw r3,24(r4) 24: 20 00 80 4e blr
GCC 5 branch is being closed
(In reply to Martin Sebor from comment #23) > No progress in GCC 7.0 which emits the following code for powerpc64le at -O2 > (-Os is slightly different but the same size): Same thing on mainline still.
(In reply to Michael Matz from comment #12) > Smaller perhaps, but it uses two registers, where it originally used none. > For x86 that's the better tradeoff. That can be handled by doing it in some very late post-RA pass, and only do it if we can find a usable register for that.
GCC 6 branch is being closed
The GCC 7 branch is being closed, re-targeting to GCC 8.4.
GCC 8.4.0 has been released, adjusting target milestone.
GCC 8 branch is being closed.
GCC 9.4 is being released, retargeting bugs to GCC 9.5.
I looked at adding the following powerpc patch that was proposed in March, 2021: https://gcc.gnu.org/pipermail/gcc-patches/2021-March/566744.html There are two parts to the patch, that are sort of unrelated. The first part is to add minimum and maximum section anchor offset values and use -fsection anchors. I ran a spec 2017 benchmark on a pre-production power10 system, comparing my normal run times to run times with -fsection-anchors and setting the minimum/maximum section anchor offsets. Two benchmarks improved and two benchmarks regressed: xalancbmk_r: 1.75% regression cactuBSSN_r: 4.24% improvement blender_r: 1.92% regression roms_r: 1.05% improvement I then built spec 2017 with just the part of setting const_anchor, but not the section anchor minimum/maximum offsets. Eight benchmarks did not build due to assertion failures in cse.c: gcc_r exchange2_r cactuBSSN_r wrf_r blender_r cam4_r fotonik3d_r roms_r If I specify the section anchor minimum/maximum offsets, add -fsection-anchors, and set the const_anchor, all 23 INT+FP benchmarks build, but WRF_R does not run correctly. So without more debugging, I don't recommend setting const_anchor. It is probably useful to set the minimum/maximum section anchor offsets in case people use -fsection-anchors. As an aside, if we wanted to accept using constant addresses in the PowerPC, we would need to recognize a constant address as being legitimate. This may be useful in some embedded environments where you have devices at certain memory locations. But somebody would need to add the support.
GCC 9 branch is being closed
GCC 10.4 is being released, retargeting bugs to GCC 10.5.
GCC 10 branch is being closed.
GCC 11 branch is being closed.