Bug 33699 - [4.8/4.9/5/6 regression] missing optimization on const addr area store
[4.8/4.9/5/6 regression] missing optimization on const addr area store
Status: NEW
Product: gcc
Classification: Unclassified
Component: middle-end
4.3.0
: P2 normal
: 4.8.5
Assigned To: Not yet assigned to anyone
: missed-optimization
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-10-08 16:54 UTC by fshvaige
Modified: 2014-12-19 13:36 UTC (History)
6 users (show)

See Also:
Host:
Target: mips*-* powerpc*-*-* x86_64-*-*
Build:
Known to work: 3.4.0
Known to fail: 4.0.0, 4.1.3, 4.2.2, 4.3.0, 4.6.0
Last reconfirmed: 2012-01-04 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description fshvaige 2007-10-08 16:54:26 UTC
Same problem for
-0s/-02
version 4.1.0
etc...


[Code]

typedef unsigned * ptr_t;
void f (void) {
    ptr_t p = (ptr_t)0xFED0;
    p[0] = 0xDEAD;
    p[2] = 0xDEAD;
    p[4] = 0xDEAD;
    p[6] = 0xDEAD;
}


[Assembly generated by version gcc-4.3-20071005]

00000000 <f>:
   0:	3404dead 	li	a0,0xdead
   4:	3402fee8 	li	v0,0xfee8
   8:	3403fed0 	li	v1,0xfed0
   c:	ac440000 	sw	a0,0(v0)
  10:	ac640000 	sw	a0,0(v1)
  14:	3402fed8 	li	v0,0xfed8
  18:	3403fee0 	li	v1,0xfee0
  1c:	ac440000 	sw	a0,0(v0)
  20:	03e00008 	jr	ra
  24:	ac640000 	sw	a0,0(v1)


[Assembly generated by version 3.4.5 (seems better)]

00000000 <f>:
   0:	3403fed0 	li	v1,0xfed0
   4:	3402dead 	li	v0,0xdead
   8:	ac620018 	sw	v0,24(v1)
   c:	ac620000 	sw	v0,0(v1)
  10:	ac620008 	sw	v0,8(v1)
  14:	03e00008 	jr	ra
  18:	ac620010 	sw	v0,16(v1)
  1c:	00000000 	nop


[Version]

Using built-in specs.
Target: mips-elf
Configured with: ../gcc-4.3-20071005/configure --enable-languages=c,c++ --prefix=/auto/mipaproj/fshvaige/apps/Linux/gcc-4.3-20071005 --target=mips-elf --program-suffix=.mips --without-headers --with-newlib
Thread model: single
gcc version 4.3.0 20071005 (experimental) (GCC) 


[Command line options]

gcc.mips -c -o main.o -v -save-temps -O3 -march=mips64 -mabi=eabi -mexplicit-relocs main.c
Comment 1 Andrew Pinski 2007-12-26 01:33:50 UTC
The issue here is that we are using constants as being free in the first place and not being able to decompose them later on, in the RTL level.
Comment 2 Steven Bosscher 2008-01-07 18:24:00 UTC
This is related to some work done in the past for auto-increment addressing modes (even though there are no auto-inc/dec modes in the reporter's assembly).  See one of Joern's old patches: http://gcc.gnu.org/ml/gcc-patches/2005-02/msg01612.html

Look at the comment before optimize_related_value() to understand what this patch is supposed to achieve.  Let's not talk about how it achieved this -- it suffices to say that the patch is not in the trunk -- but we really do need a pass over RTL to optimize this kind of thing.
Comment 3 Joseph S. Myers 2008-07-04 22:18:43 UTC
Closing 4.1 branch.
Comment 4 Jorn Wolfgang Rennecke 2009-02-08 12:49:33 UTC
(In reply to comment #2)
> This is related to some work done in the past for auto-increment addressing
> modes

Actually, the problem with constants that are loaded into registers -
and in the same basic block, at that - is much simpler.
If the targets rtx_cost works properly, then reload_cse_move2add should
fix up this code.

We need, however, some way to deal with the case where constants are expensive
addresses; this is completely broken at the moment.  Complete unrolling of
loops accessing static arrays can create oodles of constant addresses; I've
managed to split these up with LEGITIMIZE_ADDRESS, the movsi expander, and
a patch to momory_address, however, gcse just recombines the costly constants,
irrespective of what rtx_cost and address_cost says.
And the havoc that gcse can wreak transcends basic blocks, so any attempt to
clean up after if with lesser scope is bound to be inferior.
Comment 5 Joseph S. Myers 2009-03-31 20:12:29 UTC
Closing 4.2 branch.
Comment 6 Adam Nemet 2009-05-28 07:43:13 UTC
Subject: Bug 33699

Author: nemet
Date: Thu May 28 07:42:52 2009
New Revision: 147944

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=147944
Log:
	PR middle-end/33699
	* target.h (struct gcc_target): Fix indentation.  Add
	const_anchor.
	* target-def.h (TARGET_CONST_ANCHOR): New macro.
	(TARGET_INITIALIZER): Use it.
	* cse.c (CHEAPER): Move it up to the other macros.
	(insert): Rename this ...
	(insert_with_costs): ... to this.  Add cost parameters.  Update
	function comment.
	(insert): New function.  Call insert_with_costs.
	(compute_const_anchors, insert_const_anchor, insert_const_anchors,
	find_reg_offset_for_const, try_const_anchors): New functions.
	(cse_insn): Call try_const_anchors.  Adjust cost of src_related
	when using a const-anchor.  Call insert_const_anchors.
	* config/mips/mips.c (mips_set_mips16_mode): Set
	targetm.const_anchor.
	* doc/tm.texi (Misc): Document TARGET_CONST_ANCHOR.

testsuite/
	* gcc.target/mips/const-anchor-1.c: New test.
	* gcc.target/mips/const-anchor-2.c: New test.

Added:
    trunk/gcc/testsuite/gcc.target/mips/const-anchor-1.c
    trunk/gcc/testsuite/gcc.target/mips/const-anchor-2.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/mips/mips.c
    trunk/gcc/cse.c
    trunk/gcc/doc/tm.texi
    trunk/gcc/target-def.h
    trunk/gcc/target.h
    trunk/gcc/testsuite/ChangeLog

Comment 7 Adam Nemet 2009-05-28 07:49:14 UTC
Note that the above patch does not yet fix the testcase.  Besides this patch we need some more cost adjustments and also some changes in fwprop to propagate into the address expression.
Comment 8 Richard Biener 2009-08-04 12:28:24 UTC
GCC 4.3.4 is being released, adjusting target milestone.
Comment 9 Andrew Pinski 2010-03-12 23:56:39 UTC
PowerPC has the same issue.  X86 does not because it's move instruction can take a constant address.
Comment 10 Richard Biener 2010-05-22 18:11:42 UTC
GCC 4.3.5 is being released, adjusting target milestone.
Comment 11 Richard Biener 2011-03-04 12:12:27 UTC
Even on x86 it's smaller to not replicate 0xDEAD and to use
offset addressing, like

0000000000000000 <f>:
   0: b8 d0 fe 00 00            mov    $0xfed0,%eax
   5: bb ad de 00 00            mov    $0xdead,%ebx
   a: 89 18                     mov    %ebx,(%rax)
   c: 89 58 08                  mov    %ebx,0x8(%rax)
   f: 89 58 10                  mov    %ebx,0x10(%rax)
  12: 89 58 18                  mov    %ebx,0x18(%rax)
  15: c3                        retq   

instead of the generated

   0: c7 04 25 d0 fe 00 00      movl   $0xdead,0xfed0
   7: ad de 00 00 
   b: c7 04 25 d8 fe 00 00      movl   $0xdead,0xfed8
  12: ad de 00 00 
  16: c7 04 25 e0 fe 00 00      movl   $0xdead,0xfee0
  1d: ad de 00 00 
  21: c7 04 25 e8 fe 00 00      movl   $0xdead,0xfee8
  28: ad de 00 00 
  2c: c3                        retq
Comment 12 Michael Matz 2011-03-04 15:33:04 UTC
Smaller perhaps, but it uses two registers, where it originally used none.
For x86 that's the better tradeoff.
Comment 13 Richard Biener 2011-06-27 12:14:26 UTC
4.3 branch is being closed, moving to 4.4.7 target.
Comment 14 Jakub Jelinek 2012-03-13 12:47:52 UTC
4.4 branch is being closed, moving to 4.5.4 target.
Comment 15 Richard Biener 2012-07-02 11:48:47 UTC
The 4.5 branch is being closed, adjusting target milestone.
Comment 16 Andrew Pinski 2012-12-31 10:09:07 UTC
(In reply to comment #12)
> Smaller perhaps, but it uses two registers, where it originally used none.
> For x86 that's the better tradeoff.

Except for the obvious -Os.
Comment 17 Jakub Jelinek 2013-04-12 15:17:00 UTC
GCC 4.6.4 has been released and the branch has been closed.
Comment 18 Richard Biener 2014-06-12 13:46:53 UTC
The 4.7 branch is being closed, moving target milestone to 4.8.4.
Comment 19 Jakub Jelinek 2014-12-19 13:36:14 UTC
GCC 4.8.4 has been released.