Bug 9703 - [arm] Accessing data through constant pool more times could be solved in less instructions
Summary: [arm] Accessing data through constant pool more times could be solved in less...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.3
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization, patch
Depends on:
Blocks:
 
Reported: 2003-02-14 09:46 UTC by lac
Modified: 2009-03-13 09:39 UTC (History)
1 user (show)

See Also:
Host:
Target: arm-*-elf
Build:
Known to work:
Known to fail:
Last reconfirmed: 2006-02-02 17:14:49


Attachments
data-acc-t-cp.tar.gz (730 bytes, application/x-gzip )
2003-05-21 15:17 UTC, lac
Details

Note You need to log in before you can comment on or make changes to this bug.
Description lac 2003-02-14 09:46:01 UTC
Each data access through constant pool requires 2 instructions: (1) to access constant pool element (data address) and (2) to access the data. Storing the constant pool base address in a register could save one instruction per subsequent data accesses. (See attached files.)

Release:
gcc version 3.3 20030210 (prerelease)

Environment:
BUILD & HOST: Linux 2.4.20 i686 unknown
TARGET: arm-unknown-elf

How-To-Repeat:
Use the command line below to compile the specified source code:
arm-elf-gcc -S -g0 -Os -o 02.s ./02.c

Attached is the original assembly output where accessing globalvar1 can be solved in one less instructions as shown in the manually edited assembly file.

// 01.c:

# 1 "01.c"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "01.c"
unsigned int globvar1=0;
unsigned int globvar2=0;
+C1
int f1()
{
    globvar1=11;
    globvar2=12;
    return globvar1;
}

int main ()
{
  return f1();
}
Comment 1 Dara Hazeghi 2003-05-26 20:59:30 UTC
Hello,

with gcc 3.3 branch and mainline (20030509) I get the following code. Could you confirm whether 
this still exhibits the problem you noted? Thanks,

Dara

        .file   "01.i"
        .global globvar1
        .bss
        .global globvar1
        .align  2
        .type   globvar1, %object
        .size   globvar1, 4
globvar1:
        .space  4
        .global globvar2
        .global globvar2
        .align  2
        .type   globvar2, %object
        .size   globvar2, 4
globvar2:
        .space  4
        .text
        .align  2
        .global f1
        .type   f1, %function
f1:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        ldr     r3, .L2
        mov     r2, #12
        str     r2, [r3, #0]
        ldr     r3, .L2+4
        mov     r0, #11
        @ lr needed for prologue
        str     r0, [r3, #0]
        mov     pc, lr
.L3:
        .align  2
.L2:
        .word   globvar2
        .word   globvar1
        .size   f1, .-f1
        .align  2
        .global main
        .type   main, %function
main:
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        @ lr needed for prologue
        b       f1
        .size   main, .-main
        .ident  "GCC: (GNU) 3.3 20030508 (prerelease)"
Comment 2 Andrew Pinski 2003-05-26 21:05:15 UTC
See Dara's question.
Comment 3 lac 2003-05-27 14:05:36 UTC
Subject: Re:  [arm] Accessing data through constant pool
 more times could be solved in less instructions

Hello!

Thanks for your question! The problem still exists... I commented your
code below. The relevant part is function f1. The ldr insn is unnecessary
at (*) when in (**) offset is used. With sequential cpool accesses 1 insn
per access should be dropped.
Regards,
Laszlo

On Mon, 26 May 2003, dhazeghi@yahoo.com wrote:

> PLEASE REPLY TO gcc-bugzilla@gcc.gnu.org ONLY, *NOT* gcc-bugs@gcc.gnu.org.
>
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=9703
>
>
>
> ------- Additional Comments From dhazeghi@yahoo.com  2003-05-26 20:59 -------
> Hello,
>
> with gcc 3.3 branch and mainline (20030509) I get the following code. Could you confirm whether
> this still exhibits the problem you noted? Thanks,
>
> Dara
>
>         .file   "01.i"
>         .global globvar1
>         .bss
>         .global globvar1
>         .align  2
>         .type   globvar1, %object
>         .size   globvar1, 4
> globvar1:
>         .space  4
>         .global globvar2
>         .global globvar2
>         .align  2
>         .type   globvar2, %object
>         .size   globvar2, 4
> globvar2:
>         .space  4
>         .text
>         .align  2
>         .global f1
>         .type   f1, %function
> f1:
>         @ args = 0, pretend = 0, frame = 0
>         @ frame_needed = 0, uses_anonymous_args = 0
>         @ link register save eliminated.
>         ldr     r3, .L2
>         mov     r2, #12
>         str     r2, [r3, #0]
>         ldr     r3, .L2+4		; (*) can be removed
>         mov     r0, #11
>         @ lr needed for prologue
>         str     r0, [r3, #0] 		; (**) with offset instead of #0
>         mov     pc, lr
> .L3:
>         .align  2
> .L2:
>         .word   globvar2
>         .word   globvar1
>         .size   f1, .-f1
>         .align  2
>         .global main
>         .type   main, %function
> main:
>         @ args = 0, pretend = 0, frame = 0
>         @ frame_needed = 0, uses_anonymous_args = 0
>         @ link register save eliminated.
>         @ lr needed for prologue
>         b       f1
>         .size   main, .-main
>         .ident  "GCC: (GNU) 3.3 20030508 (prerelease)"
>
>
>
> ------- You are receiving this mail because: -------
> You reported the bug, or are watching the reporter.
>
Comment 4 Philip Blundell 2003-05-28 13:13:39 UTC
I was working on some code to help with this sometime last year.  I'll try to
look at it again.
Comment 5 Khem Raj 2005-09-27 03:20:57 UTC
This is still exhibited on mainline gcc.
Comment 6 Andrew Pinski 2006-02-02 16:54:06 UTC
Should be helped or almost ready to be fixed by:
http://gcc.gnu.org/wiki/Section%20Anchor%20Optimisations
Comment 7 Richard Sandiford 2006-02-02 17:14:49 UTC
Patch posted here:

http://gcc.gnu.org/ml/gcc-patches/2006-02/msg00133.html
Comment 8 Richard Sandiford 2006-02-18 22:07:01 UTC
Subject: Bug 9703

Author: rsandifo
Date: Sat Feb 18 22:06:53 2006
New Revision: 111254

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=111254
Log:
	* cselib.c (cselib_init): Change RTX_SIZE to RTX_CODE_SIZE.
	* emit-rtl.c (copy_rtx_if_shared_1): Use shallow_copy_rtx.
	(copy_insn_1): Likewise.  Don't copy each field individually.
	Reindent.
	* read-rtl.c (apply_macro_to_rtx): Use RTX_CODE_SIZE instead
	of RTX_SIZE.
	* reload1.c (eliminate_regs): Use shallow_copy_rtx.
	* rtl.c (rtx_size): Rename variable to...
	(rtx_code_size): ...this.
	(rtx_size): New function.
	(rtx_alloc_stat): Use RTX_CODE_SIZE instead of RTX_SIZE.
	(copy_rtx): Use shallow_copy_rtx.  Don't copy each field individually.
	Reindent.
	(shallow_copy_rtx_stat): Use rtx_size instead of RTX_SIZE.
	* rtl.h (rtx_code_size): New variable.
	(rtx_size): Change from a variable to a function.
	(RTX_SIZE): Rename to...
	(RTX_CODE_SIZE): ...this.

	PR target/9703
	PR tree-optimization/17106
	* doc/tm.texi (TARGET_USE_BLOCKS_FOR_CONSTANT_P): Document.
	(Anchored Addresses): New section.
	* doc/invoke.texi (-fsection-anchors): Document.
	* doc/rtl.texi (SYMBOL_REF_IN_BLOCK_P, SYMBOL_FLAG_IN_BLOCK): Likewise.
	(SYMBOL_REF_ANCHOR_P, SYMBOL_FLAG_ANCHOR): Likewise.
	(SYMBOL_REF_BLOCK, SYMBOL_REF_BLOCK_OFFSET): Likewise.
	* hooks.c (hook_bool_mode_rtx_false): New function.
	* hooks.h (hook_bool_mode_rtx_false): Declare.
	* gengtype.c (create_optional_field): New function.
	(adjust_field_rtx_def): Add the "block_sym" field for SYMBOL_REFs when
	SYMBOL_REF_IN_BLOCK_P is true.
	* target.h (output_anchor, use_blocks_for_constant_p): New hooks.
	(min_anchor_offset, max_anchor_offset): Likewise.
	(use_anchors_for_symbol_p): New hook.
	* toplev.c (compile_file): Call output_object_blocks.
	(target_supports_section_anchors_p): New function.
	(process_options): Check that -fsection-anchors is only used on
	targets that support it and when -funit-at-a-time is in effect.
	* tree-ssa-loop-ivopts.c (prepare_decl_rtl): Only create DECL_RTL
	if the decl doesn't have one.
	* dwarf2out.c: Remove instantiations of VEC(rtx,gc).
	* expr.c (emit_move_multi_word, emit_move_insn): Pass the result
	of force_const_mem through use_anchored_address.
	(expand_expr_constant): New function.
	(expand_expr_addr_expr_1): Call it.  Use the same modifier when
	calling expand_expr for INDIRECT_REF.
	(expand_expr_real_1): Pass DECL_RTL through use_anchored_address
	for all modifiers except EXPAND_INITIALIZER.  Use expand_expr_constant.
	* expr.h (use_anchored_address): Declare.
	* loop-unroll.c: Don't declare rtx vectors here.
	* explow.c: Include output.h.
	(validize_mem): Call use_anchored_address.
	(use_anchored_address): New function.
	* common.opt (-fsection-anchors): New switch.
	* varasm.c (object_block_htab, anchor_labelno): New variables.
	(hash_section, object_block_entry_eq, object_block_entry_hash)
	(use_object_blocks_p, get_block_for_section, create_block_symbol)
	(use_blocks_for_decl_p, change_symbol_section): New functions.
	(get_variable_section): New function, split out from assemble_variable.
	(make_decl_rtl): Create a block symbol if use_object_blocks_p and
	use_blocks_for_decl_p say so.  Use change_symbol_section if the
	symbol has already been created.
	(assemble_variable_contents): New function, split out from...
	(assemble_variable): ...here.  Don't output any code for
	block symbols; just pass them to place_block_symbol.
	Use get_variable_section and assemble_variable_contents.
	(get_constant_alignment, get_constant_section, get_constant_size): New
	functions, split from output_constant_def_contents.
	(build_constant_desc): Create a block symbol if use_object_blocks_p
	says so.  Or into SYMBOL_REF_FLAGS.
	(assemble_constant_contents): New function, split from...
	(output_constant_def_contents): ...here.  Don't output any code
	for block symbols; just pass them to place_section_symbol.
	Use get_constant_section and get_constant_alignment.
	(force_const_mem): Create a block symbol if use_object_blocks_p and
	use_blocks_for_constant_p say so.  Or into SYMBOL_REF_FLAGS.
	(output_constant_pool_1): Add an explicit alignment argument.
	Don't switch sections here.
	(output_constant_pool): Adjust call to output_constant_pool_1.
	Switch sections here instead.  Don't output anything for block symbols;
	just pass them to place_block_symbol.
	(init_varasm_once): Initialize object_block_htab.
	(default_encode_section_info): Keep the old SYMBOL_FLAG_IN_BLOCK.
	(default_asm_output_anchor, default_use_aenchors_for_symbol_p)
	(place_block_symbol, get_section_anchor, output_object_block)
	(output_object_block_htab, output_object_blocks): New functions.
	* target-def.h (TARGET_ASM_OUTPUT_ANCHOR): New macro.
	(TARGET_ASM_OUT): Include it.
	(TARGET_USE_BLOCKS_FOR_CONSTANT_P): New macro.
	(TARGET_MIN_ANCHOR_OFFSET, TARGET_MAX_ANCHOR_OFFSET): New macros.
	(TARGET_USE_ANCHORS_FOR_SYMBOL_P): New macro.
	(TARGET_INITIALIZER): Include them.
	* rtl.c (rtl_check_failed_block_symbol): New function.
	* rtl.h: Include vec.h.  Declare heap and gc rtx vectors.
	(block_symbol, object_block): New structures.
	(rtx_def): Add a block_symbol field to the union.
	(BLOCK_SYMBOL_CHECK): New macro.
	(rtl_check_failed_block_symbol): Declare.
	(SYMBOL_FLAG_IN_BLOCK, SYMBOL_FLAG_ANCHOR): New SYMBOL_REF flags.
	(SYMBOL_REF_IN_BLOCK_P, SYMBOL_REF_ANCHOR_P): New predicates.
	(SYMBOL_FLAG_MACH_DEP_SHIFT): Bump by 2.
	(SYMBOL_REF_BLOCK, SYMBOL_REF_BLOCK_OFFSET): New accessors.
	* output.h (output_section_symbols): Declare.
	(object_block): Name structure.
	(place_section_symbol, get_section_anchor, default_asm_output_anchor)
	(default_use_anchors_for_symbol_p): Declare.
	* Makefile.in (RTL_BASE_H): Add vec.h.
	(explow.o): Depend on output.h.
	* config/rs6000/rs6000.c (TARGET_MIN_ANCHOR_OFFSET): Override default.
	(TARGET_MAX_ANCHOR_OFFSET): Likewise.
	(TARGET_USE_BLOCKS_FOR_CONSTANT_P): Likewise.
	(rs6000_use_blocks_for_constant_p): New function.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/Makefile.in
    trunk/gcc/common.opt
    trunk/gcc/config/rs6000/rs6000.c
    trunk/gcc/cselib.c
    trunk/gcc/doc/invoke.texi
    trunk/gcc/doc/rtl.texi
    trunk/gcc/doc/tm.texi
    trunk/gcc/dwarf2out.c
    trunk/gcc/emit-rtl.c
    trunk/gcc/explow.c
    trunk/gcc/expr.c
    trunk/gcc/expr.h
    trunk/gcc/gengtype.c
    trunk/gcc/hooks.c
    trunk/gcc/hooks.h
    trunk/gcc/loop-unroll.c
    trunk/gcc/output.h
    trunk/gcc/read-rtl.c
    trunk/gcc/reload1.c
    trunk/gcc/rtl.c
    trunk/gcc/rtl.h
    trunk/gcc/target-def.h
    trunk/gcc/target.h
    trunk/gcc/toplev.c
    trunk/gcc/tree-ssa-loop-ivopts.c
    trunk/gcc/varasm.c

Comment 9 Richard Sandiford 2006-02-18 22:22:53 UTC
The patch I committed should provide the general infrastructure,
but an ARM patch will be needed to make use of it.  ARM code
would also benefit if we tried to reuse addresses that the
function had to calculate anyway, rather than use arbitrary
anchors for everything.  (I wrote a message about this that
I was supposed to send to gcc-patches@.  However, it isn't
in either the web archives or gmane, so I suspect I sent it
privately by accident.)

I'm not intending to do the ARM bits myself right now,
so I'll return this PR to unassigned.
Comment 10 Andrew Pinski 2008-12-20 00:26:46 UTC
This might have been implemented for 4.4 already.  Section anchors now have been enabled for ARM.
Comment 11 Ramana Radhakrishnan 2009-02-08 05:23:56 UTC
(In reply to comment #10)
> This might have been implemented for 4.4 already.  Section anchors now have
> been enabled for ARM.
> 

4.4 seems to enable this with section anchors turned on. This is the code generated. 

Here is the code generated for the function reported. 

	ldr	r3, .L3
	mov	r0, #11
	mov	r2, #12
	stmia	r3, {r0, r2}	@ phole stm
	bx	lr


I suspect this can now be closed. 

Comment 12 Richard Earnshaw 2009-03-13 09:39:49 UTC
Fixed.