Bug 49857 - Put constant switch-tables into flash
Summary: Put constant switch-tables into flash
Status: RESOLVED WONTFIX
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.6.1
: P4 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: addr-space, missed-optimization
Depends on: 49868
Blocks:
  Show dependency treegraph
 
Reported: 2011-07-26 20:49 UTC by Georg-Johann Lay
Modified: 2017-07-31 08:34 UTC (History)
1 user (show)

See Also:
Host:
Target: avr
Build:
Known to work:
Known to fail: 4.6.1, 4.7.0, 6.2.0
Last reconfirmed: 2011-12-01 00:00:00


Attachments
C source with constant switch statement (142 bytes, text/plain)
2011-07-26 20:49 UTC, Georg-Johann Lay
Details
pr49857-v2-all.diff: Proposed patch (6.57 KB, patch)
2017-07-25 12:15 UTC, Georg-Johann Lay
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Georg-Johann Lay 2011-07-26 20:49:40 UTC
Created attachment 24837 [details]
C source with constant switch statement

For constant switches like

int sw_2 (char x)
{
    switch(x) 
    { 
        case '0': return -1;
        case '1': return  2;
        case '2': return  3;
        case '3': return  5;
        case '4': return  7;
        case '5': return 11;
        case '6': return 13;
        case '7': return 17;
        case '8': return 19;
        case '9': return 23;
        case 'a':return 29;
        case 'A':return 29;
    }
    
    return -1;
}

avr-gcc 4.6.1 generates a lookup table (-Os -mmcu=atmega8 -S -fverbose-asm)

sw_2:
	subi r24,lo8(-(-49))	 ;  csui.0,
	cpi r24,lo8(49)	 ;  csui.0,
	brsh .L3	 ; ,
	mov r30,r24	 ;  tmp50, csui.0
	ldi r31,lo8(0)	 ; ,
	subi r30,lo8(-(CSWTCH.1))	 ;  tmp50,
	sbci r31,hi8(-(CSWTCH.1))	 ;  tmp50,
	ld r24,Z	 ;  tmp51, CSWTCH.1
	clr r25	 ;  D.1929
	sbrc r24,7	 ;  D.1929
	com r25	 ;  D.1929
	ret
.L3:
	ldi r24,lo8(-1)	 ;  D.1929,
	ldi r25,hi8(-1)	 ;  D.1929,
	ret
	.size	sw_2, .-sw_2
	.data
	.type	CSWTCH.1, @object
	.size	CSWTCH.1, 49
CSWTCH.1:
	.byte	2
	.byte	3
	.byte	5
	.byte	7
	.byte	11
	.byte	13
	.byte	17
	.byte	19
	.byte	23
	.byte	-1
	.byte	-1
	.byte	-1
...

CSWTCH.1 is put in .data (.rodata would not be better) so that the lookup table ends up in RAM.  

The table is constant and could go into flash memory, i.e. .progmem.data.

===================================

avr-gcc -v -Os -fverbose-asm -mmcu=atmega8 -W -Wall -S foo.c # dp -mmcu=atmega128 foo.c -save-temps -Os -c
Using built-in specs.
COLLECT_GCC=e:\WinAVR\4.6.1\bin\avr-gcc.exe
COLLECT_LTO_WRAPPER=e:/winavr/4.6.1/bin/../libexec/gcc/avr/4.6.1/lto-wrapper.exe
Target: avr
Configured with: ../../gcc.gnu.org/gcc-4_6-branch/configure --target=avr --prefix=/local/gnu/install/gcc-4.6-mingw32 --host=i586-mingw32 --build=i686-linux-gnu --enable-languages=c,c++ --disable-nls --disable-shared --with-dwarf2
Thread model: single
gcc version 4.6.1 20110620 (prerelease) (GCC) 

GNU C (GCC) version 4.6.1 20110620 (prerelease) (avr)
	compiled by GNU C version 3.3.1 (mingw special 20030804-1), GMP version 4.3.2, MPFR version 2.4.2, MPC version 0.8.2
GGC heuristics: --param ggc-min-expand=47 --param ggc-min-heapsize=32702
Comment 1 Georg-Johann Lay 2012-01-29 14:55:48 UTC
Changed component from "target" to "tree-optimization" because:

The code in tree-switch-conversion.c needs to be extended by, say, a target hook that queries the backend for the right address space for the generated lookup table, i.e. the CSWTCH.num objects.

This is needed for two reasons:

1) The data (CSWTCH.num) has to be located in the preferred section,
   i.e. in Flash and not in RAM.

2) The accesses have to use the right instructions, i.e. instructions to read from non-generic address space (Flash) and not instructions to access generic address space (RAM).
Comment 2 Steven Bosscher 2012-01-29 17:24:32 UTC
Jump tables are generated in stmt.c, as part of expanding to RTL.
Comment 3 Steven Bosscher 2012-01-29 17:51:04 UTC
An, no, tree-switch-conversion.c:build_one_array() must build its value_type for elements with a specified address space.
Comment 4 Jakub Jelinek 2013-03-22 14:44:30 UTC
GCC 4.8.0 is being released, adjusting target milestone.
Comment 5 Jakub Jelinek 2013-05-31 10:58:37 UTC
GCC 4.8.1 has been released.
Comment 6 Georg-Johann Lay 2013-06-01 13:41:30 UTC
Bumped the milestone to 4.9
Comment 7 Jakub Jelinek 2014-04-22 11:36:54 UTC
GCC 4.9.0 has been released
Comment 8 Jakub Jelinek 2015-04-22 11:59:34 UTC
GCC 5.1 has been released.
Comment 9 Richard Biener 2015-07-16 09:13:58 UTC
GCC 5.2 is being released, adjusting target milestone to 5.3.
Comment 10 Jakub Jelinek 2016-04-27 10:58:37 UTC
GCC 6.1 has been released.
Comment 11 Richard Biener 2016-08-22 08:36:58 UTC
GCC 6.2 is being released, adjusting target milestone
Comment 12 Jakub Jelinek 2016-12-21 10:59:18 UTC
GCC 6.3 is being released, adjusting target milestone.
Comment 13 Richard Biener 2017-07-04 08:50:21 UTC
GCC 6.4 is being released, adjusting target milestone.
Comment 14 Georg-Johann Lay 2017-07-05 10:15:26 UTC
Please, this is (not only) a target issue.

As already mentioned, in order for the target to be able to generate correct code, this needs a new target hook, hence byond "target".  Just putting the tables in flash won't adjust the address spaces which is needed for correct accesses.
Comment 15 Georg-Johann Lay 2017-07-25 12:15:39 UTC
Created attachment 41829 [details]
pr49857-v2-all.diff: Proposed patch

	PR 49857
	* target.def (TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA): New hook.
	* targhooks.c (default_addr_space_for_artificial_rodata): New function.
	* targhooks.h (default_addr_space_for_artificial_rodata): New proto.
	* tree-switch-conversion.c (target.h): Include it.
	(build_one_array): Set address space of array_type according to
	targetm.addr_space.for_artificial_rodata().
	* doc/tm.texi.in (Named Address Spaces)
	[TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA]: Add hook anchor.
	* doc/tm.texi: Regenerate.

	* config/avr/avr-opts.h: New file.
	* config/avr/avr.opt: Include it.
	(-maddr-space-for-lookup=): New option and...
	(avr_opt_addr_space_for_lookup): ...associated Var.
	(avr_aspace_for_lookup): New option enums used by above.
	* config/avr/avr-protos.h (avr_out_load_flashx): New proto.
	* config/avr/avr.c (avr_out_load_flashx): New function.
	* avr_adjust_insn_length [ADJUST_LEN_LOAD_FLASHX]: Handle it.
	* avr_rtx_costs_1 [ZERO_EXTEND, SIGN_EXTEND]: Handle
	shift-and-extend-by-1 of HI -> PSI.
	[ASHIFT,PSImode]: Describe cost of extend-and-shift-by-1.
	(TARGET_ADDR_SPACE_FOR_ARTIFICIAL_RODATA): Define to...
	(avr_addr_space_for_artificial_rodata): ...this new static function.
	* config/avr/avr.md (unspec): Add UNSPEC_LOAD_FLASHX.
	(adjust_len): Add load_flashx.
	(*ashiftpsi.1_sign_extend.hi, *ashiftpsi.1_zero_extend.hi)
	(*extendpsi.ashift.1.uqi, *load<mode>-flashx): New insns.
	(*split_xload<mode>-cswtch): New insn-and-split.
	* doc/invoke.texi (AVR Options) <-maddr-space-for-lookup=>: Document.
Comment 16 Georg-Johann Lay 2017-07-25 12:22:05 UTC
(In reply to Georg-Johann Lay from comment #15)
> pr49857-v2-all.diff: Proposed patch

Problems so far:

* ivopts may shred address-space which results in wrong code.  This is is the
  reason for why the patch doesn't -maddr-space-for-lookup=memx

$ make -k check-gcc RUNTESTFLAGS="--target_board=atmega128-sim execute.exp='20120808-1.c pr35800.c' --tool_opts=-maddr-space-for-lookup=memx"

* tree-swich-conversion might bloat the code, cf. PR81540 so it may even
  be better to disable tree-switch-conversion per default.
Comment 17 Georg-Johann Lay 2017-07-31 08:34:38 UTC
Giving up on this.  A working solution as patch series is here, so anyone can pick it up.

http://gcc.gnu.org/ml/gcc-patches/2017-07/msg01804.html
http://gcc.gnu.org/ml/gcc-patches/2017-07/msg01808.html
http://gcc.gnu.org/ml/gcc-patches/2017-07/msg01810.html

Note that it only can be done for artificial data -- or you have to make sure that the data is not accessed by inline asm by different means (and of course, no pointers must escape, and a respective hook must not be called from within the C++ front-end).

avr-part approval:
http://gcc.gnu.org/ml/gcc-patches/2017-07/msg01835.html

gcc-part rejection as "too specific":
http://gcc.gnu.org/ml/gcc-patches/2017-07/msg01879.html