Bug List: (This bug is not in your last search results)   Show last search results      Search page      Enter new bug
Bug#: 34789
Product:  
Component:  
Status: NEW
Resolution:
Assigned To: Not yet assigned to anyone <unassigned@gcc.gnu.org>
Host:
Reported against  
Priority:  
Severity:  
Target Milestone:  
 
 
Target:
Reporter: David Brown <david@westcontrol.com>
Add CC:
CC:
Remove selected CCs
Build:
URL:
Summary:
Keywords:
Known to work:
Known to fail:

Attachment Description Type Created Size Actions
Create a New Attachment (proposed patch, testcase, etc.) View All

Bug 34789 depends on: Show dependency tree
Show dependency graph
Bug 34789 blocks:

Additional Comments:





Mark bug as waiting for feedback
Mark bug as suspended




View Bug Activity   |   Format For Printing   |   Clone This Bug


Description:   Last confirmed: 2008-04-04 21:31 Opened: 2008-01-15 08:23
On RISC cpus with many registers, it is often helpful to keep constants (such
as addresses) in registers for later use, since loading a register with a
constant takes longer than moving data between registers, and since
register-constant operations are seldom available.  On the AVR, however,
register-constant operations are just as small and fast as register-register
operations (excluding the movw operation), so it is often better to use the
constants directly - it saves a few instructions (time and code space), and
avoids tying up a register pair unnecessarily.

Example:

extern uint16_t data[64];
extern uint16_t foo(uint16_t x);
extern uint16_t a;

uint16_t bar(void) {
        uint16_t x;
        x = foo(data[a]);
        return foo(data[x]);
}

In this case, the compiler caches the address of "data" in r16:r17 :

  59                    bar:
  60                    /* prologue: frame size=0 */
  61 001a 0F93                  push r16
  62 001c 1F93                  push r17
  63                    /* prologue end (size=2) */
  64 001e E091 0000             lds r30,a        ;  a, a
  65 0022 F091 0000             lds r31,(a)+1    ;  a, a
  66 0026 00E0                  ldi r16,lo8(data)        ;  tmp45,
  67 0028 10E0                  ldi r17,hi8(data)        ;  tmp45,
  68 002a EE0F                  lsl r30  ;  a
  69 002c FF1F                  rol r31  ;  a
  70 002e E00F                  add r30,r16      ;  a, tmp45
  71 0030 F11F                  adc r31,r17      ;  a, tmp45
  72 0032 8081                  ld r24,Z         ; , data
  73 0034 9181                  ldd r25,Z+1      ; , data
  74 0036 0E94 0000             call foo         ; 
  75 003a 880F                  lsl r24  ;  tmp50
  76 003c 991F                  rol r25  ;  tmp50
  77 003e 080F                  add r16,r24      ;  tmp45, tmp50
  78 0040 191F                  adc r17,r25      ;  tmp45, tmp50
  79 0042 F801                  movw r30,r16     ; , tmp45
  80 0044 8081                  ld r24,Z         ; , data
  81 0046 9181                  ldd r25,Z+1      ; , data
  82 0048 0E94 0000             call foo         ; 
  83                    /* epilogue: frame size=0 */
  84 004c 1F91                  pop r17
  85 004e 0F91                  pop r16
  86 0050 0895                  ret

Better code could be generated by using "data" directly:

                        ; Prologue avoided - saved 2 words code
                                lds r30, a
                                lds r31, (a) + 1
                        ; Load of r16:r17 avoided - saved 2 words code
                                lsl r30 ; a
                                rol r31 ; a
                                subi r30, lo8(-(data))
                                sbci r31, hi8(-(data))
                                ld r24, z
                                ldd r25, Z+1
                                call foo
                                lsl r24 ; x
                                rol r25 ; x
                        ; Using constant for data is just as small as register
                                subi r24, lo8(-(data))
                                sbci r25, hi8(-(data))
                                movw r30, r16
                                ld r24, z
                                ldd r25, Z+1
                                call foo
                        ; Epilogue avoided - saved 2 words code
                                ret

This saves 6 code words, and corresponding time.

------- Comment #1 From Andy Hutchinson 2008-02-22 01:22 -------
This appears to be due to avr_rtx_costs not assigning same cost to SYMBOL_REF
and CONST_INT. So SYMBOL_REF looks expensive - so is held in register to avoid
"recalculating" it.

Quick change to make SYMBOL_REF same cost as CONST_INT in both avr_rtx_cost and
avr_operand_rtx_cost gave the desired result 

  22                    /* prologue: function */
  23                    /* frame size = 0 */
  24                    .LM2:
  25 0000 E091 0000             lds r30,a
  26 0004 F091 0000             lds r31,(a)+1
  27 0008 EE0F                  lsl r30
  28 000a FF1F                  rol r31
  29 000c E050                  subi r30,lo8(-(data))
  30 000e F040                  sbci r31,hi8(-(data))
  31 0010 8081                  ld r24,Z
  32 0012 9181                  ldd r25,Z+1
  33 0014 0E94 0000             call foo
  34                    .LM3:
  35 0018 FC01                  movw r30,r24
  36 001a EE0F                  lsl r30
  37 001c FF1F                  rol r31
  38 001e E050                  subi r30,lo8(-(data))
  39 0020 F040                  sbci r31,hi8(-(data))
  40 0022 8081                  ld r24,Z
  41 0024 9181                  ldd r25,Z+1
  42 0026 0E94 0000             call foo
  43                    /* epilogue start */
  44                    .LM4:
  45 002a 0895                  ret

CONST and LABEL_REF might also have same problem as they cost same as MEM.

But note, if SYMBOL_REF were part of memory address, then it might be better
held in register (like Y or Z) - this should be done by checking outer code.
With outer code of "MEM" SYMBOL_REF would be more expensive than register.
(which might same a few LDS/STS that appear in code)

Avr_rtx_cost needs some serious work done to correct these are other anomalies
in cost assumptions and the recursion on operand costs.

Bug List: (This bug is not in your last search results)   Show last search results      Search page      Enter new bug