This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Help: Register allocator sets up frame at low register pressure (PR 50775)


2011/10/25 Georg-Johann Lay <avr@gjlay.de>:
> With the following, small C test program
>
>
> typedef struct
> {
> Â Âunsigned char a, b, c, d;
> } s_t;
>
> unsigned char func1 (s_t *x, s_t *y, s_t *z)
> {
> Â Âunsigned char s = 0;
> Â Âs += x->a;
> Â Âs += y->a;
> Â Âs += z->a;
>
> Â Âs += x->b;
> Â Âs += y->b;
> Â Âs += z->b;
>
> Â Âs += x->c;
> Â Âs += y->c;
> Â Âs += z->c;
>
> Â Âreturn s;
> }
>
> there is a frame pointer set up for no apparent reason.
>
> The machine for which this code is compiled for (AVR) has just few pointer
> registers and taking away one of them to use it as frame pointer leads to
> severe performance degradation in many real-world programs: moving from/to
> memory is more expensive than movon around registers, setting up a frame is
> expensive and taking away 1 of 2 address registers is expensive.
>
> What I tried and what did not fix it:
>
> - increase targetm.memory_move_cost (up to unsane value)
> - play around with targetm.class_likely_spilled_p
>
> The program is compiled with
>
> $ avr-gcc in.c -S -Os -fdump-rtl-ira-details -fdump-rtl-postreload-details
> -mmcu=avr4 -mstrict-X
>
> with avr-gcc from current trunk SVN r180399.
>
>
> The issue is that AVR has only 3 pointer registers X, Y, and Z with the
> following addressing capabilities:
>
> Â*X, *X++, *--X Â Â Â Â Â Â (R27:R26, call-clobbered)
> Â*Y, *Y++, *--Y, *(Y+const) (R28:R29, call-saved, frame pointer)
> Â*Z, *Z++, *--Z, *(Z+const) (R30:R31, call-clobbered)
>
> Older version of the compiler prior to 4.7 trunk r179993 allowed a fake
> addressing mode *(X+const) and emulated it by emitting appropriate instructions
> sequence like
>
> ÂX = X + const
> Âr = *X
> ÂX = X - const
>
> which was only a rare corner case in the old register allocator, but in the new
> allocator this sequence is seen very often leading to code bloat of +50% for
> some real-world functions.
>
> This is the reason why the command line option -mstrict-X has been added to the
> AVR backend, see PR46278.
>
> This option denies fake *(X+const) addressing but leads to the mentioned spills
> from register allocator and to code even worse as compared to without setting
> -mstrict-X, i.e. register allocator sabotages a smart usage of the address
> registers.
>
> All I see is that reload1.c:alter_reg() generates the spill because
> ira_conflicts_p is true.
>
> With the option -morder1 turn on (affects ADJUST_REG_ALLOC_ORDER) there is
> still a frame set up even though never accessed.
>
> Can anyone give me some advice how to proceed with this issue?
>
> Can be said if this is a target issue or IRA/reload flaw?

It's not a costs related problem.
I think that I can explain a problem.
I think that it's an IRA bug.

> Spilling for insn 11.
> Using reg 26 for reload 0
> Spilling for insn 17.
> Using reg 30 for reload 0
> Spilling for insn 23.
> Using reg 30 for reload 0
>       Try Assign 60(a6), cost=16000

Wrong thing starts here...
ira-color.c:4120 allocno_reload_assign (a, forbidden_regs);

> changing reg in insn 2
> changing reg in insn 9
> changing reg in insn 13
> changing reg in insn 19
>      Assigning 60(freq=4000) a new slot 0
> Register 60 now on stack.

Call trace:
allocno_reload_assign() -> assign_hard_reg() -> get_conflict_profitable_regs()

The `get_conflict_profitable_regs' calculates wrong `profitable_regs[1]'

(Special for Vladimir)
AVR is an 8 bits microcontroller.
The AVR has only 3 pointer registers X, Y, and Z with the
following addressing capabilities:
 *X, *X++, *--X             (R27:R26, call-clobbered)
 *Y, *Y++, *--Y, *(Y+const) (R28:R29, call-saved, frame pointer)
 *Z, *Z++, *--Z, *(Z+const) (R30:R31, call-clobbered)
Also, all modes larger than 8 bits should start in an even register.

So, `get_conflict_profitable_regs' trying to calculate two arrays:
  - profitable_regs[0] for first word of register 60(a6)
  - profitable_regs[1] for second word of register 60(a6)

Values of `profitable_regs':
(gdb) p print_hard_reg_set (stderr,profitable_regs[0] , 01)
 0-2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
$63 = void
(gdb) p print_hard_reg_set (stderr,profitable_regs[1] , 01)
 0-2 4 6 8 10 12 14 16 18 20 22 24 26 28 30

They are equal !
It's wrong because second word of register 60(a6) must be allocated to
odd register.


This is a wrong place in `get_conflict_profitable_regs':
...
  nwords = ALLOCNO_NUM_OBJECTS (a);
  for (i = 0; i < nwords; i++)
    {
      obj = ALLOCNO_OBJECT (a, i);
      COPY_HARD_REG_SET (conflict_regs[i],
			 OBJECT_TOTAL_CONFLICT_HARD_REGS (obj));
      if (retry_p)
	{
	  COPY_HARD_REG_SET (profitable_regs[i],
			     reg_class_contents[ALLOCNO_CLASS (a)]);
	  AND_COMPL_HARD_REG_SET (profitable_regs[i],
				  ira_prohibited_class_mode_regs
				  [ALLOCNO_CLASS (a)][ALLOCNO_MODE (a)]);
-------------------------------------------------------------^^^^^^^^^^^^^^^^^^^^^^^^^
	}

ALLOCNO_MODE (a) is a right mode for first word (word = 8bits register)
But it's wrong mode for second word of allocno.
Even more, ALLOCNO_MODE (a) is a right mode only for whole allocno.
If we want to spill/load/store separate parts(IRA objects) of allocno
we must use mode of each part(object).

`ira_prohibited_class_mode_regs' derived only from HARD_REGNO_MODE_OK.
So, the second word of 60(a6) permitted to any register after first
word of 60(a6).
For AVR: profitable_regs[1] = profitable_regs[0] << 1

Also, I have a question about the following fields of `ira_allocno':
  /* The number of objects tracked in the following array.  */
  int num_objects;
  /* An array of structures describing conflict information and live
     ranges for each object associated with the allocno.  There may be
     more than one such object in cases where the allocno represents a
     multi-word register.  */
  ira_object_t objects[2];
--------------------------^^^^^
The SImode for AVR consists of 4 words, but only 2 objects in allocno structure.
Is this right ?

Denis.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]