Bug 88209 - Inefficient array initialization
Summary: Inefficient array initialization
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 9.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2018-11-26 20:38 UTC by Berni
Modified: 2018-11-28 08:40 UTC (History)
0 users

See Also:
Host:
Target: avr
Build:
Known to work:
Known to fail:
Last reconfirmed: 2018-11-28 00:00:00


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Berni 2018-11-26 20:38:28 UTC
The following was tested on avr-gcc, but this behavior should not be different on other platforms.

Consider the following array declaration with initialization of all elements to 0:

int main(void)
{
   char arr[256] = {0};
   return 0;
}

In avr-gcc 8.2.0 and before the following asm code is generated:

  9a:	de 01       	movw	r26, r28
  9c:	11 96       	adiw	r26, 0x01	; 1
  9e:	80 e0       	ldi	r24, 0x00	; 0
  a0:	91 e0       	ldi	r25, 0x01	; 1
  a2:	fd 01       	movw	r30, r26
  a4:	9c 01       	movw	r18, r24
  a6:	11 92       	st	Z+, r1
  a8:	21 50       	subi	r18, 0x01	; 1
  aa:	30 40       	sbci	r19, 0x00	; 0
  ac:	e1 f7       	brne	.-8      	; 0xa6 <main+0x1e>

This behavior is what to expect. 256 Bytes of 0 (register r1) is pushed on the stack.

In avr-gcc 9.0.0 (20181118) this code is generated. Here a 256-bytes data field is generated in section .rodata
I don't fully understand the code because in the last three lines, still 0 (register r1) is pushed on the stack! 

  b0:	80 91 00 01 	lds	r24, 0x0100	; 0x800100 <__data_start>
  b4:	90 91 01 01 	lds	r25, 0x0101	; 0x800101 <__data_start+0x1>
  b8:	9a 83       	std	Y+2, r25	; 0x02
  ba:	89 83       	std	Y+1, r24	; 0x01
  bc:	fe 01       	movw	r30, r28
  be:	33 96       	adiw	r30, 0x03	; 3
  c0:	8e ef       	ldi	r24, 0xFE	; 254
  c2:	df 01       	movw	r26, r30
  c4:	1d 92       	st	X+, r1
  c6:	8a 95       	dec	r24
  c8:	e9 f7       	brne	.-6      	; 0xc4 <main+0x26>

In this case RAM is wated because of the 256 bytes reserved in .rodata
The examples were compiled with -Os. At first I was thinking this new behavior is the result of some runtime optimization but for -Os, the focus should be laid on code size / memory consumption. So I consider this as a bug!
Comment 1 Andrew Pinski 2018-11-26 20:48:27 UTC
My bet this is a really target specific issue as the middle end usually uses the target cost model if figure out if it should do a copy loop or a zeroing loop.
Comment 2 Richard Biener 2018-11-28 08:40:08 UTC
Confirmed.  There are IIRC several related PRs.