Bug 36467 - [avr] Missed optimization with pointer arithmetic and mul*
Summary: [avr] Missed optimization with pointer arithmetic and mul*
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.3.0
: P3 normal
Target Milestone: 4.7.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2008-06-08 17:56 UTC by Eric Weddington
Modified: 2011-08-11 21:43 UTC (History)
2 users (show)

See Also:
Host:
Target: avr-*-*
Build:
Known to work:
Known to fail: 4.3.0, 4.6.1
Last reconfirmed: 2011-07-09 18:20:52


Attachments
Test case with structure size == 16. (169 bytes, text/plain)
2008-06-08 17:57 UTC, Eric Weddington
Details
Test case with structure size == 17. (172 bytes, text/plain)
2008-06-08 17:58 UTC, Eric Weddington
Details
Extendedn test case with size = 15, 16, 17, 18 (196 bytes, text/plain)
2011-07-09 09:10 UTC, Georg-Johann Lay
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Eric Weddington 2008-06-08 17:56:37 UTC
From AVR Freaks forum:
http://www.avrfreaks.net/index.php?name=PNphpBB2&file=viewtopic&p=453770#453770

Adding an offset to a pointer to a structure. If the structure size is 16, the generated code is a loop with shifts. If the structure size is 17, the generated code uses MUL* instructions and generates shorter code than when the size is 16.

Command lines:
avr-gcc -save-temps -Os -mmcu=atmega32 -c test.c -o test.o
avr-gcc -save-temps -Os -mmcu=atmega32 -c test2.c -o test2.o

See the code generation of funct().

Test cases to follow...
Comment 1 Eric Weddington 2008-06-08 17:57:45 UTC
Created attachment 15734 [details]
Test case with structure size == 16.
Comment 2 Eric Weddington 2008-06-08 17:58:52 UTC
Created attachment 15735 [details]
Test case with structure size == 17.
Comment 3 Eric Weddington 2008-06-08 18:08:34 UTC
Generated code when structure size is 16 (test.i):

funct:
/* prologue: function */
/* frame size = 0 */
	lds r24,head
	mov r30,r24
	clr r31
	sbrc r30,7
	com r31
	ldi r24,4
1:	lsl r30
	rol r31
	dec r24
	brne 1b
	subi r30,lo8(-(qq))
	sbci r31,hi8(-(qq))
	ld r24,Z
	sbrc r24,1
	std Z+1,__zero_reg__
.L3:
	ret


Generated code when structure size is 17 (test2.i):

funct:
/* prologue: function */
/* frame size = 0 */
	lds r24,head
	ldi r25,lo8(17)
	muls r24,r25
	movw r30,r0
	clr r1
	subi r30,lo8(-(qq))
	sbci r31,hi8(-(qq))
	ld r24,Z
	sbrc r24,1
	std Z+1,__zero_reg__
.L3:
	ret
Comment 4 Andy Hutchinson 2008-06-08 18:20:51 UTC
It makes sense in one respect
We don't have fast shift by 4 bits and code defaults to loop for Os. Seems we should be selective as MUL is indeed shorter.

Though I think gcc may be confused by our poor cost data and perhaps was alsp mislead into using shift instead of MUL.

Comment 5 Georg-Johann Lay 2011-07-09 09:10:56 UTC
Created attachment 24723 [details]
Extendedn test case with size = 15, 16, 17, 18
Comment 6 Georg-Johann Lay 2011-07-20 17:23:31 UTC
Author: gjl
Date: Wed Jul 20 17:23:28 2011
New Revision: 176527

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=176527
Log:

	PR target/36467
	PR target/49687
	* config/avr/avr.md (mulhi3): Use register_or_s9_operand for
	operand2 and expand appropriately if there is a CONST_INT in
	operand2.
	(usmulqihi3): New insn.
	(*sumulqihi3): New insn.
	(*osmulqihi3): New insn.
	(*oumulqihi3): New insn.
	(*muluqihi3.uconst): New insn_and_split.
	(*muluqihi3.sconst): New insn_and_split.
	(*mulsqihi3.sconst): New insn_and_split.
	(*mulsqihi3.uconst): New insn_and_split.
	(*mulsqihi3.oconst): New insn_and_split.
	(*ashifthi3.signx.const): New insn_and_split.
	(*ashifthi3.signx.const7): New insn_and_split.
	(*ashifthi3.zerox.const): New insn_and_split.
	(mulsqihi3): New insn.
	(muluqihi3): New insn.
	(muloqihi3): New insn.
	* config/avr/predicates.md (const_2_to_7_operand): New.
	(const_2_to_6_operand): New.
	(u8_operand): New.
	(s8_operand): New.
	(o8_operand): New.
	(s9_operand): New.
	(register_or_s9_operand): New.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/avr/avr.c
    trunk/gcc/config/avr/avr.md
    trunk/gcc/config/avr/predicates.md
Comment 7 Georg-Johann Lay 2011-07-20 17:26:03 UTC
Closed as FIXED: Reworked (widening) 16-bit multiply.