Bug 11259 - [avr] gcc Double 'andi' missed optimization
Summary: [avr] gcc Double 'andi' missed optimization
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 3.3
: P2 enhancement
Target Milestone: 4.4.0
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2003-06-19 23:15 UTC by Chris Candreva
Modified: 2008-08-22 23:06 UTC (History)
4 users (show)

See Also:
Host:
Target: avr
Build:
Known to work:
Known to fail: 4.2.1
Last reconfirmed: 2007-07-25 00:04:05


Attachments
example patch (526 bytes, patch)
2007-11-28 22:07 UTC, Rask Ingemann Lambertsen
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Chris Candreva 2003-06-19 23:15:39 UTC
I've found what looks like a missed optimization in avr-gcc  3.3 20030512

        uint8_t Encoder;

        Encoder = (PORTD >> 4) & 12;
   0:   82 b3           in      r24, 0x12       ; 18
   2:   82 95           swap    r24
   4:   8f 70           andi    r24, 0x0F       ; 15
   6:   8c 70           andi    r24, 0x0C       ; 12

The 4 bit shift is implemented as a swap/andi , but with an explicit andi
afterwards, it seems like an obvious optimization to combine them.

Trivial example code below, compiled with:
avr-gcc -c e.c -g -O2 -mmcu=at90s8535 -o e.o
#include <avr/io.h>
#include <inttypes.h>


int8_t  test ()
{

        uint8_t Encoder;

        Encoder = (PORTD >> 4) & 12;
        return(Encoder);

}
Comment 1 Andrew Pinski 2003-07-06 01:32:25 UTC
Can you provide the preprocessed source?
Comment 2 Chris Candreva 2003-07-06 02:13:33 UTC
Subject: Re:  [avr] gcc Double 'andi' missed optimization

On Sat, 6 Jul 2003, pinskia at physics dot uc dot edu wrote:

> ------- Additional Comments From pinskia at physics dot uc dot edu  2003-07-06 01:32 -------
> Can you provide the preprocessed source?

Sure -- how do I do that ?


==========================================================
Chris Candreva  -- chris@westnet.com -- (914) 967-7816
WestNet Internet Services of Westchester
http://www.westnet.com/
Comment 3 Andrew Pinski 2003-07-06 02:16:31 UTC
Read <http://gcc.gnu.org/bugs.html>. Also you can get the preprocessed source by adding -
save-temps.
Comment 4 Chris Candreva 2003-07-06 03:03:00 UTC
Subject: Re:  [avr] gcc Double 'andi' missed optimization


e.i

 --------------------------------

# 1 "e.c"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "e.c"
# 1 "/usr/local/avr/include/avr/io.h" 1 3
# 81 "/usr/local/avr/include/avr/io.h" 3
# 1 "/usr/local/avr/include/avr/sfr_defs.h" 1 3
# 82 "/usr/local/avr/include/avr/io.h" 2 3
# 189 "/usr/local/avr/include/avr/io.h" 3
# 1 "/usr/local/avr/include/avr/io8535.h" 1 3
# 190 "/usr/local/avr/include/avr/io.h" 2 3
# 2 "e.c" 2
# 1 "/usr/local/avr/include/inttypes.h" 1 3
# 60 "/usr/local/avr/include/inttypes.h" 3
typedef signed char int8_t;




typedef unsigned char uint8_t;
# 83 "/usr/local/avr/include/inttypes.h" 3
typedef int int16_t;




typedef unsigned int uint16_t;
# 99 "/usr/local/avr/include/inttypes.h" 3
typedef long int32_t;




typedef unsigned long uint32_t;
# 117 "/usr/local/avr/include/inttypes.h" 3
typedef long long int64_t;




typedef unsigned long long uint64_t;
# 134 "/usr/local/avr/include/inttypes.h" 3
typedef int16_t intptr_t;




typedef uint16_t uintptr_t;
# 3 "e.c" 2


int8_t test ()
{

        uint8_t Encoder;

        Encoder = ((*(volatile unsigned char *)((0x12) + 0x20)) >> 4) & 12;
        return(Encoder);

}



Comment 5 Andrew Pinski 2003-07-06 03:11:03 UTC
I can confirm this on the mainline (20030705):
        in r24,50-0x20
        swap r24
        andi r24,0x0f
        andi r24,lo8(12)
Comment 6 roger 2003-11-11 04:23:28 UTC
The AVR backend defines its lshrqi3 patterns as opaque "macro" sequences,
hence the fact that "x >> 4" is implemented as a swap/rotate followed by
an and instruction is never exposed to GCC's RTL optimizers.  If it were,
the two ands would be optimized away by CSE, combine or peephole, as they
are for other targets.

As there's nothing the middle-end can do unless the AVR backend is rewritten
to use RTL expanders, I'm changing this PR's component to target/11259.
Comment 7 Eric Weddington 2007-07-25 00:04:05 UTC
Bug still exists on 4.2.1.
Comment 8 Rask Ingemann Lambertsen 2007-11-28 22:07:38 UTC
Created attachment 14658 [details]
example patch

This patch is an example of the suggestion in comment #6. When compiling with -S -dp, it is clear why the code isn't optimized: "swap+andi" is one instruction.

test:
/* prologue: function */
/* frame size = 0 */
	in r24,50-0x20		 ;  6	*movqi/4	[length = 1]
	swap r24		 ;  7	lshrqi3/5	[length = 2]
	andi r24,0x0f
	andi r24,lo8(12)	 ;  13	andqi3/2	[length = 1]
/* epilogue start */
	ret			 ;  26	return		[length = 1]

The patch makes two instructions out of "swap"+"andi" of which "andi" is optimized away:

test:
/* prologue: function */
/* frame size = 0 */
	in r24,50-0x20		 ;  6	*movqi/4	[length = 1]
	swap r24		 ;  7	_rotlqi3_const4	[length = 1]
	andi r24,lo8(12)	 ;  14	andqi3/2	[length = 1]
/* epilogue start */
	ret			 ;  27	return		[length = 1]
Comment 9 aesok 2008-08-22 21:26:05 UTC
Subject: Bug 11259

Author: aesok
Date: Fri Aug 22 21:24:56 2008
New Revision: 139502

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=139502
Log:
	PR target/11259
	* config/avr/avr.md (UNSPEC_SWAP): New constants.
	(*swap): New insn pattern.
	(*ashlqi3): Rename from ashlqi3 insn pattern.
	(ashlqi3): New expanders.
	(*lshrqi3): Rename from lshrqi3 insn pattern.
	(lshrqi3): New expanders.	
	(ashlqi3_const4, ashlqi3_const5, ashlqi3_const6, lshrqi3_const4,
	lshrqi3_const5, lshrqi3_const6): New splitters.
	(andi, ashlqi3_l_const4, ashlqi3_l_const5, ashlqi3_l_const6,
	lshrqi3_l_const4, lshrqi3_l_const5, lshrqi3_l_const6): Define
	peephole2 patterns.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/avr/avr.md

Comment 10 Eric Weddington 2008-08-22 23:06:17 UTC
Fixed on 4.4.