Bug 33049 - [avr] bit extraction non optimal, inversing logic solves problem
Summary: [avr] bit extraction non optimal, inversing logic solves problem
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.1.2
: P3 normal
Target Milestone: 4.7.0
Assignee: Georg-Johann Lay
URL:
Keywords: missed-optimization
: 49446 (view as bug list)
Depends on:
Blocks:
 
Reported: 2007-08-11 17:53 UTC by Wouter van Gulik
Modified: 2011-09-18 12:22 UTC (History)
6 users (show)

See Also:
Host:
Target: avr-*-*
Build:
Known to work:
Known to fail: 4.1.2, 4.2.1, 4.3.0
Last reconfirmed: 2007-08-24 20:35:53


Attachments
examples of good extraction and bad extraction (798 bytes, text/plain)
2007-08-11 17:56 UTC, Wouter van Gulik
Details
Assembler output of testcase using 4.1.2 (328 bytes, text/plain)
2007-08-24 18:52 UTC, Wouter van Gulik
Details
Correct assembler output of test case for 4.1.2. (888 bytes, text/plain)
2007-08-24 20:25 UTC, Eric Weddington
Details
Test case assembler output for 4.2.1. (877 bytes, text/plain)
2007-08-24 20:30 UTC, Eric Weddington
Details
Test case assembler output for 4.3.0 20070817 snapshot. (711 bytes, text/plain)
2007-08-24 20:35 UTC, Eric Weddington
Details
Test case assembler output for 4.5.0. (655 bytes, application/octet-stream)
2010-09-14 06:23 UTC, abnikant
Details
Proposed patch. (548 bytes, patch)
2011-05-17 18:57 UTC, Georg-Johann Lay
Details | Diff
Assembler output from 4.7.0 (r173832) with patch applied. (845 bytes, text/plain)
2011-05-17 19:00 UTC, Georg-Johann Lay
Details
Proposed patch (549 bytes, patch)
2011-05-18 08:06 UTC, Georg-Johann Lay
Details | Diff
pr33049.diff (660 bytes, patch)
2011-06-20 10:58 UTC, Georg-Johann Lay
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Wouter van Gulik 2007-08-11 17:53:29 UTC
Using this version/config:

~~~~~~~~~~~~~~~~~
Using built-in specs.
Target: avr
Configured with: ../gcc-4.1.2/configure --prefix=/c/WinAVR --target=avr --enable
-languages=c,c++ --with-dwarf2 --enable-win32-registry=WinAVR-20070525 --disable
-nls --with-gmp=/usr/local --with-mpfr=/usr/local --enable-doc --disable-libssp
Thread model: single
gcc version 4.1.2 (WinAVR 20070525)

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I compiled the source using this command line:

avr-gcc -S -Os test.c -mmcu=atmega16

These examples also suffers from double and instruction missed (bugID 11259)

The key point is that writing an expression like:
tmp = 0; if(var&(1<<N)) tmp|=1;
results in an N shift (done by loop if N larger then 3)

While rewriting to:
tmp = 0; if(var>>N)&1) tmp|=1;
This uses a swap instruction and then shifts. It's optimal (except from the same loss as bugID 12259 for N >= 4)

Maybe this gives a hint on where the shifting is generally going wrong. I tried 3 approaches in the test.c file.

Another interresting thing is the removal off the extra and instructions in the last examples in the file.
Comment 1 Wouter van Gulik 2007-08-11 17:56:29 UTC
Created attachment 14053 [details]
examples of good extraction and bad extraction

Adding the test file showing the missed optimization
Comment 2 Andrew Pinski 2007-08-11 17:58:23 UTC
It might be interesting if you tried 4.2.1.
Comment 3 Eric Weddington 2007-08-22 16:57:51 UTC
Wouter, please attach the assembly output that you are getting for your test.c file using 4.1.2. That way we can compare it to other compiler versions.

Thanks,
Eric
Comment 4 Wouter van Gulik 2007-08-24 18:52:11 UTC
Created attachment 14105 [details]
Assembler output of testcase using 4.1.2

This is the requested assembler output that Eric asked for
Comment 5 Eric Weddington 2007-08-24 20:25:37 UTC
Created attachment 14106 [details]
Correct assembler output of test case for 4.1.2.
Comment 6 Eric Weddington 2007-08-24 20:30:34 UTC
Created attachment 14107 [details]
Test case assembler output for 4.2.1.

Not really any better than 4.1.2.
Comment 7 Eric Weddington 2007-08-24 20:35:09 UTC
Created attachment 14108 [details]
Test case assembler output for 4.3.0 20070817 snapshot.

Again, only marginally better.
Comment 8 abnikant 2010-09-14 06:23:41 UTC
Created attachment 21787 [details]
Test case assembler output for 4.5.0.
Comment 9 abnikant 2010-09-14 06:25:46 UTC
Lot better code size in gcc-4.5.0 and above [head]. See the attachment in comment #8.
Comment 10 Georg-Johann Lay 2011-05-17 18:57:39 UTC
Created attachment 24264 [details]
Proposed patch.

Proposed Patch (also compatible with older versions of GCC).

The insn needs at most 3 instructions and does not put pressure on d-regs.

* config/avr/avr.md ("*extzv"): New insn.
Comment 11 Georg-Johann Lay 2011-05-17 19:00:05 UTC
Created attachment 24265 [details]
Assembler output from 4.7.0 (r173832) with patch applied.

Assembler output from 4.7.0 (r173832) with patch applied.

There's not much room for improvement left now.
Comment 12 Georg-Johann Lay 2011-05-18 08:06:53 UTC
Created attachment 24277 [details]
Proposed patch

Patch that is less fuzzy in its attributes.
Comment 13 Wouter van Gulik 2011-05-19 10:38:38 UTC
Yep it looks a lot better now.
The if statements could be optimized into the equivalent shift instructions but that is not a AVR backend problem I guess.
I noticed that the disassembly shows wrong lengths for some outputs of extzv. Is that a problem?
Comment 14 Georg-Johann Lay 2011-05-30 14:33:31 UTC
(In reply to comment #13)
> Yep it looks a lot better now.
> The if statements could be optimized into the equivalent shift instructions but
> that is not a AVR backend problem I guess.

Shifting generally is more expensive instead of bit extracting; at least if the bit offset is known at compile time.

> I noticed that the disassembly shows wrong lengths for some outputs of extzv.
> Is that a problem?

It's not a problem if the sequence actually printed is not greater than the instruction length reported. If the reported instruction length was strinct greater, an assembler error could occur because relative jump offets migh be out of scope.

The only case where the instruction length is smaller than reported is if IN_REG=OUT_REG and BITPOS=4 (sequence is swap + andi 1). The instruction length for lsr + andi 1 (IN_REG=OUT_REG, BITPOS=1) is already corrected in the patch.
Comment 15 HotHead 2011-06-17 21:38:20 UTC
*** Bug 49446 has been marked as a duplicate of this bug. ***
Comment 16 Georg-Johann Lay 2011-06-20 10:50:27 UTC
*** Bug 49446 has been marked as a duplicate of this bug. ***
Comment 17 Georg-Johann Lay 2011-06-20 10:58:59 UTC
Created attachment 24563 [details]
pr33049.diff

Patch also covering PR49446.

Tested without regression against SVN 175201

	PR target/33049
	* config/avr/avr.md (extzv): New expander.
	(*extzv, *extzv.qihi1, *extzv.qihi2): New insn-and-split.
Comment 18 Georg-Johann Lay 2011-06-21 17:30:57 UTC
Author: gjl
Date: Tue Jun 21 17:30:54 2011
New Revision: 175269

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=175269
Log:
	PR target/33049
	* config/avr/avr.md (extzv): New expander.
	(*extzv): New insn.
	(*extzv.qihi1, *extzv.qihi2): New insn-and-split.
	* config/avr/constraints.md (C04): New constraint.
	* doc/md.texi (Machine Constraints): Document it.


Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/config/avr/avr.md
    trunk/gcc/config/avr/constraints.md
    trunk/gcc/doc/md.texi
Comment 19 Georg-Johann Lay 2011-06-21 17:36:35 UTC
Closing as resolved+fixed according to applied patch.
Comment 20 Georg-Johann Lay 2011-09-18 12:22:58 UTC
See also http://lists.gnu.org/archive/html/avr-gcc-list/2008-12/msg00009.html