This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug rtl-optimization/34072] unoptimal byte extraction.
- From: "rask at gcc dot gnu dot org" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 14 Nov 2007 01:44:03 -0000
- Subject: [Bug rtl-optimization/34072] unoptimal byte extraction.
- References: <bug-34072-7667@http.gcc.gnu.org/bugzilla/>
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
------- Comment #1 from rask at gcc dot gnu dot org 2007-11-14 01:44 -------
With -S -dp it is clear that only byte0 is optimized:
byte0:
movzbl 4(%esp), %eax # 11 *movqi_1/3
byte1:
movl 4(%esp), %eax # 24 *movsi_1/1
movl 8(%esp), %edx # 25 *movsi_1/1
shrdl $8, %edx, %eax # 30 x86_shrd_1/1
byte6:
movzwl 10(%esp), %eax # 24 *zero_extendhisi2_movzwl
byte7:
movzbl 11(%esp), %eax # 28 *zero_extendqisi2_movzbw
They should all be optimized to use movqi. The first part of the problem is
that any of cse, cse2, gcse and fwprop will combine these instructions
(insn 7 6 8 2 /tmp/pr34072.c:3 (set (reg:QI 60)
(subreg:QI (reg:SI 64) 0)) 62 {*movqi_1} (nil))
(insn 8 7 12 2 /tmp/pr34072.c:3 (set (reg:QI 58 [ <result> ])
(reg:QI 60)) 62 {*movqi_1} (nil))
(insn 12 8 18 2 /tmp/pr34072.c:3 (set (reg/i:QI 0 ax)
(reg:QI 58 [ <result> ])) 62 {*movqi_1} (nil))
into
(insn 12 8 18 2 /tmp/pr34072.c:3 (set (reg/i:QI 0 ax [ <result> ])
(subreg:QI (reg:SI 64) 0)) 62 {*movqi_1} (nil))
and then combine won't touch it because of the hard register (ax) and
SMALL_REGISTER_CLASSES and/or CLASS_LIKELY_SPILLED. The fix is to teach
these passes to not combine these insns, as demonstrated using
-fno-forward-propagate -fno-gcse -fno-rerun-cse-after-loop -fno-cse[1]:
byte6:
movzbl 10(%esp), %eax # 8 *movqi_1/3
byte7:
movzbl 11(%esp), %eax # 8 *movqi_1/3
Byte1 is still not optimized because we're failing to simplify this
instruction in combine:
(set (reg:QI 60)
(subreg:QI (lshiftrt:DI (mem/c/i:DI (reg/f:SI 16 argp) [2 x+0 S8 A32])
(const_int 8 [0x8])) 0))
I should be entirely possible to simplify it to this:
(set (reg:QI 60) (mem/c/i:QI (plus:SI (reg/f:SI 16 argp) (const_int 1))))
[1] An option I hacked in to debug this problem.
--
rask at gcc dot gnu dot org changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|UNCONFIRMED |NEW
Component|target |rtl-optimization
Ever Confirmed|0 |1
Keywords| |missed-optimization
Known to fail| |4.3.0
Last reconfirmed|0000-00-00 00:00:00 |2007-11-14 01:44:03
date| |
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34072