This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

[Bug rtl-optimization/34072] unoptimal byte extraction.



------- Comment #1 from rask at gcc dot gnu dot org  2007-11-14 01:44 -------
With -S -dp it is clear that only byte0 is optimized:

byte0:
        movzbl  4(%esp), %eax   # 11    *movqi_1/3
byte1:
        movl    4(%esp), %eax   # 24    *movsi_1/1
        movl    8(%esp), %edx   # 25    *movsi_1/1
        shrdl   $8, %edx, %eax  # 30    x86_shrd_1/1
byte6:
        movzwl  10(%esp), %eax  # 24    *zero_extendhisi2_movzwl
byte7:
        movzbl  11(%esp), %eax  # 28    *zero_extendqisi2_movzbw

They should all be optimized to use movqi. The first part of the problem is
that any of cse, cse2, gcse and fwprop will combine these instructions

(insn 7 6 8 2 /tmp/pr34072.c:3 (set (reg:QI 60)
        (subreg:QI (reg:SI 64) 0)) 62 {*movqi_1} (nil))

(insn 8 7 12 2 /tmp/pr34072.c:3 (set (reg:QI 58 [ <result> ])
        (reg:QI 60)) 62 {*movqi_1} (nil))

(insn 12 8 18 2 /tmp/pr34072.c:3 (set (reg/i:QI 0 ax)
        (reg:QI 58 [ <result> ])) 62 {*movqi_1} (nil))

into

(insn 12 8 18 2 /tmp/pr34072.c:3 (set (reg/i:QI 0 ax [ <result> ])
        (subreg:QI (reg:SI 64) 0)) 62 {*movqi_1} (nil))

and then combine won't touch it because of the hard register (ax) and
SMALL_REGISTER_CLASSES and/or CLASS_LIKELY_SPILLED. The fix is to teach
these passes to not combine these insns, as demonstrated using
-fno-forward-propagate -fno-gcse -fno-rerun-cse-after-loop -fno-cse[1]:

byte6:
        movzbl  10(%esp), %eax  # 8     *movqi_1/3
byte7:
        movzbl  11(%esp), %eax  # 8     *movqi_1/3

Byte1 is still not optimized because we're failing to simplify this
instruction in combine:

(set (reg:QI 60)
    (subreg:QI (lshiftrt:DI (mem/c/i:DI (reg/f:SI 16 argp) [2 x+0 S8 A32])
            (const_int 8 [0x8])) 0))

I should be entirely possible to simplify it to this:

(set (reg:QI 60) (mem/c/i:QI (plus:SI (reg/f:SI 16 argp) (const_int 1))))

[1] An option I hacked in to debug this problem.


-- 

rask at gcc dot gnu dot org changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
          Component|target                      |rtl-optimization
     Ever Confirmed|0                           |1
           Keywords|                            |missed-optimization
      Known to fail|                            |4.3.0
   Last reconfirmed|0000-00-00 00:00:00         |2007-11-14 01:44:03
               date|                            |


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=34072


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]