Bug 27856 - With -Os, loading a constant to a register can use another register
Summary: With -Os, loading a constant to a register can use another register
Status: RESOLVED DUPLICATE of bug 22072
Alias: None
Product: gcc
Classification: Unclassified
Component: target (show other bugs)
Version: 4.1.1
: P3 enhancement
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: missed-optimization, ra
Depends on: 18427 22072
Blocks:
  Show dependency treegraph
 
Reported: 2006-06-01 11:35 UTC by etienne_lorrain
Modified: 2024-02-22 22:41 UTC (History)
5 users (show)

See Also:
Host: i686-pc-linux-gnu
Target: i686-pc-linux-gnu
Build: i686-pc-linux-gnu
Known to work:
Known to fail:
Last reconfirmed: 2006-06-01 17:21:06


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description etienne_lorrain 2006-06-01 11:35:27 UTC
$ cat tmp.c
unsigned athird (unsigned val)
  {
  return val / 3;
  }
$ /home/etienne/projet/toolchain/bin/gcc -S -Os -o tmp.s -fomit-frame-pointer -fverbose-asm tmp.c
$ cat tmp.s
        .file   "tmp.c"
# GNU C version 4.1.1 (i686-pc-linux-gnu)
#       compiled by GNU C version 4.1.1.
# GGC heuristics: --param ggc-min-expand=62 --param ggc-min-heapsize=60570
# options passed:  -mtune=pentiumpro -auxbase-strip -Os
# -fomit-frame-pointer -fverbose-asm
# options enabled:  -falign-loops -fargument-alias -fbranch-count-reg
# -fcaller-saves -fcommon -fcprop-registers -fcrossjumping
# -fcse-follow-jumps -fcse-skip-blocks -fdefer-pop
# -fdelete-null-pointer-checks -fearly-inlining
# -feliminate-unused-debug-types -fexpensive-optimizations -ffunction-cse
# -fgcse -fgcse-lm -fguess-branch-probability -fident -fif-conversion
# -fif-conversion2 -finline-functions -finline-functions-called-once
# -fipa-pure-const -fipa-reference -fipa-type-escape -fivopts
# -fkeep-static-consts -fleading-underscore -floop-optimize
# -floop-optimize2 -fmath-errno -fmerge-constants -fomit-frame-pointer
# -foptimize-register-move -foptimize-sibling-calls -fpcc-struct-return
# -fpeephole -fpeephole2 -fregmove -freorder-functions
# -frerun-cse-after-loop -frerun-loop-opt -fsched-interblock -fsched-spec
# -fsched-stalled-insns-dep -fschedule-insns2 -fshow-column
# -fsplit-ivs-in-unroller -fstrength-reduce -fstrict-aliasing
# -fthread-jumps -ftrapping-math -ftree-ccp -ftree-copy-prop
# -ftree-copyrename -ftree-dce -ftree-dominator-opts -ftree-dse -ftree-fre
# -ftree-loop-im -ftree-loop-ivcanon -ftree-loop-optimize -ftree-lrs
# -ftree-salias -ftree-sink -ftree-sra -ftree-store-ccp
# -ftree-store-copy-prop -ftree-ter -ftree-vect-loop-version -ftree-vrp
# -funit-at-a-time -fverbose-asm -fzero-initialized-in-bss -m32 -m80387
# -m96bit-long-double -malign-stringops -mfancy-math-387 -mfp-ret-in-387
# -mieee-fp -mno-red-zone -mpush-args -mtls-direct-seg-refs

# Compiler executable checksum: acc0f3237f8807740daa75cf2b5b2d98

        .text
.globl athird
        .type   athird, @function
athird:
        movl    4(%esp), %eax   # val, val
        movl    $3, %edx        #, tmp63
        movl    %edx, %ecx      # tmp63,
        xorl    %edx, %edx      # tmp62
        divl    %ecx    #
        ret
        .size   athird, .-athird
        .ident  "GCC: (GNU) 4.1.1"
        .section        .note.GNU-stack,"",@progbits

  Here tmp63 is not needed and the two lines:
        movl    $3, %edx        #, tmp63
        movl    %edx, %ecx      # tmp63,
  Should be replaced by:
        movl    $3, %ecx
Comment 1 Andrew Pinski 2006-06-01 17:21:05 UTC
Confirmed, this is a RA issue.
Before register allocation:
(insn:HI 10 7 11 2 (set (reg:SI 63)
        (const_int 3 [0x3])) 34 {*movsi_1} (nil)
    (expr_list:REG_EQUIV (const_int 3 [0x3])
        (nil)))

(insn:HI 11 10 15 2 (parallel [
            (set (reg:SI 61)
                (udiv:SI (reg/v:SI 59 [ val ])
                    (reg:SI 63)))
            (set (reg:SI 62)
                (umod:SI (reg/v:SI 59 [ val ])
                    (reg:SI 63)))
            (clobber (reg:CC 17 flags))
        ]) 197 {udivmodsi4} (insn_list:REG_DEP_TRUE 6 (insn_list:REG_DEP_TRUE 10 (nil)))
    (expr_list:REG_UNUSED (reg:CC 17 flags)
        (expr_list:REG_UNUSED (reg:SI 62)
            (expr_list:REG_DEAD (reg/v:SI 59 [ val ])
                (expr_list:REG_DEAD (reg:SI 63)
                    (expr_list:REG_UNUSED (reg:CC 17 flags)
                        (expr_list:REG_UNUSED (reg:SI 62)
                            (nil))))))))

-----------
After:
(insn:HI 10 7 30 2 (set (reg:SI 1 dx [63])
        (const_int 3 [0x3])) 34 {*movsi_1} (nil)
    (expr_list:REG_EQUIV (const_int 3 [0x3])
        (nil)))

(insn 30 10 11 2 (set (reg:SI 2 cx)
        (reg:SI 1 dx [63])) 34 {*movsi_1} (nil)
    (nil))

(insn:HI 11 30 15 2 (parallel [
            (set (reg:SI 0 ax [61])
                (udiv:SI (reg/v:SI 0 ax [orig:59 val ] [59])
                    (reg:SI 2 cx)))
            (set (reg:SI 1 dx [62])
                (umod:SI (reg/v:SI 0 ax [orig:59 val ] [59])
                    (reg:SI 2 cx)))
            (clobber (reg:CC 17 flags))
        ]) 197 {udivmodsi4} (insn_list:REG_DEP_TRUE 6 (insn_list:REG_DEP_TRUE 10 (nil)))
    (nil))
Comment 2 Andrew Pinski 2006-06-01 17:24:16 UTC
And "yara" gets this correct.
Comment 3 etienne_lorrain 2006-09-05 11:32:22 UTC
 Just for info, does that means we need to wait for YARA to be included, considering 
http://gcc.gnu.org/ml/gcc/2006-08/msg00164.html
 it will probably happen after 4.2 ?

 I am seeing a lot of them, even some pattern like that to clear %eax:
xor %edx,%edx
...
mov %edx,%eax
... use %eax ...
mov 10,%edx

 Thanks.
Comment 4 Steven Bosscher 2007-04-04 12:19:39 UTC
Still see this.
Comment 5 Andrew Pinski 2008-09-14 04:30:46 UTC
The same issue is really 22072.

*** This bug has been marked as a duplicate of 22072 ***