This is the mail archive of the gcc-bugs@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

target/8871: Inefficient zero_extendsidi2 for MMX


>Number:         8871
>Category:       target
>Synopsis:       Inefficient zero_extendsidi2 for MMX
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    unassigned
>State:          open
>Class:          pessimizes-code
>Submitter-Id:   net
>Arrival-Date:   Sat Dec 07 22:06:01 PST 2002
>Closed-Date:
>Last-Modified:
>Originator:     otaylor@redhat.com
>Release:        CVS Head, 7 December 2002
>Organization:
>Environment:
Linux/ia32
>Description:
When moving a 32-bit quantity into an MMX register,
GCC first zero-extends it as if doing 64-bit arithmetic
emulation, then uses movq to move it into the register.
So, code like:

===
        xorl    %edx, %edx
        movl    %eax, -16(%ebp)
        movl    %edx, -12(%ebp)
        movq    -16(%ebp), %mm1
===

Instead of simply:

===
       movd    %eax, %mm1
===

This (and associated overhead) causes a pretty big
hit for the typical uses of MMX.... the attached
demonstration patch improved one alpha-compositing 
routine from 29 million pixels/sec to 51 million
pixels/sec. (With the patch, results for a range
of routines were comparable to hand-written assembly.)

The attached patch just replaces the existing 
patterns for zero_extendsidi2 with a pattern using
movd. This is clearly wrong, but my minimal GCC
hacking skills proved unequal to integrating it
in properly.
>How-To-Repeat:
A simple example demonstrating the code generation
is:

===
typedef int di __attribute__ ((mode(DI)));

di foo (unsigned int a, unsigned int b)
{
  return __builtin_ia32_por (a, b);
}
===
>Fix:

>Release-Note:
>Audit-Trail:
>Unformatted:
----gnatsweb-attachment----
Content-Type: application/octet-stream; name="zero_extend.patch"
Content-Transfer-Encoding: base64
Content-Disposition: attachment; filename="zero_extend.patch"

SW5kZXg6IGkzODYubWQKPT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09PT09
PT09PT09PT09PT09PT09PT09PT09PT09PQpSQ1MgZmlsZTogL2N2c3Jvb3QvZ2NjL2djYy9nY2Mv
Y29uZmlnL2kzODYvaTM4Ni5tZCx2CnJldHJpZXZpbmcgcmV2aXNpb24gMS40MDQKZGlmZiAtdSAt
cCAtcjEuNDA0IGkzODYubWQKLS0tIGkzODYubWQJMTkgTm92IDIwMDIgMjI6NTI6NDAgLTAwMDAJ
MS40MDQKKysrIGkzODYubWQJOCBEZWMgMjAwMiAwNTo0MjozNCAtMDAwMApAQCAtMzAzMSw2MCAr
MzAzMSwxMiBAQAogCSAgICAgIChjbG9iYmVyIChyZWc6Q0MgMTcpKV0pXQogICAiIikKIAotOzsg
JSUlIEtpbGwgbWUgb25jZSBtdWx0aS13b3JkIG9wcyBhcmUgc2FuZS4KLShkZWZpbmVfZXhwYW5k
ICJ6ZXJvX2V4dGVuZHNpZGkyIgotICBbKHNldCAobWF0Y2hfb3BlcmFuZDpESSAwICJyZWdpc3Rl
cl9vcGVyYW5kIiAiPXIiKQotICAgICAoemVyb19leHRlbmQ6REkgKG1hdGNoX29wZXJhbmQ6U0kg
MSAibm9uaW1tZWRpYXRlX29wZXJhbmQiICJybSIpKSldCi0gICIiCi0gICJpZiAoIVRBUkdFVF82
NEJJVCkKLSAgICAgewotICAgICAgIGVtaXRfaW5zbiAoZ2VuX3plcm9fZXh0ZW5kc2lkaTJfMzIg
KG9wZXJhbmRzWzBdLCBvcGVyYW5kc1sxXSkpOwotICAgICAgIERPTkU7Ci0gICAgIH0KLSAgIikK
LQotKGRlZmluZV9pbnNuICJ6ZXJvX2V4dGVuZHNpZGkyXzMyIgotICBbKHNldCAobWF0Y2hfb3Bl
cmFuZDpESSAwICJub25pbW1lZGlhdGVfb3BlcmFuZCIgIj1yLD9yLD8qbyIpCi0JKHplcm9fZXh0
ZW5kOkRJIChtYXRjaF9vcGVyYW5kOlNJIDEgIm5vbmltbWVkaWF0ZV9vcGVyYW5kIiAiMCxybSxy
IikpKQotICAgKGNsb2JiZXIgKHJlZzpDQyAxNykpXQotICAiIVRBUkdFVF82NEJJVCIKLSAgIiMi
Ci0gIFsoc2V0X2F0dHIgIm1vZGUiICJTSSIpXSkKLQotKGRlZmluZV9pbnNuICJ6ZXJvX2V4dGVu
ZHNpZGkyX3JleDY0IgotICBbKHNldCAobWF0Y2hfb3BlcmFuZDpESSAwICJub25pbW1lZGlhdGVf
b3BlcmFuZCIgIj1yLG8iKQotICAgICAoemVyb19leHRlbmQ6REkgKG1hdGNoX29wZXJhbmQ6U0kg
MSAibm9uaW1tZWRpYXRlX29wZXJhbmQiICJybSwwIikpKV0KLSAgIlRBUkdFVF82NEJJVCIKLSAg
IkAKLSAgIG1vdlx0eyVrMSwgJWswfCVrMCwgJWsxfQotICAgIyIKLSAgWyhzZXRfYXR0ciAidHlw
ZSIgImltb3Z4LGltb3YiKQotICAgKHNldF9hdHRyICJtb2RlIiAiU0ksREkiKV0pCi0KLShkZWZp
bmVfc3BsaXQKLSAgWyhzZXQgKG1hdGNoX29wZXJhbmQ6REkgMCAibWVtb3J5X29wZXJhbmQiICIi
KQotICAgICAoemVyb19leHRlbmQ6REkgKG1hdGNoX2R1cCAwKSkpXQotICAiVEFSR0VUXzY0QklU
IgotICBbKHNldCAobWF0Y2hfZHVwIDQpIChjb25zdF9pbnQgMCkpXQotICAic3BsaXRfZGkgKCZv
cGVyYW5kc1swXSwgMSwgJm9wZXJhbmRzWzNdLCAmb3BlcmFuZHNbNF0pOyIpCi0KLShkZWZpbmVf
c3BsaXQgCi0gIFsoc2V0IChtYXRjaF9vcGVyYW5kOkRJIDAgInJlZ2lzdGVyX29wZXJhbmQiICIi
KQotCSh6ZXJvX2V4dGVuZDpESSAobWF0Y2hfb3BlcmFuZDpTSSAxICJyZWdpc3Rlcl9vcGVyYW5k
IiAiIikpKQotICAgKGNsb2JiZXIgKHJlZzpDQyAxNykpXQotICAiIVRBUkdFVF82NEJJVCAmJiBy
ZWxvYWRfY29tcGxldGVkCi0gICAmJiB0cnVlX3JlZ251bSAob3BlcmFuZHNbMF0pID09IHRydWVf
cmVnbnVtIChvcGVyYW5kc1sxXSkiCi0gIFsoc2V0IChtYXRjaF9kdXAgNCkgKGNvbnN0X2ludCAw
KSldCi0gICJzcGxpdF9kaSAoJm9wZXJhbmRzWzBdLCAxLCAmb3BlcmFuZHNbM10sICZvcGVyYW5k
c1s0XSk7IikKLQotKGRlZmluZV9zcGxpdCAKLSAgWyhzZXQgKG1hdGNoX29wZXJhbmQ6REkgMCAi
bm9uaW1tZWRpYXRlX29wZXJhbmQiICIiKQotCSh6ZXJvX2V4dGVuZDpESSAobWF0Y2hfb3BlcmFu
ZDpTSSAxICJnZW5lcmFsX29wZXJhbmQiICIiKSkpCi0gICAoY2xvYmJlciAocmVnOkNDIDE3KSld
Ci0gICIhVEFSR0VUXzY0QklUICYmIHJlbG9hZF9jb21wbGV0ZWQiCi0gIFsoc2V0IChtYXRjaF9k
dXAgMykgKG1hdGNoX2R1cCAxKSkKLSAgIChzZXQgKG1hdGNoX2R1cCA0KSAoY29uc3RfaW50IDAp
KV0KLSAgInNwbGl0X2RpICgmb3BlcmFuZHNbMF0sIDEsICZvcGVyYW5kc1szXSwgJm9wZXJhbmRz
WzRdKTsiKQorKGRlZmluZV9pbnNuICJ6ZXJvX2V4dGVuZHNpZGkyIgorICBbKHNldCAobWF0Y2hf
b3BlcmFuZDpESSAwICJub25pbW1lZGlhdGVfb3BlcmFuZCIgIj15IikKKwkoemVyb19leHRlbmQ6
REkgKG1hdGNoX29wZXJhbmQ6U0kgMSAibm9uaW1tZWRpYXRlX29wZXJhbmQiICJybSIpKSldCisg
ICJUQVJHRVRfTU1YIgorICAibW92ZFx0eyUxLCAlMHwlMCwgJTF9IgorICBbKHNldF9hdHRyICJt
b2RlIiAiREkiKV0pCiAKIChkZWZpbmVfaW5zbiAiemVyb19leHRlbmRoaWRpMiIKICAgWyhzZXQg
KG1hdGNoX29wZXJhbmQ6REkgMCAicmVnaXN0ZXJfb3BlcmFuZCIgIj1yLHIiKQo=


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]