This is the mail archive of the
gcc-bugs@gcc.gnu.org
mailing list for the GCC project.
[Bug middle-end/40893] New: ARM and PPC truncate intermediate operations unnecessarily
- From: "lessen42+gcc at gmail dot com" <gcc-bugzilla at gcc dot gnu dot org>
- To: gcc-bugs at gcc dot gnu dot org
- Date: 28 Jul 2009 16:28:11 -0000
- Subject: [Bug middle-end/40893] New: ARM and PPC truncate intermediate operations unnecessarily
- Reply-to: gcc-bugzilla at gcc dot gnu dot org
Consider the following C code:
#include <inttypes.h>
void dct2x2dc_dconly( int16_t d[2][2] )
{
int d0 = d[0][0] + d[0][1];
int d1 = d[1][0] + d[1][1];
d[0][0] = d0 + d1;
d[0][1] = d0 - d1;
}
The following is generated with arm-none-linux-gnueabi-gcc-4.4.0 -O3
-mcpu=cortex-a8 -S
dct2x2dc_dconly:
ldrsh ip, [r0, #2]
ldrsh r3, [r0, #0]
ldrsh r1, [r0, #6]
ldrsh r2, [r0, #4]
add r3, ip, r3
add r2, r1, r2
uxth r3, r3
uxth r2, r2
rsb r1, r2, r3
add r3, r2, r3
strh r1, [r0, #2] @ movhi
strh r3, [r0, #0] @ movhi
bx lr
(with pre-armv6 targets the two uxth are replaced by asl #16, lsr #16 pairs.)
The following is generated with powerpc-unknown-linux-gnu-gcc-4.4.0 -O3
-mcpu=G4 -S
dct2x2dc_dconly:
lha 10,2(3)
lha 0,0(3)
lha 11,6(3)
lha 9,4(3)
add 0,10,0
rlwinm 0,0,0,0xffff
add 9,11,9
rlwinm 9,9,0,0xffff
subf 11,9,0
add 0,9,0
sth 11,2(3)
sth 0,0(3)
blr
The two uxth in the ARM version, and the two rlwinm in the PPC version are
completely unnecessary, as letting strh/sth truncate will give equivalent
results. x86 does not exhibit this behaviour, and removing either d0 + d1 or d0
- d1 will not cause d0 and d1 be truncated to to 16 bits on both ARM and PPC.
powerpc-unknown-linux-gnu-gcc-4.4.0 -v
Using built-in specs.
Target: powerpc-unknown-linux-gnu
Configured with: /var/tmp/portage/sys-devel/gcc-4.4.0/work/gcc-4.4.0/configure
--prefix=/usr --bindir=/usr/powerpc-unknown-linux-gnu/gcc-bin/4.4.0
--includedir=/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.0/include
--datadir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.0
--mandir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.0/man
--infodir=/usr/share/gcc-data/powerpc-unknown-linux-gnu/4.4.0/info
--with-gxx-include-dir=/usr/lib/gcc/powerpc-unknown-linux-gnu/4.4.0/include/g++-v4
--host=powerpc-unknown-linux-gnu --build=powerpc-unknown-linux-gnu
--enable-altivec --disable-fixed-point --without-ppl --without-cloog
--disable-nls --with-system-zlib --disable-checking --disable-werror
--enable-secureplt --disable-multilib --disable-libmudflap --disable-libssp
--enable-libgomp --enable-cld --disable-libgcj --enable-languages=c,c++,fortran
--enable-shared --enable-threads=posix --enable-__cxa_atexit
--enable-clocale=gnu --with-bugurl=http://bugs.gentoo.org/
--with-pkgversion='Gentoo 4.4.0 p1.1'
Thread model: posix
gcc version 4.4.0 (Gentoo 4.4.0 p1.1)
arm-none-linux-gnueabi-gcc-4.4.0 -v
Using built-in specs.
Target: arm-none-linux-gnueabi
Configured with: ../gcc-4.4.0/configure --target=arm-none-linux-gnueabi
--prefix=/usr/local/arm --enable-threads
--with-sysroot=/usr/local/arm/arm-none-linux-gnueabi/libc
Thread model: posix
gcc version 4.4.0 (GCC)
--
Summary: ARM and PPC truncate intermediate operations
unnecessarily
Product: gcc
Version: 4.4.0
Status: UNCONFIRMED
Severity: normal
Priority: P3
Component: middle-end
AssignedTo: unassigned at gcc dot gnu dot org
ReportedBy: lessen42+gcc at gmail dot com
GCC host triplet: i386-apple-darwin
GCC target triplet: arm-none-linux-gnueabi
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=40893