Bug 46615 - [4.6 regression] possibly-invalid x86-64 inline asm miscompilation
Summary: [4.6 regression] possibly-invalid x86-64 inline asm miscompilation
Status: RESOLVED INVALID
Alias: None
Product: gcc
Classification: Unclassified
Component: inline-asm (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-11-23 02:59 UTC by astrange+gcc@gmail.com
Modified: 2011-05-13 13:10 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description astrange+gcc@gmail.com 2010-11-23 02:59:08 UTC
gcc 4.6 miscompiles this source from ffmpeg on x86-64-apple-darwin10, whereas previous compilers worked. I'm not sure if the asm is legal, but it's existed in the wild for a long time.

const unsigned long long __attribute__((aligned(8))) ff_bgr24toUV[2][4] =
{
    {0x38380000DAC83838ULL, 0xECFFDAC80000ECFFULL, 0xF6E40000D0E3F6E4ULL, 0x3838D0E300003838ULL},
    {0xECFF0000DAC8ECFFULL, 0x3838DAC800003838ULL , 0x38380000D0E33838ULL, 0xF6E4D0E30000F6E4ULL},
};

static void 
bgr24ToUV_mmx_MMX2(int f)
{
	__asm__ volatile(
	"movq 24+%0, %%mm6 \n\t"
	:: "m"(ff_bgr24toUV[f == 0][0]));
}

void 
rgb24ToUV_MMX2()
{
	bgr24ToUV_mmx_MMX2(1);
}

> gcc -v
Using built-in specs.
COLLECT_GCC=/usr/local/gcc46/bin/gcc
COLLECT_LTO_WRAPPER=/usr/local/gcc46/libexec/gcc/x86_64-apple-darwin10.5.0/4.6.0/lto-wrapper
Target: x86_64-apple-darwin10.5.0
Configured with: ../../src/gcc/configure --prefix=/usr/local/gcc46 --with-arch=native --with-tune=native --disable-nls --with-gmp=/sw --disable-bootstrap --enable-checking --enable-languages=c,c++,lto,objc,obj-c++
Thread model: posix
gcc version 4.6.0 20101122 (experimental) (GCC) 
> gcc -O -o swscale-fails.s -S swscale.i 
swscale.i: In function 'rgb24ToUV_MMX2':
swscale.i:10:2: warning: use of memory input without lvalue in asm operand 0 is deprecated [enabled by default]

Working asm (4.2):
_rgb24ToUV_MMX2:
	pushq	%rbp
	movq	%rsp, %rbp
	movq 24+_ff_bgr24toUV(%rip), %mm6 
	leave
	ret
.globl _ff_bgr24toUV
	.const
	.align 3
_ff_bgr24toUV:
	.quad	4050987868490315832
	.quad	-1369135209168966401
	.quad	-656399642184648988
	.quad	4051217538195929144
	.quad	-1369375758026740481
	.quad	4051228417348089912
	.quad	4050987868324313144
	.quad	-656169972313032988
	.section __TEXT,__eh_frame,coalesced,no_toc+strip_static_syms+live_support

Non-working asm (4.6):
_rgb24ToUV_MMX2:
	movq 24+LC0(%rip), %mm6 	
	ret
	.globl _ff_bgr24toUV
	.const
	.align 3
_ff_bgr24toUV:
	.quad	4050987868490315832
	.quad	-1369135209168966401
	.quad	-656399642184648988
	.quad	4051217538195929144
	.quad	-1369375758026740481
	.quad	4051228417348089912
	.quad	4050987868324313144
	.quad	-656169972313032988
	.literal8
	.align 3
LC0:
	.quad	4050987868490315832
	.section __TEXT,__eh_frame,coalesced,no_toc+strip_static_syms+live_support

24+_ff_bgr24toUV(%rip) is fine, but 24+LC0(%rip) is a pointer to nothing, and ld breaks:

ld: in /var/folders/MY/MYkVh2TwHgKZhNFIG8M3wU+++TI/-Tmp-//cc9dJIWa.o, in section __TEXT,__text reloc 0: local relocation for address 0x0000000C in section __text does not target section __literal8

I'm going to fix the asm since it looks fragile anyway, but that won't fix existing releases of ffmpeg.

Note that creating LC0 is not even an optimization since it doesn't save any space (because the array is __attribute__((used))).
Comment 1 Jakub Jelinek 2010-11-23 08:28:16 UTC
That is really invalid.  The "m" operand is just long long, so there are no guarantees that you get anything around it.
Comment 2 Paulo César Pereira de Andrade 2011-05-13 13:10:59 UTC
I added a patch to mpeg2dec in Mandriva by removing
the const modifier of the vectors.

It was basic hacking with some trial&error to correct
https://bugzilla.gnome.org/show_bug.cgi?id=649930
more information at:
https://qa.mandriva.com/show_bug.cgi?id=63279

and the patch being used is:
http://svn.mandriva.com/cgi-bin/viewvc.cgi/packages/cooker/mpeg2dec/current/SOURCES/libmpeg2-0.5.1-gcc4.6.patch?view=markup&pathrev=674312

Commenting here because there should exist more
code out there that will fail due to the same
reason.