Bug 86924 - tree-slp-vectorize may create unaligned memory access, causing segmentation fault
Summary: tree-slp-vectorize may create unaligned memory access, causing segmentation f...
Status: WAITING
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 8.2.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords: wrong-code
Depends on:
Blocks:
 
Reported: 2018-08-12 18:18 UTC by Mario Rohkrämer
Modified: 2023-09-27 12:43 UTC (History)
2 users (show)

See Also:
Host:
Target: x86_64-w64-mingw32
Build:
Known to work:
Known to fail:
Last reconfirmed: 2018-08-21 00:00:00


Attachments
Zipped temp output encoder.i by lupo... (311.18 KB, application/zip)
2018-08-21 15:13 UTC, Mario Rohkrämer
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Mario Rohkrämer 2018-08-12 18:18:33 UTC
Compiler version: 8.2.0 for Windows 64 bit, as released in MSYS2 / MinGW64
Windows 7 SP1, 64 bit


$ gcc -v
Using built-in specs.
COLLECT_GCC=H:\development\media-autobuild_suite-master\msys64\mingw64\bin\gcc.exe
COLLECT_LTO_WRAPPER=H:/development/media-autobuild_suite-master/msys64/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/8.2.0/lto-wrapper.exe
Target: x86_64-w64-mingw32
Configured with: ../gcc-8.2.0/configure --prefix=/mingw64 --with-local-prefix=/mingw64/local --build=x86_64-w64-mingw32 --host=x86_64-w64-mingw32 --target=x86_64-w64-mingw32 --with-native-system-header-dir=/mingw64/x86_64-w64-mingw32/include --libexecdir=/mingw64/lib --enable-bootstrap --with-arch=x86-64 --with-tune=generic --enable-languages=ada,c,lto,c++,objc,obj-c++,fortran --enable-shared --enable-static --enable-libatomic --enable-threads=posix --enable-graphite --enable-fully-dynamic-string --enable-libstdcxx-filesystem-ts=yes --enable-libstdcxx-time=yes --disable-libstdcxx-pch --disable-libstdcxx-debug --disable-isl-version-check --enable-lto --enable-libgomp --disable-multilib --enable-checking=release --disable-rpath --disable-win32-registry --disable-nls --disable-werror --disable-symvers --with-libiconv --with-system-zlib --with-gmp=/mingw64 --with-mpfr=/mingw64 --with-mpc=/mingw64 --with-isl=/mingw64 --with-pkgversion='Rev1, Built by MSYS2 project' --with-bugurl=https://sourceforge.net/projects/msys2 --with-gnu-as --with-gnu-ld
Thread model: posix
gcc version 8.2.0 (Rev1, Built by MSYS2 project)


The AOMedia AV1 video encoder compiled with this version (but it is probably independent of the operating system) crashes while encoding. The following bug report in the Chromium bug tracker analyzed the problem, especially comment 7 went down to disassembly:

https://bugs.chromium.org/p/aomedia/issues/detail?id=2055#c7

Summary by lupo...:

+----
Bug appears in the compilation of https://aomedia.googlesource.com/aom/+/da17065690c185ae678d5db9466cf0a402ca6b6d/av1/encoder/encoder.c#3415
More precisely in the optimized and inlined lshift_bwd_ref_frames(cpi) inside update_reference_frames

Disassembly listings to follow:
cmake -G "MSYS Makefiles" -DCONFIG_LOWBITDEPTH=1 -DENABLE_DOCS=0 -DENABLE_TESTS=off ../aom
loc_4D5CD2:
mov     edx, [rcx+35624Ch]
movdqa  xmm3, xmmword ptr [rcx+478E38h]
mov     [rcx+356248h], edx
mov     edx, [rcx+356254h]
movaps  xmmword ptr [rcx+478E28h], xmm3
movdqa  xmm3, xmmword ptr [rcx+478E58h]
mov     [rcx+35624Ch], edx
movaps  xmmword ptr [rcx+478E38h], xmm3
mov     [rcx+356254h], r11d
jmp     loc_4D58A0

cmake -G "MSYS Makefiles" -DCONFIG_LOWBITDEPTH=1 -DENABLE_DOCS=0 -DENABLE_TESTS=off -DAOM_EXTRA_C_FLAGS="-fno-tree-slp-vectorize" -DAOM_EXTRA_CXX_FLAGS="-fno-tree-slp-vectorize" ../aom
loc_4D5DC2:
mov     edx, [rcx+35624Ch]
movdqu  xmm3, xmmword ptr [rcx+478E38h]
movdqu  xmm5, xmmword ptr [rcx+478E58h]
mov     [rcx+356248h], edx
mov     edx, [rcx+356254h]
movups  xmmword ptr [rcx+478E28h], xmm3
mov     [rcx+35624Ch], edx
movups  xmmword ptr [rcx+478E38h], xmm5
mov     [rcx+356254h], r11d
jmp     loc_4D5993

It all reduces to aligned vs unaligned memory access. By manually patching the faulty executable, changing movdqa to movdqu and movaps to movups, I have been able to finish an encode without problems.
+----


Please excuse not providing all the details you requested in the "Reporting Bugs" guide. But I believe the linked bug report in the Chromium tracker is verbose enough to understand the issue.
Comment 1 Richard Biener 2018-08-21 13:16:20 UTC
The question is where rcx comes from and why it isn't suitably aligned.  This is very likely an issue in chromium, not gcc.  The workaround patch shows the
things to look at, namely the type declarations of *cpi and ordered_bwd and
where they are allocated.

Please attach preprocessed source for encoder.c as you use it when compiling
for x86_64-w64-mingw32 (just add -save-temps to the compile that reproduces the failure).
Comment 2 Mario Rohkrämer 2018-08-21 14:40:08 UTC
Unfortunately, I do not have much experience in running a compile manually. I only let the "media-autobuild suite" batch run.

https://github.com/jb-alvarado/media-autobuild_suite/

I would not know for sure where to manipulate these batch/shell files to add the requested argument, or how to manually run the compilation for one specific file. It's all automated. But I will ask around and try to get advice.
Comment 3 Mario Rohkrämer 2018-08-21 15:13:48 UTC
Created attachment 44567 [details]
Zipped temp output encoder.i by lupo...

This is the "-save-temps" output which user lupo... attached in comment 12 to the Chromium bug report I linked above.