Using the gcc-4.9 version 4.9-20140411-2 on Debian GNU/Linux amd64, I'm getting incorrect code for ffmpeg flac format decoding (same source code compiled with gcc-4.8 version 4.8.2-19 works fine). The problem only affects some flac files that have nonetheless been passed as valid by flac version 1.3.0. The error returned is similar to: [flac @ 0x7f3b540092e0] invalid subframe paddingB sq= 0B f=0/0 [flac @ 0x7f3b540092e0] decode_frame() failed The bug report against ffmpeg is at: https://trac.ffmpeg.org/ticket/3559 I am *NOT* a C programmer and am not familiar with the flac format decoding process but am posting this in case anyone who is might be prepared to investigate further. Thanks for any help or suggestions.
What options are you compiling ffmpeg with?
Created attachment 32637 [details] config.mak produced by ./configure --cc=gcc-4.9 --host-cc=gcc-4.9 --dep-cc=gcc-4.9
I've attached the config.mak produced by ./configure --cc=gcc-4.9 --host-cc=gcc-4.9 --dep-cc=gcc-4.9 The CFLAGS from config.mak: CFLAGS= -std=c99 -fomit-frame-pointer -pthread -D_GNU_SOURCE=1 -D_REENTRANT -I /usr/include/SDL -g -Wdeclaration-after-statement -Wall -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wwrite-strings -Wtype-limits -Wundef -missing-prototypes -Wno-pointer-to-int-cast -Wstrict-prototypes -Wempty-body -Wno-parentheses -Wno-switch -Wno-format-zero-length -Wno-pointer-sign -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -Werror=implicit-function-declaration -Werror=missing-prototypes -Werror=return-type -Werror=vla -Wno-maybe-uninitialized
Can you try adding -fno-strict-aliasing -fwrapv ? Next is does ffmpeg have inline-asm for the flac codec? This requires some C code knowledge though.
(In reply to Andrew Pinski from comment #5) > Can you try adding -fno-strict-aliasing -fwrapv ? And -fno-aggressive-loop-optimizations. I see the vectorizer is already turned off.
OK, I modified the CFLAGS statement in config.mak to what is below, re-ran make clean and make, and still experienced the same problems. CFLAGS= -std=c99 -fomit-frame-pointer -pthread -D_GNU_SOURCE=1 -D_REENTRANT -I/usr/include/SDL -g -Wdeclaration-after-statement -Wall -Wdisabled-optimization -Wpointer-arith -Wredundant-decls -Wwrite-strings -Wtype-limits -Wundef -Wmissing-prototypes -Wno-pointer-to-int-cast -Wstrict-prototypes -Wempty-body -Wno-parentheses -Wno-switch -Wno-format-zero-length -Wno-pointer-sign -O3 -fno-math-errno -fno-signed-zeros -fno-tree-vectorize -Werror=implicit-function-declaration -Werror=missing-prototypes -Werror=return-type -Werror=vla -Wno-maybe-uninitialized -fno-strict-aliasing -fwrapv -fno-aggressive-loop-optimizations Thanks for your interest.
PS, no assembly language in the flac decoding process as far as I could see.
Compiling ffmpeg with -fsanitize=undefined shows: % ffplay -v 9 -loglevel 99 -i 10.innocent_starter.tiny.flac ... Input #0, flac, from '10.innocent_starter.tiny.flac': Metadata: ARTIST : 水樹奈々 ALBUM : THE MUSEUM TITLE : innocent starter DATE : 2007 GENRE : JPop track : 10 CDDB : e111e110 Duration: 00:04:40.64, bitrate: 59 kb/s Stream #0:0, 56, 1/44100: Audio: flac, 44100 Hz, stereo, s16 detected 4 logical cores [ffplay_abuffer @ 0x7fc9940066a0] Setting 'sample_rate' to value '44100' [ffplay_abuffer @ 0x7fc9940066a0] Setting 'sample_fmt' to value 's16' [ffplay_abuffer @ 0x7fc9940066a0] Setting 'channels' to value '2' [ffplay_abuffer @ 0x7fc9940066a0] Setting 'time_base' to value '1/44100' [ffplay_abuffer @ 0x7fc9940066a0] Setting 'channel_layout' to value '0x3' [ffplay_abuffer @ 0x7fc9940066a0] tb:1/44100 samplefmt:s16 samplerate:44100 chlayout:0x3 [AVFilterGraph @ 0x7fc994001790] query_formats: 2 queried, 3 merged, 0 already done, 0 delayed Audio frame changed from rate:44100 ch:2 fmt:s16 layout:stereo serial:-1 to rate:44100 ch:2 fmt:s16 layout:stereo serial:1 [ffplay_abuffer @ 0x7fc988000f00] Setting 'sample_rate' to value '44100' [ffplay_abuffer @ 0x7fc988000f00] Setting 'sample_fmt' to value 's16' [ffplay_abuffer @ 0x7fc988000f00] Setting 'channels' to value '2' [ffplay_abuffer @ 0x7fc988000f00] Setting 'time_base' to value '1/44100' [ffplay_abuffer @ 0x7fc988000f00] Setting 'channel_layout' to value '0x3' [ffplay_abuffer @ 0x7fc988000f00] tb:1/44100 samplefmt:s16 samplerate:44100 chlayout:0x3 [AVFilterGraph @ 0x7fc988000be0] query_formats: 2 queried, 3 merged, 0 already done, 0 delayed /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/golomb.h:332:28: runtime error: left shift of negative value -1 /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp_template.c:85:36: runtime error: left shift of negative value -4 /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp_template.c:86:36: runtime error: left shift of negative value -4 /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp_template.c:100:36: runtime error: left shift of negative value -5 /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp_template.c:101:36: runtime error: left shift of negative value -2 [flac @ 0x7fc994003a20] invalid subframe paddingB sq= 0B f=0/0 [flac @ 0x7fc994003a20] decode_frame() failed /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp.c:63:16: runtime error: signed integer overflow: 1246 * -2064943 cannot be represented in type 'int' /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp.c:63:12: runtime error: signed integer overflow: 1722048318 + 1035287866 cannot be represented in type 'int' /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp.c:58:20: runtime error: signed integer overflow: 1777 * -2064943 cannot be represented in type 'int' /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp.c:56:20: runtime error: signed integer overflow: -1668 * -2064943 cannot be represented in type 'int' /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp.c:58:16: runtime error: signed integer overflow: 1024856828 + 1571300099 cannot be represented in type 'int' /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp.c:61:16: runtime error: signed integer overflow: 1246 * -3189343 cannot be represented in type 'int' /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp.c:56:16: runtime error: signed integer overflow: -1372495215 + -1772004530 cannot be represente d in type 'int' /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp.c:61:12: runtime error: signed integer overflow: -1496662804 + -1546546990 cannot be represente d in type 'int' [flac @ 0x7fc994003a20] invalid subframe padding [flac @ 0x7fc994003a20] decode_frame() failed /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp_template.c:71:36: runtime error: left shift of negative value -2349 /var/tmp/portage/media-video/ffmpeg-2.2.1/work/ffmpeg-2.2.1/libavcodec/flacdsp_template.c:72:36: runtime error: left shift of negative value -5961 [flac @ 0x7fc994004c20] invalid subframe paddingB sq= 0B f=0/0 [flac @ 0x7fc994004c20] decode_frame() failed [flac @ 0x7fc994004c20] invalid subframe paddingB sq= 0B f=0/0 [flac @ 0x7fc994004c20] decode_frame() failed [flac @ 0x7fc994003480] underread: 6275 orig size: 9378 0B f=0/0 [flac @ 0x7fc9940012b0] sample/frame number mismatch in adjacent frames Last message repeated 1 times [flac @ 0x7fc9940012b0] sample/frame number mismatch in adjacent frames Last message repeated 1 times [flac @ 0x7fc9940012b0] sample/frame number mismatch in adjacent frames Last message repeated 1 times [flac @ 0x7fc9940012b0] sample/frame number mismatch in adjacent frames Last message repeated 1 times [flac @ 0x7fc9940012b0] sample/frame number mismatch in adjacent frames Last message repeated 1 times [flac @ 0x7fc9940012b0] sample/frame number mismatch in adjacent frames Last message repeated 1 times [flac @ 0x7fc9940012b0] sample/frame number mismatch in adjacent frames Last message repeated 1 times
I haven't been able to reproduce what was shown in comment 9, but appreciate the effort taken to reproduce the problem.
OK I'm down to one function from ./libavcodec/flacdec.c: 208 __attribute__ ((optimize(0))) 209 static int decode_residuals(FLACContext *s, int32_t *decoded, int pred_order) 210 { 211 int i, tmp, partition, method_type, rice_order; 212 int rice_bits, rice_esc; 213 int samples; 214 215 method_type = get_bits(&s->gb, 2); 216 if (method_type > 1) { 217 av_log(s->avctx, AV_LOG_ERROR, "illegal residual coding method %d\n", 218 method_type); 219 return AVERROR_INVALIDDATA; 220 } 221 222 rice_order = get_bits(&s->gb, 4); 223 224 samples= s->blocksize >> rice_order; 225 if (samples << rice_order != s->blocksize) { 226 av_log(s->avctx, AV_LOG_ERROR, "invalid rice order: %i blocksize %i\n", 227 rice_order, s->blocksize); 228 return AVERROR_INVALIDDATA; 229 } 230 231 if (pred_order > samples) { 232 av_log(s->avctx, AV_LOG_ERROR, "invalid predictor order: %i > %i\n", 233 pred_order, samples); 234 return AVERROR_INVALIDDATA; 235 } 236 237 rice_bits = 4 + method_type; 238 rice_esc = (1 << rice_bits) - 1; 239 240 decoded += pred_order; 241 i= pred_order; 242 for (partition = 0; partition < (1 << rice_order); partition++) { 243 tmp = get_bits(&s->gb, rice_bits); 244 if (tmp == rice_esc) { 245 tmp = get_bits(&s->gb, 5); 246 for (; i < samples; i++) 247 *decoded++ = get_sbits_long(&s->gb, tmp); 248 } else { 249 for (; i < samples; i++) { 250 *decoded++ = get_sr_golomb_flac(&s->gb, tmp, INT_MAX, 0); 251 } 252 } 253 i= 0; 254 } 255 256 return 0; 257 } 258 Without __attribute__ ((optimize(0))) it gets miscompiled.
Created attachment 32638 [details] Unreduced testcase
Started with r205279.
Created attachment 32639 [details] good assembler output r205278
Created attachment 32640 [details] bad assembler output r205279
*** Bug 60904 has been marked as a duplicate of this bug. ***
Created attachment 32641 [details] Testcase from PR60904
*** Bug 60905 has been marked as a duplicate of this bug. ***
Created attachment 32642 [details] Testcase from PR60905
All three testcases use the same inline function libavcodec/golomb.h: 314 /** 315 * read unsigned golomb rice code (jpegls). 316 */ 317 static inline int get_ur_golomb_jpegls(GetBitContext *gb, int k, int limit, 318 int esc_len) 319 { 320 unsigned int buf; 321 int log; 322 323 OPEN_READER(re, gb); 324 UPDATE_CACHE(re, gb); 325 buf = GET_CACHE(re, gb); 326 327 log = av_log2(buf); 328 329 if (log - k >= 32 - MIN_CACHE_BITS + (MIN_CACHE_BITS == 32) && 330 32 - log < limit) { 331 buf >>= log - k; 332 buf += (30 - log) << k; 333 LAST_SKIP_BITS(re, gb, 32 + k - log); 334 CLOSE_READER(re, gb); 335 336 return buf; 337 } else { 338 int i; 339 for (i = 0; i < limit && SHOW_UBITS(re, gb, 1) == 0; i++) { 340 if (gb->size_in_bits <= re_index) 341 return -1; 342 LAST_SKIP_BITS(re, gb, 1); 343 UPDATE_CACHE(re, gb); 344 } 345 SKIP_BITS(re, gb, 1); 346 347 if (i < limit - 1) { 348 if (k) { 349 buf = SHOW_UBITS(re, gb, k); 350 LAST_SKIP_BITS(re, gb, k); 351 } else { 352 buf = 0; 353 } 354 355 CLOSE_READER(re, gb); 356 return buf + (i << k); 357 } else if (i == limit - 1) { 358 buf = SHOW_UBITS(re, gb, esc_len); 359 LAST_SKIP_BITS(re, gb, esc_len); 360 CLOSE_READER(re, gb); 361 362 return buf + 1; 363 } else 364 return -1; 365 } 366 } And indeed adding __attribute__ ((optimize(0))) to it "fixes" all three testcases. The function looks suspicious and I'm not sure if the code is valid.
Given c#13, likely mine.
I'm still working my way through the code, but it looks like we're mucking up the value for "buf" and "k". Not really sure if that's the root cause is just a downstream effect. Anyway, once the buf computation goes awry bad the callers start to see unexpected results and the bogus errors.
I think I see what's going on... Definitely mine.
So the issue here is we have an ASM which feeds an eq/neq condition found inside a loop. The eq/neq condition creates an equivalency that is (mis) used when we thread through the loop backedge. The code which records temporary equivalences for jump threading ignores GIMPLE_ASMs, so as we reprocess the block, the old equivalence isn't replaced with anything new. ie, the old equivalence doesn't get invalidated. As a result we re-use the implied value from the conditional in a previous iteration of the loop to incorrectly simplify a test and all hell breaks loose. In the past we were so conservative when traversing loop backedges that it was safe to totally ignore GIMPLE_ASMs in this code. That's no longer the safe thing to do.
Patch in testing.
Testing is good. Just trying to build a nice little testcase for the regression suite.
(In reply to Jeffrey A. Law from comment #26) > Testing is good. Just trying to build a nice little testcase for the > regression suite. Thanks. Another thing I've noticed is that ac3 audio doesn't work (no sound at all) when ffmpeg is compiled by gcc-4.9. I haven't figured it out yet, but it appears to be a different issue, because it also happens with r205279 reverted.
(In reply to Markus Trippelsdorf from comment #27) > (In reply to Jeffrey A. Law from comment #26) > > Testing is good. Just trying to build a nice little testcase for the > > regression suite. > > Another thing I've noticed is that ac3 audio doesn't work > (no sound at all) when ffmpeg is compiled by gcc-4.9. I haven't figured > it out yet, but it appears to be a different issue, because it also > happens with r205279 reverted. Luckily this issue is already fixed on the gcc-4.9 branch.
(In reply to Markus Trippelsdorf from comment #28) > (In reply to Markus Trippelsdorf from comment #27) > > (In reply to Jeffrey A. Law from comment #26) > > > Testing is good. Just trying to build a nice little testcase for the > > > regression suite. > > > > Another thing I've noticed is that ac3 audio doesn't work > > (no sound at all) when ffmpeg is compiled by gcc-4.9. I haven't figured > > it out yet, but it appears to be a different issue, because it also > > happens with r205279 reverted. > > Luckily this issue is already fixed on the gcc-4.9 branch. Ah, it only happens with r205279 reverted. It could be a latent issue. Would be good to check if it happens with your fix, too. Could you attach your patch?
Sorry. I expect to check the fix & testcase onto the trunk today. For various reasons my work time is limited right now. For the 4.9.x releases, final call will be with jakub, jsm and richi as the releaes managers.
Author: law Date: Wed Apr 23 18:04:46 2014 New Revision: 209716 URL: http://gcc.gnu.org/viewcvs?rev=209716&root=gcc&view=rev Log: PR tree-optimization/60902 * tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts_at_dest): Make sure to invalidate outputs from statements that do not produce useful outputs for threading. PR tree-optimization/60902 * gcc.target/i386/pr60902.c: New test. Added: trunk/gcc/testsuite/gcc.target/i386/pr60902.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-threadedge.c
Please, don't forget to backport the fix to the 4.9 branch. gcc 4.9.0 is almost unusable for ffmpeg because this didn't make it on time. Thanks.
James, The whole point behind leaving the 4.9 regression marker is to ensure this BZ gets reviewed as part of the 4.9.1 process. I'd consider this high priority for 4.9.1 and fully expect the fix will be included in 4.9.1.
Author: law Date: Mon Apr 28 13:38:19 2014 New Revision: 209860 URL: http://gcc.gnu.org/viewcvs?rev=209860&root=gcc&view=rev Log: PR tree-optimization/60902 * tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts_at_dest): Only iterate over real defs when invalidating outputs from statements that do not produce useful outputs for threading. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-threadedge.c
Author: law Date: Tue May 13 20:26:30 2014 New Revision: 210398 URL: http://gcc.gnu.org/viewcvs?rev=210398&root=gcc&view=rev Log: PR tree-optimization/60902 * tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts_at_dest): Make sure to invalidate outputs from statements that do not produce useful outputs for threading. PR tree-optimization/60902 * gcc.target/i386/pr60902.c: New test. Added: branches/gcc-4_9-branch/gcc/testsuite/gcc.target/i386/pr60902.c Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/testsuite/ChangeLog branches/gcc-4_9-branch/gcc/tree-ssa-threadedge.c
Author: law Date: Tue May 13 20:26:41 2014 New Revision: 210399 URL: http://gcc.gnu.org/viewcvs?rev=210399&root=gcc&view=rev Log: PR tree-optimization/60902 * tree-ssa-threadedge.c (record_temporary_equivalences_from_stmts_at_dest): Only iterate over real defs when invalidating outputs from statements that do not produce useful outputs for threading. Modified: branches/gcc-4_9-branch/gcc/ChangeLog branches/gcc-4_9-branch/gcc/tree-ssa-threadedge.c
Fix backported to branch