It's likely caused by a violation of the strict aliasing rules, but I can't verify that: $ wget http://loop-aes.sourceforge.net/aespipe/aespipe-v2.4f.tar.bz2 $ cd aespipe-v2.4f/ $ export CFLAGS="-O2 -flto -flto-partition=one" && ./configure && make tests ... ./aespipe -v -p 3 -e AES128 -K ./gpgkey2.asc -G test-dir1 <test-file3 >test-file1 3<test-file4 ./aespipe: C-language AES, 128 key bits, encrypting, multi-key-v2 mode, RAM not locked make test-part3 make[2]: Entering directory '/tmp/aespipe-v2.4f' md5sum test-file1 >test-file2 echo "f9825b79873f5c439ae9371c1a929a6c test-file1" >test-file5 make[2]: Leaving directory '/tmp/aespipe-v2.4f' cmp test-file2 test-file5 test-file2 test-file5 differ: byte 1, line 1 make[1]: *** [Makefile:120: test-part2] Error 1 make[1]: Leaving directory '/tmp/aespipe-v2.4f' make: *** [Makefile:87: tests] Error 2 Adding -fno-strict-aliasing fixes that. And the following dbg counter shows that: $ gcc -o aespipe aespipe.o aes.o md5.o sha512.o rmd160.o -fdbg-cnt=ipa_mod_ref:385-385 -flto-partition=one -fdump-tree-optimized-lineno=bad -fdump-ipa-modref-details && make tests optimized dump diff is then: ;; Function compute_sector_iv (compute_sector_iv, funcdef_no=0, decl_uid=4504, cgraph_uid=12, symbol_order=57) ... [./aespipe.c:775:20] _13 = MEM[(u_int64_t *)bfp_22 + 16B]; [./aespipe.c:775:26] _14 = MEM[(u_int64_t *)bfp_22]; [./aespipe.c:775:20] _15 = _13 ^ _14; [./aespipe.c:775:20] MEM[(u_int64_t *)bfp_22 + 16B] = _15; [./aespipe.c:776:20] _16 = MEM[(u_int64_t *)bfp_22 + 24B]; - [./aespipe.c:776:26] _17 = MEM[(u_int64_t *)bfp_22 + 8B]; - [./aespipe.c:776:20] _18 = _16 ^ _17; + [./aespipe.c:776:20] _18 = _12 ^ _16; [./aespipe.c:776:20] MEM[(u_int64_t *)bfp_22 + 24B] = _18; So one load is optimized out 769 do { 770 bfp[0] ^= dip[0]; 771 bfp[1] ^= dip[1]; 772 aes_encrypt(acpa[0], (unsigned char *)bfp, (unsigned char *)bfp); 773 dip = bfp; 774 bfp += 2; 775 bfp[0] ^= dip[0]; 776 bfp[1] ^= dip[1]; 777 aes_encrypt(acpa[0], (unsigned char *)bfp, (unsigned char *)bfp); 778 dip = bfp; 779 bfp += 2; 780 } while(--x >= 0); 781 size -= 512; @Honza: Can you please take a look?
mine.
Created attachment 51165 [details] Hash reference function test case Fails on gcc11 with -O2, passes with -O2. # gcc -O2 -o tst md5_reference.c && ./tst
Comment on attachment 51165 [details] Hash reference function test case edit: Fails test case on -O2, passes on -O1
(In reply to Gregory Tucker from comment #2) > Created attachment 51165 [details] I think there are some aliasing violations here. We store to buf (which is unsigned char array) as a mixture of unsigned char and uint64_t in md5_ref BUT then do loads in the uint32_t type in md5_single. I think: uint32_t *w = (uint32_t *) data; Should need the may_alias attribute added to it.
(In reply to Andrew Pinski from comment #4) > (In reply to Gregory Tucker from comment #2) > > Created attachment 51165 [details] > > I think: > uint32_t *w = (uint32_t *) data; > > Should need the may_alias attribute added to it. Thanks for looking at the test case. Indeed adding may_alias to w will fix.
GCC 11.2 is being released, retargeting bugs to GCC 11.3
Thanks for testcase. This indeed is aliasing violation. We do: ipa-modref: call stmt md5_single (&buf, digest_18(D)); ipa-modref: call to md5_single/11 does not use ref: MEM[(uint64_t *)_8] alias sets: 3->3 which makes us to optimize it away. This is uint64_t store from *((uint64_t *) & buf[i - 8]) = (uint64_t) len *8; and md5_single does uint32_t loads. So I am marking this as invalid.