This testcase does not return 0 on powerpc64-suse-linux and s390x-suse-linux with compiled with -O1 but it does with -O0. I tried all the new opts from -O0 to -O1, but adding them all to -O0 does not fix it. No idea why. flags=' -O1' gcc -Wall -c -o /tmp/main.o /tmp/main.c $flags gcc -Wall -c -o /tmp/matroxfb_probe.o /tmp/matroxfb_probe.c $flags gcc -Wall -o /tmp/matroxfb_probe /tmp/main.o /tmp/matroxfb_probe.o $flags /tmp/matroxfb_probe
Created attachment 11021 [details] /tmp/main.c
Created attachment 11022 [details] /tmp/matroxfb_probe.c
generated asm is: 00000000100006d0 <.matroxfb_probe>: 100006d0: 7c 08 02 a6 mflr r0 100006d4: fb e1 ff f8 std r31,-8(r1) 100006d8: 7c 7f 1b 78 mr r31,r3 100006dc: 38 a0 00 08 li r5,8 100006e0: f8 01 00 10 std r0,16(r1) 100006e4: f8 21 ff 71 stdu r1,-144(r1) 100006e8: 60 00 00 00 nop 100006ec: e8 63 00 20 ld r3,32(r3) 100006f0: 80 9f 00 40 lwz r4,64(r31) 100006f4: 38 c1 00 70 addi r6,r1,112 100006f8: 4b ff ff 49 bl 10000640 <.pci_bus_read_config_byte> 100006fc: 60 00 00 00 nop 10000700: e9 22 80 58 ld r9,-32680(r2) 10000704: a0 09 00 00 lhz r0,0(r9) 10000708: 2f 80 00 00 cmpwi cr7,r0,0 1000070c: 41 9e 00 54 beq- cr7,10000760 <.matroxfb_probe+0x90> 10000710: 80 7f 00 44 lwz r3,68(r31) 10000714: 89 61 00 70 lbz r11,112(r1) 10000718: 48 00 00 18 b 10000730 <.matroxfb_probe+0x60> 1000071c: 60 00 00 00 nop 10000720: 39 29 00 28 addi r9,r9,40 10000724: a0 09 00 00 lhz r0,0(r9) 10000728: 2f 80 00 00 cmpwi cr7,r0,0 1000072c: 41 9e 00 34 beq- cr7,10000760 <.matroxfb_probe+0x90> 10000730: 80 09 00 04 lwz r0,4(r9) 10000734: 7f a0 18 00 cmpd cr7,r0,r3 10000738: 40 9e ff e8 bne+ cr7,10000720 <.matroxfb_probe+0x50> 1000073c: a0 09 00 04 lhz r0,4(r9) 10000740: 7f 80 58 40 cmplw cr7,r0,r11 10000744: 41 9c ff dc blt+ cr7,10000720 <.matroxfb_probe+0x50> 10000748: 38 21 00 90 addi r1,r1,144 1000074c: 38 60 00 00 li r3,0 10000750: e8 01 00 10 ld r0,16(r1) 10000754: eb e1 ff f8 ld r31,-8(r1) 10000758: 7c 08 03 a6 mtlr r0 1000075c: 4e 80 00 20 blr 10000760: 38 21 00 90 addi r1,r1,144 10000764: 38 60 ff ed li r3,-19 10000768: e8 01 00 10 ld r0,16(r1) 1000076c: eb e1 ff f8 ld r31,-8(r1) 10000770: 7c 08 03 a6 mtlr r0 10000774: 4e 80 00 20 blr
(In reply to comment #0) > I tried all the new opts from -O0 to -O1, but adding them all to -O0 does not > fix it. No idea why. Because you did not read the documention that says that would not work. Now if you instead used -fno-* to -O1 you might get a better idea.
flags=' -m64 -g -O1 -fno-cprop-registers -fno-defer-pop -fno-guess-branch-probability -fno-if-conversion -fno-if-conversion2 -fno-ipa-pure-const -fno-ipa-reference -fno-loop-optimize -fno-merge-constants -fno-omit-frame-pointer -fno-tree-ccp -fno-tree-ch -fno-tree-copy-prop -fno-tree-copyrename -fno-tree-dce -fno-tree-dominator-opts -fno-tree-dse -fno-tree-fre -fno-tree-lrs -fno-tree-salias -fno-tree-sink -fno-tree-sra -fno-tree-ter -fno-unit-at-a-time -fno-var-tracking ' ; gcc -Wall -c -o ~olh/main.o ~olh/main.c $flags && gcc -Wall -c -o ~olh/matroxfb_probe.o ~olh/matroxfb_probe.c $flags && gcc -Wall -o ~olh/matroxfb_probe ~olh/main.o ~olh/matroxfb_probe.o $flags && ~olh/matroxfb_probe pci_bus_read_config_byte called (nil) 0 8 0xfffffc24c10 p 0x10011448 -19 doesnt seem to help.
This works on the mainline on powerpc-darwin and on the 4.1 branch on x86_64-linux-gnu and the mainline on x86_64.
Can you try "-O1 -fno-ivopts"?
yes.
-fno-ivopts fixes it. The problem is how bitfield refs are dealt with it seems. With -fno-ivopts the final_cleanup pass looks like so at the interesting point: D.2180 = BIT_FIELD_REF <*pdev, 32, 544> & 4294967295; ... if ((BIT_FIELD_REF <*b, 32, 0> & 4294967295) != D.2180) goto <L3>; else goto <L1>; ivopts lead to this code at that point: D.2180 = BIT_FIELD_REF <*pdev, 32, 544> & 4294967295; ... if ((MEM[base: (long unsigned int *) b] & 4294967295) != D.2180) goto <L3>; else goto <L1>; Now BIT_FIELD_REF<*b,32,0> extract exactly the 32 bit at address 'b'. But MEM[base: (long unsigned int *) b] extracts the 64 bit at that address. The masking afterwards selects the lower 32bit from that, but ppc being a big endian target this extracts the wrong half. Let's CC Zdenek for this.
The following patch should fix the problem. I am fairly sure a similar check used to be somewhere in ivopts, but it probably got lost when MEM_REFs were introduced (I am somewhat surprised ivopts do not ICE and produce something that actually resembles a correct code in this case...) Index: tree-ssa-loop-ivopts.c =================================================================== *** tree-ssa-loop-ivopts.c (revision 111993) --- tree-ssa-loop-ivopts.c (working copy) *************** find_interesting_uses_address (struct iv *** 1482,1489 **** /* Ignore bitfields for now. Not really something terribly complicated to handle. TODO. */ ! if (TREE_CODE (base) == COMPONENT_REF ! && DECL_NONADDRESSABLE_P (TREE_OPERAND (base, 1))) goto fail; if (STRICT_ALIGNMENT --- 1482,1490 ---- /* Ignore bitfields for now. Not really something terribly complicated to handle. TODO. */ ! if (TREE_CODE (base) == BIT_FIELD_REF ! || (TREE_CODE (base) == COMPONENT_REF ! && DECL_NONADDRESSABLE_P (TREE_OPERAND (base, 1)))) goto fail;
Patch: http://gcc.gnu.org/ml/gcc-patches/2006-03/msg00925.html
Subject: Bug 26643 Author: rakdver Date: Wed Mar 29 01:34:51 2006 New Revision: 112483 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=112483 Log: PR tree-optimization/26643 * tree-ssa-loop-ivopts.c (find_interesting_uses_address): Do not handle bit_field_refs. Modified: trunk/gcc/ChangeLog trunk/gcc/tree-ssa-loop-ivopts.c
Fixed at least on the mainline but still needs to fixed on the 4.1 branch.
Would some kind person please attempt a backport to 4.1?
running a 4.1 bootstrap.
Subject: Re: [4.1 Regression] Linux matroxfb_probe miscompiled > ------- Comment #15 from bonzini at gnu dot org 2006-04-18 14:12 ------- > running a 4.1 bootstrap. It's been in our SUSE tree for some while and so survived bootstrapping and testing on x86_64, i686, ppc, ppc64, s390, s390x and ia64.
Subject: Bug 26643 Author: bonzini Date: Tue Apr 18 14:22:59 2006 New Revision: 113041 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=113041 Log: 2006-04-18 Paolo Bonzini <bonzini@gnu.org> PR tree-optimization/26643 Backport from mainline: 2006-03-29 Zdenek Dvorak <dvorakz@suse.cz> PR tree-optimization/26643 * tree-ssa-loop-ivopts.c (find_interesting_uses_address): Do not handle bit_field_refs. Modified: branches/gcc-4_1-branch/gcc/ChangeLog branches/gcc-4_1-branch/gcc/tree-ssa-loop-ivopts.c
patch committed to 4.1 too.