Curse Intel and their modal register sets. Today I caught the x86-64 compiler using %mm0 just because the datatype it wanted to move fit nicely in that register. Except that after an MMX register is touched, one must leave MMX mode before (1) the nest FPU instruction or (2) at a call boundary, since the abi requires we be in FPU mode. Not bothering to add a test case, since I'm planning to hack around this specific example with changed register preferences, but the point remains that we have nothing in place to prevent the badness. I suspect that what we'll need for a complete solution may include dynamic register class letters. At some point, perhaps during rtl expansion, we record whether or not there are any *operations* that require either MMX or FPU. If we have MMX but not FPU operations, we set 'f' to NOREGS; if we have FPU but not MMX, we set 'y' to NOREGS. If we have both, then then we'll need an optimize_mode_switching pass to swap between modes. The exceedingly tricky bit there will be tricking reload into not making both kinds of registers live behind our backs.
Confirmed, there is a dup of this filed already.
*** Bug 14801 has been marked as a duplicate of this bug. ***
Found the bug finnally, see PR 14801 for an example. Also PR 16872 is another example.
*** Bug 17415 has been marked as a duplicate of this bug. ***
Uros, also for you it seems... (http://gcc.gnu.org/ml/gcc-patches/2005-06/msg01724.html)
(In reply to comment #5) > Uros, also for you it seems... > (http://gcc.gnu.org/ml/gcc-patches/2005-06/msg01724.html) with this patch I get an ice on amd64 bootstrap: (..) -c ../../gcc/unwind-dw2.c -o libgcc/./unwind-dw2.o In file included from ../../gcc/unwind-dw2.c:256: ../../gcc/config/i386/linux-unwind.h: In function 'x86_64_fallback_frame_state': ../../gcc/config/i386/linux-unwind.h:55: warning: dereferencing type-punned pointer will break strict-aliasing rules ../../gcc/unwind.inc: In function '_Unwind_ForcedUnwind': ../../gcc/unwind.inc:215: internal compiler error: in create_pre_exit, at mode-switching.c:350
(in reply to comment #6) > ../../gcc/unwind.inc: In function '_Unwind_ForcedUnwind': > ../../gcc/unwind.inc:215: internal compiler error: in create_pre_exit, > at mode-switching.c:350 This is a known problem, with a hack to mode-switching.c at http://gcc.gnu.org/ml/gcc-patches/2005-06/msg01434.html. As this will fix an ice you got, the real problem is in fact, that this function is (like __builtin_apply case) trying to handle returned %mm register together with %st and this will confuse mode switching in the exit block. Please, could you try to apply the mode-switching.c part of the patch and see if it fix an ice for you. However, I think that __builtin_apply should process only an x87 output register, and should be limited only to functions that return in FPU_X87 mode.
(In reply to comment #7) > (in reply to comment #6) > > This is a known problem, with a hack to mode-switching.c at > http://gcc.gnu.org/ml/gcc-patches/2005-06/msg01434.html. > > Please, could you try to apply the mode-switching.c part of the patch > and see if it fix an ice for you. with this hack bootstrap still ices. ../../gcc/unwind.inc: In function '_Unwind_ForcedUnwind': ../../gcc/unwind.inc:215: internal compiler error: in create_pre_exit, at mode-switching.c:362
(In reply to comment #8) > > This is a known problem, with a hack to mode-switching.c at > > http://gcc.gnu.org/ml/gcc-patches/2005-06/msg01434.html. > > > > Please, could you try to apply the mode-switching.c part of the patch > > and see if it fix an ice for you. > > with this hack bootstrap still ices. > > ../../gcc/unwind.inc: In function '_Unwind_ForcedUnwind': > ../../gcc/unwind.inc:215: internal compiler error: in create_pre_exit, > at mode-switching.c:362 It was a hack anyway :---( Thanks for the report, I'll try to find a proper fix in the next week. (BTW: It fails for x86-64, because this target enables mmx by default.)
(In reply to comment #6) > with this patch I get an ice on amd64 bootstrap: > In file included from ../../gcc/unwind-dw2.c:256: > ../../gcc/config/i386/linux-unwind.h: > In function 'x86_64_fallback_frame_state': > ../../gcc/config/i386/linux-unwind.h:55: warning: dereferencing type-punned > pointer will break strict-aliasing rules > ../../gcc/unwind.inc: In function '_Unwind_ForcedUnwind': > ../../gcc/unwind.inc:215: internal compiler error: in create_pre_exit, > at mode-switching.c:350 > Pawel, could you check the patch at http://gcc.gnu.org/ml/gcc-patches/2005- 07/msg01128.html if it fixes bootstrap problems on AMD64? Patch works for me with BOOT_CFLAGS="-02 -msse2" on pentium4, and this is as far as I can test...
Whee, it looks that x86_64 breakage has gone. I have succesfully compiled unwind-dw2.c with patched x86_64 crosscompiler.
(In reply to comment #10) > (In reply to comment #6) > > > with this patch I get an ice on amd64 bootstrap: > > In file included from ../../gcc/unwind-dw2.c:256: > > ../../gcc/config/i386/linux-unwind.h: > > In function 'x86_64_fallback_frame_state': > > ../../gcc/config/i386/linux-unwind.h:55: warning: dereferencing type-punned > > pointer will break strict-aliasing rules > > ../../gcc/unwind.inc: In function '_Unwind_ForcedUnwind': > > ../../gcc/unwind.inc:215: internal compiler error: in create_pre_exit, > > at mode-switching.c:350 > > > > Pawel, could you check the patch at http://gcc.gnu.org/ml/gcc-patches/2005- > 07/msg01128.html if it fixes bootstrap problems on AMD64? Patch works for me > with BOOT_CFLAGS="-02 -msse2" on pentium4, and this is as far as I can test... > I check this right now :) I was busy with PR22584 earlier :|
current mainline bootstrap still fails. (...) ./xgcc -B./ -B/usr/x86_64-pld-linux/bin/ -isystem /usr/x86_64-pld-linux/include -isystem /usr/x86_64-pld-linux/sys-include -L/home/users/pluto/rpm/BUILD/gcc-4.1-20050723T1611UTC/obj-x86_64-pld-linux/gcc/../ld -O2 -DIN_GCC -W -Wall -Wwrite-strings -Wstrict-prototypes -Wmissing-prototypes -Wold-style-definition -isystem ./include -fPIC -DHAVE_GTHR_DEFAULT -DIN_LIBGCC2 -D__GCC_FLOAT_NOT_NEEDED -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include -I../../gcc/../libcpp/include -fvisibility=hidden -DHIDE_EXPORTS -fexceptions -c ../../gcc/unwind-dw2.c -o libgcc/./unwind-dw2.o ../../gcc/unwind.inc: In function '_Unwind_ForcedUnwind': ../../gcc/unwind.inc:215: internal compiler error: in create_pre_exit, at mode-switching.c:352 (...) make[3]: *** [libgcc/./unwind-dw2.o] Error 1 make[3]: Leaving directory `/home/users/pluto/rpm/BUILD/gcc-4.1-20050723T1611UTC/obj-x86_64-pld-linux/gcc' make[2]: *** [stmp-multilib] Error 2 make[2]: Leaving directory `/home/users/pluto/rpm/BUILD/gcc-4.1-20050723T1611UTC/obj-x86_64-pld-linux/gcc' make[1]: *** [stage1_build] Error 2 make[1]: Leaving directory `/home/users/pluto/rpm/BUILD/gcc-4.1-20050723T1611UTC/obj-x86_64-pld-linux/gcc' make: *** [bootstrap] Error 2
(In reply to comment #13) > current mainline bootstrap still fails. > ../../gcc/unwind.inc: In function '_Unwind_ForcedUnwind': > ../../gcc/unwind.inc:215: internal compiler error: in create_pre_exit, at > mode-switching.c:352 The patch at http://gcc.gnu.org/ml/gcc-patches/2005-08/msg01142.html fixes this problem.
Doing the code review. I've got a local patch for the create_pre_exit ice. I'm going to work to see this in 4.1.
So, I fixed another case in which we could die in create_pre_exit having to do with complex return values. But past that, there are failures that are completely within optimize_mode_switching, e.g. execute/20050604-1.c. $ ./cc1 -m32 -march=pentium4 z.c foo z.c: In function ‘foo’: z.c:28: error: unable to find a register to spill in class ‘MMX_REGS’ z.c:28: error: this is the insn: (insn 14 63 15 2 (set (reg:V4HI 61 [ D.1620 ]) (mem/s/j:V4HI (symbol_ref:SI ("u") <var_decl 0x2aaaadaff160 u>) [0 u.v+0 S8 A64])) 994 {*movv4hi_internal} (nil) (nil)) The problem is that we have a CFG like +--+ v | 1->2->3->4 and we place the efpu insn in block 2, but the emms insn in block 4. Aside from being Less Than Ideal, this results in BOTH mmx and fpu registers live around the loop, which means we can't allocate anything. Uros, you should bootstrap i386 with --with-arch=foo, where foo is whatever machine you have that supports at least mmx. Otherwise, you're not actually testing this new code on i386 except for the few test cases that force an -march or -mmx option. I'll keep looking at it for a bit to see if its something simple, but we're not going to overhaul optimize_mode_switching for 4.1 if it's something complicated.
Actually, I lied about the CFG. It's actually 1->3 with 2-3 still forming the loop. So LCM did the right thing, technically: for the case in which the loop trip count is zero, we avoid the efpu insn. The problem is, the model we have wrt efpu/emms requires that they be used in balanced pairs. And, really, we'd prefer that these insns be pushed out of loops when possible. But I'm not sure how to address this at the moment.
There is another bug in ix86_mode_needed() that causes timeouts for pr20314-1.c. The problem is in asm operands parsing code that gets into infinite loop. The correct code should increase variable c instead of cc when comma is found: config/i386/i386.c (ix86_mode_needed): ... for (i = 0; i < noperands; i++) { const char *c = constraints[i]; enum reg_class class; if (c[0] == '%') c++; if (ISDIGIT ((unsigned char) c[0]) && c[1] == '\0') c = constraints[c[0] - '0']; while (*c) { char cc = *c; int len; switch (cc) { case ',': c++; <<<<< here!! continue; case '=': case '+': case '*': case '%': case '!': case '#': case '&': case '?': break; ... Regarding emms/efpu instructions in loop: I have made some experiments by inserting mode switching insn before NOTE_INSN_LOOP_BEGIN. The failure in 20050604-1.c is fixed if this mode is set to FPU_MODE_MMX.
Uros, The mode switching patch ICEs current mainline on ix86. gcc fbmmx.i -msse -O0,-O1 fails with different insn-errors. [ -msse -O0 ] fbmmx.c: In function ‘_cairo_pixman_composite_src_add_8000x8000mmx’: fbmmx.c:2169: error: unable to find a register to spill in class ‘MMX_REGS’ fbmmx.c:2169: error: this is the insn: (insn 174 172 175 7 (set (reg:V8QI 59 [ D.8903 ]) (mem/c/i:V8QI (plus:SI (reg/f:SI 20 frame) (const_int -16 [0xfffffff0])) [0 __m2+0 S8 A32])) 776 {*movv8qi_internal} (nil) (nil)) fbmmx.c:2169: internal compiler error: in spill_failure, at reload1.c:1890 [ -msse -O1 ] fbmmx.c: In function ‘_cairo_pixman_composite_src_add_8000x8000mmx’: fbmmx.c:2169: error: unable to find a register to spill in class ‘MMX_REGS’ fbmmx.c:2169: error: this is the insn: (insn 166 165 169 9 (set (reg:V8QI 167) (us_plus:V8QI (mem:V8QI (reg/v/f:SI 4 si [orig:120 src ] [120]) [0 S8 A64]) (mem:V8QI (reg/v/f:SI 2 cx [orig:122 dst ] [122]) [0 S8 A64]))) 812 {mmx_usaddv8qi3} (nil) (nil)) fbmmx.c:2169: internal compiler error: in spill_failure, at reload1.c:1890
Created attachment 9791 [details] testcase for c#19
I'm no longer actively working on this.
I have seen this bug in the vector-2 testcases shipped in gcc/gcc/testsuite/gcc.dg/compat when compiling for i386 with -msse2. In vector-2_y.c and vector-2_x.c we end up using both mmx and x87 registers in the same function without any intervening EMMS instruction. This fails in 4.1.2, 3.4.6, and 3.3.1.
As stated in comment #16 and #17, the LCM infrastructure doesn't support mode switching in the way that would be usable for emms. Additionally, there are MANY problems expected when sharing x87 and MMX registers (i.e. handling of uninitialized x87 registers at the beginning of the function - this is the reason we don't implement x87 register passing ABI). Automatic MMX vectorization is not exactly a much usable feature nowadays (we have SSE that works quite well here). Due to recent changes in MMX register allocation area, excellent code is produced using MMX intrinsics, I'm closing this bug as WONTFIX.