The kernel for s390 (and ppc) currently drags in __cmpdi2 from libgcc with the following reduced testcase even though s390 is supposed to have an 8 byte comparison opcode and archs such as i686 generate bit-twiddling for this unsupported case (admittandly not from ifcvt, but from some define-expand hackery), i.e. ll[0] | ll[1] == 0. void foo(void); int dcache_readdir(long long ll) { switch(ll) { case 0: foo(); } }
This is no way critical at all.
Confirmed, this is just a missed optimization and not a regression at best. If the kernel is using long long on a 32bit target, it needs all support functions including __cmpdi2.
Using void foo(void); int dcache_readdir(long long ll) { if (ll == 0) foo(); } the correct bit-twiddling is generated... So it looks like a generic switch expand issue.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21237
(In reply to comment #4) > http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21237 oops, I accidentally posted this link:(
As far as I know the kernel guys rely on the fact that gcc can handle DImode operations without calling libgcc. As Richard pointed out this only fails in this case because the conditional jump is emitted differently for case nodes. A normal DImode compare (on 32bit) is split into SImode compares before emit_cmp_and_jump_insns is called. This is done by do_jump_by_parts_equality. emit_case_nodes in turn calls do_jump_if_equal which calls emit_cmp_and_jump_insns with DImode operands. So I think using the dojump.c machinery in emit_case_nodes should be the way to go - right?!
Yes, this sounds like the way to go.
Subject: Bug 25724 Author: sayle Date: Mon Feb 13 01:55:37 2006 New Revision: 110906 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=110906 Log: PR middle-end/25724 * dojump.c (do_jump): Call do_compare_rtx_and_jump. (do_jump_parts_zero_rtx): New function renamed from do_jump_parts_equality_rtx. Made static. Add a mode argument. (do_jump_parts_equality_rtx): New function split out from do_jump_parts_equality. Old implementation renamed as above. Call do_jump_parts_zero_rtx if either operand is zero. (do_jump_parts_equality): Call do_jump_parts_equality_rtx to do all of the heavy lifting. (do_compare_rtx_and_jump): Handle multi-word comparisons by calling either do_jump_by_parts_greater_rtx or do_jump_by_parts_equality_rtx. * expr.h (do_jump_by_parts_equality_rtx): Remove prototype. * expmed.c (do_cmp_and_jump): Now multi-word optimization has moved to do_compare_rtx_and_jump, call it directly. * stmt.c (do_jump_if_equal): Remove static prototype. Add a mode argument. Call do_compare_rtx_and_jump. (emit_case_nodes): Update calls to do_jump_if_equal. Modified: trunk/gcc/ChangeLog trunk/gcc/dojump.c trunk/gcc/expmed.c trunk/gcc/expr.h trunk/gcc/stmt.c
This has now been fixed on mainline.