Bug 42233 - [4.3 regression] c++ builtin_expect code generation regression
Summary: [4.3 regression] c++ builtin_expect code generation regression
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: middle-end (show other bugs)
Version: 4.4.3
: P3 normal
Target Milestone: 4.4.4
Assignee: Jakub Jelinek
URL:
Keywords: missed-optimization
Depends on:
Blocks:
 
Reported: 2009-11-30 16:43 UTC by Guillaume Morin
Modified: 2010-03-08 11:59 UTC (History)
5 users (show)

See Also:
Host:
Target:
Build:
Known to work: 4.1.2 4.4.4 4.5.0
Known to fail: 4.4.3
Last reconfirmed: 2010-02-19 10:03:15


Attachments
gcc45-pr42233.patch (615 bytes, patch)
2010-02-18 12:47 UTC, Jakub Jelinek
Details | Diff
gcc45-pr42233-2.patch (7.66 KB, patch)
2010-02-18 15:04 UTC, Jakub Jelinek
Details | Diff
gcc44-pr42233.patch (8.27 KB, patch)
2010-02-19 18:28 UTC, Jakub Jelinek
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Guillaume Morin 2009-11-30 16:43:18 UTC
The following c/++ code generates assembly that favors the unlikely case with gcc 4.3 and 4.4:
extern void likely();
extern void unlikely();
void test_expect(char * a, char *b, char *c, char *d) {
    if (__builtin_expect(!!(a == b && c == d), 1)) {
        likely();
    }
    else {
        unlikely();
    }
}
Compiled with g++44 -O2 -S -fverbose-asm -m32. (or -m64).

A few notes: gcc 4.1 seems to generate better assembly.  If this code is code is compiled as c code and not C++ with gcc 4.4, the likely case is properly favored.

Changing the if statement into "if (__builtin_expect(!!(a == b),1) && __builtin_expect(!!(c == d), 1))" fixes the problem as well.  With this code, gcc generates better code for the likely case.
Comment 1 Guillaume Morin 2009-11-30 17:00:51 UTC
I can see this problem with 4.3.2 (Debian 4.3.2-1.1), 4.2.4 (Debian 4.2.4-6), 4.4.0 20090514 (Red Hat 4.4.0-6).  I do not see it with 4.1.2 20080704 (Red Hat 4.1.2-46) and 4.1.3 20080704 (prerelease) (Debian 4.1.2-25).
Comment 2 Jan Hubicka 2009-11-30 17:29:32 UTC
With 4.5 we seem to be all fine here:
jh@gcc17:~/trunk/build/gcc$ ./g++ -B ./ -O2 tt.c -fdump-tree-all-details -S -fdump-rtl-all-details-blocks
jh@gcc17:~/trunk/build/gcc$ more tt.s
        .file   "tt.c"
        .text
        .p2align 4,,15
.globl _Z11test_expectPcS_S_S_
        .type   _Z11test_expectPcS_S_S_, @function
_Z11test_expectPcS_S_S_:
.LFB0:
        cmpq    %rcx, %rdx
        sete    %dl
        cmpq    %rsi, %rdi
        sete    %al
        testb   %al, %dl
        je      .L2
        jmp     _Z6likelyv
.L2:
        jmp     _Z8unlikelyv
.LFE0:

and the conditional jump is predicted in the proper direction in greg dumps as having 1% probability to reach unlikely.
Of course we miss here the conditional tailcall case.

Honza
Comment 3 Guillaume Morin 2009-11-30 22:06:51 UTC
The trunk g++ output looks good.  As I said, there is a simple workaround.  But since this is a regression from 4.1, a fix in 4.4 would be nice.
Additionally, it would be great if you could document the full scope of the bug (like if all conditions can trigger this etc) and if there is any simpler workaround.
Comment 4 Guillaume Morin 2010-01-06 17:51:19 UTC
My original tests were done on x86/x86_64.  I observed the same problem with gcc 4.3.2 when compiling for PowerPC on AIX.  For some reason, the output is properly optimized with the same version (different build though) when compiling for Sparc on Solaris.
Comment 5 Jakub Jelinek 2010-02-17 15:34:38 UTC
I can't reproduce this with 4.4 either, since http://gcc.gnu.org/ml/gcc-patches/2008-08/msg01924.html
gcc output looks like what Jan pasted here.
Comment 6 Guillaume Morin 2010-02-17 15:40:13 UTC
Are you compiling with g++ and -O2?  On the only 4.4 version I have access to right now (gcc version 4.4.0 20090514 (Red Hat 4.4.0-6), I get:
$ g++44 -O2 -S test_expect.c
$ cat test_expect.s
(snip)
.globl _Z11test_expectPcS_S_S_
        .type   _Z11test_expectPcS_S_S_, @function
_Z11test_expectPcS_S_S_:
.LFB0:
        cmpq    %rcx, %rdx
        je      .L5
.L2:
        jmp     _Z8unlikelyv
        .p2align 4,,10
        .p2align 3
.L5:
        cmpq    %rsi, %rdi
        jne     .L2
        .p2align 4,,6
        jmp     _Z6likelyv

I get a similar output for the versions I listed in the report.
Comment 7 Jakub Jelinek 2010-02-18 12:30:18 UTC
On the 4.4 branch actually cc1 behaves differently from cc1plus.
In any case, I believe the two setne's plus testb code is actually the worst
of the options, and e.g. with -Os it is even much larger.
Just compile the testcase on 4.5 with -Os and -Os -D'__builtin_expect(a,b)=(a)'
In the former case (i.e. with __builtin_expect) if generates:
        cmpq    %rcx, %rdx
        sete    %dl
        xorl    %eax, %eax
        cmpq    %rsi, %rdi
        movzbl  %dl, %edx
        sete    %al
        testl   %eax, %edx
        je      .L2
        xorl    %eax, %eax
        jmp     likely
.L2:
        xorl    %eax, %eax
        jmp     unlikely
while in the latter case generates:
        cmpq    %rcx, %rdx
        jne     .L2
        cmpq    %rsi, %rdi
        jne     .L2
        xorl    %eax, %eax
        jmp     likely
.L2:
        xorl    %eax, %eax
        jmp     unlikely
(what would be actually the best code for -O2 when __builtin_expect is used and tells that likely () is likely).
Comment 8 Jakub Jelinek 2010-02-18 12:47:03 UTC
Created attachment 19906 [details]
gcc45-pr42233.patch

The reason why it behaves differently with __builtin_expect from normal condition is that in the latter case gimple_boolify changes the types of the comparisons etc. to _Bool, while for __builtin_expect it doesn't recurse into its first argument.
The attached patch fixes that.
Then we are back to the 4.3 generated code, i.e. the same as is generated without __builtin_expect.
At this point the problem is described in a comment above add_reg_br_prob_note:
/* Verify that there is exactly single jump instruction since last and attach
   REG_BR_PROB note specifying probability.
   ??? We really ought to pass the probability down to RTL expanders and let it
   re-distribute it when the conditional expands into multiple conditionals.
   This is however difficult to do.  */
Since here there are two jumps, not one, no REG_BR_PROB note is added.
Comment 9 Paolo Bonzini 2010-02-18 13:17:47 UTC
I'm relieved that cond-optab is in no way related to this. :-)

I'm not expert of the gimplifier but the patch makes sense and looks good.
Comment 10 Jakub Jelinek 2010-02-18 15:04:55 UTC
Created attachment 19909 [details]
gcc45-pr42233-2.patch

Trunk patch that seems to fix this together with the previous patch.
Going to bootstrap/regtest both now.
Comment 11 Jakub Jelinek 2010-02-19 09:50:46 UTC
Subject: Bug 42233

Author: jakub
Date: Fri Feb 19 09:50:30 2010
New Revision: 156888

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=156888
Log:
	PR middle-end/42233
	* gimplify.c (gimple_boolify): For __builtin_expect call
	gimple_boolify also on its first argument.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/gimplify.c

Comment 12 Jakub Jelinek 2010-02-19 09:54:14 UTC
Subject: Bug 42233

Author: jakub
Date: Fri Feb 19 09:53:51 2010
New Revision: 156889

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=156889
Log:
	PR middle-end/42233
	* expr.h (jumpifnot, jumpifnot_1, jumpif, jumpif_1, do_jump,
	do_jump_1, do_compare_rtx_and_jump): Add PROB argument.
	* dojump.c: Include output.h.
	(inv): New inline function.
	(jumpifnot, jumpifnot_1, jumpif, jumpif_1, do_jump_1, do_jump,
	do_jump_by_parts_greater_rtx, do_jump_by_parts_greater,
	do_jump_by_parts_zero_rtx, do_jump_by_parts_equality_rtx,
	do_jump_by_parts_equality, do_compare_and_jump): Add PROB
	argument, pass it down to other calls.
	(do_compare_rtx_and_jump): Likewise.  If PROB is not -1,
	add REG_BR_PROB note to the conditional jump.
	* cfgexpand.c (add_reg_br_prob_note): Removed.
	(expand_gimple_cond): Don't call it, add the probability
	as last argument to jumpif_1/jumpifnot_1.
	* Makefile.in (dojump.o): Depend on output.h.
	* builtins.c (expand_errno_check): Adjust do_compare_rtx_and_jump
	callers.
	* expmed.c (emit_store_flag_force, do_cmp_and_jump): Likewise.
	* stmt.c (do_jump_if_equal): Likewise.
	* cfgrtl.c (rtl_lv_add_condition_to_bb): Likewise.
	* loop-unswitch.c (compare_and_jump_seq): Likewise.
	* config/rs6000/rs6000.c (rs6000_aix_emit_builtin_unwind_init):
	Likewise.
	* optabs.c (expand_doubleword_shift, expand_abs): Likewise.
	* expr.c (expand_expr_real_1): Adjust do_jump, jumpifnot and
	jumpifnot_1 callers.
	(expand_expr_real_2): Adjust jumpifnot_1 and do_compare_rtx_and_jump
	callers.
	(store_expr): Adjust jumpifnot caller.
	(store_constructor): Adjust jumpif caller.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/Makefile.in
    trunk/gcc/builtins.c
    trunk/gcc/cfgexpand.c
    trunk/gcc/cfgrtl.c
    trunk/gcc/config/rs6000/rs6000.c
    trunk/gcc/dojump.c
    trunk/gcc/expmed.c
    trunk/gcc/expr.c
    trunk/gcc/expr.h
    trunk/gcc/loop-unswitch.c
    trunk/gcc/optabs.c
    trunk/gcc/stmt.c

Comment 13 Jakub Jelinek 2010-02-19 10:03:14 UTC
Fixed on the trunk so far.  I might try to do a backport in a few days to 4.4 branch, though the second patch won't be trivial, as dojump.c changed massively with SSA expansion changes.
Comment 14 Jakub Jelinek 2010-02-19 12:47:34 UTC
Subject: Bug 42233

Author: jakub
Date: Fri Feb 19 12:47:18 2010
New Revision: 156893

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=156893
Log:
	PR middle-end/42233
	* loop-doloop.c (add_test): Adjust do_compare_rtx_and_jump caller.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/loop-doloop.c

Comment 15 Jakub Jelinek 2010-02-19 18:19:22 UTC
Subject: Bug 42233

Author: jakub
Date: Fri Feb 19 18:19:06 2010
New Revision: 156903

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=156903
Log:
	PR middle-end/42233
	* dojump.c (do_jump) <case TRUTH_NOT_EXPR>: Invert priority.

Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/dojump.c

Comment 16 Jakub Jelinek 2010-02-19 18:28:40 UTC
Created attachment 19924 [details]
gcc44-pr42233.patch

Backported patch for 4.4 branch, so far only compile tested.
Comment 17 Jakub Jelinek 2010-03-08 11:46:42 UTC
Subject: Bug 42233

Author: jakub
Date: Mon Mar  8 11:46:28 2010
New Revision: 157274

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=157274
Log:
	PR middle-end/42233
	* dojump.c (do_jump) <case TRUTH_NOT_EXPR>: Invert priority.

	PR bootstrap/43121
	* except.c (sjlj_emit_function_enter): Don't call
	add_reg_br_prob_note, instead add REG_BR_PROB note to the last insn
	directly.
	* rtl.h (add_reg_br_prob_note): Remove prototype.

	PR middle-end/42233
	* loop-doloop.c (add_test): Adjust do_compare_rtx_and_jump caller.

	PR middle-end/42233
	* expr.h (jumpifnot, jumpifnot_1, jumpif, jumpif_1, do_jump,
	do_jump_1, do_compare_rtx_and_jump): Add PROB argument.
	* dojump.c: Include output.h.
	(inv): New inline function.
	(jumpifnot, jumpifnot_1, jumpif, jumpif_1, do_jump_1, do_jump,
	do_jump_by_parts_greater_rtx, do_jump_by_parts_greater,
	do_jump_by_parts_zero_rtx, do_jump_by_parts_equality_rtx,
	do_jump_by_parts_equality, do_compare_and_jump): Add PROB
	argument, pass it down to other calls.
	(do_compare_rtx_and_jump): Likewise.  If PROB is not -1,
	add REG_BR_PROB note to the conditional jump.
	* cfgexpand.c (add_reg_br_prob_note): Removed.
	(expand_gimple_cond): Don't call it, add the probability
	as last argument to jumpif_1/jumpifnot_1.
	* Makefile.in (dojump.o): Depend on output.h.
	* builtins.c (expand_errno_check): Adjust do_compare_rtx_and_jump
	callers.
	* expmed.c (emit_store_flag_force, do_cmp_and_jump): Likewise.
	* stmt.c (do_jump_if_equal): Likewise.
	* cfgrtl.c (rtl_lv_add_condition_to_bb): Likewise.
	* loop-unswitch.c (compare_and_jump_seq): Likewise.
	* config/rs6000/rs6000.c (rs6000_aix_emit_builtin_unwind_init):
	Likewise.
	* optabs.c (expand_doubleword_shift, expand_abs): Likewise.
	* expr.c (expand_expr_real_1): Adjust do_jump, jumpifnot and
	jumpifnot_1 callers.
	(expand_expr_real_2): Adjust jumpifnot_1 and do_compare_rtx_and_jump
	callers.
	(store_expr): Adjust jumpifnot caller.
	(store_constructor): Adjust jumpif caller.

	PR middle-end/42233
	* gimplify.c (gimple_boolify): For __builtin_expect call
	gimple_boolify also on its first argument.

Modified:
    branches/gcc-4_4-branch/gcc/ChangeLog
    branches/gcc-4_4-branch/gcc/Makefile.in
    branches/gcc-4_4-branch/gcc/cfgexpand.c
    branches/gcc-4_4-branch/gcc/cfgrtl.c
    branches/gcc-4_4-branch/gcc/config/rs6000/rs6000.c
    branches/gcc-4_4-branch/gcc/dojump.c
    branches/gcc-4_4-branch/gcc/except.c
    branches/gcc-4_4-branch/gcc/expmed.c
    branches/gcc-4_4-branch/gcc/expr.c
    branches/gcc-4_4-branch/gcc/expr.h
    branches/gcc-4_4-branch/gcc/gimplify.c
    branches/gcc-4_4-branch/gcc/loop-doloop.c
    branches/gcc-4_4-branch/gcc/loop-unswitch.c
    branches/gcc-4_4-branch/gcc/optabs.c
    branches/gcc-4_4-branch/gcc/rtl.h
    branches/gcc-4_4-branch/gcc/stmt.c

Comment 18 Jakub Jelinek 2010-03-08 11:59:55 UTC
Fixed for 4.4.3 too.  Don't plan to backport this to 4.3.