Bug 83610

Summary: Come up with __builtin_expect_with_probabilty
Product: gcc Reporter: Daniel Fruzynski <bugzilla>
Component: targetAssignee: Martin Liška <marxin>
Status: RESOLVED FIXED    
Severity: normal CC: acahalan
Priority: P3 Keywords: missed-optimization
Version: 4.9.0   
Target Milestone: ---   
Host: Target: x86_64-linux-gnu
Build: Known to work:
Known to fail: Last reconfirmed: 2017-12-29 00:00:00
Bug Depends on:    
Bug Blocks: 85559    
Attachments: Benchmark

Description Daniel Fruzynski 2017-12-28 12:24:21 UTC
[code]
void f1();
void f2();

void test(int a, int b, int c, int d, int n, int k)
{
  int val = a & b;
  if (__builtin_expect(!!(n == k), 0))
    val &= c;
  if (__builtin_expect(!!(n == 10 - k), 0))
    val &= d;
  if (val)
    f1();
  else
    f2();
}
[/code]

This code compiled with gcc 4.8.5 generates branches as expected:

[asm]
test(int, int, int, int, int, int):
  and edi, esi
  cmp r8d, r9d
  je .L6
.L2:
  mov eax, 10
  sub eax, r9d
  cmp r8d, eax
  je .L7
.L3:
  test edi, edi
  jne .L8
  jmp f2()
.L8:
  jmp f1()
.L7:
  and edi, ecx
  jmp .L3
.L6:
  and edi, edx
  jmp .L2
[/asm]

When this code is compiled with gcc 4.9.0 or higher, it generates branchless code like below. In my case it is slower than version with branches. I wanted to   convince compiler to generate this version of code by using __builtin_expect, but for some reason it does not work.

[asm]
test(int, int, int, int, int, int):
  and esi, edi
  mov eax, 10
  and edx, esi
  cmp r8d, r9d
  cmove esi, edx
  sub eax, r9d
  and ecx, esi
  cmp r8d, eax
  cmove esi, ecx
  test esi, esi
  jne .L6
  jmp f2()
.L6:
  jmp f1()
[/asm]
Comment 1 Daniel Fruzynski 2017-12-28 12:30:10 UTC
Code was compiled with "-O3 -march=core2 -mtune=generic"
Comment 2 Martin Liška 2017-12-29 11:31:37 UTC
Thanks for report. I can confirm that it changed in GCC 4.9.0 and newer versions.
Note that version without branching (using cmove) is actually also conditional code and is transformed by 'If-conversion pass'.

What you can do is to change default probability of expect (which is by default 90%):

--param builtin-expect-probability=99

Doing that I see original version w/o cmove. Is it helpful for you?
Do you have a microbenchmark you can attach that compares speed of both versions?
Comment 3 Daniel Fruzynski 2017-12-29 19:28:51 UTC
Created attachment 42980 [details]
Benchmark

Here is benchmark for this case. With unlikely() execution time decreases from 20.5sec to 20.3sec - about 1%. For my real app change it was a bit more than 2%.

Thanks for information about this parameter, I will give it a try. So far I noticed that gcc uses CMOV when values are stored in registers. When they are in memory as a class fields, it generates code with branches. I am still playing with this code, so maybe I will need it later.

BTW, what do you thing about adding 3rd param to __builtin_expect, which will specify probability? It may be helpful in cases like mine.
Comment 4 Andrew Pinski 2017-12-29 19:32:34 UTC
Note also this is a target issue because on some targets the conditional move is just as cheap (or slightly cheaper) as a branch followed by a mov.
Comment 5 Martin Liška 2018-01-02 09:01:57 UTC
(In reply to Daniel Fruzynski from comment #3)
> Created attachment 42980 [details]
> Benchmark
> 
> Here is benchmark for this case. With unlikely() execution time decreases
> from 20.5sec to 20.3sec - about 1%. For my real app change it was a bit more
> than 2%.

Thanks for the micro-benchmark.

> 
> Thanks for information about this parameter, I will give it a try. So far I
> noticed that gcc uses CMOV when values are stored in registers. When they
> are in memory as a class fields, it generates code with branches. I am still
> playing with this code, so maybe I will need it later.
> 
> BTW, what do you thing about adding 3rd param to __builtin_expect, which
> will specify probability? It may be helpful in cases like mine.

Well I haven't thought about it. From implementation point of view it will be simple. But usage can be then a bit complicated.

Thoughts?
Comment 6 Martin Liška 2018-07-26 12:37:29 UTC
Patch candidate was sent to gcc-patches ML:
https://gcc.gnu.org/ml/gcc-patches/2018-07/msg01296.html
Comment 7 Martin Liška 2018-08-10 09:43:37 UTC
Author: marxin
Date: Fri Aug 10 09:43:06 2018
New Revision: 263466

URL: https://gcc.gnu.org/viewcvs?rev=263466&root=gcc&view=rev
Log:
Introduce __builtin_expect_with_probability (PR target/83610).

2018-08-10  Martin Liska  <mliska@suse.cz>

        PR target/83610
	* builtin-types.def (BT_FN_LONG_LONG_LONG_DOUBLE): Add new
        function type.
	* builtins.c (expand_builtin_expect_with_probability):
        New function.
	(expand_builtin_expect_with_probability): New function.
	(build_builtin_expect_predicate): Add new argumnet probability
        for BUILT_IN_EXPECT_WITH_PROBABILITY.
	(fold_builtin_expect):
	(fold_builtin_2):
	(fold_builtin_3):
	* builtins.def (BUILT_IN_EXPECT_WITH_PROBABILITY):
	* builtins.h (fold_builtin_expect): Set new argument.
	* doc/extend.texi: Document __builtin_expect_with_probability.
	* doc/invoke.texi: Likewise.
	* gimple-fold.c (gimple_fold_call): Pass new argument.
	* ipa-fnsummary.c (find_foldable_builtin_expect): Handle
        also BUILT_IN_EXPECT_WITH_PROBABILITY.
	* predict.c (get_predictor_value): New function.
	(expr_expected_value): Add new argument probability. Assume
        that predictor and probability are always non-null.
	(expr_expected_value_1): Likewise.  For __builtin_expect and
        __builtin_expect_with_probability set probability.  Handle
        combination in binary expressions.
	(tree_predict_by_opcode): Simplify code by simply calling
        get_predictor_value.
	(pass_strip_predict_hints::execute): Add handling of
        BUILT_IN_EXPECT_WITH_PROBABILITY.
	* predict.def (PRED_BUILTIN_EXPECT_WITH_PROBABILITY): Add
        new predictor.
	* tree.h (DECL_BUILT_IN_P): New function.
2018-08-10  Martin Liska  <mliska@suse.cz>

        PR target/83610
	* gcc.dg/predict-17.c: New test.
	* gcc.dg/predict-18.c: New test.
	* gcc.dg/predict-19.c: New test.

Added:
    trunk/gcc/testsuite/gcc.dg/predict-17.c
    trunk/gcc/testsuite/gcc.dg/predict-18.c
    trunk/gcc/testsuite/gcc.dg/predict-19.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/builtin-types.def
    trunk/gcc/builtins.c
    trunk/gcc/builtins.def
    trunk/gcc/builtins.h
    trunk/gcc/doc/extend.texi
    trunk/gcc/doc/invoke.texi
    trunk/gcc/gimple-fold.c
    trunk/gcc/ipa-fnsummary.c
    trunk/gcc/predict.c
    trunk/gcc/predict.def
    trunk/gcc/testsuite/ChangeLog
    trunk/gcc/tree.h
Comment 8 Martin Liška 2018-08-10 09:44:23 UTC
Implemented.
Comment 9 Andrew Pinski 2021-07-17 20:10:26 UTC
*** Bug 26367 has been marked as a duplicate of this bug. ***