[PATCH] dojump: Fix up probabilities splitting in dojump.c comparison splitting [PR98212]

Thu Dec 10 10:03:28 GMT 2020

Hi!

When compiling:
void foo (void);
void bar (float a, float b) { if (__builtin_expect (a != b, 1)) foo (); }
void baz (float a, float b) { if (__builtin_expect (a == b, 1)) foo (); }
void qux (float a, float b) { if (__builtin_expect (a != b, 0)) foo (); }
void corge (float a, float b) { if (__builtin_expect (a == b, 0)) foo (); }
on x86_64, we get (unimportant cruft removed):
bar:    ucomiss %xmm1, %xmm0
        jp      .L4
        je      .L1
.L4:    jmp     foo
.L1:    ret
baz:    ucomiss %xmm1, %xmm0
        jp      .L6
        jne     .L6
        jmp     foo
.L6:    ret
qux:    ucomiss %xmm1, %xmm0
        jp      .L13
        jne     .L13
        ret
.L13:   jmp     foo
corge:  ucomiss %xmm1, %xmm0
        jnp     .L18
.L14:   ret
.L18:   jne     .L14
        jmp     foo
(note for bar and qux that changed with a patch I've posted earlier today).
This is all reasonable, except the last function, the overall jump to
the tail call is predicted unlikely (10%), so it is good jmp foo isn't on
the straight line path, but NaNs are (or should be) considered very unlikely
in the programs, so IMHO the right code (and one emitted with the following
patch) is:
corge:  ucomiss %xmm1, %xmm0
        jp      .L14
        je      .L18
.L14:   ret
.L18:   jmp     foo

When splitting conditions, we have previous original prob as probability of
jumping to true label and then use a cprob set to 99% for ORDERED and 1%
for UNORDERED.  For !and_them, we end up with splitting
if (x) goto true; // prob
goto false;
into:
if (y) goto true; // prob * cprob (== first_prob)
if (z) goto true; // adjusted prob
goto false;
with first_prob = prob.split (cprob); being the computation.

For and_them, we instead split:
if (x) goto true; // prob
goto false;
into:
if (y) goto false; // 1 - first_prob
if (z) goto true; // adjusted prob
goto false;
and first_prob being computed as:
prob = prob.invert ();
first_prob = prob.split (cprob).invert ();
prob = prob.invert ();
This sort of works if the initial probability is likely (larger than even),
but with original prob being e.g. 10%, we compute first_prob as 10.9%
- 1 - ((1 - 0.1) * 0.99) - as EQ splits into ORDERED && something, but
we use first_prob  on recursive call where we pass ORDERED and the false
label and NULL true label.  That means we predict the case where none of
the arguments is NaN is only 10.9%, which is very unlikely.

The following patch uses a different computation, taking into account
that for the and_them case, if first_code is ORDERED that
first_prob.invert () should be the probability of the inverted comparison
and UNORDERED should be considered unlikely, so it uses
first_prob = prob.split (cprob.invert ()).invert ();
without the two prob = prob.invert (); around it and that seems to work
fine for all the cases, but I'm not 100% sure if that is right.
Or in the PR there is yet another variant that also seems to work.

Bootstrapped/regtested on x86_64-linux and i686-linux.

2020-12-10  Jakub Jelinek  <jakub@redhat.com>

	PR rtl-optimization/98212
	* dojump.c (do_compare_rtx_and_jump): Change computation of
	first_prob for and_them and don't invert prob around it.

	* gcc.dg/predict-8.c: Adjust expected probability.

--- gcc/dojump.c.jj	2020-12-09 15:11:17.042888002 +0100
+++ gcc/dojump.c	2020-12-09 20:05:59.535234206 +0100
@@ -1148,9 +1148,8 @@ do_compare_rtx_and_jump (rtx op0, rtx op
 	      if (and_them)
 		{
 		  rtx_code_label *dest_label;
-		  prob = prob.invert ();
-		  profile_probability first_prob = prob.split (cprob).invert ();
-		  prob = prob.invert ();
+		  profile_probability first_prob
+		    = prob.split (cprob.invert ()).invert ();
 		  /* If we only jump if true, just bypass the second jump.  */
 		  if (! if_false_label)
 		    {
--- gcc/testsuite/gcc.dg/predict-8.c.jj	2020-01-12 11:54:37.506396918 +0100
+++ gcc/testsuite/gcc.dg/predict-8.c	2020-12-10 10:34:00.632123682 +0100
@@ -8,4 +8,4 @@ int foo(float a, float b) {
     return 2;
 }
 
-/* { dg-final { scan-rtl-dump-times "65.\[34]. .guessed" 2 "expand"} } */
+/* { dg-final { scan-rtl-dump-times "99.\[345]. .guessed" 2 "expand"} } */

	Jakub