Bug 46939 - http://blog.regehr.org/archives/320 example 6
Summary: http://blog.regehr.org/archives/320 example 6
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: tree-optimization (show other bugs)
Version: 4.6.0
: P3 normal
Target Milestone: ---
Assignee: Jan Hubicka
URL: http://blog.regehr.org/archives/320
Keywords:
Depends on:
Blocks:
 
Reported: 2010-12-14 14:31 UTC by Jakub Jelinek
Modified: 2010-12-17 17:11 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2010-12-15 12:24:34


Attachments
patch I am testing (1021 bytes, patch)
2010-12-15 13:28 UTC, Jan Hubicka
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Jakub Jelinek 2010-12-14 14:31:37 UTC
The reason we don't expand / 10 using a multiplication is because gcc thinks it happens in cold code.

In *.optimized we have:
  # BLOCK 5 freq:9999
  # PRED: 4 [100.0%]  (fallthru,exec) 2 [33.3%]  (exec)
  # PT = nonlocal 
  
  # strD.1584_1 = PHI <strD.1584_17(4), strD.1584_12(D)(2)>
  # signD.1590_5 = PHI <signD.1590_4(4), 0(2)>
<L39>:
  str.0D.2703_18 = (long unsigned intD.4) strD.1584_1;
  end.1D.2704_19 = (long unsigned intD.4) endD.1592_13;
  if (str.0D.2703_18 < end.1D.2704_19)
    goto <bb 6>;
  else
    goto <bb 22>;
  # SUCC: 6 [4.0%]  (true,exec) 22 [96.0%]  (false,exec)

  # BLOCK 6 freq:400
  # PRED: 5 [4.0%]  (true,exec)
  # VUSE <.MEMD.2753_65(D)>
  D.2701_20 = *strD.1584_1;
  if (D.2701_20 > 48)
    goto <bb 7>;
  else
    goto <bb 22>;
  # SUCC: 7 [4.0%]  (true,exec) 22 [96.0%]  (false,exec)

  # BLOCK 7 freq:16
  # PRED: 6 [4.0%]  (true,exec)
  if (D.2701_20 <= 57)
    goto <bb 8>;
  else
    goto <bb 22>;
  # SUCC: 8 [4.0%]  (true,exec) 22 [96.0%]  (false,exec)
...
  # BLOCK 15 freq:6
  # PRED: 14 [96.0%]  (true,exec)
  D.2735_44 = (long intD.2) digitD.1591_43;
  D.2736_45 = 9223372036854775807 - D.2735_44;
  D.2737_46 = D.2736_45 / 10;
  if (ctx_valueD.1589_3 <= D.2737_46)
    goto <bb 16>;
  else
    goto <bb 22>;
  # SUCC: 16 [96.0%]  (true,exec) 22 [4.0%]  (false,exec)

while ((unsigned long) str < (unsigned long) end) is a loop, not sure why we predict the loop header to terminate immediately, and both the >= 48 and <= 57 tests have return -1; in the other branch, so it is also strange to see them predicted so unlikely.  Honza?
Comment 1 Jan Hubicka 2010-12-15 12:24:34 UTC
Predictions for bb 6
  DS theory heuristics: 4.0%
  first match heuristics (ignored): 4.0%
  combined heuristics: 4.0%
  negative return heuristics: 4.0%

So it is negative return heuristics.

Richi's profiling reorg broke statistics code, so I will need to dig into archives before I fix it again.  But last time I updated the tables the heuristics was 96% right on SPEC. I am not terribly oposed in making it less reliable with a comment explaining why. In SPEC it is most surely perfoming heuristics, we have only few of them above 60%

The heuristics makes a guess that negative return values tends to be used to return error stages. It is not true in this case.
Comment 2 Jan Hubicka 2010-12-15 12:54:10 UTC
As observed by Jakub, it is actually bug in the return heuristics implementation.  Will make patch shortly.
Comment 3 Jan Hubicka 2010-12-15 13:28:56 UTC
Created attachment 22763 [details]
patch I am testing
Comment 4 Jan Hubicka 2010-12-16 01:27:25 UTC
Author: hubicka
Date: Thu Dec 16 01:27:23 2010
New Revision: 167893

URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=167893
Log:
	PR middle-end/46939
	* predic.c (predict_paths_leading_to_edge): New function.
	(apply_return_prediction): Use it.
	(predict_paths_for_bb): Do not special case abnormals.
	* gcc.target/i386/pr46939.c: New testcase.

Added:
    trunk/gcc/testsuite/gcc.target/i386/pr46939.c
Modified:
    trunk/gcc/ChangeLog
    trunk/gcc/predict.c
    trunk/gcc/testsuite/ChangeLog
Comment 5 Jan Hubicka 2010-12-16 01:40:29 UTC
the idiv issue is fixed now. It would be nice to compare if the function now has same speed as the other compilers.
Comment 6 Jakub Jelinek 2010-12-16 09:26:35 UTC
Fixed.
Comment 7 John Regehr 2010-12-16 15:10:41 UTC
(In reply to comment #6)
> Fixed.

Awesome.  We will re-test this code.
Comment 8 John Regehr 2010-12-17 17:11:05 UTC
Here is the current performance that we measure, in cycles:

gcc-head: 43
icc: 41
clang-head: 41
suncc: 42