This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Fortran] RFC / RFA patch for using build_predict_expr instead of builtin_expect / PR 58721


Hi,
sorry for taking time to return back to this.
> Pre-remark:
> 
> I had hoped that something like the following patch would work.
> However, it will lead to a bunch of run-time segfaults in the test
> suite - but the original dump seems to be fine and I also fail to
> spot the problem when looking at the patch. Thus, instead of posting
> a working patch, I have below a nonworking draft patch. Questions:
> 
> (a) Does the build_predict_expr() usage look in principle right - or
> do I do something obviously wrong?
> 
> (b) Is it possible to use build_predict_expr() for code like "a =
> (cond) ? (expr1) : (expr2)"? After gimplifying, there should be a BB
> [and a phi] - thus, adding the predict to the BB should be possible.
> But is it also possible on tree level with code like above? That
> mainly applies to trans-array.c but also in another case.
> 
> (c) How to handle:  if (cond1 || overflow_cond) {...}? Currently,
> "overflow_cond" is marked as unlikely but as build_predict_expr()
> applies to the branch/basic block and not to the condition, how
> should it be handled?
> 
> (d) And in general: Does the approach look right and the choice of
> PRED_ and their values?
> 
> * * *
> 
> gfortran uses internally builtin_expect [i.e. gfc_likely()/gfc_unlikely()].
> 
> On the other hand, builtin_expect is also used by (C/C++) users.
> Google did some tests and empirical result was that unlikely is more
> likely than GCC's prediction handling expected. I believe, the
> likelyhood for likely() reduced from a hit rate of 99 to 90.
> 
> For gfortran, the change leads to PR58721, where some code is
> regarded as less likely as before and to no inlining. In many cases,
> the unlikely() is not necessary as those branches have a call to an
> error function, annotated with "noreturn". As explicit user request
> overrides other heurisitic, the builtin_expect will cause that the
> predictor regards the gfc_unlikely() as *more* likely than without
> (as "noreturn" is less likely).
> 
> 
> This patch tries to address this for the different cases:
> 
> a) If there is a "noreturn" function called, the gfc_(un)likely has
> been removed. [Unless, I wrongly classified is in one of the next
> items; however, a "noreturn" will override another PRED_*. Thus, it
> shouldn't matter.]
> 
> b) As OVERFLOWs should only occur extremely rarely, I used
> PRED_OVERFLOW with PROB_VERY_LIKELY (i.e. the branch is very likely
> to be NOT_TAKEN).
> 
> c) In many cases, GCC permits to replace run-time aborts for
> failures by asking for a status, e.g. "DEALLOCATE(var,
> stat=status)"; I have used PRED_FAIL with HITRATE(80) for those -
> but one can argue about the value.
> 
> d) gfortran supports a run-time warning. That's currently only used
> for copy-in/out. I think both branches can be approximately equally
> likely (with no-warning being more common in most codes). However,
> the warning is only shown once. I have set PRED_WARN_CALL's hitrate
> to  to 70 - but that's also an arbitrary value.
> 
> e) As in the trans_image_index: There one had:  "if (cond ||
> invalid_bound) {... } else {...}". The "invalid_bound" is marked as
> unlikely but PRED_* cannot be use as only part of the branch
> condition is unlikely. [I've removed the unlikely.]
> 
> 
> Does the patch look sensible? Does the addition of the PRED_* make
> sense? How about their values?
> 
> 
> Tobias

>  predict.def               |   16 +++++++
> 
>  fortran/trans-array.c     |   26 +++++++-----
>  fortran/trans-expr.c      |   15 ++++---
>  fortran/trans-intrinsic.c |    3 -
>  fortran/trans-io.c        |    1 
>  fortran/trans-stmt.c      |   26 ++++++++----
>  fortran/trans.c           |   97 +++++++++++++++++++++++-----------------------
>  fortran/trans.h           |    4 -
> 
>  8 files changed, 107 insertions(+), 81 deletions(-)
> 
> diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
> index 78b08d7..8f4db5a 100644
> --- a/gcc/fortran/trans-array.c
> +++ b/gcc/fortran/trans-array.c
> @@ -79,6 +79,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "system.h"
>  #include "coretypes.h"
>  #include "tree.h"
> +#include "predict.h"	/* For PRED_FAIL.  */
> @@ -5222,6 +5222,8 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
>    overflow = integer_zero_node;
>  
>    gfc_init_block (&set_descriptor_block);
> +  gfc_add_expr_to_block (&set_descriptor_block,
> +			 build_predict_expr (PRED_FAIL, TAKEN));
>    size = gfc_array_init_size (se->expr, ref->u.ar.as->rank,
>  			      ref->u.ar.as->corank, &offset, lower, upper,
>  			      &se->pre, &set_descriptor_block, &overflow,
> @@ -5248,6 +5250,8 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
>  	  stmtblock_t set_status_block;
>  
>  	  gfc_start_block (&set_status_block);
> +	  gfc_add_expr_to_block (&set_status_block,
> +				 build_predict_expr (PRED_FAIL, NOT_TAKEN));
>  	  gfc_add_modify (&set_status_block, status,
>  			  build_int_cst (status_type, LIBERROR_ALLOCATION));
>  	  error = gfc_finish_block (&set_status_block);

I am bit confused by the expansion, but in general it seems resonable.
You do not need to attach predict_expr on both sides of one branch.  I.e. it seems enough
to attach the NOT_TAKEN predict_expr to the failure path.

There is important different in between builtin_expect and predict_expr.
The first speaks about the value, while the second speaks about its current basic block.

I.e.

if (__builtin_expect (a,0))
  unlikely path
else
  likely path.

translates as
if (a)
  predict_expr not_taken
  unlikely path
else
  likely path.

But if you have something like

a=__builtin_expect (b?1:0,0)

and you produce

a=b?predict_expr not_taken, 0,0
...
if (a)
  unlikely path

We need to check how it goes down to gimple.  Most probably we will end up with:

if (b)
  predict_expr not_taken
  a=0
else
  a=1;
if (a)
  unlikely path

predict.c itself is not smart enough to work out that A in this case has
expected value.  I am not sure if it should be - this is a difficult case
where you already need to have branch predictions to derive predictions
on value ranges.  There is a paper on the topic, but overall it seems
bit expensive and with questionable benefits for production use.

We still get proper prediction if we manage to crossjump second condition.
Crossjumping currently happens only after IPA so it won't help the branch
prediction accuracy in general. Moreover we will not have

So in short if we want the predictions to work reliably, we need to be sure
we produce them in a way backend can reliably consume them.

I think we should
  1) make use of predict_expr where it makes sense (i.e. forntend know it is
     generating the fallback path by itself).  it is cheaper and also more safe
     about fact that we will actually take the hint.
     __builtin_expect handling has its limitations and there are cases where we
     simply end up ignoring it since its use is too complicated.
  2) if there are the other cases (i.e. fortran language allows the failure
     flag to be stored into user variable and handled by user later),
     I will add internal use only argument to predict_expr and extend its handling.
     Again it won't be 100% reliable, since one conditional can then be handled
     by multiple predict_exprs that leads to generally unsolvable problem on
     how to combine probabilities.
     But hopefully those cases will mostly be simple - i.e. user will have
     if (failure_flag)
       make horrible death in controlled way
     statement somewhre after the allcation.
     We are smart enough to work out simple variants like
     if (failure_flag1 || failure_flag2)
       make horrible death in controlled way
  3) make sure the runtime library calls are correctly annotated with noreturn/cold
     flags that may help back end to work out the cases it failed otherwise.

Let me know if you need me to make patch for 2).
otherwise the patch seems OK with changes bellow

> diff --git a/gcc/predict.def b/gcc/predict.def
> index 2ce135c..ca58b26 100644
> --- a/gcc/predict.def
> +++ b/gcc/predict.def
> @@ -133,3 +133,19 @@ DEF_PREDICTOR (PRED_HOT_LABEL, "hot label", HITRATE (85), 0)
>  /* Branches to cold labels are extremely unlikely.  */
>  DEF_PREDICTOR (PRED_COLD_LABEL, "cold label", PROB_VERY_LIKELY,
>  	       PRED_FLAG_FIRST_MATCH)
> +
> +
> +/* The following are used in Fortran. */
> +
> +/* Branch leading to a run-time warning message are less likely.  */
> +DEF_PREDICTOR (PRED_WARN_CALL, "warn call", HITRATE (70), 0)
> +
> +/* Branch leading to an failure status are unlikely.  */
> +DEF_PREDICTOR (PRED_FAIL, "fail", HITRATE(80), 0)
> +
> +/* Branch belonging to a zero-sized array or absent argument are unlikely.  */
> +DEF_PREDICTOR (PRED_ZERO, "zero", HITRATE(80), 0)
> +
> +/* Branch leading to an overflow are extremely unlikely.  */
> +DEF_PREDICTOR (PRED_OVERFLOW, "overflow", PROB_VERY_LIKELY,
> +	       PRED_FLAG_FIRST_MATCH)

I would even gor for PROB_ALWAYS here.  As observed in the testcase, we may have
_very_ many early exits from the function because it allocated buch of arrays.
It realy depends if you compute 1^50, 0.99^50 or 0.9^50

I would go for a bit more dscriptive names, like "fortran alloc overflow"/"fortran zero size alloc" etc.
Those names will hopefuly be more obvious for middle end developer like myself.

Honza


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]