This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Fix up NaN and +-Inf handling and speed up m{in,ax}{loc,val} intrinsics (PR fortran/40643, PR fortran/31067, take 3)


Hi Jakub,

Jakub Jelinek wrote:
> Here is a new version.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> The changes to the generated files are attached bzip2ed, as they are really
> huge.
>   
Thanks for your impressive patch!

+           if (__builtin_expect (n >= len, 0))

Can you use   unlikely()/likely()  instead? It makes it a bit more
readable, cf. libgfortran.h:
  #define unlikely(x)     __builtin_expect(!!(x), 0)

OK for the trunk.

Tobias

PR: Regarding min*/max* and completely independent of this patch, one
could consider: (a) throwing in some
HONOR_NAN/HONOR_INF/HONOR_SIGNZERO/EXPR_MIN/EXPR_MAX for MIN/MAX and (b)
inlining for result = rank-1 arrays with -O<n>, 0 < n =/= "s" the
M{IN,AX}{VAL,LOC} intrinsics.

> 2009-07-22  Jakub Jelinek  <jakub@redhat.com>
>
> 	PR fortran/40643
> 	PR fortran/31067
> 	* trans-intrinsic.c (gfc_conv_intrinsic_minmaxloc,
> 	gfc_conv_intrinsic_minmaxval): Handle Infinities and NaNs properly,
> 	optimize.
> 	* trans-array.c (gfc_trans_scalarized_loop_end): No longer static.
> 	* trans-array.h (gfc_trans_scalarized_loop_end): New prototype.
>
> 	* libgfortran.h (GFC_REAL_4_INFINITY, GFC_REAL_8_INFINITY,
> 	GFC_REAL_10_INFINITY, GFC_REAL_16_INFINITY, GFC_REAL_4_QUIET_NAN,
> 	GFC_REAL_8_QUIET_NAN, GFC_REAL_10_QUIET_NAN, GFC_REAL_16_QUIET_NAN):
> 	Define.
> 	* m4/iparm.m4 (atype_inf, atype_nan): Define.
> 	* m4/ifunction.m4: Formatting.
> 	* m4/iforeach.m4: Likewise.
> 	(START_FOREACH_FUNCTION): Initialize dest to all 1s, not all 0s.
> 	(START_FOREACH_BLOCK, FINISH_FOREACH_FUNCTION,
> 	FINISH_MASKED_FOREACH_FUNCTION): Run foreach block inside a loop
> 	until count[0] == extent[0].
> 	* m4/minval.m4: Formatting.  Handle NaNs and infinities.  Optimize.
> 	* m4/maxval.m4: Likewise.
> 	* m4/minloc0.m4: Likewise.
> 	* m4/maxloc0.m4: Likewise.
> 	* m4/minloc1.m4: Likewise.
> 	* m4/maxloc1.m4: Likewise.
> 	* generated/maxloc0_16_i16.c: Regenerated.
> 	* generated/maxloc0_16_i1.c: Likewise.
> 	* generated/maxloc0_16_i2.c: Likewise.
> 	* generated/maxloc0_16_i4.c: Likewise.
> 	* generated/maxloc0_16_i8.c: Likewise.
> 	* generated/maxloc0_16_r10.c: Likewise.
> 	* generated/maxloc0_16_r16.c: Likewise.
> 	* generated/maxloc0_16_r4.c: Likewise.
> 	* generated/maxloc0_16_r8.c: Likewise.
> 	* generated/maxloc0_4_i16.c: Likewise.
> 	* generated/maxloc0_4_i1.c: Likewise.
> 	* generated/maxloc0_4_i2.c: Likewise.
> 	* generated/maxloc0_4_i4.c: Likewise.
> 	* generated/maxloc0_4_i8.c: Likewise.
> 	* generated/maxloc0_4_r10.c: Likewise.
> 	* generated/maxloc0_4_r16.c: Likewise.
> 	* generated/maxloc0_4_r4.c: Likewise.
> 	* generated/maxloc0_4_r8.c: Likewise.
> 	* generated/maxloc0_8_i16.c: Likewise.
> 	* generated/maxloc0_8_i1.c: Likewise.
> 	* generated/maxloc0_8_i2.c: Likewise.
> 	* generated/maxloc0_8_i4.c: Likewise.
> 	* generated/maxloc0_8_i8.c: Likewise.
> 	* generated/maxloc0_8_r10.c: Likewise.
> 	* generated/maxloc0_8_r16.c: Likewise.
> 	* generated/maxloc0_8_r4.c: Likewise.
> 	* generated/maxloc0_8_r8.c: Likewise.
> 	* generated/maxloc1_16_i16.c: Likewise.
> 	* generated/maxloc1_16_i1.c: Likewise.
> 	* generated/maxloc1_16_i2.c: Likewise.
> 	* generated/maxloc1_16_i4.c: Likewise.
> 	* generated/maxloc1_16_i8.c: Likewise.
> 	* generated/maxloc1_16_r10.c: Likewise.
> 	* generated/maxloc1_16_r16.c: Likewise.
> 	* generated/maxloc1_16_r4.c: Likewise.
> 	* generated/maxloc1_16_r8.c: Likewise.
> 	* generated/maxloc1_4_i16.c: Likewise.
> 	* generated/maxloc1_4_i1.c: Likewise.
> 	* generated/maxloc1_4_i2.c: Likewise.
> 	* generated/maxloc1_4_i4.c: Likewise.
> 	* generated/maxloc1_4_i8.c: Likewise.
> 	* generated/maxloc1_4_r10.c: Likewise.
> 	* generated/maxloc1_4_r16.c: Likewise.
> 	* generated/maxloc1_4_r4.c: Likewise.
> 	* generated/maxloc1_4_r8.c: Likewise.
> 	* generated/maxloc1_8_i16.c: Likewise.
> 	* generated/maxloc1_8_i1.c: Likewise.
> 	* generated/maxloc1_8_i2.c: Likewise.
> 	* generated/maxloc1_8_i4.c: Likewise.
> 	* generated/maxloc1_8_i8.c: Likewise.
> 	* generated/maxloc1_8_r10.c: Likewise.
> 	* generated/maxloc1_8_r16.c: Likewise.
> 	* generated/maxloc1_8_r4.c: Likewise.
> 	* generated/maxloc1_8_r8.c: Likewise.
> 	* generated/maxval_i16.c: Likewise.
> 	* generated/maxval_i1.c: Likewise.
> 	* generated/maxval_i2.c: Likewise.
> 	* generated/maxval_i4.c: Likewise.
> 	* generated/maxval_i8.c: Likewise.
> 	* generated/maxval_r10.c: Likewise.
> 	* generated/maxval_r16.c: Likewise.
> 	* generated/maxval_r4.c: Likewise.
> 	* generated/maxval_r8.c: Likewise.
> 	* generated/minloc0_16_i16.c: Likewise.
> 	* generated/minloc0_16_i1.c: Likewise.
> 	* generated/minloc0_16_i2.c: Likewise.
> 	* generated/minloc0_16_i4.c: Likewise.
> 	* generated/minloc0_16_i8.c: Likewise.
> 	* generated/minloc0_16_r10.c: Likewise.
> 	* generated/minloc0_16_r16.c: Likewise.
> 	* generated/minloc0_16_r4.c: Likewise.
> 	* generated/minloc0_16_r8.c: Likewise.
> 	* generated/minloc0_4_i16.c: Likewise.
> 	* generated/minloc0_4_i1.c: Likewise.
> 	* generated/minloc0_4_i2.c: Likewise.
> 	* generated/minloc0_4_i4.c: Likewise.
> 	* generated/minloc0_4_i8.c: Likewise.
> 	* generated/minloc0_4_r10.c: Likewise.
> 	* generated/minloc0_4_r16.c: Likewise.
> 	* generated/minloc0_4_r4.c: Likewise.
> 	* generated/minloc0_4_r8.c: Likewise.
> 	* generated/minloc0_8_i16.c: Likewise.
> 	* generated/minloc0_8_i1.c: Likewise.
> 	* generated/minloc0_8_i2.c: Likewise.
> 	* generated/minloc0_8_i4.c: Likewise.
> 	* generated/minloc0_8_i8.c: Likewise.
> 	* generated/minloc0_8_r10.c: Likewise.
> 	* generated/minloc0_8_r16.c: Likewise.
> 	* generated/minloc0_8_r4.c: Likewise.
> 	* generated/minloc0_8_r8.c: Likewise.
> 	* generated/minloc1_16_i16.c: Likewise.
> 	* generated/minloc1_16_i1.c: Likewise.
> 	* generated/minloc1_16_i2.c: Likewise.
> 	* generated/minloc1_16_i4.c: Likewise.
> 	* generated/minloc1_16_i8.c: Likewise.
> 	* generated/minloc1_16_r10.c: Likewise.
> 	* generated/minloc1_16_r16.c: Likewise.
> 	* generated/minloc1_16_r4.c: Likewise.
> 	* generated/minloc1_16_r8.c: Likewise.
> 	* generated/minloc1_4_i16.c: Likewise.
> 	* generated/minloc1_4_i1.c: Likewise.
> 	* generated/minloc1_4_i2.c: Likewise.
> 	* generated/minloc1_4_i4.c: Likewise.
> 	* generated/minloc1_4_i8.c: Likewise.
> 	* generated/minloc1_4_r10.c: Likewise.
> 	* generated/minloc1_4_r16.c: Likewise.
> 	* generated/minloc1_4_r4.c: Likewise.
> 	* generated/minloc1_4_r8.c: Likewise.
> 	* generated/minloc1_8_i16.c: Likewise.
> 	* generated/minloc1_8_i1.c: Likewise.
> 	* generated/minloc1_8_i2.c: Likewise.
> 	* generated/minloc1_8_i4.c: Likewise.
> 	* generated/minloc1_8_i8.c: Likewise.
> 	* generated/minloc1_8_r10.c: Likewise.
> 	* generated/minloc1_8_r16.c: Likewise.
> 	* generated/minloc1_8_r4.c: Likewise.
> 	* generated/minloc1_8_r8.c: Likewise.
> 	* generated/minval_i16.c: Likewise.
> 	* generated/minval_i1.c: Likewise.
> 	* generated/minval_i2.c: Likewise.
> 	* generated/minval_i4.c: Likewise.
> 	* generated/minval_i8.c: Likewise.
> 	* generated/minval_r10.c: Likewise.
> 	* generated/minval_r16.c: Likewise.
> 	* generated/minval_r4.c: Likewise.
> 	* generated/minval_r8.c: Likewise.
> 	* generated/product_c10.c: Likewise.
> 	* generated/product_c16.c: Likewise.
> 	* generated/product_c4.c: Likewise.
> 	* generated/product_c8.c: Likewise.
> 	* generated/product_i16.c: Likewise.
> 	* generated/product_i1.c: Likewise.
> 	* generated/product_i2.c: Likewise.
> 	* generated/product_i4.c: Likewise.
> 	* generated/product_i8.c: Likewise.
> 	* generated/product_r10.c: Likewise.
> 	* generated/product_r16.c: Likewise.
> 	* generated/product_r4.c: Likewise.
> 	* generated/product_r8.c: Likewise.
> 	* generated/sum_c10.c: Likewise.
> 	* generated/sum_c16.c: Likewise.
> 	* generated/sum_c4.c: Likewise.
> 	* generated/sum_c8.c: Likewise.
> 	* generated/sum_i16.c: Likewise.
> 	* generated/sum_i1.c: Likewise.
> 	* generated/sum_i2.c: Likewise.
> 	* generated/sum_i4.c: Likewise.
> 	* generated/sum_i8.c: Likewise.
> 	* generated/sum_r10.c: Likewise.
> 	* generated/sum_r16.c: Likewise.
> 	* generated/sum_r4.c: Likewise.
> 	* generated/sum_r8.c: Likewise.
>
> 	* gfortran.dg/maxlocval_2.f90: New test.
> 	* gfortran.dg/maxlocval_3.f90: New test.
> 	* gfortran.dg/maxlocval_4.f90: New test.
> 	* gfortran.dg/minlocval_1.f90: New test.
> 	* gfortran.dg/minlocval_2.f90: New test.
> 	* gfortran.dg/minlocval_3.f90: New test.
> 	* gfortran.dg/minlocval_4.f90: New test.
>   


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]