This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH] Fix up NaN and +-Inf handling and speed up m{in,ax}{loc,val} intrinsics (PR fortran/40643, PR fortran/31067, take 3)
Hi Jakub,
Jakub Jelinek wrote:
> Here is a new version.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> The changes to the generated files are attached bzip2ed, as they are really
> huge.
>
Thanks for your impressive patch!
+ if (__builtin_expect (n >= len, 0))
Can you use unlikely()/likely() instead? It makes it a bit more
readable, cf. libgfortran.h:
#define unlikely(x) __builtin_expect(!!(x), 0)
OK for the trunk.
Tobias
PR: Regarding min*/max* and completely independent of this patch, one
could consider: (a) throwing in some
HONOR_NAN/HONOR_INF/HONOR_SIGNZERO/EXPR_MIN/EXPR_MAX for MIN/MAX and (b)
inlining for result = rank-1 arrays with -O<n>, 0 < n =/= "s" the
M{IN,AX}{VAL,LOC} intrinsics.
> 2009-07-22 Jakub Jelinek <jakub@redhat.com>
>
> PR fortran/40643
> PR fortran/31067
> * trans-intrinsic.c (gfc_conv_intrinsic_minmaxloc,
> gfc_conv_intrinsic_minmaxval): Handle Infinities and NaNs properly,
> optimize.
> * trans-array.c (gfc_trans_scalarized_loop_end): No longer static.
> * trans-array.h (gfc_trans_scalarized_loop_end): New prototype.
>
> * libgfortran.h (GFC_REAL_4_INFINITY, GFC_REAL_8_INFINITY,
> GFC_REAL_10_INFINITY, GFC_REAL_16_INFINITY, GFC_REAL_4_QUIET_NAN,
> GFC_REAL_8_QUIET_NAN, GFC_REAL_10_QUIET_NAN, GFC_REAL_16_QUIET_NAN):
> Define.
> * m4/iparm.m4 (atype_inf, atype_nan): Define.
> * m4/ifunction.m4: Formatting.
> * m4/iforeach.m4: Likewise.
> (START_FOREACH_FUNCTION): Initialize dest to all 1s, not all 0s.
> (START_FOREACH_BLOCK, FINISH_FOREACH_FUNCTION,
> FINISH_MASKED_FOREACH_FUNCTION): Run foreach block inside a loop
> until count[0] == extent[0].
> * m4/minval.m4: Formatting. Handle NaNs and infinities. Optimize.
> * m4/maxval.m4: Likewise.
> * m4/minloc0.m4: Likewise.
> * m4/maxloc0.m4: Likewise.
> * m4/minloc1.m4: Likewise.
> * m4/maxloc1.m4: Likewise.
> * generated/maxloc0_16_i16.c: Regenerated.
> * generated/maxloc0_16_i1.c: Likewise.
> * generated/maxloc0_16_i2.c: Likewise.
> * generated/maxloc0_16_i4.c: Likewise.
> * generated/maxloc0_16_i8.c: Likewise.
> * generated/maxloc0_16_r10.c: Likewise.
> * generated/maxloc0_16_r16.c: Likewise.
> * generated/maxloc0_16_r4.c: Likewise.
> * generated/maxloc0_16_r8.c: Likewise.
> * generated/maxloc0_4_i16.c: Likewise.
> * generated/maxloc0_4_i1.c: Likewise.
> * generated/maxloc0_4_i2.c: Likewise.
> * generated/maxloc0_4_i4.c: Likewise.
> * generated/maxloc0_4_i8.c: Likewise.
> * generated/maxloc0_4_r10.c: Likewise.
> * generated/maxloc0_4_r16.c: Likewise.
> * generated/maxloc0_4_r4.c: Likewise.
> * generated/maxloc0_4_r8.c: Likewise.
> * generated/maxloc0_8_i16.c: Likewise.
> * generated/maxloc0_8_i1.c: Likewise.
> * generated/maxloc0_8_i2.c: Likewise.
> * generated/maxloc0_8_i4.c: Likewise.
> * generated/maxloc0_8_i8.c: Likewise.
> * generated/maxloc0_8_r10.c: Likewise.
> * generated/maxloc0_8_r16.c: Likewise.
> * generated/maxloc0_8_r4.c: Likewise.
> * generated/maxloc0_8_r8.c: Likewise.
> * generated/maxloc1_16_i16.c: Likewise.
> * generated/maxloc1_16_i1.c: Likewise.
> * generated/maxloc1_16_i2.c: Likewise.
> * generated/maxloc1_16_i4.c: Likewise.
> * generated/maxloc1_16_i8.c: Likewise.
> * generated/maxloc1_16_r10.c: Likewise.
> * generated/maxloc1_16_r16.c: Likewise.
> * generated/maxloc1_16_r4.c: Likewise.
> * generated/maxloc1_16_r8.c: Likewise.
> * generated/maxloc1_4_i16.c: Likewise.
> * generated/maxloc1_4_i1.c: Likewise.
> * generated/maxloc1_4_i2.c: Likewise.
> * generated/maxloc1_4_i4.c: Likewise.
> * generated/maxloc1_4_i8.c: Likewise.
> * generated/maxloc1_4_r10.c: Likewise.
> * generated/maxloc1_4_r16.c: Likewise.
> * generated/maxloc1_4_r4.c: Likewise.
> * generated/maxloc1_4_r8.c: Likewise.
> * generated/maxloc1_8_i16.c: Likewise.
> * generated/maxloc1_8_i1.c: Likewise.
> * generated/maxloc1_8_i2.c: Likewise.
> * generated/maxloc1_8_i4.c: Likewise.
> * generated/maxloc1_8_i8.c: Likewise.
> * generated/maxloc1_8_r10.c: Likewise.
> * generated/maxloc1_8_r16.c: Likewise.
> * generated/maxloc1_8_r4.c: Likewise.
> * generated/maxloc1_8_r8.c: Likewise.
> * generated/maxval_i16.c: Likewise.
> * generated/maxval_i1.c: Likewise.
> * generated/maxval_i2.c: Likewise.
> * generated/maxval_i4.c: Likewise.
> * generated/maxval_i8.c: Likewise.
> * generated/maxval_r10.c: Likewise.
> * generated/maxval_r16.c: Likewise.
> * generated/maxval_r4.c: Likewise.
> * generated/maxval_r8.c: Likewise.
> * generated/minloc0_16_i16.c: Likewise.
> * generated/minloc0_16_i1.c: Likewise.
> * generated/minloc0_16_i2.c: Likewise.
> * generated/minloc0_16_i4.c: Likewise.
> * generated/minloc0_16_i8.c: Likewise.
> * generated/minloc0_16_r10.c: Likewise.
> * generated/minloc0_16_r16.c: Likewise.
> * generated/minloc0_16_r4.c: Likewise.
> * generated/minloc0_16_r8.c: Likewise.
> * generated/minloc0_4_i16.c: Likewise.
> * generated/minloc0_4_i1.c: Likewise.
> * generated/minloc0_4_i2.c: Likewise.
> * generated/minloc0_4_i4.c: Likewise.
> * generated/minloc0_4_i8.c: Likewise.
> * generated/minloc0_4_r10.c: Likewise.
> * generated/minloc0_4_r16.c: Likewise.
> * generated/minloc0_4_r4.c: Likewise.
> * generated/minloc0_4_r8.c: Likewise.
> * generated/minloc0_8_i16.c: Likewise.
> * generated/minloc0_8_i1.c: Likewise.
> * generated/minloc0_8_i2.c: Likewise.
> * generated/minloc0_8_i4.c: Likewise.
> * generated/minloc0_8_i8.c: Likewise.
> * generated/minloc0_8_r10.c: Likewise.
> * generated/minloc0_8_r16.c: Likewise.
> * generated/minloc0_8_r4.c: Likewise.
> * generated/minloc0_8_r8.c: Likewise.
> * generated/minloc1_16_i16.c: Likewise.
> * generated/minloc1_16_i1.c: Likewise.
> * generated/minloc1_16_i2.c: Likewise.
> * generated/minloc1_16_i4.c: Likewise.
> * generated/minloc1_16_i8.c: Likewise.
> * generated/minloc1_16_r10.c: Likewise.
> * generated/minloc1_16_r16.c: Likewise.
> * generated/minloc1_16_r4.c: Likewise.
> * generated/minloc1_16_r8.c: Likewise.
> * generated/minloc1_4_i16.c: Likewise.
> * generated/minloc1_4_i1.c: Likewise.
> * generated/minloc1_4_i2.c: Likewise.
> * generated/minloc1_4_i4.c: Likewise.
> * generated/minloc1_4_i8.c: Likewise.
> * generated/minloc1_4_r10.c: Likewise.
> * generated/minloc1_4_r16.c: Likewise.
> * generated/minloc1_4_r4.c: Likewise.
> * generated/minloc1_4_r8.c: Likewise.
> * generated/minloc1_8_i16.c: Likewise.
> * generated/minloc1_8_i1.c: Likewise.
> * generated/minloc1_8_i2.c: Likewise.
> * generated/minloc1_8_i4.c: Likewise.
> * generated/minloc1_8_i8.c: Likewise.
> * generated/minloc1_8_r10.c: Likewise.
> * generated/minloc1_8_r16.c: Likewise.
> * generated/minloc1_8_r4.c: Likewise.
> * generated/minloc1_8_r8.c: Likewise.
> * generated/minval_i16.c: Likewise.
> * generated/minval_i1.c: Likewise.
> * generated/minval_i2.c: Likewise.
> * generated/minval_i4.c: Likewise.
> * generated/minval_i8.c: Likewise.
> * generated/minval_r10.c: Likewise.
> * generated/minval_r16.c: Likewise.
> * generated/minval_r4.c: Likewise.
> * generated/minval_r8.c: Likewise.
> * generated/product_c10.c: Likewise.
> * generated/product_c16.c: Likewise.
> * generated/product_c4.c: Likewise.
> * generated/product_c8.c: Likewise.
> * generated/product_i16.c: Likewise.
> * generated/product_i1.c: Likewise.
> * generated/product_i2.c: Likewise.
> * generated/product_i4.c: Likewise.
> * generated/product_i8.c: Likewise.
> * generated/product_r10.c: Likewise.
> * generated/product_r16.c: Likewise.
> * generated/product_r4.c: Likewise.
> * generated/product_r8.c: Likewise.
> * generated/sum_c10.c: Likewise.
> * generated/sum_c16.c: Likewise.
> * generated/sum_c4.c: Likewise.
> * generated/sum_c8.c: Likewise.
> * generated/sum_i16.c: Likewise.
> * generated/sum_i1.c: Likewise.
> * generated/sum_i2.c: Likewise.
> * generated/sum_i4.c: Likewise.
> * generated/sum_i8.c: Likewise.
> * generated/sum_r10.c: Likewise.
> * generated/sum_r16.c: Likewise.
> * generated/sum_r4.c: Likewise.
> * generated/sum_r8.c: Likewise.
>
> * gfortran.dg/maxlocval_2.f90: New test.
> * gfortran.dg/maxlocval_3.f90: New test.
> * gfortran.dg/maxlocval_4.f90: New test.
> * gfortran.dg/minlocval_1.f90: New test.
> * gfortran.dg/minlocval_2.f90: New test.
> * gfortran.dg/minlocval_3.f90: New test.
> * gfortran.dg/minlocval_4.f90: New test.
>
- References:
- [PATCH] Fix up NaN and +-Inf handling and speed up inline m{in,ax}{loc,val} intrinsics (PR fortran/40643, PR fortran/31067)
- [PATCH] Fix up NaN and +-Inf handling and speed up inline m{in,ax}{loc,val} intrinsics (PR fortran/40643, PR fortran/31067, take 2)
- [PATCH] Fix up NaN and +-Inf handling and speed up m{in,ax}{loc,val} intrinsics (PR fortran/40643, PR fortran/31067, take 3)