This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Patch,Fortran] PR 33197 Add NORM2 and PARITY

Hi Tobias,

Tobias Burnus wrote:
This patch adds NORM2 (= sqrt(sum(array**2))) and PARITY(logical_array) - and thus the last F2008 math intrinsics, which means that PR 33197 can be closed afterwards. (Though, there are still lots of bits-related intrinsics missing.)

All algorithms are implemented three times: (a) in simplify, (b) in trans-intrinsics (for scalar results, i.e. rank-1 arrays or if no dim= has been given), and (c) in libgfortran.

Please check when reading/testing the patch that it works correctly for zero-sized arrays, size-1 arrays and (for NORM2) arrays with values > sqrt(huge(a)).

For norm2 (L2 norm), one essentially does:
tmp = max(array) ! = L_infinity norm
norm2 = tmp * sqrt( (array/tmp)**2)
to avoid an overflow. However, it is implemented using a single pass. For NORM2, an algorithm based on the one in BLAS is used though I have realized that I start with SCALE == 1, RESULT = 0 while BLAS uses the opposite starting values. Both seem to lead to the same result, thus, I do not know which one is better. Quickly thinking about it, mine seems to be better (faster + more accurate) for small values (<= 1.0), but I might be wrong. Cf.

For simplify, NORM2 does not divide by max(array) as seemingly no overflow (truncation) occurs until the norm2 calculation has been completed. Thus, lazy as I was, I did not change that part to the more complicated algorithm.

Build and regtested on x86-64-linux.
OK for the trunk?

In general ok, but:

+ resolve_mask_arg (array); /* FIXME: Does this make sense? */

I think this is ok.

-    gfc_init_block (&block);

+    gfc_init_block (&block);
   /* Do the actual summation/product.  */

I'd rather have the blank line *after* than *before* the gfc_init_block.

+      absX = fold_build1 (ABS_EXPR, type, arrayse.expr);
+      cond = fold_build2 (GT_EXPR, boolean_type_node, absX, scale);
+      gfc_add_expr_to_block (&block, tmp);
+      res1 = fold_build2 (MULT_EXPR, type, scale, scale);
+      res1 = fold_build2 (RDIV_EXPR, type, res1,

Are you sure you want the gfc_add_expr_to_block here? tmp is only used last a long way above... What does this do?

+Calculates the Euclidean vector norm (@math{L_2} norm of

This seems to miss a closing parenthesis.

And most importantly:

+	     if (absX > scale)
+	       {
+		 result = 1.0 + result * (scale*scale) / (absX * absX);
+		 scale = absX;
+	       }
+	     else
+	       result += absX * absX / (scale*scale)

I think you should not use (scale*scale)/(absX*absX) here (or the other way round) but rather (scale/absX)**2. This may lead to an overflow. See the test:

  real(10) :: a(2)
  real(10) :: val

val = sqrt(huge(1.0_10)) * 2.0_10

  a = (/ val, val /)
  print *, norm2(a) / val / sqrt(2.0_10)

This prints NaN with your patch but the expected 1.00000... when the calculation is done as suggested above. Note that I had to use KIND=10 here because probably otherwise the internal calculation is done with "higher presicion" than the values (on my system). The same is true I guess for the library version.


Done:  Arc-Bar-Cav-Kni-Ran-Rog-Sam-Tou-Val-Wiz
To go: Hea-Mon-Pri

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]