This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
| Other format: | [Raw text] | |
Hi Cesar!
(At least several of) the issues that I pointed out (see below) have
never been fixed on gomp-4_0-branch, but the test cases have now been
merged from gomp-4_0-branch into trunk, so the regression (PASS -> FAIL
for libgomp.oacc-c-c++-common/reduction-2.c) as well as the other
"oddities" are now to be fixed in trunk. I re-assigned
<https://gcc.gnu.org/PR68242> from Nathan to Cesar. (I didn't verify
that the following list of items is conclusive/complete.)
On Fri, 18 Sep 2015 15:37:58 +0200, I wrote:
> Hi Cesar!
>
> On Fri, 17 Jul 2015 11:13:59 -0700, Cesar Philippidis <cesar@codesourcery.com> wrote:
> > This patch updates the libgomp OpenACC reduction test cases to check
> > worker, vector and combined gang worker vector reductions. I tried to
> > use some macros to simplify the c test cases a bit. I probably could
> > have made them more generic with an additional header file/macro, but
> > then that makes it too confusing too debug. The fortran tests are a bit
> > of a lost clause, unless someone knows how to use the preprocessor with
> > !$acc loops.
>
> > --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-2.c
> > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-2.c
>
> > +static void
> > +test_reductions (void)
> > {
>
> > - [...]
> > + const int n = 100;
> > int i;
> > - [...]
> > + float array[n];
> >
> > for (i = 0; i < n; i++)
> > - [...]
> > + array[i] = i+1;
> >
> > - [...]
> > + /* Gang reductions. */
> > + check_reduction_op (float, +, 0, array[i], num_gangs (ng), gang);
> > + check_reduction_op (float, *, 1, array[i], num_gangs (ng), gang);
>
> I see this one reproducibly FAIL in the x86_64 -m32 multilib's
> host-fallback testing (there is no nvptx offloading for 32-bit
> configurations). (The -m32 multilib is configured/enabled by default, so
> fixing this is a prerequisite for trunk integration.) From a very quick
> glance, might it be that we're overflowing the float data type with the
> "1 * 2 * 3 * [...] * 1000" computation? The OpenACC reduction computes
> "inf" which is then compared against a very high finite reference value
> -- or the other way round (I lost my debugging session). Instead of
> multiplying these "big" numbers, I guess we should just do a more
> idiomatic floating point computation?
>
> > --- a/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-4.c
> > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction-4.c
>
> > /* complex reductions. */
>
> > +static void
> > +test_reductions (void)
> > {
>
> > + double _Complex array[n];
> > +
> > + for (i = 0; i < n; i++)
> > + array[i] = i+1;
> > +
> > + /* Gang reductions. */
> > + check_reduction_op (double, +, 0, creal (array[i]), num_gangs (ng), gang);
>
> Given that in the check_reduction_op instantiations you're specifying a
> "double" data type (instead of "double _Complex", for example), and
> "creal (array[i])" reduction operands (instead of "array[i]", for
> example), we're not actually testing reductions with complex data types,
> so I guess that should be changed. :-)
>
> > --- /dev/null
> > +++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/reduction.h
> > @@ -0,0 +1,43 @@
> > +#ifndef REDUCTION_H
> > +#define REDUCTION_H
> > +
> > +#define DO_PRAGMA(x) _Pragma (#x)
> > +
> > +#define check_reduction_op(type, op, init, b, gwv_par, gwv_loop) \
> > + { \
> > + type res, vres; \
> > + res = (init); \
> > +DO_PRAGMA (acc parallel gwv_par copy (res)) \
> > +DO_PRAGMA (acc loop gwv_loop reduction (op:res)) \
> > + for (i = 0; i < n; i++) \
> > + res = res op (b); \
> > + \
> > + vres = (init); \
> > + for (i = 0; i < n; i++) \
> > + vres = vres op (b); \
> > + \
> > + if (res != vres) \
> > + abort (); \
> > + }
>
> It's the right thing for integer data types, but for anything floating
> point, we should be allowing for some small difference (epsilon) between
> res and vres, due to rounding differences in the OpenACC reduction
> (possibly offloaded) and reference value computation, and similar.
>
> > +#define check_reduction_macro(type, op, init, b, gwv_par, gwv_loop) \
> > + { \
> > + type res, vres; \
> > + res = (init); \
> > + DO_PRAGMA (acc parallel gwv_par copy(res)) \
> > +DO_PRAGMA (acc loop gwv_loop reduction (op:res)) \
> > + for (i = 0; i < n; i++) \
> > + res = op (res, (b)); \
> > + \
> > + vres = (init); \
> > + for (i = 0; i < n; i++) \
> > + vres = op (vres, (b)); \
> > + \
> > + if (res != vres) \
> > + abort (); \
> > + }
>
> Likewise.
>
> > +#define max(a, b) (((a) > (b)) ? (a) : (b))
> > +#define min(a, b) (((a) < (b)) ? (a) : (b))
> > +
> > +#endif
>
> > --- a/libgomp/testsuite/libgomp.oacc-fortran/reduction-4.f90
> > +++ b/libgomp/testsuite/libgomp.oacc-fortran/reduction-4.f90
> > @@ -5,50 +5,108 @@
> > program reduction_4
> > implicit none
> >
> > - integer, parameter :: n = 10, gangs = 20
> > + integer, parameter :: n = 10, ng = 8, nw = 4, vl = 32
> > integer :: i
> > - complex :: vresult, result
> > + real :: vresult, rg, rw, rv, rc
> > complex, dimension (n) :: array
>
> Same problem as in the C test case: not actually testing complex data
> types:
>
> > do i = 1, n
> > array(i) = i
> > end do
> >
> > -[...]
> > + !
> > + ! '+' reductions
> > + !
> > +
> > + rg = 0
> > + rw = 0
> > + rv = 0
> > + rc = 0
> > vresult = 0
> >
> > -[...]
> > + !$acc parallel num_gangs(ng) copy(rg)
> > + !$acc loop reduction(+:rg) gang
> > + do i = 1, n
> > + rg = rg + REAL(array(i))
> > + end do
> > + !$acc end parallel
GrÃÃe
Thomas
Attachment:
signature.asc
Description: PGP signature
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |