This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: [Patch] Simple __gslice_to_index tweak
- From: Paolo Carlini <pcarlini at suse dot de>
- To: Paolo Carlini <pcarlini at suse dot de>
- Cc: libstdc++ <libstdc++ at gcc dot gnu dot org>, Gabriel Dos Reis <gdr at integrable-solutions dot net>
- Date: Fri, 28 Jul 2006 22:26:31 +0200
- Subject: Re: [Patch] Simple __gslice_to_index tweak
- References: <44CA0E5B.5030103@suse.de>
Hi again,
base:
30.973u 8.068s 0:39.14 99.7% 0+0k 0+0io 0pf+0w
peak:
23.053u 7.856s 0:30.91 99.9% 0+0k 0+0io 0pf+0w
Actually, we can do quite a bit better with a more invasive change,
which, however, does not increase the complexity of the resulting code:
new peak
12.364u 7.732s 0:20.21 99.4% 0+0k 0+0io 0pf+0w
Passes the testsuite and I'm doing some additional tests...
Paolo.
////////////////
2006-07-28 Paolo Carlini <pcarlini@suse.de>
* src/valarray-inst.cc (__gslice_to_index): Do not compute
__a from scratch for each __j, instead compute the variation
from __j to __j + 1.
Index: src/valarray-inst.cc
===================================================================
--- src/valarray-inst.cc (revision 115791)
+++ src/valarray-inst.cc (working copy)
@@ -1,6 +1,7 @@
// Explicit instantiation file.
-// Copyright (C) 2001, 2004, 2005 Free Software Foundation, Inc.
+// Copyright (C) 2001, 2002, 2003, 2004, 2005, 2006
+// Free Software Foundation, Inc.
//
// This file is part of the GNU ISO C++ Library. This library is free
// software; you can redistribute it and/or modify it under the
@@ -81,28 +82,27 @@
// in __l which describes the multidimensional sizes of the
// the generalized slice.
const size_t __z = __i.size();
-
+
+ size_t __a = __o;
for (size_t __j = 0; __j < __z; ++__j)
{
- // Compute the linear-index image of (t_0, ... t_{n-1}).
- // Normaly, we should use inner_product<>(), but we do it the
- // the hard way here to avoid link-time can of worms.
- size_t __a = __o;
- for (size_t __k = 0; __k < __n; ++__k)
- __a += __s[__k] * __t[__k];
+ // Compute the linear-index image of (t_0, ... t_{n-1}).
+ // Done differentially for efficiency reasons, by computing
+ // the change going from __j to __j + 1.
+ __i[__j] = __a;
- __i[__j] = __a;
+ ++__t[__n - 1];
+ __a += __s[__n - 1];
// Process the next multi-index. The loop ought to be
- // backward since we're making a lexicagraphical visit.
- ++__t[__n - 1];
- for (size_t __k2 = __n - 1; __k2; --__k2)
+ // backward since we're making a lexicographical visit.
+ for (size_t __k2 = __n - 1; __k2 && __t[__k2] >= __l[__k2]; --__k2)
{
- if (__t[__k2] >= __l[__k2])
- {
- __t[__k2] = 0;
- ++__t[__k2 - 1];
- }
+ __a -= __s[__k2] * __t[__k2];
+ __t[__k2] = 0;
+
+ ++__t[__k2 - 1];
+ __a += __s[__k2 - 1];
}
}
}
Index: testsuite/util/regression/rand/assoc/rand_regression_test.hpp
===================================================================
--- testsuite/util/regression/rand/assoc/rand_regression_test.hpp (revision 115791)
+++ testsuite/util/regression/rand/assoc/rand_regression_test.hpp (working copy)
@@ -111,7 +111,7 @@
// Sane defaults.
size_t n = iter;
size_t m = keys;
- size_t sd = 0; // 0 = time-determined arbitrary
+ size_t sd = 1153519516; // 0 = time-determined arbitrary
double tp = 0.2;
double ip = 0.6;
double ep = 0.2;