RFC: fp printing speedup patch
Jerry Quinn
jlquinn@optonline.net
Mon Nov 10 06:06:00 GMT 2003
Hi, Paolo. Thanks for responding...
OK, a bunch of good issues raised. First off, I'm not that sure about
the algorithm used but this is essentially the code from v2. It's
also used in a slightly different form in libgcj.
Paolo Carlini writes:
> Hi Jerry. I'm just back and still trying to catch up...
>
> Anyway, I have already printed the patch and was hoping to return
> to you tomorrow or at most on monday.
>
> Some things I can tell you immediately:
>
> 1- For something so delicate (it affects the output of *most*
> programs!) we definitely must be *very* careful! You should
> post *many* details about the tests you have done, perhaps
> many actual testcases too, for instance something like a loop
> generating random floats and comparing the output of the
> library to that of plain "C" printfs (using strings, of course
> for that).
My comparison program just beats on the routine comparing the output
of random doubles to sprintf output. I've run 5 million examples, as
well as consecutive numbers and everything matches. I did it for %f,
%f.0, %e, %e.0, %e.60, %g, %g.0, and %g.60. Actually I did the 60
precision before integrating. The c++ routine clips the precision, so
I then did it at numeric_limits<double>::digits10 + 2 for %e and %g.
How much testing is sufficient? Do we need to add a huge test to the
testsuite? The test program I used is at the end of this message.
I just found an fp testsuite from the gcc 2.95 libio and am working
now to convert and validate against it.
> 2- Some more numbers about performance: for sure it's a big
> improvement (thanks again) but which is still the gap wrt v2
> for some interesting examples (big number of digits/small
> number of digits)
For the record, I get the following on your posted test:
2.95
real 0m4.383s
user 0m4.380s
sys 0m0.000s
3.3.2
real 0m14.521s
user 0m14.310s
sys 0m0.010s
3.4+fppatch
real 0m7.026s
user 0m7.010s
sys 0m0.000s
Best as I can tell, trying to print 15 or more digits of precision
cause the dtoa implementation to fall back to big ints. There is a
further optimization for powers of 2.
The big int implementation seems to be very slow in comparison to the
mp lib used in glibc. I used the floatconv code because it seemed
like even more work to try to drag in glibc's implementation, I didn't
know about sorting out the licensing differences (although I assume
that could be worked out between two gnu projects), and I wanted the
speed of the native fp optimizations in floatconv.
At one point, I had some code to bail out of floatconv if the fast
conversions weren't used and use sprintf instead. This fixes the
speed issue for high precision printing, but ends us back with
convert_from_v. I don't have a big problem with trashing the bigint
implementation in floatconv, but I don't know if I have the time or
stamina to import the mp lib in glibc.
> 3- Due to 1-, above, I (we) would appreciate if you could outline
> a bit the algorithms used, the role of the dtoa thing, and so on.
I think others can probably do a much better job here. I did a few
things:
- convert to c++
- remove the char to float code
- split out the chunk of dtoa that determines the buffer size so I
could fix the buffer handling
- fix dtoa to use an external buffer
- move the big int code into a class
- wrap up the fp stuff into a class rather than the mess of defines
- add long double support
- remove and clean as much code as I could
floatconv provides dtoa, which is a routine that returns ecvt and
fcvt-style output. The remainder of the patch cleans up this output
to be what we expect from the c++ library.
> If you want, I have time to help soon for 1- and also, I hope, for the
> various configury issues, but I have to really allocate some time to
> this in the next days.
I welcome testing help, and I especially welcome help on the configury
stuff. I started to look at it, but got bogged down really quickly.
> Paolo.
>
> P.S. Later on, we should of course deal with the other uses of
> __convert_from_v, since the final goal is removing it completely.
Definitely! Once this code is in place, the money facet should be
pretty easy to fix up.
Jerry
#include <sstream>
#include <iostream>
#include <iomanip>
#include <limits>
int main(void)
{
// Can't test 60 since libstd++ clips the precision - but 60 works without that clipping
const int __max_digits = std::numeric_limits<double>::digits10 + 2;
// int ndig = 60;
int ndig = __max_digits;
char buf[25600];
char buf2[25600];
union fpstruct {
double d;
long x[2];
long long y;
} fd;
bool dorand=true;
bool outcpp = false;
bool dosprintf=true;
bool outsprintf=false;
bool doprintf=false;
bool cmpall=true;
for (int i=0; i<5000000; i++) {
// Tends to make lots of extreme numbers. dtoa is slow here. mp
// math can be improved. for double, use long double math if it
// gets fixed.
if (dorand) {
fd.x[0] = mrand48();
fd.x[1] = mrand48();
}
// "Reasonable" numbers are efficient in dtoa. prob more realistic
else
fd.d = i;
if (outcpp)
std::cout << std::setprecision(6) << std::fixed << fd.d << "\n";
std::ostringstream os;
os << std::setprecision(6) << std::fixed << fd.d;
if (dosprintf) sprintf(buf2, "%f",fd.d);
if (outsprintf) std::cout << buf2 << "\n";
if (cmpall && os.str() != buf2) {
printf("mismatch mode f\n");
printf("fd.x[0]=0x%x; fd.x[1]=0x%x;\n", fd.x[0], fd.x[1]);
printf("dtoa %s\n", os.str().c_str());
printf("printf %s\n", buf2);
}
os.str("");
if (outcpp)
std::cout << std::setprecision(0) << std::fixed << fd.d << "\n";
os << std::setprecision(0) << std::fixed << fd.d;
if (dosprintf) sprintf(buf2, "%.0f",fd.d);
if (outsprintf) std::cout << buf2<<"\n";
if (cmpall && os.str() != buf2) {
printf("mismatch mode f.0\n");
printf("fd.x[0]=0x%x; fd.x[1]=0x%x;\n", fd.x[0], fd.x[1]);
printf("dtoa %s\n", os.str().c_str());
printf("printf %s\n", buf2);
}
os.str("");
if (outcpp)
std::cout << std::setprecision(6) << std::scientific << fd.d << "\n";
os << std::setprecision(6) << std::scientific << fd.d;
if (dosprintf) sprintf(buf2, "%e",fd.d);
if (outsprintf) std::cout << buf2 << "\n";
if (cmpall && os.str() != buf2) {
printf("mismatch mode e\n");
printf("fd.x[0]=0x%x; fd.x[1]=0x%x;\n", fd.x[0], fd.x[1]);
printf("dtoa %s\n", os.str().c_str());
printf("printf %s\n", buf2);
}
os.str("");
if (outcpp)
std::cout << std::scientific << std::setprecision(0) << fd.d << "\n";
os << std::scientific << std::setprecision(0) << fd.d;
if (dosprintf) sprintf(buf2, "%.0e",fd.d);
if (outsprintf) std::cout << buf2 << "\n";
if (cmpall && os.str() != buf2) {
printf("mismatch mode e.0\n");
printf("fd.x[0]=0x%x; fd.x[1]=0x%x;\n", fd.x[0], fd.x[1]);
printf("dtoa %s\n", os.str().c_str());
printf("printf %s\n", buf2);
}
os.str("");
if (outcpp)
std::cout << std::setprecision(ndig) << std::scientific << fd.d << "\n";
os << std::setprecision(ndig) << std::scientific << fd.d;
if (dosprintf) sprintf(buf2, "%.*e",ndig,fd.d);
if (outsprintf) std::cout << buf2 << "\n";
if (cmpall && os.str() != buf2) {
printf("mismatch mode e.60\n");
printf("fd.x[0]=0x%x; fd.x[1]=0x%x;\n", fd.x[0], fd.x[1]);
printf("dtoa %s\n", os.str().c_str());
printf("printf %s\n", buf2);
}
os.str("");
if (outcpp) {
std::cout.unsetf(std::ios_base::scientific);
std::cout << std::setprecision(6) << fd.d << "\n";
}
os.unsetf(std::ios_base::scientific);
os << std::setprecision(6) << fd.d;
if (dosprintf) sprintf(buf2, "%g",fd.d);
if (outsprintf) std::cout << buf2 << "\n";
if (cmpall && os.str() != buf2) {
printf("mismatch mode g\n");
printf("fd.x[0]=0x%x; fd.x[1]=0x%x;\n", fd.x[0], fd.x[1]);
printf("dtoa %s\n", os.str().c_str());
printf("printf %s\n", buf2);
}
os.str("");
if (outcpp)
std::cout << std::setprecision(0) << fd.d << "\n";
os << std::setprecision(0) << fd.d;
if (dosprintf) sprintf(buf2, "%.0g",fd.d);
if (outsprintf) std::cout << buf2 << "\n";
if (cmpall && os.str() != buf2) {
printf("mismatch mode g.0\n");
printf("fd.x[0]=0x%x; fd.x[1]=0x%x;\n", fd.x[0], fd.x[1]);
printf("dtoa %s\n", os.str().c_str());
printf("printf %s\n", buf2);
}
os.str("");
if (outcpp)
std::cout << std::setprecision(ndig) << fd.d << "\n";
os << std::setprecision(ndig) << fd.d;
if (dosprintf) sprintf(buf2, "%.*g",ndig, fd.d);
if (outsprintf) std::cout << buf2 << "\n";
if (cmpall && os.str() != buf2) {
printf("mismatch mode g.60\n");
printf("fd.x[0]=0x%x; fd.x[1]=0x%x;\n", fd.x[0], fd.x[1]);
printf("dtoa %s\n", os.str().c_str());
printf("printf %s\n", buf2);
}
if (outcpp || outsprintf) puts("-----");
}
}
More information about the Libstdc++
mailing list