$ g++ -v Configured with: ../gcc-3.4.2/configure --prefix=/tools/pkg/gcc/3.4.2 --enable- languages=c,c++ --with-ld=/bin/ld --with-as=/bin/as --disable-threads Thread model: aix gcc version 3.4.2 /// t.cc #include <iostream> #include <cstdio> using namespace std; int main( int argc, char** argv ) { std::ios_base::sync_with_stdio( false ); int count = 100000; if( argc > 1 ) for( int i = 0; i < count; ++i ) std::cout << (double)i; else for( int i = 0; i < count; ++i ) { char buf[50]; sprintf( buf, "%g", (double)i ); std::cout << buf; } return 0; } /// $ g++ -o t t.cc $ time t > /dev/null real 0m0.07s user 0m0.05s sys 0m0.00s $ time t cout > /dev/null real 0m4.22s user 0m2.60s sys 0m1.60s The std::ios_base::sync_with_stdio( false ); seems to have no effect. Note: On i686-pc-linux-gnu, both cout and sprintf/<<(char*) take the same time. Using the AIX system g++ 2.9-aix51-020209, there is also no difference (both are fast).
*** This bug has been marked as a duplicate of 17140 ***
Wait a minute... Ok, we have got 17140, but here, if the issue is confirmed, we are talking about 2 orders of magnitude!?!?! Indeed, on my P4-2400 (gcc3.4.3, actually, similar bevavior for mainline) I cannot confirm the disaster: paolo:~/Work> g++ -o t t.cc paolo:~/Work> time t > /dev/null 0.150u 0.000s 0:00.15 100.0% 0+0k 0+0io 219pf+0w paolo:~/Work> time t cout > /dev/null 0.170u 0.000s 0:00.17 100.0% 0+0k 0+0io 223pf+0w Either something AIX specific or I don't know, right now... Can you possibly profile?
I took some backtraces during a longer run of the program and most of the time it looked like this: (I removed some templates to make it readable) __issetuid load_locale setlocale int std::__convert_from_v<double>(...) std::ostreambuf_iterator<> std::num_put<>::_M_insert_float<double>(...) const std::num_put<>::do_put(std::ostreambuf_iterator<>, std::ios_base&, char, double) const std::basic_ostream<>::operator<<(double) main
Honestly, this doesn't help much, the percentages are important: suppose, for some reason, setlocale is very slow on AIX... Is _GLIBCXX_USE_C99 defined, on AIX? I'm asking because internally __convert_from_v uses snprintf in that case, not sprintf, and we want to be sure about that detail.
Ok, lets say I took 100 backtraces and got 99 times this one. I hope this is enough percentage. _GLIBCXX_USE_C99 is defined
> Ok, lets say I took 100 backtraces and got 99 times this one. I hope this is > enough percentage. ;) No, I mean *relative* percentages: setlocal in your profile is on top, higher than __convert_from_v and everything else. I'm asking: is consuming 1%, 10%, or 99% of the total runtime? > _GLIBCXX_USE_C99 is defined In this case, can you possibly check whether, for some reason, snprintf is slower than sprintf, on AIX? Thanks
For reference, on linux, a typical profile (count = 10000000) begins as follows, and: 1- The various snprintf are on top. 2- uselocale (the thread safe equivalent of setlocale) is way below Therefore, we can easily guess that the performance are not so different from those of snprintf alone. Each sample counts as 0.01 seconds. % cumulative self self total time seconds seconds calls ms/call ms/call name 28.65 5.71 5.71 __printf_fp 13.50 8.40 2.69 __mpn_divrem 7.28 9.85 1.45 memcpy 6.62 11.17 1.32 buffered_vfprintf 5.67 12.30 1.13 vfprintf 5.22 13.34 1.04 __mpn_mul_1 2.51 13.84 0.50 mempcpy 2.31 14.30 0.46 std::ostream::operator<<(double) 2.21 14.74 0.44 __mpn_rshift 2.01 15.14 0.40 std::ostreambuf_iterator<char, std::char_traits<char> > std::num_put<char, std::ostreambuf_iterator<char, std::char_traits<char> > >::_M_insert_float<double>(std::ostreambuf_iterator<char, std::char_traits<char> >, std::ios_base&, char, char, double) const 2.01 15.54 0.40 isinf 1.81 15.90 0.36 isnan 1.46 16.19 0.29 std::basic_filebuf<char, std::char_traits<char> >::xsputn(char const*, int) 1.30 16.45 0.26 __mpn_extract_double 1.30 16.71 0.26 uselocale
> ;) No, I mean *relative* percentages: setlocal in your profile is on top, higher > than __convert_from_v and everything else. I'm asking: is consuming 1%, 10%, or > 99% of the total runtime? Doesn't a function that is on 99% of all call stacks automatically consume 99% of the runtime? Ok, the time can also be spend in the childs like here in __issetuid, but this is an implementation detail of setlocale, isn't it? > In this case, can you possibly check whether, for some reason, snprintf is > slower than sprintf, on AIX? I rewrote the original program to use snprintf and got the same time as with sprintf.
> Doesn't a function that is on 99% of all call stacks automatically consume 99% > of the runtime? Ok, the time can also be spend in the childs like here in > __issetuid, but this is an implementation detail of setlocale, isn't it? % in *time*, see the linux example, nothing special. Ok about snprintf, it's not that.
If it helps you, here the first lines of prof. But in practice some backtraces help me a lot more than a a [g]prof output. Its very fast in finding big bottlenecks, like setlocale in this PR. count = 1000000 Name %Time Seconds Cumsecs #Calls msec/call .strncpy 31.5 5.36 5.36 .getenv 19.6 3.34 8.70 5000000 0.0007 .mystrncpy 6.2 1.06 9.76 7000000 0.0002 .getgidx.GL 5.5 0.94 10.70 .__mcount 5.3 0.90 11.60 ._doprnt 5.2 0.88 12.48 2000001 0.0004 .setlocale 4.6 0.79 13.27 3000000 0.0003 .strchr 4.2 0.72 13.99 .__issetuid 3.4 0.58 14.57 6000000 0.0001 .getuidx.GL 3.4 0.57 15.14 ._ZNSolsEd 1.1 0.18 15.32 .main 0.8 0.13 15.45 1 130.0 .malloc_y_start 0.8 0.13 15.58 .text 0.7 0.12 15.70 ._ZNKSt7num_putIcSt1 0.6 0.11 15.81 .malloc_y 0.6 0.11 15.92 1000007 0.0001 .strcmp 0.6 0.10 16.02 .free_y 0.6 0.10 16.12 1000002 0.0001 .malloc_y_heap 0.5 0.08 16.20 1000007 0.0001 .strcpy 0.4 0.07 16.27 .text 0.3 0.05 16.32 .sprintf 0.3 0.05 16.37 1000000 0.0001 .__nl_langinfo_std 0.3 0.05 16.42 2000000 0.0000 .mf2x2 0.2 0.04 16.46 999999 0.0000 .strcmp 0.2 0.04 16.50 ._ZNKSt7num_putIcSt1 0.2 0.04 16.54 ._ZdaPv.GL 0.2 0.04 16.58 .setlocale.GL 0.2 0.03 16.61
> If it helps you, here the first lines of prof. But in practice some backtraces > help me a lot more than a a [g]prof output. Its very fast in finding big > bottlenecks, like setlocale in this PR. Ok, thank you very much. Of course, you are very welcome to provide additional useful info, but I think we have already a rather good picture of the issue. Indeed, isn't really the same of 17140, which is more about speeding-up the task currently performed by snprintf, and only secondarily avoiding setlocale (what we are using for that, on gnu configs like linux, uselocale that is, is pretty fast). All in all, this means, that whatever we prepare for 17140 had better avoiding *completely* setlocale & co, otherwise, on AIX we are going to gain *nothing*. I cannot see alternate short term fixes... Maybe the AIX maintainers could help with an AIX config exploiting any fast thread-safe uselocale-equivalent available there and then AIX would use that specific locale model instead of the generic one. Other ideas?
FWIW, the issue doesn't affect linux even if configured --enable-clocale=generic (thus using setlocale). Therefore, on *some* targets setlocale is slow and causes it, not everywhere.
AIX does not yet implement uselocale(). I am not aware of any faster interface. Does "C" locale help?
std::cout.imbue( std::locale( "C" ) ); nor std::cout.imbue( std::locale::classic() ); nor export LANG=C does change anything Joerg
Hi everyone. No, switching by hand to "C" locale cannot help, because, given the current generic locale model, the setlocale calls are issued *anyway*. Indeed, we could envisage improving on this...
Can you test the attached? Should help, and we can apply it in 3_4-branch, maybe in mainline too depending on the res of 17140.
Created attachment 8081 [details] To be tested on AIX
Notice, however, that as-is, the patch is not perfect from the correctness point of view: we should also check LC_NUMERIC...
A quick test shows, that std::setlocale(LC_ALL, NULL) returns "C C C C C C". But std::setlocale(LC_NUMERIC, NULL) returns "C" I think it is enough to test LC_NUMERIC. I'll try to test a modified patch. But this may take a while.
Thanks! For the correctness issue, notice that LC_ALL overrides any LC_*, therefore checking only LC_NUMERIC seems weaker, still, probably effective in most circumstances...
I think this runtimes of the different setlocale() calls might be interesting: (This is for 1000000 calls) category locale time LC_ALL 0 0.70s LC_NUMERIC 0 0.29s LC_ALL "C" 1.63s LC_NUMERIC "C" 8.98s
Eh! The runtimes are indeed **very** interesting, thanks! And, also, it looks like I was wrong in the last comment: in case LC_ALL is set, this is automatically seen in setlocale(LC_NUMERIC, 0), therefore your suggestion of changing my patch to just use LC_NUMERIC should work well!
Created attachment 8082 [details] Should help even more on AIX, regtested Ok on linux (--enable-clocale=generic)
Created attachment 8083 [details] Wrong attachment (meaningless mix of LC_ALL and LC_NUMERIC), this one is Ok
With patch draft_19642_3 and count=1000000 $ time t > /dev/null real 0,64 user 0,60 sys 0,01 $ time t cout > /dev/null real 1,14 user 1,12 sys 0,00 Now its only 2 times slower than sprintf. Much better than the ~75 times before. Also note, that when an other locale than "C" is in place the times also get better: without patch: (count=100000) real 28,27 user 6,34 sys 21,87 with: real 5,66 user 1,89 sys 3,78 (sprintf:0,07s) This case is still horrible slow, but 4-5 times better than before. Good work so far, but I think the final solution is to avoid any frequent setlocale() calls.
Excellent. Can you please check whether on AIX this is ok ("de_DE" is just an example, any actually installed named locale should do): #include <locale.h> #include <string.h> #include <assert.h> int main() { char* save = setlocale(LC_ALL, "de_DE"); assert( !save || !strcmp(setlocale(LC_NUMERIC, NULL), "de_DE")); return 0; } ??? In short, we should make sure that any change to LC_ALL is reflected to LC_NUMERIC. I have checked that this is the case with glibc and also FreeBSD but we should check all the supported OSes. Thanks again!
Paolo, just a note. This issue has been raised before on the mailing list. (Or bugzilla, sorry so vague.) If you search stable/AIX/setlocale you might find something. There's known problems with setlocale not being "stable" on AIX. You're right, the correct long term fix is something like Jerry's work in 17140. -benjamin
> There's known problems with setlocale not being "stable" on AIX. Thanks for the pointer. I can find only some rather vague hints, around, however... "(un)stable"?!? Or 'just' plain slow?!? ;) ;) > You're right, the correct long term fix is something like Jerry's work in > 17140. Beyond doubt. Avoiding completely *any* sort of setlocale/uselocale would fix those issues completely. However, short and mid-term, the latest attached patch (*_3) does a good job and seems safe (already checked on glibc (generic) + FreeBSD + Solaris).
(In reply to comment #26) > Excellent. Can you please check whether on AIX this is ok ("de_DE" is just > an example, any actually installed named locale should do): > [code] > In short, we should make sure that any change to LC_ALL is reflected to > LC_NUMERIC. I have checked that this is the case with glibc and also > FreeBSD but we should check all the supported OSes. Thanks again! I've checked it with all installed locales I've found. It works for all.
Thanks. We are now reasonably confident that the patch is safe, besides providing a good performance improvement. Most probably we shall apply it soon, I'll take care of the next steps. Thanks again.
Subject: Bug 19642 CVSROOT: /cvs/gcc Module name: gcc Changes by: paolo@gcc.gnu.org 2005-01-30 14:09:59 Modified files: libstdc++-v3 : ChangeLog libstdc++-v3/config/locale/generic: c_locale.h Log message: 2005-01-30 Paolo Carlini <pcarlini@suse.de> PR libstdc++/19642 * config/locale/generic/c_locale.h (__convert_from_v): Switch only LC_NUMERIC, and only when actually != "C". Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/ChangeLog.diff?cvsroot=gcc&r1=1.2880&r2=1.2881 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/config/locale/generic/c_locale.h.diff?cvsroot=gcc&r1=1.13&r2=1.14
Subject: Bug 19642 CVSROOT: /cvs/gcc Module name: gcc Branch: gcc-3_4-branch Changes by: paolo@gcc.gnu.org 2005-02-02 16:46:40 Modified files: libstdc++-v3 : ChangeLog libstdc++-v3/config/locale/generic: c_locale.h Log message: 2005-02-02 Paolo Carlini <pcarlini@suse.de> PR libstdc++/19642 * config/locale/generic/c_locale.h (__convert_from_v): Switch only LC_NUMERIC, and only when actually != "C". Patches: http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/ChangeLog.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.2224.2.213&r2=1.2224.2.214 http://gcc.gnu.org/cgi-bin/cvsweb.cgi/gcc/libstdc++-v3/config/locale/generic/c_locale.h.diff?cvsroot=gcc&only_with_tag=gcc-3_4-branch&r1=1.11.10.2&r2=1.11.10.3
This specific issue is fixed for 4.0.0 and 3.4.4.