Parallel sorts get ~10 times slower as one increases the vector size from 4*10^9 to 5*10^9, perhaps at exactly 2^32, but this wasn't checked. The example below is for a vector of ints but the same phenomenon is observed on a vector of long longs. To reproduce (sort_test.cc is below): 0. Adjust 'processors' in sort_test.cc. 1. g++ -O3 -fopenmp sort_test.cc -lgomp 2. ./a.out output: 58 seconds used in sort [for vector of size 4,000,000,000] 667 seconds used in sort [for vector of size 5,000,000,000] gcc version information: crd4% gcc -v Using built-in specs. Target: x86_64-unknown-linux-gnu Configured with: ../gcc-4.4.1/configure --with-gmp=/broad/tools/Linux/x86_64/pkgs/gcc_4.4.1 --with-mpfr=/broad/tools/Linux/x86_64/pkgs/gcc_4.4.1 --prefix=/broad/tools/Linux/x86_64/pkgs/gcc_4.4.1 Thread model: posix gcc version 4.4.1 (GCC) We first observed the problem under gcc 4.3.3. hardware info: crd4% uname -a Linux crd4 2.6.16.54-0.2.5-smp #1 SMP Mon Jan 21 13:29:51 UTC 2008 x86_64 x86_64 x86_64 GNU/Linux This is a 32-processor machine with 256 GB of memory, but I don't think the problem is specific to this architecture. sort_test.cc: #include <iostream> #include <omp.h> #include <time.h> #include <vector> using namespace std; int main( ) { for ( long long m = 4; m <= 5; m++ ) { const long long entries = m * (long long) 1000000000; const int processors = 32; vector<int> x(entries); for ( long long i = 0; i < entries; i++ ) x[i] = (i*i) % 123456789; time_t clock1, clock2; time( &clock1 ); omp_set_num_threads(processors); sort( x.begin( ), x.end( ) ); time( &clock2 ); cout << clock2 - clock1 << " seconds used in sort" << endl; } }
I suppose you are running into cache effects. Why do you think this is a GCC bug?
Subject: Re: parallel sort run time increases ~10 fold when vector size gets over ~4*10^9 If instead of sorting a vec<int>, one sorts a vec<long long>, there is still a ten-fold slowdown, as one increases the vector size from 4 to 5 billion. So it's not the total amount of memory that matters, but rather the number of entries in the vector. I don't think this is about cache effects. Best, David ============================================================================================ rguenth at gcc dot gnu dot org wrote: > ------- Comment #1 from rguenth at gcc dot gnu dot org 2009-07-24 20:29 ------- > I suppose you are running into cache effects. Why do you think this is a GCC > bug? > >
Out of curiosity, did you try parallel-mode on that machine? Basically, just add -D_GLIBCXX_PARALLEL, but refer to the documentation of course: http://gcc.gnu.org/onlinedocs/libstdc++/manual/parallel_mode.html#manual.ext.parallel_mode.intro I'm also adding Johannes, in CC... Note, I don't think we have any specific issue with the normal, serial, std::sort...
Subject: Re: parallel sort run time increases ~10 fold when vector size gets over ~4*10^9 Oh crap, yes I did, and now I see that I accidentally left off the first three lines of sort_test.cc. They are: #define _GLIBCXX_PARALLEL #include <algorithm> #include <iomanip> David ======================================================================================================= paolo dot carlini at oracle dot com wrote: > ------- Comment #3 from paolo dot carlini at oracle dot com 2009-07-24 21:15 ------- > Out of curiosity, did you try parallel-mode on that machine? Basically, just > add -D_GLIBCXX_PARALLEL, but refer to the documentation of course: > > http://gcc.gnu.org/onlinedocs/libstdc++/manual/parallel_mode.html#manual.ext.parallel_mode.intro > > I'm also adding Johannes, in CC... > > Note, I don't think we have any specific issue with the normal, serial, > std::sort... > >
So this is issue is just that you are not completely happy with the behavior of parallel-mode. Ok... Let's see what Johannes thinks.
Have you tried selecting a different sort algorithm? The default seems to be the multi-way mergesort, but there are two quicksort options as well.
Sorry, the CC has never reached me. So concerning comment #4: Was the parallel mode actually activated? The multiway mergesort takes another time the space of the input as temporarily. Sure that the program was not swapping?
Subject: Re: [parallel-mode] parallel sort run time increases ~10 fold when vector size gets over ~4*10^9 Regarding comment #7, I just ran this now on a machine with 32 processors and 512 GB memory. (a) Sorting 4 x 10^9 ints took 0.9 minutes. (b) Sorting 5 x 10^9 ints took 16 minutes. The second test used about 40 GB, which is a small fraction of the available memory. (c) Sorting 2.5 x 10^9 structures having 2 ints each took 1.1 minutes. Regarding comment #6, repeating (a) and (b) with __gnu_parallel::balanced_quicksort_tag( ): (a') 6.3 minutes (b') 8.1 minutes, so the algorithm is slower on these data but does not exhibit the same jump in runtime. I also tried __gnu_parallel::quicksort_tag( ) which was about the same for (b) [(a) not tested].
I can reproduce the bug on my machine (2 Quadcore Nehalems, 48GB RAM) 4 x 10^9 ints: 65 seconds used in sort 5 x 10^9 ints: 193 seconds used in sort
The problem is in multiseq_selection.h, where this line has an overflow (static_cast<uint64_t>(__total) * __rank / __N - __leftsize) if (__total * __rank) exceeds 64 bits. The quick fix is to use a temporary double, which solves the original test case: 4 x 10^9 ints: 64 seconds used in sort 5 x 10^9 ints: 80 seconds used in sort Find patches for branch (4.4) and trunk (4.5) attached. However, I do not fully trust the double arithmetics yet, although some test cases work. Does anybody else know a better way to avoid an overflow in ((a * b) / c) with only integer arithmetics and normal rounding? Maybe I can find a way to avoid this calculation altogether.
Created attachment 18862 [details] Patch replacing uint64_t by double to avoid overflow, for trunk.
Created attachment 18863 [details] Patch replacing uint64_t by double to avoid overflow, for branch 4.4.
(In reply to comment #10) > However, I do not fully trust the double arithmetics yet, although some test > cases work. Er, this sounded a bit pessimistic, all sort tests I have tried so far work with the patch. And some more explanation: The overflow resulted in erratic and thus very load balancing in the merge step, causing the huge running times.
(In reply to comment #10) > However, I do not fully trust the double arithmetics yet, although some test > cases work. Does anybody else know a better way to avoid an overflow in ((a * > b) / c) with only integer arithmetics and normal rounding? you can use a 128-bit integer type on x86-64.
Subject: Re: [parallel-mode] parallel sort run time increases ~10 fold when vector size gets over ~4*10^9 Wonderful! Thank you very much for fixing this problem.
(In reply to comment #14) > (In reply to comment #10) > > > However, I do not fully trust the double arithmetics yet, although some test > > cases work. Does anybody else know a better way to avoid an overflow in ((a * > > b) / c) with only integer arithmetics and normal rounding? > > you can use a 128-bit integer type on x86-64. Very good idea. Do you know a good #ifdef clause to check its availability. Is it really just x64-64? Also, I probably want to use it only when really needed, because I assume it to be implemented in software, in particular the division.
Is something known about the actual size of a, b, and c? Also, I don't know which is the required precision for the result: must be exact if representable? I suppose not, otherwise the suggestiong of using double would not make sense. Depending on the answer to the above, there are various options, maybe checking for a * b overflowing (if the quantities are all positive, then checking for wraparound is easy) and then taking the appropriate actions. Anyway, barring more sophisticated solutions, using long double seems a better idea to me, because on most widespread targets a long double is at least 80 bits, with a mantissa of at least 64 bits, thus able to exactly represent any long long integer.
(In reply to comment #17) > Is something known about the actual size of a, b, and c? They can be as large as the input size. > Also, I don't know which is the required precision for the result: must be > exact if representable? In the last iteration, __n == 0 => __total == __N, and then, the result must absolutely be __rank, according to the specification. Anyway, I think I have found a solution that is easier, faster, and avoids the large intermediate altogether (see attached patch). It also fixes similar problems in two other locations. However, this patch needs further thorough testing. Also, __n == 2 ^ __r - 1, so __n + 1 == 2 ^ __r, and the divisions could be replaced by shifts.
Created attachment 18878 [details] Patch avoid large intermediates to avoid overflow, for trunk.
Excellent. Let's wait a bit for feedback from people experiencing this issue and then commit the patch, first mainline and then probably 4_4-branch too. Make sure to also regression test the fix on a "normal" ;) machine...
Subject: Re: [parallel-mode] parallel sort run time increases ~10 fold when vector size gets over ~4*10^9 I tested the patch from comment #19, sorting X billion integers on a machine having 32 processors and 256 GB memory, X = 4, 6, ..., 26. The overall behavior is very close to linear. For example, X = 4 took 1.02 minutes, whereas X = 20 took 5.22 minutes. Very nice!
Patch regtests fine on x86_64-linux. Johannes, can you prepare a ChangeLog entry, post and commit both? Thanks!
Subject: Bug 40852 Author: singler Date: Wed Oct 28 10:04:03 2009 New Revision: 153648 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=153648 Log: 2009-10-28 Johannes Singler <singler@kit.edu> PR libstdc++/40852 * include/parallel/multiseq_selection.h (multiseq_partition, multiseq_selection): Avoid intermediate values exceeding the integer type range for very large inputs. Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/include/parallel/multiseq_selection.h
Subject: Bug 40852 Author: singler Date: Wed Oct 28 10:04:35 2009 New Revision: 153649 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=153649 Log: 2009-10-28 Johannes Singler <singler@kit.edu> PR libstdc++/40852 * include/parallel/multiseq_selection.h (multiseq_partition, multiseq_selection): Avoid intermediate values exceeding the integer type range for very large inputs. Modified: branches/gcc-4_4-branch/libstdc++-v3/ChangeLog branches/gcc-4_4-branch/libstdc++-v3/include/parallel/multiseq_selection.h
Closing this bug.
Fixed for 4.4.3 and mainline.
Subject: Bug 40852 Author: law Date: Thu Oct 29 16:48:00 2009 New Revision: 153715 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=153715 Log: Recorded merge of revisions 153580-153581,153584,153586-153600,153604,153606,153610,153613,153615-153618,153621,153643,153646-153648,153650-153652,153654-153667,153669-153671 via svnmerge from svn+ssh://law@gcc.gnu.org/svn/gcc/trunk ........ r153580 | gccadmin | 2009-10-26 18:17:26 -0600 (Mon, 26 Oct 2009) | 1 line Daily bump. ........ r153581 | paolo | 2009-10-26 19:18:10 -0600 (Mon, 26 Oct 2009) | 6 lines 2009-10-26 Paolo Carlini <paolo.carlini@oracle.com> * include/std/chrono (duration<>::duration(const duration<>&)): Fix per the straightforward resolution of DR 974. * testsuite/20_util/duration/cons/dr974.cc: Add. ........ r153584 | carrot | 2009-10-27 03:06:36 -0600 (Tue, 27 Oct 2009) | 16 lines * target.h (have_conditional_execution): Add a new target hook function. * target-def.h (TARGET_HAVE_CONDITIONAL_EXECUTION): Likewise. * targhooks.h (default_have_conditional_execution): Likewise. * targhooks.c (default_have_conditional_execution): Likewise. * doc/tm.texi (TARGET_HAVE_CONDITIONAL_EXECUTION): Document it. * config/arm/arm.c (TARGET_HAVE_CONDITIONAL_EXECUTION): Define it. (arm_have_conditional_execution): New function. * ifcvt.c (noce_process_if_block, find_if_header, cond_exec_find_if_block, dead_or_predicable): Change the usage of macro HAVE_conditional_execution to a target hook call. * recog.c (peephole2_optimize): Likewise. * sched-rgn.c (add_branch_dependences): Likewise. * final.c (asm_insn_count, final_scan_insn): Likewise. * bb-reorder.c (HAVE_conditional_execution): Remove it. ........ r153586 | ebotcazou | 2009-10-27 04:09:04 -0600 (Tue, 27 Oct 2009) | 1 line Fix nits ........ r153587 | jakub | 2009-10-27 04:28:48 -0600 (Tue, 27 Oct 2009) | 3 lines PR c++/41020 * g++.dg/lookup/extern-c-redecl5.C: Fix up regexp. ........ r153588 | aldyh | 2009-10-27 05:18:12 -0600 (Tue, 27 Oct 2009) | 5 lines PR bootstrap/41451 * fold-const.c (fold_binary_loc): Do not call protected_set_expr_location. ........ r153589 | rguenth | 2009-10-27 05:30:59 -0600 (Tue, 27 Oct 2009) | 5 lines 2009-10-27 Richard Guenther <rguenther@suse.de> PR lto/41821 * gimple.c (gimple_types_compatible_p): Handle OFFSET_TYPE. ........ r153590 | revitale | 2009-10-27 05:46:07 -0600 (Tue, 27 Oct 2009) | 1 line Fix PR40648 -- Fix misaligned store vectorizer patch ........ r153591 | charlet | 2009-10-27 07:06:06 -0600 (Tue, 27 Oct 2009) | 16 lines 2009-10-27 Arnaud Charlet <charlet@adacore.com> * exp_aggr.adb: Fix comment. 2009-10-27 Emmanuel Briot <briot@adacore.com> * prj-err.adb (Error_Msg): take into account continuation lines when computing whether we have a warning. 2009-10-27 Vasiliy Fofanov <fofanov@adacore.com> * make.adb, s-os_lib.adb, s-os_lib.ads (Create_Temp_Output_File): New routine that is designed to create temp file descriptor specifically for redirecting an output stream. ........ r153592 | charlet | 2009-10-27 07:16:48 -0600 (Tue, 27 Oct 2009) | 45 lines 2009-10-27 Vincent Celier <celier@adacore.com> * makeutl.adb (Check_Source_Info_In_ALI): Do not recompile if a subunit from the runtime is found, except if gnatmake switch -a is used and this subunit cannot be found. 2009-10-27 Ed Schonberg <schonberg@adacore.com> * gnatbind.adb (gnatbind): When the -R option is selected, list subunits as well, for tools that need the complete closure of the main program. 2009-10-27 Sergey Rybin <rybin@adacore.com> * gnat_ugn.texi: Minor updates. 2009-10-27 Emmanuel Briot <briot@adacore.com> * prj-tree.adb (Free): Fix memory leak. 2009-10-27 Vasiliy Fofanov <fofanov@adacore.com> * adaint.c, s-os_lib.adb (__gnat_create_output_file_new): New function that ensures the file that is created is new. Use this function to make sure there is no race condition if several processes are creating temp files concurrently. * s-os_lib.ads: Update comment. 2009-10-27 Thomas Quinot <quinot@adacore.com> * sem_ch12.adb: Minor reformatting 2009-10-27 Javier Miranda <miranda@adacore.com> * exp_ch4.ads (Integer_Promotion_Possible): New subprogram. * exp_ch4.adb (Integer_Promotion_Possible): New subprogram. (Expand_N_Type_Conversion): Replace code that checks if the integer promotion of the operands is possible by a call to the new function Integer_Promotion_Possible. Minor reformating because an enclosing block is now not needed. * checks.adb (Apply_Arithmetic_Overflow_Check): Add missing check to see if the integer promotion is possible; in such case the runtime checks are not generated. ........ r153593 | charlet | 2009-10-27 07:22:25 -0600 (Tue, 27 Oct 2009) | 17 lines 2009-10-27 Thomas Quinot <quinot@adacore.com> * sem_ch12.adb (Install_Formal_Packages): Do not omit installation of visible entities when the formal package doesn't have a box. * checks.adb: Minor reformatting. 2009-10-27 Vincent Celier <celier@adacore.com> * prj-part.adb (Parse): Catch exception Types.Unrecoverable_Error and set Project to Empty_Node. 2009-10-27 Robert Dewar <dewar@adacore.com> * gnatbind.adb: Minor reformatting ........ r153594 | charlet | 2009-10-27 07:51:46 -0600 (Tue, 27 Oct 2009) | 18 lines 2009-10-27 Robert Dewar <dewar@adacore.com> * s-os_lib.ads, s-os_lib.adb, prj-err.adb, makeutl.adb: Minor reformatting. 2009-10-27 Ed Schonberg <schonberg@adacore.com> * sem.util.ads, sem_util.adb (Denotes_Same_Object, Denotes_Same_Prefix): New functions to detect overlap between actuals that are not by-copy in a call, when one of them is in-out. * sem_warn.ads, sem_warn.adb (Warn_On_Overlapping_Actuals): New procedure, called on a subprogram call to warn when an in-out actual that is not by-copy overlaps with another actual, thus leadind to potentially dangerous aliasing in the body of the called subprogram. Currently the warning is under control of the -gnatX switch. * sem_res.adb (resolve_call): call Warn_On_Overlapping_Actuals. ........ r153595 | charlet | 2009-10-27 08:02:58 -0600 (Tue, 27 Oct 2009) | 6 lines 2009-10-27 Robert Dewar <dewar@adacore.com> * sem_warn.adb, sem_util.adb, sem_util.ads: Minor reformatting. Add comments. ........ r153596 | charlet | 2009-10-27 08:07:19 -0600 (Tue, 27 Oct 2009) | 2 lines Minor doc updates. ........ r153597 | charlet | 2009-10-27 08:14:44 -0600 (Tue, 27 Oct 2009) | 6 lines 2009-10-27 Robert Dewar <dewar@adacore.com> * s-fileio.adb, s-fileio.ads, sem_util.adb, sem_warn.adb, sem_warn.ads: Minor reformatting ........ r153598 | rguenth | 2009-10-27 09:16:35 -0600 (Tue, 27 Oct 2009) | 5 lines 2009-10-27 Richard Guenther <rguenther@suse.de> * tree-complex.c (expand_complex_div_wide): Check for INTEGER_CST, not TREE_CONSTANT on comparison folding result. ........ r153599 | jakub | 2009-10-27 09:50:50 -0600 (Tue, 27 Oct 2009) | 6 lines PR c/41842 * c-typeck.c (convert_arguments): Return -1 if any of the arguments is error_mark_node. * gcc.dg/pr41842.c: New test. ........ r153600 | rguenth | 2009-10-27 09:52:44 -0600 (Tue, 27 Oct 2009) | 14 lines 2009-10-27 Richard Guenther <rguenther@suse.de> * tree-ssa-structalias.c (find_func_aliases): In IPA mode handle calls to externally visible functions like in regular mode. (create_variable_info_for): Do not create function infos here. (have_alias_info): Remove write-only variable. (solve_constraints): New function split out from common code in compute_points_to_sets and ipa_pta_execute. (compute_points_to_sets): Adjust. (ipa_pta_execute): Likewise. Handle clones and externally visible functions like in non-IPA mode. * gcc.dg/torture/ipa-pta-1.c: Adjust testcase. ........ r153604 | uros | 2009-10-27 11:03:47 -0600 (Tue, 27 Oct 2009) | 3 lines * ChangeLog: Fix formatting. * testsuite/ChangeLog: Ditto. ........ r153606 | ktietz | 2009-10-27 11:14:47 -0600 (Tue, 27 Oct 2009) | 11 lines 2009-10-27 Kai Tietz <kai.tietz@onevision.com> PR/41799 * config/i386/mingw32.h (CHECK_EXECUTE_STACK_ENABLED): New macro. * config/i386/mingw.opt: Add fset-stack-executable. * config/i386/i386.c (ix86_trampoline_init): Make call to emit_library_call conditional, if CHECK_EXECUTE_STACK_ENABLED is defined and its value is not zero. * doc/invoke.texi ........ r153610 | espindola | 2009-10-27 12:17:13 -0600 (Tue, 27 Oct 2009) | 7 lines 2009-10-27 Dmitry Gorbachev <d.g.gorbachev@gmail.com> PR lto/41652 * configure.ac: Call AC_SYS_LARGEFILE before AC_OUTPUT. * configure: Regenerate. ........ r153613 | ebotcazou | 2009-10-27 13:41:13 -0600 (Tue, 27 Oct 2009) | 4 lines * raise-gcc (db_region_for): Use _Unwind_GetIPInfo instead of _Unwind_GetIP if HAVE_GETIPINFO is defined. (db_action_for): Likewise. ........ r153615 | rth | 2009-10-27 14:09:07 -0600 (Tue, 27 Oct 2009) | 7 lines PR c++/41819 * tree-eh.c (eh_region_may_contain_throw_map): Rename from eh_region_may_contain_throw; update users. (eh_region_may_contain_throw): New function. (lower_catch): Check flag_exceptions before creating exception region. (lower_eh_filter, lower_eh_must_not_throw): Likewise. (lower_cleanup): Tidy existing flag_exceptions check to match. ........ r153616 | ebotcazou | 2009-10-27 14:24:31 -0600 (Tue, 27 Oct 2009) | 3 lines * gcc-interface/decl.c (purpose_member_field): New static function. (annotate_rep): Use it instead of purpose_member. ........ r153617 | jason | 2009-10-27 15:58:09 -0600 (Tue, 27 Oct 2009) | 10 lines Allow no-capture lambdas to convert to function pointer. * semantics.c (maybe_add_lambda_conv_op): New. * parser.c (cp_parser_lambda_expression): Call it. (cp_parser_lambda_declarator_opt): Make op() static if no captures. * mangle.c (write_closure_type_name): Adjust. * semantics.c (finish_this_expr): Adjust. * decl.c (grok_op_properties): Allow it. * call.c (build_user_type_conversion_1): Handle static conversion op. (build_op_call): And op(). ........ r153618 | rth | 2009-10-27 17:25:54 -0600 (Tue, 27 Oct 2009) | 1 line * cgraphunit.c (cgraph_optimize): Maintain timevar stack properly. ........ r153621 | gccadmin | 2009-10-27 18:16:59 -0600 (Tue, 27 Oct 2009) | 1 line Daily bump. ........ r153643 | kkojima | 2009-10-27 22:22:21 -0600 (Tue, 27 Oct 2009) | 4 lines * config/sh/sh.md (stuff_delay_slot): Move const_int pattern inside the unspec vector. ........ r153646 | bonzini | 2009-10-28 03:49:58 -0600 (Wed, 28 Oct 2009) | 6 lines 2009-10-28 Paolo Bonzini <bonzini@gnu.org> * config/sh/sh.md (cbranchfp4_media): Remove hack extending cstore result to DImode. ........ r153647 | bonzini | 2009-10-28 03:54:01 -0600 (Wed, 28 Oct 2009) | 6 lines 2009-10-28 Paolo Bonzini <bonzini@gnu.org> * expmed.c (emit_store_flag): Check costs before transforming to the opposite representation. ........ r153648 | singler | 2009-10-28 04:04:03 -0600 (Wed, 28 Oct 2009) | 8 lines 2009-10-28 Johannes Singler <singler@kit.edu> PR libstdc++/40852 * include/parallel/multiseq_selection.h (multiseq_partition, multiseq_selection): Avoid intermediate values exceeding the integer type range for very large inputs. ........ r153650 | bonzini | 2009-10-28 04:17:29 -0600 (Wed, 28 Oct 2009) | 15 lines 2009-10-28 Paolo Bonzini <bonzini@gnu.org> PR rtl-optimization/40741 * config/arm/arm.c (thumb1_rtx_costs): IOR or XOR with a small constant is cheap. * config/arm/arm.md (andsi3, iorsi3): Try to place the result of force_reg on the LHS. (xorsi3): Likewise, and split the XOR if the constant is complex and not in Thumb mode. 2009-10-28 Paolo Bonzini <bonzini@gnu.org> PR rtl-optimization/40741 * gcc.target/arm/thumb-branch1.c: New. ........ r153651 | bonzini | 2009-10-28 04:27:15 -0600 (Wed, 28 Oct 2009) | 13 lines 2009-10-28 Paolo Bonzini <bonzini@gnu.org> PR rtl-optimization/39715 * combine.c (simplify_comparison): Use extensions to widen comparisons. Try an ANDing first. testsuite: 2009-10-28 Paolo Bonzini <bonzini@gnu.org> PR rtl-optimization/39715 * gcc.target/arm/thumb-bitfld1.c: New. ........ r153652 | bonzini | 2009-10-28 06:37:30 -0600 (Wed, 28 Oct 2009) | 13 lines 2009-10-28 Paolo Bonzini <bonzini@gnu.org> PR rtl-optimization/41812 Revert: 2009-06-27 Paolo Bonzini <bonzini@gnu.org> * df-problems.c (df_md_scratch): New. (df_md_alloc, df_md_free): Allocate/free it. (df_md_local_compute): Only include live registers in init. (df_md_transfer_function): Prune the in-set computed by the confluence function, and the gen-set too. ........ r153654 | paolo | 2009-10-28 07:07:00 -0600 (Wed, 28 Oct 2009) | 6 lines 2009-10-28 Paolo Carlini <paolo.carlini@oracle.com> * include/bits/stl_iterator_base_funcs.h: (next): Change template parameter name consistently with the resolution of DR 1011 ([Ready] in Santa Cruz). ........ r153655 | rguenth | 2009-10-28 07:28:32 -0600 (Wed, 28 Oct 2009) | 14 lines 2009-10-28 Richard Guenther <rguenther@suse.de> PR middle-end/41855 * tree-ssa-alias.c (refs_may_alias_p_1): Deal with CONST_DECLs (ref_maybe_used_by_call_p_1): Fix bcopy handling. (call_may_clobber_ref_p_1): Likewise. * tree-ssa-structalias.c (find_func_aliases): Likewise. * alias.c (nonoverlapping_memrefs_p): Deal with CONST_DECLs. * gfortran.dg/lto/20091028-1_0.f90: New testcase. * gfortran.dg/lto/20091028-1_1.c: Likewise. * gfortran.dg/lto/20091028-2_0.f90: Likewise. * gfortran.dg/lto/20091028-2_1.c: Likewise. ........ r153656 | charlet | 2009-10-28 07:31:51 -0600 (Wed, 28 Oct 2009) | 25 lines 2009-10-28 Robert Dewar <dewar@adacore.com> * a-ztexio.adb, a-ztexio.ads, a-witeio.ads, a-witeio.adb, a-textio.ads, a-textio.adb: Reorganize (moving specs from private part to body). (Initialize_Standard_Files): New procedure. * a-tienau.adb: Minor change to make EOF directly visible * a-tirsfi.ads, a-wrstfi.adb, a-wrstfi.ads, a-zrstfi.adb, a-zrstfi.ads, a-tirsfi.adb: New unit, initial version. * gnat_rm.texi: Add documentation for Ada.[Wide_[Wide_]]Text_IO.Reset_Standard_Files. * Makefile.rtl: Add entries for Ada.[Wide_[Wide_]]Text_IO.Reset_Standard_Files 2009-10-28 Thomas Quinot <quinot@adacore.com> * exp_ch9.ads: Minor reformatting * sem_ch3.adb: Minor reformatting * sem_aggr.adb: Minor reformatting. * sem_attr.adb: Minor reformatting * tbuild.adb, tbuild.ads, par-ch4.adb, exp_ch4.adb (Tbuild.New_Op_Node): New subprogram. Minor code reorganization/factoring. ........ r153657 | charlet | 2009-10-28 07:41:05 -0600 (Wed, 28 Oct 2009) | 29 lines 2009-10-28 Thomas Quinot <quinot@adacore.com> * exp_ch4.adb (Expand_N_Type_Conversion): Perform Integer promotion for the operand of the unary minus and ABS operators. * sem_type.adb (Covers): A concurrent type and its corresponding record type are compatible. * exp_attr.adb (Expand_N_Attribute_Reference): Do not rewrite a 'Access attribute reference for the current instance of a protected type while analyzing an access discriminant constraint in a component definition. Such a reference is handled in the corresponding record's init proc, while initializing the constrained component. * exp_ch9.adb (Expand_N_Protected_Type_Declaration): When creating the corresponding record type, propagate components' Has_Per_Object_Constraint flag. * exp_ch3.adb (Build_Init_Procedure.Build_Init_Statements): For a concurrent type, set up concurrent aspects before initializing components with a per object constrain, because they may be controlled, and their initialization may call entries or protected subprograms of the enclosing concurrent object. 2009-10-28 Emmanuel Briot <briot@adacore.com> * prj-nmsc.adb (Add_If_Not_In_List): New subprogram, for better sharing of code. (Find_Source_Dirs): resolve links if Opt.Follow_Links_For_Dirs when processing the directories specified explicitly in the project file. ........ r153658 | charlet | 2009-10-28 07:50:10 -0600 (Wed, 28 Oct 2009) | 10 lines 2009-10-28 Robert Dewar <dewar@adacore.com> * exp_attr.adb, exp_ch9.adb, prj-nmsc.adb, tbuild.adb, ali.adb, types.ads: Minor reformatting 2009-10-28 Tristan Gingold <gingold@adacore.com> * init.c: Fix __gnat_error_handler for Darwin10 (Snow Leopard) ........ r153659 | rguenth | 2009-10-28 07:52:20 -0600 (Wed, 28 Oct 2009) | 11 lines 2009-10-28 Richard Guenther <rguenther@suse.de> * tree.c (free_lang_data_in_type): Do not call get_alias_set. (free_lang_data): Unconditionally compute alias sets for all standard integer types. Bail out if gate bailed out previously. Do not reset the types_compatible_p langhook. (gate_free_lang_data): Remove. (struct pass_ipa_free_lang_data): Enable unconditionally. * gimple.c (gimple_get_alias_set): Use the same alias-set for all pointer types. ........ r153660 | charlet | 2009-10-28 08:07:16 -0600 (Wed, 28 Oct 2009) | 2 lines * gcc-interface/Make-lang.in: Update dependencies. ........ r153661 | charlet | 2009-10-28 08:09:12 -0600 (Wed, 28 Oct 2009) | 22 lines 2009-10-28 Vincent Celier <celier@adacore.com> * prj-nmsc.adb (Add_To_Or_Remove_From_List): New name of procedure Add_If_Not_In_List to account to the fact that a directory may be removed from the list. Only remove directory if Removed is True. 2009-10-28 Gary Dismukes <dismukes@adacore.com> * a-textio.ads, a-textio.ads: Put back function EOF_Char in private part. Put back body of function EOF_Char. * a-tienau.adb: Remove with of Interfaces.C_Streams and change EOF back to EOF_Char. 2009-10-28 Emmanuel Briot <briot@adacore.com> * prj-tree.adb (Free): Fix memory leak. 2009-10-28 Thomas Quinot <quinot@adacore.com> * s-fileio.adb: Minor reformatting ........ r153662 | charlet | 2009-10-28 08:14:05 -0600 (Wed, 28 Oct 2009) | 9 lines 2009-10-28 Thomas Quinot <quinot@adacore.com> * s-crtl.ads (System.CRTL.strerror): New function. 2009-10-28 Ed Schonberg <schonberg@adacore.com> * sem_type.adb: Add guard to recover some type errors. ........ r153663 | charlet | 2009-10-28 08:22:09 -0600 (Wed, 28 Oct 2009) | 12 lines 2009-10-28 Bob Duff <duff@adacore.com> * s-fileio.adb: Give more information in exception messages. 2009-10-28 Robert Dewar <dewar@adacore.com> * gnat_ugn.texi: Document new -gnatyt requirement for space after right paren if next token starts with digit or letter. * styleg.adb (Check_Right_Paren): New rule for space after if next character is a letter or digit. ........ r153664 | rguenth | 2009-10-28 08:33:17 -0600 (Wed, 28 Oct 2009) | 4 lines 2009-10-28 Richard Guenther <rguenther@suse.de> * gimple.c (gimple_get_alias_set): Fix comment typo. ........ r153665 | jakub | 2009-10-28 08:36:28 -0600 (Wed, 28 Oct 2009) | 3 lines * var-tracking.c (emit_note_insn_var_location): Get the mode of a variable part from its REG, MEM or VALUE. ........ r153666 | jakub | 2009-10-28 08:37:24 -0600 (Wed, 28 Oct 2009) | 4 lines * var-tracking.c (emit_note_insn_var_location): Don't call the second vt_expand_loc unnecessarily when location is not a register nor memory. ........ r153667 | jakub | 2009-10-28 08:39:06 -0600 (Wed, 28 Oct 2009) | 6 lines PR target/41762 * config/i386/i386.c (ix86_pic_register_p): Don't call rtx_equal_for_cselib_p for VALUEs discarded as useless. * gcc.dg/pr41762.c: New test. ........ r153669 | jakub | 2009-10-28 08:43:04 -0600 (Wed, 28 Oct 2009) | 6 lines PR debug/41801 * builtins.c (get_builtin_sync_mem): Expand loc in ptr_mode, call convert_memory_address on addr. * g++.dg/ext/sync-3.C: New test. ........ r153670 | jakub | 2009-10-28 08:45:03 -0600 (Wed, 28 Oct 2009) | 6 lines PR middle-end/41837 * ipa-struct-reorg.c (find_field_in_struct_1): Return NULL if fields don't have DECL_NAME. * gcc.dg/pr41837.c: New test. ........ r153671 | rguenth | 2009-10-28 08:48:34 -0600 (Wed, 28 Oct 2009) | 15 lines 2009-10-28 Richard Guenther <rguenther@suse.de> PR lto/41808 PR lto/41839 * tree-ssa.c (useless_type_conversion_p): Do not treat conversions to pointers to incomplete types as useless. * gimple.c (gimple_types_compatible_p): Compare struct tags, not typedef names. * gcc.dg/lto/20091027-1_0.c: New testcase. * gcc.dg/lto/20091027-1_1.c: Likewise. * g++.dg/lto/20091026-1_0.C: Likewise. * g++.dg/lto/20091026-1_1.C: Likewise. * g++.dg/lto/20091026-1_a.h: Likewise. ........ Modified: branches/reload-v2a/ (props changed) Propchange: branches/reload-v2a/ ('svnmerge-integrated' modified)