This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: Tremendous performance regression in 1.1.2 -> mainline


> With a reduced version of your test I had an big improvement in speed:
> (from 85 to 8 seconds in flow).

This works great, thanks!  I'm wondering if the same change in the
order of processing the blocks might work in the lcm routine 
compute_pre_data to reduce the runtime there.  There is about a factor
of five regression in the time to compile

http://www.math.purdue.edu/~lucier/all.i.gz

that occurs mainly in the gcse pass:

-----------------------------------------------
                0.00  151.92       3/3           rest_of_compilation [5]
[6]     49.6    0.00  151.92       3         gcse_main [6]
                0.00  132.11       1/1           one_pre_gcse_pass [7]
                0.00   19.66       2/2           one_cprop_pass [27]
                0.01    0.08       2/2           compute_sets [258]
                0.04    0.02       2/2           alloc_gcse_mem [300]
                0.00    0.00       1/1           compute_can_copy [1161]
                0.00    0.00       6/185600      max_reg_num [440]
                0.00    0.00       1/66300       gcse_alloc [467]
                0.00    0.00       2/2           alloc_reg_set_mem [1572]
                0.00    0.00       2/2           free_reg_set_mem [1581]
                0.00    0.00       2/2           free_gcse_mem [1580]
                0.00    0.00       1/14          gcc_obstack_init [1420]
-----------------------------------------------
                0.00  132.11       1/1           gcse_main [6]
[7]     43.1    0.00  132.11       1         one_pre_gcse_pass [7]
                0.00  105.12       1/1           compute_pre_data [8]
                0.00   25.12       1/1           pre_gcse [19]
                1.03    0.77       1/3           compute_hash_table [35]
                0.00    0.07       1/1           alloc_pre_mem [288]
                0.00    0.00       1/1           remove_fake_edges [957]
                0.00    0.00       1/4728        remove_fake_successors [929]
                0.00    0.00       1/1           alloc_expr_hash_table [1590]
                0.00    0.00       1/1           add_noreturn_fake_exit_edges [1589]
                0.00    0.00       1/1           compute_expr_hash_table [1594]
                0.00    0.00       1/1           free_edge_list [1603]
                0.00    0.00       1/1           free_pre_mem [1605]
                0.00    0.00       1/1           free_expr_hash_table [1604]
-----------------------------------------------
                0.00  105.12       1/1           one_pre_gcse_pass [7]
[8]     34.3    0.00  105.12       1         compute_pre_data [8]
                0.00   64.15       1/1           pre_edge_lcm [13]
                9.05   30.58       1/1           compute_ae_kill [14]
                0.02    1.31       1/3           compute_local_properties [39]
                0.00    0.00       1/1           compute_transpout [810]
                0.00    0.00       1/36          sbitmap_vector_zero [591]
-----------------------------------------------
...
-----------------------------------------------
                0.00   64.15       1/1           compute_pre_data [8]
[13]    20.9    0.00   64.15       1         pre_edge_lcm [13]
                0.01   34.89       1/1           compute_antinout_edge [15]
                0.14   22.10       1/1           compute_laterin [25]
                0.01    6.58       1/13          compute_available [10]
                0.00    0.20       1/1           compute_earliest [176]
                0.00    0.12       1/1           compute_insert_delete [227]
                0.10    0.00       9/42          sbitmap_vector_alloc [110]
                0.00    0.00       1/1           create_edge_list [847]
-----------------------------------------------
                9.05   30.58       1/1           compute_pre_data [8]
[14]    12.9    9.05   30.58       1         compute_ae_kill [14]
               30.58    0.00 60619050/60619050     expr_killed_p [17]
-----------------------------------------------
                0.01   34.89       1/1           pre_edge_lcm [13]
[15]    11.4    0.01   34.89       1         compute_antinout_edge [15]
               34.71    0.00   11204/11204       sbitmap_intersection_of_succs [16]
                0.18    0.00   11205/11205       sbitmap_a_or_b_and_c [186]
                0.00    0.00       1/20          sbitmap_vector_ones [587]
                0.00    0.00       1/182778      sbitmap_zero [733]
-----------------------------------------------
               34.71    0.00   11204/11204       compute_antinout_edge [15]
[16]    11.3   34.71    0.00   11204         sbitmap_intersection_of_succs [16]
                0.00    0.00   11204/108038      sbitmap_copy [549]
-----------------------------------------------
                             37606148             expr_killed_p [17]
               30.58    0.00 60619050/60619050     compute_ae_kill [14]
[17]    10.0   30.58    0.00 60619050+37606148 expr_killed_p [17]
                             37606148             expr_killed_p [17]

gcse also spends a fair amount of time in delete_null_pointer_checks:

-----------------------------------------------
                0.00   66.52       6/6           rest_of_compilation [5]
[11]    21.7    0.00   66.52       6         delete_null_pointer_checks [11]
                0.10   66.33      10/10          delete_null_pointer_checks_1 [12]
                0.09    0.00       8/42          sbitmap_vector_alloc [110]
                0.00    0.00    2563/2670        canonicalize_condition [721]
                0.00    0.00    8905/421847      condjump_p [381]
                0.00    0.00    8903/850373      simplejump_p [279]
                0.00    0.00    2563/2670        get_condition [988]
                0.00    0.00       2/185600      max_reg_num [440]
                0.00    0.00       2/2           get_bitmap_width [1583]
-----------------------------------------------
                0.10   66.33      10/10          delete_null_pointer_checks [11]
[12]    21.7    0.10   66.33      10         delete_null_pointer_checks_1 [12]
                0.10   65.78      10/13          compute_available [10]
                0.05    0.36  461510/2655811     note_stores [44]
                0.03    0.00  461510/3538732     single_set [163]
                0.00    0.00      20/36          sbitmap_vector_zero [591]
                0.00    0.00      99/2670        canonicalize_condition [721]
                0.00    0.00      99/2670        get_condition [988]
-----------------------------------------------

but (as usual) I have no idea what's going on there.

Brad

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]