This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: Possible LRA issue?



-----Original Message-----
From: Daniel Gutson [mailto:daniel.gutson@tallertechnologies.com] 
Sent: Wednesday, August 27, 2014 8:53 PM
To: Ajit Kumar Agarwal
Cc: gcc Mailing List
Subject: Re: Possible LRA issue?

On Wed, Aug 27, 2014 at 12:16 PM, Ajit Kumar Agarwal <ajit.kumar.agarwal@xilinx.com> wrote:
> The cause of xmalloc occurring at times given below in Register Allocator will not be caused only by the structure and changing the passed S as template argument.
> It depends on how the below structures is referenced or used. From the stack trace I can see the live ranges creation is based on how the below structure is referenced and Used.

>>Could you please show me an example of such different usages and references?

I would like you to formulate the exact code with the templatized structures. Here is the flow for def and use.
                                  Live range<1>    Live range<2>

1 = Def <struct>   |
2 = Def<struct>    |                                   |
.                                  |                                   |
.                                  |                                   |  
Use of <2>              |                                  |
.                                  |                                   
.                                  |                                   
.                                  |                                   
Use of <1>             |
                                  
If the size of elements of array increases the  creation of Live ranges will increase thus increasing the calls to xmalloc to create the live ranges. The above DEF and USE allows to populate the Live ranges. Thus the Use and reference of array element is the cause of the calls rather than the declaration of the structures.

Thanks & Regards
Ajit

>
> Thanks & Regards
> Ajit
>
> -----Original Message-----
> From: gcc-owner@gcc.gnu.org [mailto:gcc-owner@gcc.gnu.org] On Behalf 
> Of Daniel Gutson
> Sent: Wednesday, August 27, 2014 7:58 PM
> To: gcc Mailing List
> Subject: Possible LRA issue?
>
> Hi,
>
>    I have a large codebase where at some point, there's a structure 
> that takes an unsigned integer template argument, and uses as the size 
> of an array, something like
>
> template <class T, size_t S>
> struct Struct
> {
>     typedef std::array<T, S> Chunk;
>     typedef std::list<Chunk> Content;
>
>    Content c;
> };
>
> Changing the values of S alters significantly the compile time and memory that the compiler takes. We use some large numbers there.
> At some point, the compiler runs out of memory (xmalloc fails). I wondered why, and did some analysis by debugging the 4.8.2 (same with 4.8.3), and did the following experiment turning off all the optimizations (-fno-* and -O0):
>   I generated a report of xmalloc usage of two programs: one having S=10u, and another with S=11u, just to see the difference of 1.
> The report was generated as follows: I set a breakpoint at xmalloc, appending a bt to a file. Then I found common stack traces and counted how many xmallocs were called in one and another versions of the program (S=10u and S=11u as mentioned above).
> The difference were:
>
> a) Stack trace:
>       xmalloc | pool_alloc | create_live_range | mark_pseudo_live | 
> mark_regno_live | process_bb_lives | lra_create_live_ranges | lra | 
> do_reload | rest_of_handle_reload | execute_one_pass | 
> execute_pass_list | execute_pass_list | expand_function | 
> output_in_order | compile | finalize_compilation_unit | 
> cp_write_global_declarations | compile_file | do_compile | toplev_main
> | __libc_start_main | _start |
>
>      S=10u: 15 times
>      S=11u: 16 times
>
>
> b) Stack trace:
>       xmalloc | lra_set_insn_recog_data | lra_get_insn_recog_data | 
> lra_update_insn_regno_info | lra_update_insn_regno_info |
> lra_push_insn_1 | lra_push_insn | push_insns | lra_process_new_insns | 
> curr_insn_transform | lra_constraints | lra | do_reload | 
> rest_of_handle_reload | execute_one_pass | execute_pass_list | 
> execute_pass_list | expand_function | output_in_order | compile | 
> finalize_compilation_unit | cp_write_global_declarations | 
> compile_file | do_compile | toplev_main | __libc_start_main | _start |
>
>      S=10u: 186 times
>      S=11u: 192 times
>
> c) Stack trace:
>      xmalloc | df_install_refs | df_refs_add_to_chains | 
> df_insn_rescan | emit_insn_after_1 | emit_pattern_after_noloc | 
> emit_pattern_after_setloc | emit_insn_after_setloc | try_split | 
> split_insn | split_all_insns | rest_of_handle_split_after_reload | 
> execute_one_pass | execute_pass_list | execute_pass_list | 
> execute_pass_list | expand_function | output_in_order | compile | 
> finalize_compilation_unit | cp_write_global_declarations | 
> compile_file | do_compile | toplev_main | __libc_start_main | _start |
>
>      S=10u: 617 times
>      S=11u: 619 times
>
> d) Stack trace:
>      xmalloc | df_install_refs | df_refs_add_to_chains | 
> df_bb_refs_record | df_scan_blocks | rest_of_handle_df_initialize | 
> execute_one_pass | execute_pass_list | execute_pass_list | 
> expand_function | output_in_order | compile | 
> finalize_compilation_unit | cp_write_global_declarations | 
> compile_file | do_compile | toplev_main | __libc_start_main | _start |
>
>     S=10u: 13223 times
>     S=11u: 13227 times
>
> e) Stack trace:
>      xmalloc | __GI__obstack_newchunk | bitmap_element_allocate | 
> bitmap_set_bit | update_lives | assign_hard_regno | assign_by_spills | 
> lra_assign | lra | do_reload | rest_of_handle_reload | 
> execute_one_pass | execute_pass_list | execute_pass_list | 
> expand_function | output_in_order | compile | 
> finalize_compilation_unit | cp_write_global_declarations | 
> compile_file | do_compile | toplev_main | __libc_start_main | _start |
>
>     S=10u: 0 times (never!)
>     S=11u: 1
>
> Unfortunately I can't disclose the source code nor have the time to isolate a piece of code reproducing the issue.
> Some comments about the code: I don't do template metaprogramming depending on S, but I do some for-range on the Content.
>
> I can extend the analysis to S=12 and compare with the previous values.
> I thought to fix this myself but lack the time and background on theses optimizations. Any hint?
> I'm open to do more experiments if anybody asks me, or post -fdumps.
>
> I suspect that playing with gcc-min-heapsize and similar values this issue could be worked around, but I'd like to know why just changing the size of an array has such a consequence.
>
> Thanks!
>
>     Daniel.
>
> --
>
> Daniel F. Gutson
> Chief Engineering Officer, SPD
>
>
> San Lorenzo 47, 3rd Floor, Office 5
>
> CÃrdoba, Argentina
>
>
> Phone: +54 351 4217888 / +54 351 4218211
>
> Skype: dgutson



-- 

Daniel F. Gutson
Chief Engineering Officer, SPD


San Lorenzo 47, 3rd Floor, Office 5

CÃrdoba, Argentina


Phone: +54 351 4217888 / +54 351 4218211

Skype: dgutson

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]