This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: The speed of the compiler, was: Re: Combine four insns


On Tue, Aug 10, 2010 at 1:19 AM, Joseph S. Myers
<joseph@codesourcery.com> wrote:
> On Mon, 9 Aug 2010, Diego Novillo wrote:
>
>> Additionally, the very worst offender in terms of compile time is -g. The size
>> of debugging information is such, that I/O and communication times increase
>> significantly.
>
> If communication between the compiler and assembler is an important part
> of the cost there, it's possible that a binary interface between them as
> suggested by Ian at <http://www.airs.com/blog/archives/268> would help.
> I would imagine it should be possible to get the assembler to accept some
> form of mixed text/binary input so you could just transmit debug info that
> way and transition to a more efficient interface incrementally (assembler
> input goes through a rather complicated sequence of preprocessing /
> processing steps, but cleaning them up to work with such input should be
> possible).
>
Ian Lance Taylor wrote:
"What does make sense is using a structured data format, rather than
text, to communicate between the compiler and the assembler. In gccâs
terms, the compiler should generate insn patterns with associated
operands. The assembler should piece those together into its own
internal data structures. In fact, of course, ideally gcc and the
assembler would use the same internal data structure. It would be
interesting to design such a structure so that it could be transmitted
in a file, or over a pipe, or in shared memory."


Using shared memory is by far the most efficient way to transmit large
amount data between processes.  It's I/O and communication cost is
roughly zero if you have enough physical memory.

Using temp file or pipe is less efficient. This is because syscalls
like read() and write() have large amount of user space to kernel
space and kernel space to user space data coping. Although using pipe
have less memory consumption,  but because size of buffer ring of pipe
in Linux kernel is 4k bytes(not ?), so using pipe to transmit large
data will consume large amount of CPU time to schedule processes,
beside of cost of read() and write().


Using a structured data format, rather than text, to communicate
between the compiler and the assembler, may require big change to
current compiler architecture, rendering the compiler even harder to
maintain.  For a given machine instruction, it's gcc representation is
simple RTL, but it's assembly representation is complex and varying.
And, after all, you must provide text "dump" of the assembly for
debugging purpose.  Separation of compiler and assembler conforms
modern software engineer practice.

-- 
Chiheng Xu
Wuhan,China


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]