This is the mail archive of the gcc@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]

Re: A sick idea - mmapped file output


Nick Ing-Simmons said:
> Linus Torvalds <torvalds@transmeta.com> writes:
> >
> >Comparing the mmap() with a stdio-based approach is unfair.  You should
> >at least try with direct write() calls.  If the common case is _really_
> >short writes, you could go half-way, with a really simple buffering
> >scheme rather than stdio (expose the buffer or similar).
> 
> My experiments reading/wring huge netlist files for work showed best 
> performance was mmap() for reading and write() of "large" chunks 
> for writing.  
> 

Depending on the 'flow' or timing of your read operations, and depending
on the OS (if it dynamically determines the need for read-ahead), mmap
for reads could be a gain (avoiding a copy of data), or a slight loss
(due to lack of read-ahead, page fault overhead or process map management
issues.)  If data doesn't have to be read-in (because it is already in memory),
then some efficient mmap schemes will pre-map the data upon issuing the mmap
system call or upon the faulting in of other adjacent pages.

In rather exhaustive tests, I have found that there is sometimes some
gain for using mmap for read operations.  If there is any significant
amount of usage of that read data, then the gain for mmap is proportionally
very small.  There are all sorts of tradeoffs (cache issues) and the like,
but for data that is going to be significantly processed, the gain is
miniscule.

Mmap for read and/or write is best used for fast file (or network) copies.
IF the OS does the mmap and the necessary VM and IO operations quickly,
and has the same read-ahead hooks for VM as normal read/write type I/O,
then the mmap is a good choice.  I suspect that using mmap for read and/or
write will on most OSes, work worse than regular read/write type I/O.

MMaped write is likely to be worse than mmaped read -- and almost never
wise for writing normal, sequential files.  Mmaped read is plausibly in
some cases, advantageous.

John

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]