This is the mail archive of the
libstdc++@gcc.gnu.org
mailing list for the libstdc++ project.
Re: WIP: Implement Filesystem TS
- From: Marc Glisse <marc dot glisse at inria dot fr>
- To: Jonathan Wakely <jwakely at redhat dot com>
- Cc: libstdc++ at gcc dot gnu dot org
- Date: Mon, 4 Aug 2014 21:40:57 +0200 (CEST)
- Subject: Re: WIP: Implement Filesystem TS
- Authentication-results: sourceware.org; auth=none
- References: <20140804135012 dot GQ2361 at redhat dot com> <20140804171148 dot GS2361 at redhat dot com>
- Reply-to: libstdc++ at gcc dot gnu dot org
On Mon, 4 Aug 2014, Jonathan Wakely wrote:
On 04/08/14 14:50 +0100, Jonathan Wakely wrote:
This is a 99% complete implementation of the Filesystem TS as defined
by the N4099 draft,
http://open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4099.html
The missing 1% relates to code conversions from wide character
strings, which will be easier with the C++11 codecvt facets.
And that 99% only refers to POSIX systems, there are large chunks
missing for Windows, but as I don't have access to a Windows PC and
don't know the API someone else will have to provide those missing
pieces. I hope no major redesign will be needed to add Windows support
though.
I plan to document the design, but I'm not sure when that will happen,
so for now:
* filesystem::path is implemented as a basic_string containing the
native path, an enumeration describing the type of path represented
(either an individual component such as root-name, root-dir,
directory, filename, or a composite path of multiple components) and
a std::list containing the separate components of a composite path
(which is empty unless the path has multiple components).
Any reason not to make it a vector?
(N4099 writes --end() in a few places, I don't remember seeing text
explaining that this is ok even if end() returns a pointer, while we do
have text explaining what end()-1 means)
The list exists so that dereferencing a filesystem::path::iterator can
return a reference to an object that is stored somewhere outside the
iterator, as required for forward iterators.
I find it a bit scary that this wart in the standard (the iterator
concepts mix traversal and access properties) has such an impact on the
design of the rest of the library. We might still have kept a vector with
the indices of the '/' or something, but having never looked at the FS
proposals I was expecting iterators to return something similar to a
string_view. Now I agree you have little choice with the current wording
(I didn't check the status of the LWG issue about iterators returning
references to themselves but you nicely added a reminder of what the
conclusion was :-).
* It might be possible to optimize path by lazily populating the
std::list, so that copying paths and passing them around by value
just copies the basic_string containing the native path, and it only
gets parsed to find the individual components as needed.
Or we could parse eagerly and not need to store the full string, but
that's probably less efficient if we are going to need the string soon to
pass it to the system.
* directory_iterator holds a shared_ptr<_Dir> where _Dir is a pimpl
class containing a DIR* returned by opendir(), a path object
containing the path the dir was opened with, and a directory_entry
object that gets returned by dereferencing the iterator. It also
contained a file_type enumeration, which gets used on GNU and BSD
platforms where the dirent struct contains the file type, which
means no stat() system call is needed to find out whether the
current entry is a directory and should be recursed into.
* recursive_directory_iterator holds a shared_ptr to a
stack<pair<_Dir, directory_iterator>> representing each directory
recursed into and the position within that directory. The
shared_ptr<_Dir>s belong to the directory_iterator objects in the
stack alias the shared_ptr held by the parent
recursive_directory_iterator, so the reference counts are shared by
the whole stack.
I don't understand the last sentence of this paragraph. I don't know what
a parent recursive_directory_iterator is, and from what I understand each
directory_iterator in the stack iterates on a different directory so there
is nothing to share.
--
Marc Glisse