[Bug libstdc++/90281] New: utf-8 encoded std::filesystem::path can not be converted to utf-16.

ssh at pobox dot com gcc-bugzilla@gcc.gnu.org
Mon Apr 29 15:14:00 GMT 2019


https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90281

            Bug ID: 90281
           Summary: utf-8 encoded std::filesystem::path can not be
                    converted to utf-16.
           Product: gcc
           Version: 8.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: ssh at pobox dot com
  Target Milestone: ---

During tests on an implementation of a std::fs compatible backport for
C++11/C++14 I found the following issue when running my checks against GCCs
C++17 std::fs code. The following example

#include <filesystem>

int main()
{
    auto p = std::filesystem::u8path("\xf0\x9d\x84\x9e").u16string();
    return 0;
}

fails with:

terminate called after throwing an instance of
'std::filesystem::__cxx11::filesystem_error'
  what():  filesystem error: Cannot convert character sequence: Invalid or
incomplete multibyte or wide character

Tested with GCC 8.1.0 and 8.2.0 on Ubuntu 18.04 and macOS, and GCC 8.3.0 on
Wandbox.

The UTF-8 sequence is the musical symbol "clef" (U+1D11E) and on both systems I
can create a file with that name, e.g. from the shell. Even when then iterating
over the directory and calling u16string() on the filename of the
directory_entry, this exception will be thrown, but the example code is
simpler.

(Just as additional info: \xf0\x9d\x84\x9e is the correct UTF8 encoding for
U+1D11E and this works with clang and its libc++/libc++fs.)


More information about the Gcc-bugs mailing list