This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Proposal to patch libiberty.a for controlling whether pathnames on Windows are converted to lower case


This is a proposed patch to libiberty, with an accompanying patch to GCC.
The purpose of this patch is to make it possible for Windows-hosted toolchains to have the ability to control whether Canonicalized filenames are converted to all lower-case.
Most Windows users are not affected by this behavior.
However, we have at least one customer who does use Windows systems where their hard drives are case-sensitive.  Thus, they desire this ability.

The first implementation of Windows support in libiberty/lrealpath.c was added back in 2004.  The new code included a call to GetFullPathName(), to Canonicalize the filename.  Next,  if there was sufficient room in the buffer, the following code was run:

        /* The file system is case-preserving but case-insensitive,
           Canonicalize to lowercase, using the codepage associated
           with the process locale.  */
        CharLowerBuff (buf, len);
        return strdup (buf);

In effect, the assumption of the code is that all Windows file systems will be case-preserving but case-insensitive, so converting a filename to lowercase is not a problem.  And tools would always find the resulting file on the file system.  That turns out not to be true, but lrealpath() was mostly used just for system header files, so no one noticed.

However, in April 2014, libcpp was patched, to cause even non-system headers on Windows systems to be Canonicalized.  This patch has caused problems for users that have case-sensitive file systems on their Windows systems.  A patch to libcpp was proposed to do additional Canonicalize non-system headers on Windows systems. The discussion on the patch starts at https://gcc.gnu.org/ml/gcc-patches/2014-04/msg00009.html
As is noted in the comments:
      For DOS based file system, we always try to shorten non-system headers, as DOS has a tighter constraint on max path length. The okay to add the patch was given May 9, 2014 at https://gcc.gnu.org/ml/gcc-patches/2014-05/msg00557.html , with the note:       I've spoke with Kai a bit about this and he thinks it's appropriate and desirable to shorten paths on these kinds of filesystems.

The libcpp change meant that lrealpath() began to get called for non-system header files.  There is other code in both bfd and libiberty that can reach the lrealpath() function, but apparently those functions are not as much of a problem, as in not really being used in most tools (obviously since our customer had no complaints before 2014).

What I am proposing is to modify lrealpath() to only call CharLowerBuff when the user has a filesystem where changing the case of the filename is not an issue.
That is actually most users, so the default should be to call CharLowerBuff.
But I want to give the user the ability to control that behavior.

My proposal is to change that section of code in libiberty/lrealpath.c as follows:

     else
       {
-       /* The file system is case-preserving but case-insensitive,
-          Canonicalize to lowercase, using the codepage associated
+       /* The file system is normally case-preserving but case-insensitive,
+          Canonicalize to lowercase if desired, using the codepage associated
           with the process locale.  */
-        CharLowerBuff (buf, len);
+       if (libiberty_lowerpath)
+          CharLowerBuff (buf, len);
         return strdup (buf);
       }

In effect, this just adds a control to let the user decide whether or not a pathname is converted to lowercase on Windows systems.

I also added a global definition for that control at the start of the libiberty/lrealpath.c source, setting it so the default behavior on Windows is to convert pathnames to lowercase:

+/* External option to control whether a pathname is converted
+   to all lower-case on Windows systems.  The default behavior
+   is to convert.
+*/
+#if defined (_WIN32)
+#ifdef __cplusplus
+extern "C" {
+#endif
+unsigned char libiberty_lowerpath = 1;
+#ifdef __cplusplus
+}
+#endif
+#endif

And, for use by tools that link to libiberty.a, I added an external reference to that control in include/libiberty.h:

+#if defined (_WIN32)
+/* Determine if filenames should be converted to lower case */
+extern unsigned char libiberty_lowerpath;
+#endif


Adding the above code to include/libiberty.h and libiberty/lrealpath.c results in a libiberty.a that behaves exactly the same as it does today for most users. It also makes available a method of affecting whether or not a given tool affects canonicalized pathnames on Windows by also converting those pathnames to lowercase.

The questions to discuss:
1.    Is this a reasonable solution to this problem, by adding such a controlling symbol? 2.    Is it reasonable to use ‘libiberty_lowerpath’ as the symbol’s name?  There are other global symbols defined in libiberty that use a similar name, so this seems reasonable. 3.    Should the symbol be called ‘libiberty_lowerpath’ or something else, such as ‘libiberty_lowerfilename’?  Does that matter? 4.    While the ‘CharLowerBuff()’ code is within a _WIN32 section of code, should the global symbol also be defined so it only exists in libiberty.a for Windows hosts?  Or should that global symbol always be defined in libiberty.a, regardless of whether it is for Windows or Linux?  Obviously that symbol is only meaningful for Windows hosts.  But the existence (or non-existence)  of the symbol obviously affects how tools write their code to modify that symbol.  As presently written, tools would only modify this symbol when being built for running on Windows.

Depending on the answers to the above, a followup can show the proposed changes to tools, to add options to allow control of changing filenames to lower case on Windows.

If the above change (or something similar) is done to libiberty, then GCC should also be changed to add an option for accessing this variable.  I would propose adding the '-flowerpath' option, which would be the default.  That also adds '-fno-lowerpath', which is what users would use to keep their filenames from being converted to lower-case.  Most users, of course, could just ignore this option.  And it would only be meaningful for Windows-hosted tools.

Here are the patches that would implement the above:

diff -raup a/include/libiberty.h b/include/libiberty.h
--- a/include/libiberty.h    2019-08-07 08:16:05.269694038 -0700
+++ b/include/libiberty.h    2019-08-07 09:02:05.771381502 -0700
@@ -750,6 +750,11 @@ extern unsigned long libiberty_len;
    (char *) memcpy (libiberty_nptr, libiberty_optr, libiberty_len))
 #endif

+#if defined (_WIN32)
+/* Determine if filenames should be converted to lower case */
+extern unsigned char libiberty_lowerpath;
+#endif
+
 #ifdef __cplusplus
 }
 #endif
diff -raup a/libiberty/lrealpath.c b/libiberty/lrealpath.c
--- a/libiberty/lrealpath.c    2019-08-07 08:16:05.957694774 -0700
+++ b/libiberty/lrealpath.c    2019-08-07 09:03:03.271408306 -0700
@@ -72,6 +72,20 @@ extern char *canonicalize_file_name (con
 # endif
 #endif

+/* External option to control whether a pathname is converted
+   to all lower-case on Windows systems.  The default behavior
+   is to convert.
+*/
+#if defined (_WIN32)
+#ifdef __cplusplus
+extern "C" {
+#endif
+unsigned char libiberty_lowerpath = 1;
+#ifdef __cplusplus
+}
+#endif
+#endif
+
 char *
 lrealpath (const char *filename)
 {
@@ -159,10 +173,11 @@ lrealpath (const char *filename)
       return strdup (filename);
     else
       {
-    /* The file system is case-preserving but case-insensitive,
-       Canonicalize to lowercase, using the codepage associated
+    /* The file system is normally case-preserving but case-insensitive,
+       Canonicalize to lowercase if desired, using the codepage associated
        with the process locale.  */
-        CharLowerBuff (buf, len);
+    if (libiberty_lowerpath)
+          CharLowerBuff (buf, len);
         return strdup (buf);
       }
   }
diff -raup a/gcc/common.opt b/gcc/common.opt
--- a/gcc/common.opt    2019-08-05 09:42:54.889054582 -0700
+++ b/gcc/common.opt    2019-08-05 10:09:01.708902089 -0700
@@ -1543,6 +1543,10 @@ floop-nest-optimize
 Common Report Var(flag_loop_nest_optimize) Optimization
 Enable the loop nest optimizer.

+flowerpath
+Common Report
+Convert pathnames to all lowercase on Windows (ignore elsewhere).
+
 fstrict-volatile-bitfields
 Common Report Var(flag_strict_volatile_bitfields) Init(-1) Optimization
 Force bitfield accesses to match their type width.
diff -raup a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
--- a/gcc/doc/invoke.texi    2019-08-05 09:42:55.213053535 -0700
+++ b/gcc/doc/invoke.texi    2019-08-05 10:27:53.749415626 -0700
@@ -174,7 +174,8 @@ in the following sections.
 -pass-exit-codes  -pipe  -specs=@var{file}  -wrapper  @gol
 @@@var{file}  -ffile-prefix-map=@var{old}=@var{new}  @gol
 -fplugin=@var{file}  -fplugin-arg-@var{name}=@var{arg}  @gol
--fdump-ada-spec@r{[}-slim@r{]}  -fada-spec-parent=@var{unit} -fdump-go-spec=@var{file}}
+-fdump-ada-spec@r{[}-slim@r{]}  -fada-spec-parent=@var{unit} @gol
+-fdump-go-spec=@var{file} -fno-lowerpath}

 @item C Language Options
 @xref{C Dialect Options,,Options Controlling C Dialect}.
diff -raup a/gcc/opts.c b/gcc/opts.c
--- a/gcc/opts.c    2019-08-05 09:42:55.385052980 -0700
+++ b/gcc/opts.c    2019-08-05 10:09:01.724902054 -0700
@@ -2412,6 +2412,12 @@ common_handle_option (struct gcc_options
       opts->x_flag_stack_usage_info = value != 0;
       break;

+    case OPT_flowerpath:
+#ifdef __WIN32
+      libiberty_lowerpath = value;
+#endif
+      break;
+
     case OPT_g:
       set_debug_level (NO_DEBUG, DEFAULT_GDB_EXTENSIONS, arg, opts, opts_set,
                        loc);





Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]