Bug 111527 - COLLECT_GCC_OPTIONS option hits single-variable limits too early
Summary: COLLECT_GCC_OPTIONS option hits single-variable limits too early
Status: NEW
Alias: None
Product: gcc
Classification: Unclassified
Component: driver (show other bugs)
Version: 14.0
: P3 normal
Target Milestone: ---
Assignee: Not yet assigned to anyone
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-09-21 19:06 UTC by Sergei Trofimovich
Modified: 2024-07-02 03:31 UTC (History)
3 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed: 2024-05-18 00:00:00


Attachments
workaround for gcc driver long argument list error (2.60 KB, patch)
2024-05-06 11:39 UTC, Deepthi H
Details | Diff
With_Files_Workaround_for_GCC_ArgumentLongList (1.60 KB, patch)
2024-05-15 10:58 UTC, Deepthi H
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Sergei Trofimovich 2023-09-21 19:06:00 UTC
Tl;DR:

  linux allows you to pass up to 2MB of command line arguments to the tools limiting each individual option to 128KB. Unfortunately `gcc` collects all the options into a single `COLLECT_GCC_OPTIONS` variable and hits the smaller 128KB limit.

  This causes periodic problems for distributions that install each individual package into a separate directory. One of them is `NixOS`.

  Could `gcc` be amended not to use single limiting `COLLECT_GCC_OPTIONS` variable and use, say, direct arguments to pass options around? Or even use response file if it hits `-E2BIG`?

  Fake reproducer: 2 huge variables that programs normally accept, but not `gcc`:

  $ big_100k_var=$(printf "%0*d" 100000 0)

  # this works: 200KB of options for `printf` external command
  $ $(which printf) "%s %s" $big_100k_var $big_100k_var >/dev/null; echo $?
  0

  # this fails: 200KB of options for `gcc`, fails in `cc1`
  $ touch a.c; gcc -c a.c -DA=$big_100k_var -DB=$big_100k_var
  gcc: fatal error: cannot execute 'cc1': execv: Argument list too long
  compilation terminated.

  Thanks!

MOre words

In `nixpkgs` repository each package gets installed into it's own unique directory:

    /nix/store/hash1-foo/include/foo.h
    /nix/store/hash2-bar/include/bar.h
    ...

If any of those encounter `__FILE__` macros in static inline functions they usually embed full path to header files into final binaries as is.

I wanted to remap all those paths into form that does not contain hashes:

    -fmacro-prefix-map=/nix/store/hash1-foo/include/foo.h=/nix/store/00000000-foo/include/foo.h
    -fmacro-prefix-map=/nix/store/hash2-bar/include/bar.h=/nix/store/00000000-bar/include/bar.h
    ...

When I got to packages that use about ~100 include directories I started getting  errors from `cc1`:

```
Command line: `gcc -m64 -mcx16 $remap_options /build/qemu-8.1.0/build/meson-private/tmpbyikv8nc/testfile.c \
  -o /build/qemu-8.1.0/build/meson-private/tmpbyikv8nc/output.exe -D_FILE_OFFSET_BITS=64 \
  -O0 -Wl,--start-group -laio -Wl,--end-group -Wl,--allow-shlib-undefined` -> 1
stderr:
gcc: fatal error: cannot execute 'cc1': execv: Argument list too long
compilation terminated.
```

The limit felt too low and I found the `COLLECT_GCC_OPTIONS` variable.
Comment 1 Richard Biener 2023-09-22 05:22:37 UTC
Hm, but the COLLECT_GCC_OPTIONS variable is only used for communicating between the driver and the linker, the options therein are individually passed to the program execved?

You are maybe looking for the -f*-map options to take a file as input containing multiple mappings?
Comment 2 Sergei Trofimovich 2023-09-22 08:15:13 UTC
(In reply to Richard Biener from comment #1)
> Hm, but the COLLECT_GCC_OPTIONS variable is only used for communicating
> between the driver and the linker, the options therein are individually
> passed to the program execved?

AFAIU the driver sets `COLLECT_GCC_OPTIONS` variable and never unsets it. As a result it affects all the `exevce() calls. Be it `cc1`, `as` or anything else regardless of the fact if it uses the variable or not. `cc1` is probably the first casualty.

As a simplistic example here we break `ls` with too large environment file:

    $ COLLECT_GCC_OPTIONS=$(printf "%0*d" 200000 0) ls
    -bash: /run/current-system/sw/bin/ls: Argument list too long

> You are maybe looking for the -f*-map options to take a file as input
> containing multiple mappings?

`NixOS` is also occasionally hottong the same limit by passing too many include an library paths:

    -I/nix/store/hash1-foo/include
    -I/nix/store/hash2-bar/include
    ...
    -L/nix/store/hash1-foo/lib
    -L/nix/store/hash2-bar/lib
    ...
    -Wl,-rpath,/nix/store/hash1-foo/lib
    -Wl,-rpath,/nix/store/hash2-bar/lib

I wonder if we could solve all of these limitations here by at least avoiding `COLLECT_GCC_OPTIONS`.

But otherwise if generic fix is too invasive change then passing a mapping file should do as well.

What would be an acceptable for of the file? A new option, like?
    -fmacro-prefix-map-file=./foo
with entries of exactly the same form
    $ cat foo
    /nix/store/hash1-foo=/nix/store/00000000-foo
    /nix/store/hash2-bar=/nix/store/00000000-bar
    ...
    
Or maybe reuse existing -fmacro-prefix-map= and use response-style file input? Like -fmacro-prefix-map=@./foo.

clang would probably need the same handling if we were to extend the driver.
Comment 3 Deepthi H 2024-03-16 09:52:48 UTC
I have been investigating this issue further. Hence checking the source code and debugging the gcc sources. However, I wasn't able to find where the COLLECT_GCC_OPTION has been set to 128kb

I couldn't find it being set in gcc. Can you please let us know how can we increase the limit of collect options?
Comment 4 Deepthi H 2024-03-16 09:53:09 UTC
I have been investigating this issue further. Hence checking the source code and debugging the gcc sources. However, I wasn't able to find where the COLLECT_GCC_OPTION has been set to 128kb

I couldn't find it being set in gcc. Can you please let us know how can we increase the limit of collect options?
Comment 5 Sergei Trofimovich 2024-03-16 20:18:21 UTC
(In reply to Deepthi H from comment #4)
> I have been investigating this issue further. Hence checking the source code
> and debugging the gcc sources. However, I wasn't able to find where the
> COLLECT_GCC_OPTION has been set to 128kb
> 
> I couldn't find it being set in gcc. Can you please let us know how can we
> increase the limit of collect options?

The 128K limit against a single environment variable is a linux kernel limitation set by this define in include/uapi/linux/binfmts.h:

    #define MAX_ARG_STRLEN (PAGE_SIZE * 32)

https://trofi.github.io/posts/299-maximum-argument-count-on-linux-and-in-gcc.html has more words on that.
Comment 6 Deepthi H 2024-05-06 11:39:18 UTC
Created attachment 58107 [details]
workaround for gcc driver long argument list error
Comment 7 Deepthi H 2024-05-06 11:42:24 UTC
We've a solution for this issue.
 
When gcc/g++ is called using the @response-file.rsp syntax, gcc should forward the argument to its subprocesses. Previously the files were expanded which could lead to excessively long argument lists and 'cc1: execv: Argument list too long' errors.
 
In particular, CMake and Ninja tend to create a lot of '-isystem' include directories, requiring allowing the forwarding in spec files by using %@{i*}.
 
In xputenv method, If the ENV variable size greater then 128kb then we split the ENV variable(i.e COLLECT_GCC_OPTIONS) where each chunk will be 128kb in length.
 
GCC passes the entire command line, including expanded @rsp-files to the collect2 in environment variable COLLECT_GCC_OPTIONS. This can exceed the build environment's kernel's environment variable length limit. In this workaround, environment variables longer than 128kb are split into multiple variables and stitched back together in collect2.
 
The patch is attached here. 
And, the patch output of the example code given in 'Description' as below:
 
=======
sunild@BFT-LPT-I-051:~$ $GCC_PATH/gcc -c a.c -DA=$big_100k_var -DB=$big_100k_var -v
Using built-in specs.
COLLECT_GCC=/home/sunild/GCC_Driver/bin/home/sunild/GCC_Driver/build/bin/gcc
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/home/sunild/GCC_Driver/build --enable-languages=c --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 14.0.1 20240426 (experimental) (GCC)
COLLECT_GCC_OPTIONS_0='-c' '-D' 'A=00000000000000000000000000000000000000000000000000000000
COLLECT_GCC_OPTIONS_1=000000000000000000000000000000000000000000000000000000000000000000000
COLLECT_GCC_OPTIONS_COUNT=2
 
=======
 
Let us know your comments on this solution. Such a solution is acceptable to change the gcc driver?
Comment 8 Andrew Pinski 2024-05-06 16:02:43 UTC
(In reply to Deepthi H from comment #7)
>  
> Let us know your comments on this solution. Such a solution is acceptable to
> change the gcc driver?

Seems better to place the arguments in a file instead and just pass around that file name instead of passing all arguments via an env variable.
Comment 9 Deepthi H 2024-05-15 10:58:41 UTC
Created attachment 58212 [details]
With_Files_Workaround_for_GCC_ArgumentLongList
Comment 10 Deepthi H 2024-05-15 11:00:11 UTC
As suggested, We've updated the patch to place the arguments in a file instead of passing it from an env variable.
When the COLLECT_GCC_OPTIONS is <128kb the env variable is used and when the size >128kb arguments will be moved to a file.
 
In Collect2: We copy back the contents of the arguments from file into a buffer and pass it on to the Linker.
 
See the updated patch : With_Files_Workaround_for_GCC_driver_LongArgumentList.patch
 
The output of the example program in descripton section as below: 
=============================================================
$big_100k_var=$(printf "%0*d" 100000 0)
$<<GCC_PATH_With_Fix>>/gcc -c a.c -DA=$big_100k_var -DB=$big_100k_var -v
Using built-in specs.
COLLECT_GCC=/home/sunild/Gcc_Driver_with_FILE/install/home/Gcc_Driver_with_FILE/build/bin/gcc
Target: x86_64-pc-linux-gnu
Configured with: ../gcc/configure --prefix=/home/Gcc_Driver_with_FILE/build --enable-languages=c --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 15.0.0 20240513 (experimental) (GCC)
/home/Gcc_Driver_with_FILE/install/home/Gcc_Driver_with_FILE/build/bin/../libexec/gcc/x86_64-pc-linux-gnu/15.0.0/cc1 -quiet -v -imultiarch x86_64-linux-gnu -iprefix /home/Gcc_Driver_with_FILE/install/home/Gcc_Driver_with_FILE/build/bin/../lib/gcc/x86_64-pc-linux-gnu/15.0.0/ -D A=00000000000000000000000000000000000000000000000
....
....
<<trimmed output>>
....
....
0 -D B=000000000000000000000000000...
....
....
<<trimmed output>>
....
....
00000000000 a.c -quiet -dumpbase a.c -dumpbase-ext .c -mtune=generic -march=x86-64 -version -o /tmp/cc7Ox5mH.s
=============================================================
 
Please let us know this solution is ok.
Comment 11 Andrew Pinski 2024-05-18 21:33:37 UTC
Note using "/tmp/rsp.txt" is not acceptable at all since there might be a few linking happening on the machine. You should use make_at_file (which will make either a tmp file which will be deleted at the end compiling or an .args.N file which can be looked at if used with -save-temps) instead from gcc.cc and then pass the filename to collect2 as "@filename" and then inside collect2 expand as needed.
Comment 12 Deepthi H 2024-07-01 10:30:46 UTC
As suggested, we made the necessary changes and updated the patch to gcc community. Please find the link below:
https://gcc.gnu.org/pipermail/gcc-patches/2024-July/656073.html
Comment 13 Deepthi H 2024-07-02 03:31:34 UTC
Test logs ::
 
:~/$ big_100k_var=$(printf "%0*d" 100000 0)
 
Host gcc binary (Without fix)::
 
:~/$ gcc a.c -DA=$big_100k_var -DB=$big_100k_var
gcc: fatal error: cannot execute ‘/usr/lib/gcc/x86_64-linux-gnu/11/cc1’: execv: Argument list too long
compilation terminated.
 
 
Gcc binary with fix included ::
:~$ ./gcc a.c -DA=$big_100k_var -DB=$big_100k_var
:~$ ls a.o*
a.out
 
Test Logs::
 
                === gcc Summary ===
 
# of expected passes            8
# of expected failures          2
 
PASS: gcc.dg/longcmd/pr111527-1.c (test for excess errors)
PASS: gcc.dg/longcmd/pr111527-1.c execution test  
 
PASS: gcc.dg/longcmd/pr111527-2.c (test for excess errors)
PASS: gcc.dg/longcmd/pr111527-2.c execution test
PASS: gcc.dg/longcmd/pr111527-2.c output-exists ./pr111527-2.exe
 
PASS: gcc.dg/longcmd/pr111527-3.c (test for excess errors)
PASS: gcc.dg/longcmd/pr111527-3.c execution test
PASS: gcc.dg/longcmd/pr111527-3.c output-exists ./pr111527-3.exe
 
XFAIL: *-*-*
xgcc: fatal error: cannot execute 'cc1': posix_spawn: Argument list too long^M
compilation terminated.^M
compiler exited with status 1
XFAIL: gcc.dg/longcmd/pr111527-4.c (test for excess errors)