Bug 113575 - [14 Regression] memory hog building insn-opinit.o (i686-linux-gnu -> riscv64-linux-gnu)
Summary: [14 Regression] memory hog building insn-opinit.o (i686-linux-gnu -> riscv64-...
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: other (show other bugs)
Version: 14.0
: P3 normal
Target Milestone: 14.0
Assignee: Not yet assigned to anyone
URL:
Keywords: build, memory-hog
Depends on:
Blocks: 84402
  Show dependency treegraph
 
Reported: 2024-01-24 07:59 UTC by Matthias Klose
Modified: 2024-03-04 04:29 UTC (History)
3 users (show)

See Also:
Host: i686-linux-gnu
Target: riscv64-linux-gnu
Build:
Known to work: 13.2.1
Known to fail: 14.0
Last reconfirmed: 2024-01-24 00:00:00


Attachments
Tentative (1.18 KB, patch)
2024-01-24 19:30 UTC, Robin Dapp
Details | Diff

Note You need to log in before you can comment on or make changes to this bug.
Description Matthias Klose 2024-01-24 07:59:03 UTC
seen with trunk 20240121, building a riscv64-linux-gnu cross compiler on i686-linux-gnu:

cc1plus: out of memory allocating 65536 bytes after a total of 3543261184 bytes
make[5]: *** [Makefile:1198: insn-opinit.o] Error 1
make[5]: *** Waiting for unfinished jobs....

Configured with: -v
         --with-pkgversion='Debian 14-20240121-1'
         --with-bugurl='file:///usr/share/doc/gcc-14/README.Bugs'
         --enable-languages=c,ada,c++,go,d,fortran,objc,obj-c++,m2,rust
         --prefix=/usr
         --with-gcc-major-version-only
         --program-suffix=-14
         --enable-shared
         --enable-linker-build-id
         --libexecdir=/usr/libexec
         --without-included-gettext
         --enable-threads=posix
         --libdir=/usr/lib
         --enable-nls
         --with-sysroot=/
         --enable-clocale=gnu
         --enable-libstdcxx-debug
         --enable-libstdcxx-time=yes
         --with-default-libstdcxx-abi=new
         --enable-libstdcxx-backtrace
         --enable-gnu-unique-object
         --disable-libitm
         --disable-libquadmath
         --disable-libquadmath-support
         --enable-plugin
         --enable-default-pie
         --with-system-zlib
         --enable-libphobos-checking=release
         --without-target-system-zlib
         --enable-multiarch
         --disable-werror
         --disable-multilib
         --with-arch=rv64gc
         --with-abi=lp64d
         --enable-checking=yes
         --build=i686-linux-gnu
         --host=i686-linux-gnu
         --target=riscv64-linux-gnu
         --program-prefix=riscv64-linux-gnu-
         --includedir=/usr/riscv64-linux-gnu/include

Other cross builds on i686-linux-gnu targeting amd64 arm64 s390x ppc64el armhf built ok.
Comment 1 Andrew Pinski 2024-01-24 08:05:15 UTC
Which version is your host compiler?
Comment 2 Andrew Pinski 2024-01-24 08:06:10 UTC
riscv has many many rtl patterns which definitely does not help insn-opinit size.
Comment 3 Sam James 2024-01-24 08:08:44 UTC
yeah, the split needs doing anyway, but it's really especially bad on riscv..
Comment 4 Matthias Klose 2024-01-24 08:25:38 UTC
same version, r14-8314-g29f931e39f2
Comment 5 Robin Dapp 2024-01-24 08:53:24 UTC
Yes, this is a known issue and it's due to our large number of patterns.  Contrary to insn-emit insn-opinit cannot be split that easily.  It would probably need a tree-like approach or similar.
I wouldn't see this as a regression in the classical sense as we just have many more patterns because of the vector extension.
Is increasing the available memory an option in the meantime or does this urgently require fixing?
Comment 6 Richard Biener 2024-01-24 09:00:52 UTC
The source isn't unreasonable, we should see what takes all the memory there.
Comment 7 Robin Dapp 2024-01-24 09:01:29 UTC
Ok, I'm going to check.
Comment 8 Richard Biener 2024-01-24 09:09:45 UTC
My host compiler (x86_64, older trunk) uses "just" 800MB.  3.5GB looks like a runaway?  What uarch is your i586 compiler targeting?
Comment 9 Richard Biener 2024-01-24 09:43:57 UTC
Ah, I can reproduce - there are allocation spikes with a i586 compiler,
always during DF which oddly enough do not happen with a x86_64 compiler.

The course of action would be to reduce the testcase to a single function
(I've starting from a x86_64->riscv cross tree preprocessed source with -m32
added, so a bit of a "wrong" testcase).  Just keeping init_all_optabs
seems to reproduce it.

Reducing the number of stores to ena[] makes it eventually fit - cutting
in half seems to work for me but I still see a 1.5GB peak.

Oddly enough a checking enabled x86_64 hosted compiler shows similar bad
performance.

IIRC DF issues with many adjacent stores are not unheard of.  Possibly
on x86_64 store-merging helps to avoid this.

That said, a "fix" could be to adjust the insn-opinit generator to
emit multiple init_all_optabs functions, doing 1000 at a time.
Comment 10 Richard Biener 2024-01-24 09:48:10 UTC
btw, -fno-var-tracking also greatly improves compile-time (but does nothing to memory use).  Compiling with -O1 reduces memory use to 300MB even when
var-tracking is enabled.  So an option might be to force building this
generator file with -O1 (I can hardly see anything in there that would
require more).  On the DF side this replaces LIVE with LR_IN IIRC.
It's also in line with us suggesting -O1 for (large) machine generated code ...
Comment 11 Matthias Klose 2024-01-24 17:21:44 UTC
> Is increasing the available memory an option
> in the meantime or does this urgently require fixing?

there is a buffer of 500mb, but it's already using 3.5gb.  That probably would work building without any parallelism with some Makefile changes, but building the whole compiler sequentially because of that is not a good option.
Comment 12 Robin Dapp 2024-01-24 19:30:13 UTC
Created attachment 57209 [details]
Tentative

I tested the attached "fix".  On my machine with 13.2 host compiler it reduced the build time for insn-opinit.cc from > 4 mins to < 2 mins and the memory usage from >1G to 600ish M.  I didn't observe 3.5G before, though.

For now I just went with an arbitrary threshold of 5000 patterns and splitting into 10 functions.  After testing on x86 and aarch64 I realized that both have <3000 patterns so right now it would only split riscv's init function.

Or rather the other way, i.e. splitting into fixed-size chunks (of 1000) instead?
Comment 13 Richard Biener 2024-01-25 13:08:02 UTC
(In reply to Robin Dapp from comment #12)
> Created attachment 57209 [details]
> Tentative
> 
> I tested the attached "fix".  On my machine with 13.2 host compiler it
> reduced the build time for insn-opinit.cc from > 4 mins to < 2 mins and the
> memory usage from >1G to 600ish M.  I didn't observe 3.5G before, though.
> 
> For now I just went with an arbitrary threshold of 5000 patterns and
> splitting into 10 functions.  After testing on x86 and aarch64 I realized
> that both have <3000 patterns so right now it would only split riscv's init
> function.
> 
> Or rather the other way, i.e. splitting into fixed-size chunks (of 1000)
> instead?

Yeah, I'd simplify it by doing exactly that.
Comment 14 Robin Dapp 2024-01-25 14:38:12 UTC
Ok, running tests with the adjusted version and going to post a patch afterwards.

However, during a recent run compiling insn-recog took 2G and insn-emit-7 as well as insn-emit-10 required > 1.5G each.  Looks like they could cause problems as well then?  The insn-emit files can be split into 20 instead of 10 which might help but insn-recog I haven't had a look at yet.
Comment 15 GCC Commits 2024-01-26 21:44:38 UTC
The master branch has been updated by Robin Dapp <rdapp@gcc.gnu.org>:

https://gcc.gnu.org/g:861997a9c7088da25ed4dc0bd339060ca063514f

commit r14-8457-g861997a9c7088da25ed4dc0bd339060ca063514f
Author: Robin Dapp <rdapp@ventanamicro.com>
Date:   Wed Jan 24 17:28:31 2024 +0100

    genopinit: Split init_all_optabs [PR113575].
    
    init_all_optabs initializes > 10000 patterns for riscv targets.  This
    leads to pathological situations in dataflow analysis (which can occur
    with many adjacent stores).
    To alleviate this this patch makes genopinit split the init_all_optabs
    function into several init_optabs_xx functions that each initialize 1000
    patterns.
    
    With this change insn-opinit.cc's compilation time is reduced from 4+
    minutes to 1:30 and memory consumption decreases from 1.2G to 630M.
    
    gcc/ChangeLog:
    
            PR other/113575
    
            * genopinit.cc (main): Split init_all_optabs into functions
            of 1000 patterns each.
Comment 16 Jeffrey A. Law 2024-03-04 04:13:02 UTC
Fixed on the trunk.
Comment 17 Jeffrey A. Law 2024-03-04 04:29:57 UTC
Forgot to change state.  Fixed on the trunk.