This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][RFC] Add new ipa-reorder pass
- From: Fangrui Song <i at maskray dot me>
- To: Andrew Pinski <pinskia at gmail dot com>
- Cc: Martin Liška <mliska at suse dot cz>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Jan Hubicka <hubicka at ucw dot cz>, Nathan Sidwell <nathan at acm dot org>, Vyacheslav Barinov <v dot barinov at samsung dot com>
- Date: Thu, 3 Oct 2019 07:45:49 +0000
- Subject: Re: [PATCH][RFC] Add new ipa-reorder pass
- References: <12f761ce-012d-f9be-ceef-6ad8de6324e8@suse.cz> <fd999b53-e0f0-7a0f-ee13-c8f56bef7a7e@suse.cz> <20191003045210.s4aqwi46y2yg4dbv@google.com> <CA+=Sn1ni95cPehqZLUQy4C0goWSvynQqD0Gj_b0aNXNSVxzVSA@mail.gmail.com>
On 2019-10-03, Andrew Pinski wrote:
On Wed, Oct 2, 2019@9:52 PM Fangrui Song <i@maskray.me> wrote:
On 2019-09-24, Martin Liška wrote:
>On 9/19/19 10:33 AM, Martin Liška wrote:
>> - One needs modified binutils and I that would probably require a configure detection. The only way
>> which I see is based on ld --version. I'm planning to make the binutils submission soon.
>
>The patch submission link:
>https://sourceware.org/ml/binutils/2019-09/msg00219.html
Hi Martin,
I have a question about why .text.sorted.* are needed.
The Sony presentation (your [2] link) embedded a new section
.llvm.call-graph-profile[3] to represent edges in the object files. The
linker (lld) collects all .llvm.call-graph-profile sections and does a
C3 layout. There is no need for new section type .text.sorted.*
[3]: https://github.com/llvm/llvm-project/blob/master/lld/test/ELF/cgprofile-obj.s
(Please CC me. I am not subscribed.)
The idea of GCC's version is to that the modification needed to the
linker is very little. And even without patching the linker
script/linker, you don't need to much special and you don't need LD to
do much work@all.
I am afraid this can be a large limitation. Then .text.sorted.* can
only a) reorder functions within a translation unit, b) or reorder all
functions when LTO is enabled.
b) is possible only if all .text.sorted.* sections can be numbered.
For the LLVM case, call graph profile can be used without LTO. When both
ThinLTO+PGO are enabled, however, the additional performance improvement
offered by the global call graph profile reordering is insignificant,
smaller than 1%.