This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Redesign jump threading profile updates


Sorry, yes, will try to reproduce.
Teresa

On Wed, Oct 1, 2014 at 12:03 AM, Christophe Lyon
<christophe.lyon@linaro.org> wrote:
> On 30 September 2014 20:20, Teresa Johnson <tejohnson@google.com> wrote:
>> On Mon, Sep 29, 2014 at 9:33 PM, Jeff Law <law@redhat.com> wrote:
>>> On 09/29/14 08:19, Teresa Johnson wrote:
>>>>>
>>>>>
>>>>> Just an update - I found some good test cases by compiling the
>>>>> c-torture tests with profile feedback with and without my patch. But
>>>>> in the cases I pulled out I saw that there were still a couple profile
>>>>> or probability insanities introduced by jump threading (albeit far
>>>>> less than before), so I wanted to investigate before I commit. I ran
>>>>> out of time this week and will not get to this until I get back from
>>>>> vacation the week after next.
>>>>
>>>>
>>>> Hi Jeff,
>>>>
>>>> I finally had a chance to get back to this and look at the remaining
>>>> insanities in the new test cases I created. It turns out that there
>>>> were still a few issues in the case where there were guessed
>>>> frequencies and no profile counts. The two test cases I created do use
>>>> FDO, and the insanities in the routines with profile counts went away
>>>> with my patch. But the outlined copies of routines that were also
>>>> inlined into the main routine still had estimated frequencies, and
>>>> these still had a few issues.
>>>>
>>>> The problem is that the profile updates are done incrementally as we
>>>> walk and update the paths in ssa_fix_duplicate_block_edges, including
>>>> the block and edge counts, the block frequencies and the
>>>> probabilities. This is very difficult to do when only operating on
>>>> frequencies since the edge frequencies are derived from the source
>>>> block frequency and the probability. Therefore, once the source block
>>>> frequency is updated, the edge frequency is also affected, and it is
>>>> really difficult to figure out what the update to the edge frequency
>>>> (essentially the probability) is using the same incremental update
>>>> approach. I was attempting to handle this with the routine
>>>> deduce_freq, for example, but this turned out to have issues for
>>>> certain types of paths. I tried a few other approaches, but they start
>>>> looking really ugly and I didn't want to add a parallel but different
>>>> algorithm in the case of no profile counts.
>>>>
>>>> So by far the simplest approach was simply to take a snapshot of the
>>>> existing block and edge frequencies along the path before we start the
>>>> updates in ssa_fix_duplicate_block_edges, by copying them into the
>>>> profile count fields of those blocks and edges. Then the existing
>>>> algorithm operates the same as when we do have counts, and can
>>>> essentially operate incrementally on the edge frequencies since they
>>>> live in the count field of the edge and are no longer affected anytime
>>>> the source block is updated. Since the algorithm does update block
>>>> frequencies and probabilities as well (based on the count updates
>>>> performed), we can simply clear out these fake count fields at the end
>>>> of ssa_fix_duplicate_block_edges. This takes care of the remaining
>>>> insanities introduced by jump threading from these test cases. During
>>>> testing I also added in some checking to ensure that the count fields
>>>> for the whole routine were cleared properly to make sure the new
>>>> clear_counts_path was not missing anything (checking is a little too
>>>> heavyweight to add in normally).
>>>>
>>>> New patch below (also attached since my mailer sometimes eats spaces).
>>>> The differences between the old patch and the new one:
>>>> - removed deduce_freq (which was my least favorite part of the patch
>>>> anyway!), and its call from recompute_probabilities, since it is no
>>>> longer necessary.
>>>> - two new routines freqs_to_counts_path and clear_counts_path, invoked
>>>> from ssa_fix_duplicate_block_edges.
>>>> - two new tests
>>>>
>>>> Bootstrapped and tested on x86_64-unknown-linux-gnu, ok for trunk?
>>>>
>>>> Thanks,
>>>> Teresa
>>>>
>>>> gcc:
>>>>
>>>> 2014-09-29  Teresa Johnson  <tejohnson@google.com>
>>>>
>>>>          * tree-ssa-threadupdate.c (struct ssa_local_info_t): New
>>>>          duplicate_blocks bitmap.
>>>>          (remove_ctrl_stmt_and_useless_edges): Ditto.
>>>>          (create_block_for_threading): Ditto.
>>>>          (compute_path_counts): New function.
>>>>          (update_profile): Ditto.
>>>>          (recompute_probabilities): Ditto.
>>>>          (update_joiner_offpath_counts): Ditto.
>>>>          (freqs_to_counts_path): Ditto.
>>>>          (clear_counts_path): Ditto.
>>>>          (ssa_fix_duplicate_block_edges): Update profile info.
>>>>          (ssa_create_duplicates): Pass new parameter.
>>>>          (ssa_redirect_edges): Remove old profile update.
>>>>          (thread_block_1): New duplicate_blocks bitmap,
>>>>          remove old profile update.
>>>>          (thread_single_edge): Pass new parameter.
>>>>
>>>> gcc/testsuite:
>>>>
>>>> 2014-09-29  Teresa Johnson  <tejohnson@google.com>
>>>>
>>>>          * testsuite/gcc.dg/tree-prof/20050826-2.c: New test.
>>>>          * testsuite/gcc.dg/tree-prof/cmpsf-1.c: Ditto.
>>>
>>> Given I'd already been through this pretty thoroughly, I just gave this a
>>> cursory review.
>>>
>>> clear_counts_path needs a function comment.  It's pretty obvious what it's
>>> doing, but for completeness let's go ahead and get the obvious comment in
>>> there.
>>
>> Done and committed as r215739.
>>
>
> Since this commit, I can see all my builds for arm*linux* and
> aarch64*linux* fail while building glibc:
>
> /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools/bin/aarch64-none-linux-gnu-gcc
> iso-2022-cn.c -c -std=gnu99 -fgnu89-inline  -O2 -Wall -Win
> line -Wundef -Wwrite-strings -fmerge-all-constants -frounding-math -g
> -Wstrict-prototypes   -fPIC        -I../include
> -I/tmp/3496222_18.tmpdir/aci-gcc-f
> sf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata
> -I/tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux
> -gnu/glibc-1  -I../sysdeps/unix/sysv/linux/aarch64/nptl
> -I../sysdeps/unix/sysv/linux/aarch64
> -I../sysdeps/unix/sysv/linux/generic  -I../sysdeps/unix/s
> ysv/linux/wordsize-64  -I../nptl/sysdeps/unix/sysv/linux
> -I../nptl/sysdeps/pthread  -I../sysdeps/pthread
> -I../sysdeps/unix/sysv/linux  -I../sysdeps/gn
> u  -I../sysdeps/unix/inet  -I../nptl/sysdeps/unix/sysv
> -I../sysdeps/unix/sysv  -I../nptl/sysdeps/unix  -I../sysdeps/unix
> -I../sysdeps/posix  -I../sysd
> eps/aarch64/fpu  -I../sysdeps/aarch64/nptl  -I../sysdeps/aarch64
> -I../sysdeps/wordsize-64  -I../sysdeps/ieee754/ldbl-128
> -I../sysdeps/ieee754/dbl-64/w
> ordsize-64  -I../sysdeps/ieee754/dbl-64  -I../sysdeps/ieee754/flt-32
> -I../sysdeps/aarch64/soft-fp  -I../sysdeps/ieee754
> -I../sysdeps/generic  -I../npt
> l  -I.. -I../libio -I. -nostdinc -isystem
> /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include
> -i
> system /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/tools/lib/gcc/aarch64-none-linux-gnu/5.0.0/include-fixed
> -isystem /tmp/3496222_18.tmpdir
> /aci-gcc-fsf/builds/gcc-fsf-gccsrc/sysroot-aarch64-none-linux-gnu/usr/include
>  -D_LIBC_REENTRANT -include ../include/libc-symbols.h  -DPIC -DSHARED
> -DNOT_IN_libc -o
> /tmp/3496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata/iso-2022-cn.os
> -MD -MP -MF /tmp/3
> 496222_18.tmpdir/aci-gcc-fsf/builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata/iso-2022-cn.os.dt
> -MT /tmp/3496222_18.tmpdir/aci-gcc-fsf
> /builds/gcc-fsf-gccsrc/obj-aarch64-none-linux-gnu/glibc-1/iconvdata/iso-2022-cn.os
>
> In file included from iso-2022-cn.c:407:0:
> ../iconv/skeleton.c: In function 'gconv':
> ../iconv/skeleton.c:800:1: internal compiler error: in
> check_probability, at basic-block.h:959
> 0xe4e2fb find_many_sub_basic_blocks(simple_bitmap_def*)
>         /tmp/3496222_18.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/basic-block.h:959
> 0x6623f0 execute
>         /tmp/3496222_18.tmpdir/aci-gcc-fsf/sources/gcc-fsf/gccsrc/gcc/cfgexpand.c:5916
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See <http://gcc.gnu.org/bugs.html> for instructions.
>
> Can you have a look?
>
> Thanks,
>
> Christophe.
>
>> Thanks,
>> Teresa
>>
>>>
>>> With that fix, approved for the trunk.  Thanks for taking the time to sort
>>> out all these issues.
>>>
>>> jeff
>>>
>>>
>>
>>
>>
>> --
>> Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413



-- 
Teresa Johnson | Software Engineer | tejohnson@google.com | 408-460-2413


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]