This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Disable accumulate-outgoing-args for Generic and Buldozers

From: Richard Biener <richard dot guenther at gmail dot com>
To: Jakub Jelinek <jakub at redhat dot com>
Cc: Jan Hubicka <hubicka at ucw dot cz>, Alexandre Oliva <aoliva at redhat dot com>, GCC Patches <gcc-patches at gcc dot gnu dot org>
Date: Wed, 29 Jan 2014 16:24:17 +0100
Subject: Re: Disable accumulate-outgoing-args for Generic and Buldozers
Authentication-results: sourceware.org; auth=none
References: <20140101143004 dot GD26209 at kam dot mff dot cuni dot cz> <20140124211119 dot GX892 at tucnak dot redhat dot com> <20140127221939 dot GH892 at tucnak dot redhat dot com> <20140127222546 dot GI892 at tucnak dot redhat dot com> <20140127235628 dot GC12944 at kam dot mff dot cuni dot cz> <20140128062647 dot GJ892 at tucnak dot redhat dot com> <20140129100205 dot GS892 at tucnak dot redhat dot com>

On Wed, Jan 29, 2014 at 11:02 AM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Tue, Jan 28, 2014 at 07:26:47AM +0100, Jakub Jelinek wrote:
>> > I wonder if this is just some of --enable-checking tests in dwarf2out going wild
>> > or if it is just expensive sanity checking code?
>> > I used to have chroot environment for 32bit builds, I will need to re-install it now.
>>
>>  variable tracking       :2914.85 (83%) usr   1.88 ( 7%) sys2931.22 (82%) wall   80844 kB ( 3%) ggc
>>  var-tracking dataflow   :  18.19 ( 1%) usr   0.19 ( 1%) sys  18.49 ( 1%) wall   10899 kB ( 0%) ggc
>>  var-tracking emit       :  29.41 ( 1%) usr   0.11 ( 0%) sys  29.65 ( 1%) wall  148128 kB ( 6%) ggc
>>  TOTAL                 :3525.97            25.73          3570.33            2321043 kB
>>
>> So, strangely both vt_find_locations and vt_emit_notes, typically the most expensive ones,
>> are quite unexpensive and most of the time is spent elsewhere, in vt_initialize?
>
> So, most of the time seems to be spent in cselib.c remove_useless_values
> (both from Ctrl-C in gdb profiling, and callgrind).  For callgrind I've
> actually built 64-bit cc1plus with --enable-checking=release, and still compiled
> the same --enable-checking=yes,rtl -m32 -O2 -g insn-recog.c, the build then
> took just 14 minutes instead of 60 minutes, and in that case only about 30%
> of compile time has been spent in var-tracking and 20% of compile time
> in remove_useless_values in particular.
>
> The problem with remove_useless_values is that we have quickly very big
> cselib hash table (over 200000 entries) and very large number of very small
> basic blocks (over 34000) and at the end of every basic block we call
> cselib_preserve_only_values which walks the whole hash table, and e.g.
> references_value_p is called 845114869x from discard_useless_locs.

It looks like remove_useless_values () is an "optimization" (shrinks the
hashtable) and thus can be omitted?  After all it's already limited:

  if (n_useless_values > MAX_USELESS_VALUES
      /* remove_useless_values is linear in the hash table size.  Avoid
         quadratic behavior for very large hashtables with very few
         useless elements.  */
      && ((unsigned int)n_useless_values
          > (cselib_hash_table.elements () - n_debug_values) / 4))
    remove_useless_values ();

why don't we remove them at the point we discover them useless?
Thus when n_useless_values increases?  Can't we do that in O(1)
there (with a hashtable lookup?)

> A micro-optimization could be e.g. to turn references_value_p into a
> template where only_useless would be a template parameter rather than actual
> parameter (due to recursion inlining doesn't help here) or just two
> functions.
>
> Also, for RTL checking, I wonder if the functions like reference_values_p
> and similar ones that use GET_RTX_FORMAT/GET_RTX_LENGTH to walk the elements
> couldn't use special non-checking macros in doing so if the compiler can't
> figure out checking is redundant there because it is being performed by hand
> by the function (haven't verified).  And, perhaps also an approach similar
> to http://gcc.gnu.org/ml/gcc-patches/2014-01/msg00140.html for
> GET_RTX_FORMAT and/or GET_RTX_LENGTH (so that at least for the cases where
> the compiler knows which rtx code it is (if some code is guarded with
> specific GET_CODE () == test), it could avoid loading from the const
> arrays).
>
> Anyway, I guess more important is whether all the values in the
> cselib_hash_table will be ever useful for some lookup, or if there are
> e.g. values that only reference preserved values where none of those
> referenced values have any locations other than preserved values.
>
> Or if we can somehow quickly find out what VALUEs have changed during
> processing of the last bb and only process that subset instead of all the
> hash table entries all the time.
>
> Alex?  Your thoughts?
>
>         Jakub

Follow-Ups:
- Re: Disable accumulate-outgoing-args for Generic and Buldozers
  - From: Richard Biener

References:
- Disable accumulate-outgoing-args for Generic and Buldozers
  - From: Jan Hubicka
- Re: Disable accumulate-outgoing-args for Generic and Buldozers
  - From: Jakub Jelinek
- Re: Disable accumulate-outgoing-args for Generic and Buldozers
  - From: Jakub Jelinek
- Re: Disable accumulate-outgoing-args for Generic and Buldozers
  - From: Jakub Jelinek
- Re: Disable accumulate-outgoing-args for Generic and Buldozers
  - From: Jan Hubicka
- Re: Disable accumulate-outgoing-args for Generic and Buldozers
  - From: Jakub Jelinek
- Re: Disable accumulate-outgoing-args for Generic and Buldozers
  - From: Jakub Jelinek

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]