[PATCH][RFC] Sanitize equals and hash functions in hash-tables.
Martin Liška
mliska@suse.cz
Fri Jun 7 12:04:00 GMT 2019
On 6/7/19 10:57 AM, Richard Biener wrote:
> On Mon, Jun 3, 2019 at 3:35 PM Martin Liška <mliska@suse.cz> wrote:
>>
>> On 6/1/19 12:06 AM, Jeff Law wrote:
>>> On 5/22/19 3:13 AM, Martin Liška wrote:
>>>> On 5/21/19 1:51 PM, Richard Biener wrote:
>>>>> On Tue, May 21, 2019 at 1:02 PM Martin Liška <mliska@suse.cz> wrote:
>>>>>>
>>>>>> On 5/21/19 11:38 AM, Richard Biener wrote:
>>>>>>> On Tue, May 21, 2019 at 12:07 AM Jeff Law <law@redhat.com> wrote:
>>>>>>>>
>>>>>>>> On 5/13/19 1:41 AM, Martin Liška wrote:
>>>>>>>>> On 11/8/18 9:56 AM, Martin Liška wrote:
>>>>>>>>>> On 11/7/18 11:23 PM, Jeff Law wrote:
>>>>>>>>>>> On 10/30/18 6:28 AM, Martin Liška wrote:
>>>>>>>>>>>> On 10/30/18 11:03 AM, Jakub Jelinek wrote:
>>>>>>>>>>>>> On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška wrote:
>>>>>>>>>>>>>> +hashtab_chk_error ()
>>>>>>>>>>>>>> +{
>>>>>>>>>>>>>> + fprintf (stderr, "hash table checking failed: "
>>>>>>>>>>>>>> + "equal operator returns true for a pair "
>>>>>>>>>>>>>> + "of values with a different hash value");
>>>>>>>>>>>>> BTW, either use internal_error here, or at least if using fprintf
>>>>>>>>>>>>> terminate with \n, in your recent mail I saw:
>>>>>>>>>>>>> ...different hash valueduring RTL pass: vartrack
>>>>>>>>>>>>> ^^^^^^
>>>>>>>>>>>> Sure, fixed in attached patch.
>>>>>>>>>>>>
>>>>>>>>>>>> Martin
>>>>>>>>>>>>
>>>>>>>>>>>>>> + gcc_unreachable ();
>>>>>>>>>>>>>> +}
>>>>>>>>>>>>> Jakub
>>>>>>>>>>>>>
>>>>>>>>>>>> 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch
>>>>>>>>>>>>
>>>>>>>>>>>> From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep 17 00:00:00 2001
>>>>>>>>>>>> From: marxin <mliska@suse.cz>
>>>>>>>>>>>> Date: Mon, 29 Oct 2018 09:38:21 +0100
>>>>>>>>>>>> Subject: [PATCH] Sanitize equals and hash functions in hash-tables.
>>>>>>>>>>>>
>>>>>>>>>>>> ---
>>>>>>>>>>>> gcc/hash-table.h | 40 +++++++++++++++++++++++++++++++++++++++-
>>>>>>>>>>>> 1 file changed, 39 insertions(+), 1 deletion(-)
>>>>>>>>>>>>
>>>>>>>>>>>> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
>>>>>>>>>>>> index bd83345c7b8..694eedfc4be 100644
>>>>>>>>>>>> --- a/gcc/hash-table.h
>>>>>>>>>>>> +++ b/gcc/hash-table.h
>>>>>>>>>>>> @@ -503,6 +503,7 @@ private:
>>>>>>>>>>>>
>>>>>>>>>>>> value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
>>>>>>>>>>>> value_type *find_empty_slot_for_expand (hashval_t);
>>>>>>>>>>>> + void verify (const compare_type &comparable, hashval_t hash);
>>>>>>>>>>>> bool too_empty_p (unsigned int);
>>>>>>>>>>>> void expand ();
>>>>>>>>>>>> static bool is_deleted (value_type &v)
>>>>>>>>>>>> @@ -882,8 +883,12 @@ hash_table<Descriptor, Allocator>
>>>>>>>>>>>> if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
>>>>>>>>>>>> expand ();
>>>>>>>>>>>>
>>>>>>>>>>>> - m_searches++;
>>>>>>>>>>>> +#if ENABLE_EXTRA_CHECKING
>>>>>>>>>>>> + if (insert == INSERT)
>>>>>>>>>>>> + verify (comparable, hash);
>>>>>>>>>>>> +#endif
>>>>>>>>>>>>
>>>>>>>>>>>> + m_searches++;
>>>>>>>>>>>> value_type *first_deleted_slot = NULL;
>>>>>>>>>>>> hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
>>>>>>>>>>>> hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
>>>>>>>>>>>> @@ -930,6 +935,39 @@ hash_table<Descriptor, Allocator>
>>>>>>>>>>>> return &m_entries[index];
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> +#if ENABLE_EXTRA_CHECKING
>>>>>>>>>>>> +
>>>>>>>>>>>> +/* Report a hash table checking error. */
>>>>>>>>>>>> +
>>>>>>>>>>>> +ATTRIBUTE_NORETURN ATTRIBUTE_COLD
>>>>>>>>>>>> +static void
>>>>>>>>>>>> +hashtab_chk_error ()
>>>>>>>>>>>> +{
>>>>>>>>>>>> + fprintf (stderr, "hash table checking failed: "
>>>>>>>>>>>> + "equal operator returns true for a pair "
>>>>>>>>>>>> + "of values with a different hash value\n");
>>>>>>>>>>>> + gcc_unreachable ();
>>>>>>>>>>>> +}
>>>>>>>>>>> I think an internal_error here is probably still better than a simple
>>>>>>>>>>> fprintf, even if the fprintf is terminated with a \n :-)
>>>>>>>>>> Fully agree with that, but I see a lot of build errors when using internal_error.
>>>>>>>>>>
>>>>>>>>>>> The question then becomes can we bootstrap with this stuff enabled and
>>>>>>>>>>> if not, are we likely to soon? It'd be a shame to put it into
>>>>>>>>>>> EXTRA_CHECKING, but then not be able to really use EXTRA_CHECKING
>>>>>>>>>>> because we've got too many bugs to fix.
>>>>>>>>>> Unfortunately it's blocked with these 2 PRs:
>>>>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87845
>>>>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87847
>>>>>>>>> Hi.
>>>>>>>>>
>>>>>>>>> I've just added one more PR:
>>>>>>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90450
>>>>>>>>>
>>>>>>>>> I'm sending updated version of the patch that provides a disablement for the 3 PRs
>>>>>>>>> with a new function disable_sanitize_eq_and_hash.
>>>>>>>>>
>>>>>>>>> With that I can bootstrap and finish tests. However, I've done that with a patch
>>>>>>>>> limits maximal number of checks:
>>>>>>>> So rather than call the disable_sanitize_eq_and_hash, can you have its
>>>>>>>> state set up when you instantiate the object? It's not a huge deal,
>>>>>>>> just thinking about loud.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> So how do we want to go forward, particularly the EXTRA_EXTRA checking
>>>>>>>> issue :-)
>>>>>>>
>>>>>>> There is at least one PR where we have a table where elements _in_ the
>>>>>>> table are never compared against each other but always against another
>>>>>>> object (I guess that's usual even), but the setup is in a way that the
>>>>>>> comparison function only works with those. With the patch we verify
>>>>>>> hashing/comparison for something that is never used.
>>>>>>>
>>>>>>> So - wouldn't it be more "correct" to only verify comparison/hashing
>>>>>>> at lookup time, using the object from the lookup and verify that against
>>>>>>> all other elements?
>>>>>>
>>>>>> I don't a have problem with that. Apparently this changes fixes
>>>>>> PR90450 and PR87847.
>>>>>>
>>>>>> Changes from previous version:
>>>>>> - verification happens only when an element is searched (not inserted)
>>>>>> - new argument 'sanitize_eq_and_hash' added for hash_table::hash_table
>>>>>> - new param has been introduced hash-table-verification-limit in order
>>>>>> to limit number of elements that are compared within a table
>>>>>> - verification happens only with flag_checking >= 2
>>>>>>
>>>>>> I've been bootstrapping and testing the patch right now.
>>>>>
>>>>> Looks like I misremembered the original patch. The issue isn't
>>>>> comparing random two elements in the table.
>>>>>
>>>>> That it fixes PR90450 is because LIM never calls find_slot_with_hash
>>>>> without INSERTing.
>>>>>
>>>>
>>>> There's updated version of the patch where I check all find operations
>>>> (both w/ and w/o insertion).
>>>>
>>>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests
>>>> except for:
>>>>
>>>> $ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c -O2 -c
>>>> hash table checking failed: equal operator returns true for a pair of values with a different hash value
>>>> during GIMPLE pass: lim
>>>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c: In function âfn1â:
>>>> /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/torture/pr63941.c:6:1: internal compiler error: in hashtab_chk_error, at hash-table.h:1019
>>>> 6 | fn1 ()
>>>> | ^~~
>>>> 0x6c5725 hashtab_chk_error
>>>> /home/marxin/Programming/gcc/gcc/hash-table.h:1019
>>>> 0xe504ea hash_table<mem_ref_hasher, false, xcallocator>::verify(ao_ref* const&, unsigned int)
>>>> /home/marxin/Programming/gcc/gcc/hash-table.h:1040
>>>> 0xe504ea hash_table<mem_ref_hasher, false, xcallocator>::find_slot_with_hash(ao_ref* const&, unsigned int, insert_option)
>>>> /home/marxin/Programming/gcc/gcc/hash-table.h:960
>>>> 0xe504ea gather_mem_refs_stmt
>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:1501
>>>> 0xe504ea analyze_memory_references
>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:1625
>>>> 0xe504ea tree_ssa_lim
>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:2646
>>>> 0xe504ea execute
>>>> /home/marxin/Programming/gcc/gcc/tree-ssa-loop-im.c:2708
>>>>
>>>> Richi: it's after your recent patch.
>>>>
>>>> For some reason I don't see PR87847 issue any longer.
>>>>
>>>>
>>>> May I install the patch with disabled sanitization in tree-ssa-loop-im.c ?
>>> Don't we still need to deal with the naked fprintf when there's a
>>> failure. ie, shouldn't we be raising it with a gcc_assert or somesuch?
>>
>> Good point, I've just adjusted that.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
>
> Ugh, the cselib one is really bad. But I don't hold my breath for anyone
> fixing it ...
Yes :D It's been some time and there's no interest in the PR.
>
> One question - there's unconditional
>
> + if (m_sanitize_eq_and_hash)
> + verify (comparable, hash);
>
> which will read a global variable and have (possibly not inline) call
> to verify on a common path even with checking disabled. So I think
> we want to compile this checking feature out for !CHECKING_P
> or at least make the if __builtin_expect (..., 0), ::verify not
> inlined and marked pure () (thus, !CHECKING_P is simplest ;)).
Fixed. May I install the patch? The cselib issue can be solved later..
Martin
>
> Thanks,
> Richard.
>
>> Thanks,
>> Martin
>>
>>>
>>> jeff
>>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Enable-sanitization-for-hash-tables.patch
Type: text/x-patch
Size: 8671 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20190607/5abe3649/attachment.bin>
More information about the Gcc-patches
mailing list