This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.
- From: Martin Liška <mliska at suse dot cz>
- To: Richard Biener <richard dot guenther at gmail dot com>, Jeff Law <law at redhat dot com>
- Cc: Jakub Jelinek <jakub at redhat dot com>, Alexander Monakov <amonakov at ispras dot ru>, GCC Patches <gcc-patches at gcc dot gnu dot org>, Nathan Sidwell <nathan at acm dot org>, Jason Merrill <jason at redhat dot com>, Paul Richard Thomas <paul dot richard dot thomas at gmail dot com>, Martin Jambor <mjambor at suse dot cz>
- Date: Tue, 21 May 2019 13:02:11 +0200
- Subject: Re: [PATCH][RFC] Sanitize equals and hash functions in hash-tables.
- References: <23ffca95-6492-e609-aebb-bbdd83b5185d@suse.cz> <alpine.LNX.2.20.13.1810291537420.16308@monopod.intra.ispras.ru> <b3f846d3-b387-a986-22b0-071ef92791d1@suse.cz> <20181030100342.GN11625@tucnak> <32744d50-09fd-496c-e97e-9ec478d64ec4@suse.cz> <e48993c5-bb85-ddf4-2c6f-89145bf4a2dd@redhat.com> <492d87a7-0210-0df3-f484-f126baa6866c@suse.cz> <47fcf0aa-4b89-5354-1b59-4e6c623f5c3a@suse.cz> <be20f9e6-0de7-58f2-e668-62727274fb4e@redhat.com> <CAFiYyc1oRW5_qrQ3=DDK5SqD7XhKo7X8-AApXyKgQ2xs3mWgcg@mail.gmail.com>
On 5/21/19 11:38 AM, Richard Biener wrote:
> On Tue, May 21, 2019 at 12:07 AM Jeff Law <law@redhat.com> wrote:
>>
>> On 5/13/19 1:41 AM, Martin Liška wrote:
>>> On 11/8/18 9:56 AM, Martin Liška wrote:
>>>> On 11/7/18 11:23 PM, Jeff Law wrote:
>>>>> On 10/30/18 6:28 AM, Martin Liška wrote:
>>>>>> On 10/30/18 11:03 AM, Jakub Jelinek wrote:
>>>>>>> On Mon, Oct 29, 2018 at 04:14:21PM +0100, Martin Liška wrote:
>>>>>>>> +hashtab_chk_error ()
>>>>>>>> +{
>>>>>>>> + fprintf (stderr, "hash table checking failed: "
>>>>>>>> + "equal operator returns true for a pair "
>>>>>>>> + "of values with a different hash value");
>>>>>>> BTW, either use internal_error here, or at least if using fprintf
>>>>>>> terminate with \n, in your recent mail I saw:
>>>>>>> ...different hash valueduring RTL pass: vartrack
>>>>>>> ^^^^^^
>>>>>> Sure, fixed in attached patch.
>>>>>>
>>>>>> Martin
>>>>>>
>>>>>>>> + gcc_unreachable ();
>>>>>>>> +}
>>>>>>> Jakub
>>>>>>>
>>>>>> 0001-Sanitize-equals-and-hash-functions-in-hash-tables.patch
>>>>>>
>>>>>> From 0d9c979c845580a98767b83c099053d36eb49bb9 Mon Sep 17 00:00:00 2001
>>>>>> From: marxin <mliska@suse.cz>
>>>>>> Date: Mon, 29 Oct 2018 09:38:21 +0100
>>>>>> Subject: [PATCH] Sanitize equals and hash functions in hash-tables.
>>>>>>
>>>>>> ---
>>>>>> gcc/hash-table.h | 40 +++++++++++++++++++++++++++++++++++++++-
>>>>>> 1 file changed, 39 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/gcc/hash-table.h b/gcc/hash-table.h
>>>>>> index bd83345c7b8..694eedfc4be 100644
>>>>>> --- a/gcc/hash-table.h
>>>>>> +++ b/gcc/hash-table.h
>>>>>> @@ -503,6 +503,7 @@ private:
>>>>>>
>>>>>> value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
>>>>>> value_type *find_empty_slot_for_expand (hashval_t);
>>>>>> + void verify (const compare_type &comparable, hashval_t hash);
>>>>>> bool too_empty_p (unsigned int);
>>>>>> void expand ();
>>>>>> static bool is_deleted (value_type &v)
>>>>>> @@ -882,8 +883,12 @@ hash_table<Descriptor, Allocator>
>>>>>> if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
>>>>>> expand ();
>>>>>>
>>>>>> - m_searches++;
>>>>>> +#if ENABLE_EXTRA_CHECKING
>>>>>> + if (insert == INSERT)
>>>>>> + verify (comparable, hash);
>>>>>> +#endif
>>>>>>
>>>>>> + m_searches++;
>>>>>> value_type *first_deleted_slot = NULL;
>>>>>> hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
>>>>>> hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
>>>>>> @@ -930,6 +935,39 @@ hash_table<Descriptor, Allocator>
>>>>>> return &m_entries[index];
>>>>>> }
>>>>>>
>>>>>> +#if ENABLE_EXTRA_CHECKING
>>>>>> +
>>>>>> +/* Report a hash table checking error. */
>>>>>> +
>>>>>> +ATTRIBUTE_NORETURN ATTRIBUTE_COLD
>>>>>> +static void
>>>>>> +hashtab_chk_error ()
>>>>>> +{
>>>>>> + fprintf (stderr, "hash table checking failed: "
>>>>>> + "equal operator returns true for a pair "
>>>>>> + "of values with a different hash value\n");
>>>>>> + gcc_unreachable ();
>>>>>> +}
>>>>> I think an internal_error here is probably still better than a simple
>>>>> fprintf, even if the fprintf is terminated with a \n :-)
>>>> Fully agree with that, but I see a lot of build errors when using internal_error.
>>>>
>>>>> The question then becomes can we bootstrap with this stuff enabled and
>>>>> if not, are we likely to soon? It'd be a shame to put it into
>>>>> EXTRA_CHECKING, but then not be able to really use EXTRA_CHECKING
>>>>> because we've got too many bugs to fix.
>>>> Unfortunately it's blocked with these 2 PRs:
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87845
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87847
>>> Hi.
>>>
>>> I've just added one more PR:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90450
>>>
>>> I'm sending updated version of the patch that provides a disablement for the 3 PRs
>>> with a new function disable_sanitize_eq_and_hash.
>>>
>>> With that I can bootstrap and finish tests. However, I've done that with a patch
>>> limits maximal number of checks:
>> So rather than call the disable_sanitize_eq_and_hash, can you have its
>> state set up when you instantiate the object? It's not a huge deal,
>> just thinking about loud.
>>
>>
>>
>> So how do we want to go forward, particularly the EXTRA_EXTRA checking
>> issue :-)
>
> There is at least one PR where we have a table where elements _in_ the
> table are never compared against each other but always against another
> object (I guess that's usual even), but the setup is in a way that the
> comparison function only works with those. With the patch we verify
> hashing/comparison for something that is never used.
>
> So - wouldn't it be more "correct" to only verify comparison/hashing
> at lookup time, using the object from the lookup and verify that against
> all other elements?
I don't a have problem with that. Apparently this changes fixes
PR90450 and PR87847.
Changes from previous version:
- verification happens only when an element is searched (not inserted)
- new argument 'sanitize_eq_and_hash' added for hash_table::hash_table
- new param has been introduced hash-table-verification-limit in order
to limit number of elements that are compared within a table
- verification happens only with flag_checking >= 2
I've been bootstrapping and testing the patch right now.
Martin
>
> Richard.
>
>>
>> Jeff
>From 47df69feb035672056d2ad86c53eda9c286ed92a Mon Sep 17 00:00:00 2001
From: marxin <mliska@suse.cz>
Date: Mon, 13 May 2019 07:16:22 +0200
Subject: [PATCH] Enable sanitization for hash tables.
---
gcc/cselib.c | 9 ++++++--
gcc/hash-set.h | 2 +-
gcc/hash-table.c | 3 +++
gcc/hash-table.h | 60 ++++++++++++++++++++++++++++++++++++++++++++----
gcc/params.def | 6 +++++
gcc/toplev.c | 4 ++++
6 files changed, 77 insertions(+), 7 deletions(-)
diff --git a/gcc/cselib.c b/gcc/cselib.c
index 84c17c23f6d..a1cbdec9718 100644
--- a/gcc/cselib.c
+++ b/gcc/cselib.c
@@ -2858,9 +2858,14 @@ cselib_init (int record_what)
}
used_regs = XNEWVEC (unsigned int, cselib_nregs);
n_used_regs = 0;
- cselib_hash_table = new hash_table<cselib_hasher> (31);
+ /* FIXME: enable sanitization (PR87845) */
+ cselib_hash_table
+ = new hash_table<cselib_hasher> (31, /* ggc */ false,
+ /* sanitize_eq_and_hash */ false);
if (cselib_preserve_constants)
- cselib_preserved_hash_table = new hash_table<cselib_hasher> (31);
+ cselib_preserved_hash_table
+ = new hash_table<cselib_hasher> (31, /* ggc */ false,
+ /* sanitize_eq_and_hash */ false);
next_uid = 1;
}
diff --git a/gcc/hash-set.h b/gcc/hash-set.h
index de3532f5f68..d891ed78297 100644
--- a/gcc/hash-set.h
+++ b/gcc/hash-set.h
@@ -28,7 +28,7 @@ class hash_set
public:
typedef typename Traits::value_type Key;
explicit hash_set (size_t n = 13, bool ggc = false CXX_MEM_STAT_INFO)
- : m_table (n, ggc, GATHER_STATISTICS, HASH_SET_ORIGIN PASS_MEM_STAT) {}
+ : m_table (n, ggc, true, GATHER_STATISTICS, HASH_SET_ORIGIN PASS_MEM_STAT) {}
/* Create a hash_set in gc memory with space for at least n elements. */
diff --git a/gcc/hash-table.c b/gcc/hash-table.c
index 646a7a1c497..8e86fffa36f 100644
--- a/gcc/hash-table.c
+++ b/gcc/hash-table.c
@@ -74,6 +74,9 @@ struct prime_ent const prime_tab[] = {
{ 0xfffffffb, 0x00000006, 0x00000008, 31 }
};
+/* Limit number of comparisons when calling hash_table<>::verify. */
+unsigned int hash_table_sanitize_eq_limit;
+
/* The following function returns an index into the above table of the
nearest prime number which is greater than N, and near a power of two. */
diff --git a/gcc/hash-table.h b/gcc/hash-table.h
index 4178616478e..b3e579b8822 100644
--- a/gcc/hash-table.h
+++ b/gcc/hash-table.h
@@ -295,6 +295,8 @@ struct prime_ent
extern struct prime_ent const prime_tab[];
+/* Limit number of comparisons when calling hash_table<>::verify. */
+extern unsigned int hash_table_sanitize_eq_limit;
/* Functions for computing hash table indexes. */
@@ -371,10 +373,12 @@ class hash_table
public:
explicit hash_table (size_t, bool ggc = false,
+ bool sanitize_eq_and_hash = true,
bool gather_mem_stats = GATHER_STATISTICS,
mem_alloc_origin origin = HASH_TABLE_ORIGIN
CXX_MEM_STAT_INFO);
explicit hash_table (const hash_table &, bool ggc = false,
+ bool sanitize_eq_and_hash = true,
bool gather_mem_stats = GATHER_STATISTICS,
mem_alloc_origin origin = HASH_TABLE_ORIGIN
CXX_MEM_STAT_INFO);
@@ -516,6 +520,7 @@ private:
value_type *alloc_entries (size_t n CXX_MEM_STAT_INFO) const;
value_type *find_empty_slot_for_expand (hashval_t);
+ void verify (const compare_type &comparable, hashval_t hash);
bool too_empty_p (unsigned int);
void expand ();
static bool is_deleted (value_type &v)
@@ -564,6 +569,9 @@ private:
/* if m_entries is stored in ggc memory. */
bool m_ggc;
+ /* True if the table should be sanitized for equal and hash functions. */
+ bool m_sanitize_eq_and_hash;
+
/* If we should gather memory statistics for the table. */
#if GATHER_STATISTICS
bool m_gather_mem_stats;
@@ -586,12 +594,13 @@ extern void dump_hash_table_loc_statistics (void);
template<typename Descriptor, bool Lazy,
template<typename Type> class Allocator>
hash_table<Descriptor, Lazy, Allocator>::hash_table (size_t size, bool ggc,
+ bool sanitize_eq_and_hash,
bool gather_mem_stats
ATTRIBUTE_UNUSED,
mem_alloc_origin origin
MEM_STAT_DECL) :
m_n_elements (0), m_n_deleted (0), m_searches (0), m_collisions (0),
- m_ggc (ggc)
+ m_ggc (ggc), m_sanitize_eq_and_hash (sanitize_eq_and_hash)
#if GATHER_STATISTICS
, m_gather_mem_stats (gather_mem_stats)
#endif
@@ -617,12 +626,14 @@ template<typename Descriptor, bool Lazy,
template<typename Type> class Allocator>
hash_table<Descriptor, Lazy, Allocator>::hash_table (const hash_table &h,
bool ggc,
+ bool sanitize_eq_and_hash,
bool gather_mem_stats
ATTRIBUTE_UNUSED,
mem_alloc_origin origin
MEM_STAT_DECL) :
m_n_elements (h.m_n_elements), m_n_deleted (h.m_n_deleted),
- m_searches (0), m_collisions (0), m_ggc (ggc)
+ m_searches (0), m_collisions (0), m_ggc (ggc),
+ m_sanitize_eq_and_hash (sanitize_eq_and_hash)
#if GATHER_STATISTICS
, m_gather_mem_stats (gather_mem_stats)
#endif
@@ -912,7 +923,11 @@ hash_table<Descriptor, Lazy, Allocator>
entry = &m_entries[index];
if (is_empty (*entry)
|| (!is_deleted (*entry) && Descriptor::equal (*entry, comparable)))
- return *entry;
+ {
+ if (m_sanitize_eq_and_hash)
+ verify (comparable, hash);
+ return *entry;
+ }
}
}
@@ -941,8 +956,10 @@ hash_table<Descriptor, Lazy, Allocator>
if (insert == INSERT && m_size * 3 <= m_n_elements * 4)
expand ();
- m_searches++;
+ if (m_sanitize_eq_and_hash && insert == NO_INSERT)
+ verify (comparable, hash);
+ m_searches++;
value_type *first_deleted_slot = NULL;
hashval_t index = hash_table_mod1 (hash, m_size_prime_index);
hashval_t hash2 = hash_table_mod2 (hash, m_size_prime_index);
@@ -989,6 +1006,41 @@ hash_table<Descriptor, Lazy, Allocator>
return &m_entries[index];
}
+/* Report a hash table checking error. */
+
+ATTRIBUTE_NORETURN ATTRIBUTE_COLD
+static void
+hashtab_chk_error ()
+{
+ fprintf (stderr, "hash table checking failed: "
+ "equal operator returns true for a pair "
+ "of values with a different hash value\n");
+#ifndef GENERATOR_FILE
+ gcc_unreachable ();
+#else
+ exit (1);
+#endif
+}
+
+/* Verify that all existing elements in th hash table which are
+ equal to COMPARABLE have an equal HASH value provided as argument. */
+
+template<typename Descriptor, bool Lazy,
+ template<typename Type> class Allocator>
+void
+hash_table<Descriptor, Lazy, Allocator>
+::verify (const compare_type &comparable, hashval_t hash)
+{
+ for (size_t i = 0; i < MIN (hash_table_sanitize_eq_limit, m_size); i++)
+ {
+ value_type *entry = &m_entries[i];
+ if (!is_empty (*entry) && !is_deleted (*entry)
+ && hash != Descriptor::hash (*entry)
+ && Descriptor::equal (*entry, comparable))
+ hashtab_chk_error ();
+ }
+}
+
/* This function deletes an element with the given COMPARABLE value
from hash table starting with the given HASH. If there is no
matching element in the hash table, this function does nothing. */
diff --git a/gcc/params.def b/gcc/params.def
index 6b7f7eb5bae..e5ca6ff45e4 100644
--- a/gcc/params.def
+++ b/gcc/params.def
@@ -1439,6 +1439,12 @@ DEFPARAM(PARAM_GIMPLE_FE_COMPUTED_HOT_BB_THRESHOLD,
" The parameter is used only in GIMPLE FE.",
0, 0, 0)
+DEFPARAM(PARAM_HASH_TABLE_VERIFICATION_LIMIT,
+ "hash-table-verification-limit",
+ "The number of elements for which hash table verification is done for "
+ "each searched element.",
+ 100, 0, 0)
+
/*
Local variables:
diff --git a/gcc/toplev.c b/gcc/toplev.c
index d300ac2ec89..116be7be395 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -1799,6 +1799,10 @@ process_options (void)
optimization_default_node = build_optimization_node (&global_options);
optimization_current_node = optimization_default_node;
+ if (flag_checking >= 2)
+ hash_table_sanitize_eq_limit
+ = PARAM_VALUE (PARAM_HASH_TABLE_VERIFICATION_LIMIT);
+
/* Please don't change global_options after this point, those changes won't
be reflected in optimization_{default,current}_node. */
}
--
2.21.0