[PATCH] cache compute_objsize results in strlen/sprintf (PR 97373)

Jeff Law law@redhat.com
Mon Nov 23 21:04:58 GMT 2020



On 11/4/20 5:58 PM, Martin Sebor via Gcc-patches wrote:
> To determine the target of a pointer expression and the offset into
> it, the increasingly widely used compute_objsize function traverses
> the IL following the DEF statements of pointer variables, aggregating
> offsets from POINTER_PLUS assignments along the way.  It does that
> for many statements that involve pointers, including calls to
> built-in functions and (so far only) accesses to char arrays.  When
> a function has many such statements with pointers to the same objects
> but with different offsets, the traversal ends up visiting the same
> pointer assignments repeatedly and unnecessarily.
>
> To avoid this repeated traversal, the attached patch adds the ability
> to cache results obtained in prior calls to the function.  The cache
> is optional and only used when enabled.
>
> To exercise the cache I have enabled it for the strlen pass (which
> is probably the heaviest compute_objsize user).  That happens to
> resolve PR 97373 which tracks the pass' failure to detect sprintf
> overflowing allocated buffers at a non-constant offset.  I thought
> about making this a separate patch but the sprintf/strlen changes
> are completely mechanical so it didn't seem worth the effort.
>
> In the benchmarking I've done the cache isn't a huge win there but
> it does have a measurable difference in the project I'm wrapping up
> where most pointer assignments need to be examined.  The space used
> for the cache is negligible on average: fewer than 20 entries per
> Glibc function and about 6 for GCC.  The worst case in Glibc is
> 6 thousand entries and 10k in GCC.  Since the entries are sizable
> (216 bytes each) the worst case memory consumption can be reduced
> by adding a level of indirection.  A further savings can be obtained
> by replacing some of the offset_int members of the entries with
> HOST_WIDE_INT.
>
> The efficiency benefits of the cache should increase further as more
> of the access checking code is integrated into the same pass.  This
> should eventually include the checking currently done in the built-in
> expanders.
>
> Tested on x86_64-linux, along with Glibc and Binutils/GDB.
>
> Martin
>
> PS The patch add the new pointer_query class (loosely modeled on
> range_query) to builtins.{h,c}.  This should be only temporary,
> until the access checking code is moved into a file (and ultimately
> a pass) of its own.
>
> gcc-97373.diff
>
> PR middle-end/97373 - missing warning on sprintf into allocated destination
>
> gcc/ChangeLog:
>
> 	PR middle-end/97373
> 	* builtins.c (compute_objsize): Rename...
> 	(compute_objsize_r): to this.  Change order and types of arguments.
> 	Use new argument.  Adjust calls to self.
> 	(access_ref::get_ref): New member function.
> 	(pointer_query::pointer_query): New member function.
> 	(pointer_query::get_ref): Same.
> 	(pointer_query::put_ref): Same.
> 	(handle_min_max_size): Change order and types of arguments.
> 	(maybe_emit_free_warning): Add a test.
> 	* builtins.h (class pointer_query): New class.
> 	(compute_objsize): Declare an overload.
> 	* gimple-ssa-sprintf.c (get_destination_size):
> 	(handle_printf_call):
> 	* tree-ssa-strlen.c (adjust_last_stmt): Add an argument and use it.
> 	(maybe_warn_overflow): Same.
> 	(handle_builtin_strcpy): Same.
> 	(maybe_diag_stxncpy_trunc): Same.
> 	(handle_builtin_memcpy): Change argument type.  Adjust calls.
> 	(handle_builtin_strcat): Same.
> 	(handle_builtin_memset): Same.
> 	(handle_store): Same.
> 	(strlen_check_and_optimize_call): Same.
> 	(check_and_optimize_stmt): Same.
> 	(strlen_dom_walker): Add new data members.
> 	(strlen_dom_walker::before_dom_children): Use new member.
> 	(printf_strlen_execute): Dump cache performance counters.
> 	* tree-ssa-strlen.h (maybe_diag_stxncpy_trunc): Add argument.
> 	(handle_printf_call): Change argument type.
>
> gcc/testsuite/ChangeLog:
>
> 	PR middle-end/97373
> 	* gcc.dg/warn-strnlen-no-nul.c:
> 	* g++.dg/warn/Wmismatched-new-delete-2.C: New test.
> 	* gcc.dg/tree-ssa/builtin-sprintf-warn-25.c: New test.
I'm going to hesitatingly ACK this. 

The reason I saw hesitatingly is because I suspect that we're going to
see more cases where caching would be useful and I'd like to have a
general way to do that.  We've already got a ton of caching code in
Ranger and I fully expect more as we continue to write analysis code
that wants to walk backwards.  I don't want to see multiple
implementations of simple caching code.

Additionally, if you're doing so many queries in a subsequent patch that
caching is needed, then that may be an indication that you're doing too
much work.  I can't really evaluate that yet as I haven't looked at the
subsequent patch where this becomes a significant issue.

So expect pushback if further cache implementations are introduced.

Jeff



More information about the Gcc-patches mailing list