Bug 47170 reports a leak of a FILE * in gcc/intl/localealias.c found by cppcheck. This appears to be a genuine leak. -fanalyzer does not currently report this but ought to, so I'm filing this bug to track this.
Code is actually in intl/localealias.c
The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>: https://gcc.gnu.org/g:15e7b93ba4256884c90198c678ed7eded4e73464 commit r11-3340-g15e7b93ba4256884c90198c678ed7eded4e73464 Author: David Malcolm <dmalcolm@redhat.com> Date: Fri Sep 18 17:34:50 2020 -0400 analyzer: decls are not on the heap Whilst debugging the remaining state explosion in PR analyzer/93355 I noticed that half of the states at an exploding program point had: 'malloc': {'&buf': 'non-heap'} whereas the other half didn't, presumably depending on whether the path to each enode had used this local buffer: char buf[400]; This patch tweaks malloc_state_machine::get_default_state to be smarter about this, so that we can implicitly treat pointers to decls as non-heap, preventing pointless differences between sm_state_map instances. With that, all of the states in question have equal (empty) malloc sm-state - though the state explosion continues for other reasons. gcc/analyzer/ChangeLog: PR analyzer/93355 * sm-malloc.cc (malloc_state_machine::get_default_state): Look at the base region when considering pointers. Treat pointers to decls as being non-heap.
The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>: https://gcc.gnu.org/g:6b828f69519a50e6e2961b62ea552bf89d287199 commit r11-3415-g6b828f69519a50e6e2961b62ea552bf89d287199 Author: David Malcolm <dmalcolm@redhat.com> Date: Thu Dec 19 16:15:09 2019 -0500 analyzer: add testcases for PR 93355 (intl/localealias.c leak) PR analyzer/93355 reports a missing diagnostic about a FILE leak in intl/localealias.c. This appears to be due to a issue in the feasibility-checking code, though there is also a state explosion. This patch adds test cases that I've been using when investigating this, two of them currently requiring -fno-analyzer-feasibility, and one currently requiring -Wno-analyzer-too-complex. gcc/testsuite/ChangeLog: PR analyzer/93355 * gcc.dg/analyzer/pr93355-localealias-feasibility.c: New test. * gcc.dg/analyzer/pr93355-localealias-simplified.c: New test. * gcc.dg/analyzer/pr93355-localealias.c: New test.
Current status is that there is testcase coverage for this in git, but the test requires: /* { dg-additional-options "-Wno-analyzer-too-complex -fno-analyzer-feasibility" } */ (a) It happens to successfully explore enough of the graph to find the leak, but hits complexity limits at 1 program point: pr93355-localealias.c:263:41: warning: terminating analysis for this program point: callstring: [(SN: 66 -> SN: 85 in _nl_expand_alias)] before (SN: 21 stmt: 0): _7 = *cp_107;EN: 163, EN: 170, EN: 177, EN: 370, EN: 557, EN: 794, EN: 801, EN: 808 [-Wanalyzer-too-complex] 263 | while (isspace ((unsigned char) cp[0])) | ~~^~~ (b) Without -fno-analyzer-feasibility, it falsely rejects the path as infeasible at the "locale_alias_path[0] != '\0'" within _nl_expand_alias: | 185 | added = 0; | | ~~~~~~~~~ | | | | | (3) ...to here | 186 | while (added == 0 && locale_alias_path[0] != '\0') | | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | | | | | | | (5) ...to here | | (4) following ‘true’ branch (when ‘added == 0’)... | | (6) following ‘true’ branch... | ‘_nl_expand_alias’: event 7 | |cc1: | (7): ...to here | ‘_nl_expand_alias’: events 8-9 | | 198 | if (start < locale_alias_path) | | ~~~~~~^~~~~~~~~~~~~~~~~~~ | | | | | (8) this path would have been rejected as infeasible at this edge: edge from EN: 88 to EN: 90; rejected constraint: INIT_VAL((*INIT_VAL(locale_alias_path))) == (const char)0; where region: {start_39, value: INIT_VAL(locale_alias_path)} which looks like an instance of PR 96374. (c) Also, the path it reports for the leak is rather verbose: it reports various events in _nl_expand_alias leading to the call into read_alias_file, but it really ought to just report the events in read_alias_file. diagnostic_manager ought to prune leading interprocedural events (I think I had code for this in one of my working copies)
The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>: https://gcc.gnu.org/g:f2f639c4a781016ad146d44f463714fe4295cb6e commit r11-7028-gf2f639c4a781016ad146d44f463714fe4295cb6e Author: David Malcolm <dmalcolm@redhat.com> Date: Mon Feb 1 21:52:41 2021 -0500 analyzer: add more feasibility test cases [PR93355,PR96374] This patch adds a couple more reduced test cases derived from the integration test for PR analyzer/93355. In both cases, the analyzer falsely rejects the buggy code paths as being infeasible due to PR analyzer/96374, and so the tests are marked as XFAIL for now. gcc/testsuite/ChangeLog: PR analyzer/93355 PR analyzer/96374 * gcc.dg/analyzer/pr93355-localealias-feasibility-2.c: New test. * gcc.dg/analyzer/pr93355-localealias-feasibility-3.c: New test.
The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>: https://gcc.gnu.org/g:8a2750086d57d1a2251d9239fa4e6c2dc9ec3a86 commit r11-7029-g8a2750086d57d1a2251d9239fa4e6c2dc9ec3a86 Author: David Malcolm <dmalcolm@redhat.com> Date: Mon Feb 1 21:54:11 2021 -0500 analyzer: directly explore within static functions [PR93355,PR96374] PR analyzer/93355 tracks that -fanalyzer fails to report the FILE * leak in read_alias_file in intl/localealias.c. One reason for the failure is that read_alias_file is marked as "static", and the path leading to the single call of read_alias_file is falsely rejected as infeasible due to PR analyzer/96374. I have been attempting to fix that bug, but don't have a good solution yet. Previously, -fanalyzer only directly explored "static" functions if they were needed for call summaries, instead forcing them to be indirectly explored, but if we have a feasibility bug like above, we will fail to report any issues in a function that's only called by such a falsely infeasible path. It now seems wrong to me to reject directly exploring static functions: even if there is currently no way to call a function, it seems reasonable to warn about bugs within them, since otherwise these latent bugs are a timebomb in the code. Hence this patch reworks toplevel_function_p to directly explore almost all functions, working around these feasiblity issues. It introduces a naming convention that "__analyzer_"-prefixed function names don't get directly explored, since this is useful in the analyzer's DejaGnu-based tests. This workaround gets PR analyzer/93355 closer to working, but unfortunately there is a second instance of PR analyzer/96374 within read_alias_file itself which means even with this patch -fanalyzer falsely rejects the path as infeasible. Still, this ought to help in other cases, and simplifies the implementation. gcc/analyzer/ChangeLog: PR analyzer/93355 PR analyzer/96374 * engine.cc (toplevel_function_p): Simplify so that we only reject functions with a "__analyzer_" prefix. (add_any_callbacks): Delete. (exploded_graph::build_initial_worklist): Update for dropped param of toplevel_function_p. (exploded_graph::build_initial_worklist): Don't bother looking for callbacks that are reachable from global initializers. gcc/testsuite/ChangeLog: PR analyzer/93355 PR analyzer/96374 * gcc.dg/analyzer/conditionals-3.c: Add "__analyzer_" prefix to support subroutines where necessary. * gcc.dg/analyzer/data-model-1.c: Likewise. * gcc.dg/analyzer/feasibility-1.c (called_by_test_6a): New. (test_6a): New. * gcc.dg/analyzer/params.c: Add "__analyzer_" prefix to support subroutines where necessary. * gcc.dg/analyzer/pr96651-2.c: Likewise. * gcc.dg/analyzer/signal-4b.c: Likewise. * gcc.dg/analyzer/single-field.c: Likewise. * gcc.dg/analyzer/torture/conditionals-2.c: Likewise.
(In reply to CVS Commits from comment #6) > The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>: > > https://gcc.gnu.org/g:8a2750086d57d1a2251d9239fa4e6c2dc9ec3a86 > > commit r11-7029-g8a2750086d57d1a2251d9239fa4e6c2dc9ec3a86 This patch probably also fixes (c) within comment #4 above.
(In reply to David Malcolm from comment #4) > (a) It happens to successfully explore enough of the graph to find the leak, > but hits complexity limits at 1 program point: > > pr93355-localealias.c:263:41: warning: terminating analysis for this program > point: callstring: [(SN: 66 -> SN: 85 in _nl_expand_alias)] before (SN: 21 > stmt: 0): _7 = *cp_107;EN: 163, EN: 170, EN: 177, EN: 370, EN: 557, EN: > 794, EN: 801, EN: 808 [-Wanalyzer-too-complex] > 263 | while (isspace ((unsigned char) cp[0])) > | ~~^~~ If I get rid of _nl_expand_alias then the state explosion goes away. There are 2 enodes at the callsite to read_alias_file, so the entrypoint starts with 3 enodes (both calls, plus independent analysis) Hence there's already a 3x for that function. Without that 3x, things seem reasonable within read_alias_file. So (a) may be really "just" a call summarization problem.
Going back to the summary from comment #4: (a) It happens to successfully explore enough of the graph to find the leak, but hits complexity limits at a program point (b) Without -fno-analyzer-feasibility, it falsely rejects the path as infeasible (c) is effectively fixed by r11-7029-g8a2750086d57d1a2251d9239fa4e6c2dc9ec3a86. The issue with (a) is that we have a non-trivial function calling a non-trivial function, and the lack of call summaries means that we get a state explosion due to expanding the callee 3 times (once directly, and twice at the single callsite, exploded in the callee). The callee has an issue that is found standalone (since r11-7029-g8a2750086d57d1a2251d9239fa4e6c2dc9ec3a86). Hence the state explosion on nested calls means that we won't fail to find the standalone issue. So although it would be good to fix the state explosion/call summary issue, I think it's lower priority than fixing the feasibility issue. (I spent some time this week looking at call summaries, but now think this is lower priority than the feasibility issue). For (b), I've tried several approaches, but now that we're trying multiple paths per dedupe candidate, it's hard to see where things go wrong. I've been experimenting with better ways to visualize this to try to make debugging it less painful.
The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>: https://gcc.gnu.org/g:3857edb5d32dcdc11d9a2fe3ad7c156c52a1ec7f commit r11-7640-g3857edb5d32dcdc11d9a2fe3ad7c156c52a1ec7f Author: David Malcolm <dmalcolm@redhat.com> Date: Thu Mar 11 17:46:37 2021 -0500 analyzer: new implementation of shortest feasible path [PR96374] The analyzer builds an exploded graph of (point,state) pairs and when it finds a problem, records a diagnostic at the relevant exploded node. Once it has finished exploring the graph, the analyzer needs to generate the shortest feasible path through the graph to each diagnostic's node. This is used: - for rejecting diagnostics that are infeasible (due to impossible sets of constraints), - for use in determining which diagnostic to use in each deduplication set (the one with the shortest path), and - for building checker_paths for the "winning" diagnostics, giving a list of events Prior to this patch the analyzer simply found the shortest path to the node, and then checked it for feasibility, which could lead to falsely rejecting diagnostics: "the shortest path, if feasible" is not the same as "the shortest feasible path" (PR analyzer/96374). An example is PR analyzer/93355, where this issue causes the analyzer to fail to emit a leak warning for a missing fclose on an error-handling path in intl/localealias.c. This patch implements a new algorithm for finding the shortest feasible path to an exploded node: instead of simply finding the shortest path, the new algorithm uses a worklist to iteratively build a tree of path prefixes, which are feasible paths by construction, until a path to the target node is found. The worklist is prioritized, so that the first feasible path discovered is the shortest possible feasible path. The algorithm continues trying paths until the target node is reached or a limit is exceeded, in which case the diagnostic is treated as being infeasible (which could still be a false negative, but is much less likely to happen than before). Iteratively building a tree of paths allows for work to be reused, and the tree can be dumped in .dot form (via a new -fdump-analyzer-feasibility option), making it much easier to debug compared to other approaches I tried. Doing so fixes the missing leak warning for PR analyzer/93355 and various other test cases. Testing: - I manually verified that the behavior is determistic using 50 builds of pr93355-localealias.c. All dumps were identical. - I manually verified that it still builds with --disable-analyzer. - Lightly tested with valgrind; no additional issues. - Lightly performance tested, showing a slight speed regression to the analyzer relative to before the patch, but correctness for this issue is more important than the slight performance hit for the analyzer. gcc/ChangeLog: PR analyzer/96374 * Makefile.in (ANALYZER_OBJS): Add analyzer/feasible-graph.o and analyzer/trimmed-graph.o. * doc/analyzer.texi (Analyzer Paths): Rewrite description of feasibility checking to reflect new implementation. * doc/invoke.texi (-fdump-analyzer-feasibility): Document new option. * shortest-paths.h (shortest_paths::get_shortest_distance): New. gcc/analyzer/ChangeLog: PR analyzer/96374 * analyzer.opt (-param=analyzer-max-infeasible-edges=): New param. (fdump-analyzer-feasibility): New flag. * diagnostic-manager.cc: Include "analyzer/trimmed-graph.h" and "analyzer/feasible-graph.h". (epath_finder::epath_finder): Convert m_sep to a pointer and only create it if !flag_analyzer_feasibility. (epath_finder::~epath_finder): New. (epath_finder::m_sep): Convert to a pointer. (epath_finder::get_best_epath): Add param "diag_idx" and use it when logging. Rather than finding the shortest path and then checking feasibility, instead use explore_feasible_paths unless !flag_analyzer_feasibility, in which case simply use the shortest path, and note if it is infeasible. Update for m_sep becoming a pointer. (class feasible_worklist): New. (epath_finder::explore_feasible_paths): New. (epath_finder::process_worklist_item): New. (class dump_eg_with_shortest_path): New. (epath_finder::dump_trimmed_graph): New. (epath_finder::dump_feasible_graph): New. (saved_diagnostic::saved_diagnostic): Add "idx" param, using it on new field m_idx. (saved_diagnostic::to_json): Dump m_idx. (saved_diagnostic::calc_best_epath): Pass m_idx to get_best_epath. Remove assertion that m_problem was set when m_best_epath is NULL. (diagnostic_manager::add_diagnostic): Pass an index when created saved_diagnostic instances. * diagnostic-manager.h (saved_diagnostic::saved_diagnostic): Add "idx" param. (saved_diagnostic::get_index): New accessor. (saved_diagnostic::m_idx): New field. * engine.cc (exploded_node::dump_dot): Call args.dump_extra_info. Move code to... (exploded_node::dump_processed_stmts): ...this new function and... (exploded_node::dump_saved_diagnostics): ...this new function. Add index of each diagnostic. (exploded_edge::dump_dot): Move bulk of code to... (exploded_edge::dump_dot_label): ...this new function. * exploded-graph.h (eg_traits::dump_args_t::dump_extra_info): New vfunc. (exploded_node::dump_processed_stmts): New decl. (exploded_node::dump_saved_diagnostics): New decl. (exploded_edge::dump_dot_label): New decl. * feasible-graph.cc: New file. * feasible-graph.h: New file. * trimmed-graph.cc: New file. * trimmed-graph.h: New file. gcc/testsuite/ChangeLog: PR analyzer/96374 * gcc.dg/analyzer/dot-output.c: Add -fdump-analyzer-feasibility to options. * gcc.dg/analyzer/feasibility-1.c (test_6): Remove xfail. (test_7): New. * gcc.dg/analyzer/pr93355-localealias-feasibility-2.c: Remove xfail. * gcc.dg/analyzer/pr93355-localealias-feasibility-3.c: Remove xfails. * gcc.dg/analyzer/pr93355-localealias-feasibility.c: Remove -fno-analyzer-feasibility from options. * gcc.dg/analyzer/pr93355-localealias.c: Likewise. * gcc.dg/analyzer/unknown-fns-4.c: Remove xfail.
The above patch fixes the feasibility issue in (b) above, and the analyzer now successfully emits a diagnostic for the leak. The only remaining issue is (a) (see comment #9 above).
(In reply to David Malcolm from comment #11) > The only remaining issue is (a) (see comment #9 above). Updating Summary accordingly, and adding to call summaries tracker