Bug 99260 - analyzer does not track outcomes of realloc
Summary: analyzer does not track outcomes of realloc
Status: RESOLVED FIXED
Alias: None
Product: gcc
Classification: Unclassified
Component: analyzer (show other bugs)
Version: 11.0
: P3 normal
Target Milestone: ---
Assignee: David Malcolm
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-02-24 21:02 UTC by David Malcolm
Modified: 2021-08-31 09:50 UTC (History)
2 users (show)

See Also:
Host:
Target:
Build:
Known to work:
Known to fail:
Last reconfirmed:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description David Malcolm 2021-02-24 21:02:32 UTC
The analyzer currently has no knowledge of the behavior of "realloc" (leading e.g. to bug 99193).

For example, it currently fails to issue a warning for the classic "self-assignment realloc" gotcha, code of the form:
  p = realloc (p, 4096);

e.g.
  void *p = malloc (1024);
  p = realloc (p, 4096);
  free (p);

which ought to be reported as a leak (for the case where the realloc fails, losing the ptr to the original buffer).

Ideally, the analyzer would track various possible outcomes of realloc:
  - buffer grew successfully in-place
  - buffer was successfully moved to a larger allocation
  - buffer was successfully contracted
  - realloc failed, returning NULL, without freeing existing buffer.
or simply:
  - success: non-NULL is returned
  - failure: NULL is returned, existing buffer is not freed.

Unfortunately, the analyzer is currently structured so that each call
statement can only have at most one successor state; there is no
way to "bifurcate" the state, or have N-way splits into multiple
outcomes.  The existing "on_stmt" code works on a copy of the next
state, updating it in place, rather than copying it and making any
necessary changes.  I did this as an optimization to avoid unnecessary
copying of state objects, but it makes it hard to support multiple
outcomes.  (ideally our state objects would be immutable and thus
support trivial copying, alternatively, C++11 move semantics may
help here)
Comment 1 GCC Commits 2021-02-25 00:57:14 UTC
The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>:

https://gcc.gnu.org/g:a6baafcac5308be1a5d92c0b2a179495b7a24b52

commit r11-7381-ga6baafcac5308be1a5d92c0b2a179495b7a24b52
Author: David Malcolm <dmalcolm@redhat.com>
Date:   Wed Feb 24 19:55:40 2021 -0500

    analyzer: fix false positive on realloc [PR99193]
    
    PR analyzer/99193 describes various false positives from
    -Wanalyzer-mismatching-deallocation on realloc(3) calls
    of the form:
    
        |   31 |   void *p = malloc (1024);
        |      |             ^~~~~~~~~~~~~
        |      |             |
        |      |             (1) allocated here (expects deallocation with âfreeâ)
        |   32 |   void *q = realloc (p, 4096);
        |      |             ~~~~~~~~~~~~~~~~~
        |      |             |
        |      |             (2) deallocated with âreallocâ here; allocation at (1) expects deallocation with âfreeâ
        |
    
    The underlying issue is that the analyzer has no knowledge of
    realloc(3), and realloc has awkward semantics.
    
    Unfortunately, the analyzer is currently structured so that each call
    statement can only have at most one successor state; there is no
    way to "bifurcate" the state, or have N-way splits into multiple
    outcomes.  The existing "on_stmt" code works on a copy of the next
    state, updating it in place, rather than copying it and making any
    necessary changes.  I did this as an optimization to avoid unnecessary
    copying of state objects, but it makes it hard to support multiple
    outcomes.  (ideally our state objects would be immutable and thus
    support trivial copying, alternatively, C++11 move semantics may
    help here)
    
    I attempted a few approaches to implementing bifurcation within the
    existing state-update framework, but they were messy and thus likely
    buggy; a proper implementation would rework state-updating to
    generate copies, but this would be a major change, and seems too
    late for GCC 11.
    
    As a workaround, this patch implements enough of realloc(3) to
    suppress the false positives.
    
    This fixes the false positives in PR analyzer/99193.
    I've filed PR analyzer/99260 to track "properly" implementing realloc(3).
    
    gcc/analyzer/ChangeLog:
            PR analyzer/99193
            * region-model-impl-calls.cc (region_model::impl_call_realloc): New.
            * region-model.cc (region_model::on_call_pre): Call it.
            * region-model.h (region_model::impl_call_realloc): New decl.
            * sm-malloc.cc (enum wording): Add WORDING_REALLOCATED.
            (malloc_state_machine::m_realloc): New field.
            (use_after_free::describe_state_change): Add case for
            WORDING_REALLOCATED.
            (use_after_free::describe_final_event): Likewise.
            (malloc_state_machine::malloc_state_machine): Initialize
            m_realloc.
            (malloc_state_machine::on_stmt): Handle realloc by calling...
            (malloc_state_machine::on_realloc_call): New.
    
    gcc/testsuite/ChangeLog:
            PR analyzer/99193
            * gcc.dg/analyzer/pr99193-1.c: New test.
            * gcc.dg/analyzer/pr99193-2.c: New test.
            * gcc.dg/analyzer/pr99193-3.c: New test.
            * gcc.dg/analyzer/realloc-1.c: New test.
Comment 2 David Malcolm 2021-03-19 14:14:52 UTC
In reply to David Malcolm from comment #0)
> The analyzer currently has no knowledge of the behavior of "realloc"
> (leading e.g. to bug 99193).
> 
> For example, it currently fails to issue a warning for the classic
> "self-assignment realloc" gotcha, code of the form:
>   p = realloc (p, 4096);

Bug 56370 tracks this (for the non-analyzer case)
Comment 3 David Malcolm 2021-03-19 14:19:24 UTC
Also, bug 81452 tracks warning on realloc(p, 0)
Comment 4 GCC Commits 2021-08-30 22:40:11 UTC
The master branch has been updated by David Malcolm <dmalcolm@gcc.gnu.org>:

https://gcc.gnu.org/g:eafa9d969237fd8f712c4b25a8c58932c01f44b4

commit r12-3237-geafa9d969237fd8f712c4b25a8c58932c01f44b4
Author: David Malcolm <dmalcolm@redhat.com>
Date:   Mon Aug 30 18:36:31 2021 -0400

    analyzer: support "bifurcation"; reimplement realloc [PR99260]
    
    Most of the state-management code in the analyzer involves
    modifying state objects in-place, which implies a single outcome.
    (I originally implemented in-place modification because I wanted
    to avoid having to create copies of state objects, and it's now
    very difficult to change this aspect of the analyzer's design)
    
    However, there are various special-cases such as "realloc" for which
    it's best to split the state into multiple outcomes.
    
    This patch adds a mechanism for "bifurcating" the analysis in places
    where there isn't a split in the CFG, and uses it to implement realloc,
    in this case treating it as having 3 possible outcomes:
    - failure, returning NULL
    - success, growing the buffer in-place without moving it
    - success, allocating a new buffer, copying the content of the old
      buffer to it, and freeing the old buffer.
    
    gcc/ChangeLog:
            PR analyzer/99260
            * Makefile.in (ANALYZER_OBJS): Add analyzer/call-info.o.
    
    gcc/analyzer/ChangeLog:
            PR analyzer/99260
            * analyzer.h (class custom_edge_info): New class, adapted from
            exploded_edge::custom_info_t.  Make member functions const.
            Make update_model return bool, converting edge param from
            reference to a pointer, and adding a ctxt param.
            (class path_context): New class.
            * call-info.cc: New file.
            * call-info.h: New file.
            * engine.cc: Include "analyzer/call-info.h" and <memory>.
            (impl_region_model_context::impl_region_model_context): Update for
            new m_path_ctxt field.
            (impl_region_model_context::bifurcate): New.
            (impl_region_model_context::terminate_path): New.
            (impl_region_model_context::get_malloc_map): New.
            (impl_sm_context::impl_sm_context): Update for new m_path_ctxt
            field.
            (impl_sm_context::get_fndecl_for_call): Likewise.
            (impl_sm_context::set_next_state): Likewise.
            (impl_sm_context::warn): Likewise.
            (impl_sm_context::is_zero_assignment): Likewise.
            (impl_sm_context::get_path_context): New.
            (impl_sm_context::m_path_ctxt): New.
            (impl_region_model_context::on_condition): Update for new
            path_ctxt param.  Handle m_enode_for_diag being NULL.
            (impl_region_model_context::on_phi): Update for new path_ctxt
            param.
            (exploded_node::on_stmt): Add path_ctxt param, updating ctor calls
            to use it as necessary.  Use it to bail out after sm-handling,
            if needed.
            (exploded_node::detect_leaks): Update for new path_ctxt param.
            (dynamic_call_info_t::update_model): Update for conversion of
            exploded_edge::custom_info_t to custom_edge_info.
            (dynamic_call_info_t::add_events_to_path): Likewise.
            (rewind_info_t::update_model): Likewise.
            (rewind_info_t::add_events_to_path): Likewise.
            (exploded_edge::exploded_edge): Likewise.
            (exploded_graph::add_edge): Likewise.
            (exploded_graph::maybe_process_run_of_before_supernode_enodes):
            Update for new path_ctxt param.
            (class impl_path_context): New.
            (exploded_graph::process_node): Update for new path_ctxt param.
            Create an impl_path_context and pass it to exploded_node::on_stmt.
            Use it to terminate iterating stmts if terminate_path is called
            on it.  After processing a run of stmts, query path_ctxt to
            potentially terminate the analysis path, and/or to "bifurcate" the
            analysis into multiple additional paths.
            (feasibility_state::maybe_update_for_edge): Update for new
            update_model ctxt param.
            * exploded-graph.h
            (impl_region_model_context::impl_region_model_context): Add
            path_ctxt param.
            (impl_region_model_context::bifurcate): New.
            (impl_region_model_context::terminate_path): New
            (impl_region_model_context::get_ext_state): New.
            (impl_region_model_context::get_malloc_map): New.
            (impl_region_model_context::m_path_ctxt): New field.
            (exploded_node::on_stmt): Add path_ctxt param.
            (class exploded_edge::custom_info_t): Move to analyzer.h, renaming
            to custom_edge_info, and making the changes as noted in analyzer.h
            above.
            (exploded_edge::exploded_edge): Update for these changes to
            exploded_edge::custom_info_t.
            (exploded_edge::m_custom_info): Likewise.
            (class dynamic_call_info_t): Likewise.
            (class rewind_info_t): Likewise.
            (exploded_graph::add_edge): Likewise.
            * program-state.cc (program_state::on_edge): Update for new
            path_ctxt param.
            (program_state::push_call): Likewise.
            (program_state::returning_call): Likewise.
            (program_state::prune_for_point): Likewise.
            * region-model-impl-calls.cc: Include "analyzer/call-info.h".
            (call_details::get_fndecl_for_call): New.
            (region_model::impl_call_realloc): Reimplement.
            * region-model.cc (region_model::on_call_pre): Move call to
            impl_call_realloc to...
            (region_model::on_call_post): ...here.  Consolidate creation
            of call_details instance.
            (noop_region_model_context::bifurcate): New.
            (noop_region_model_context::terminate_path): New.
            * region-model.h (call_details::get_call_stmt): New.
            (call_details::get_fndecl_for_call): New.
            (region_model::on_realloc_with_move): New.
            (region_model_context::bifurcate): New.
            (region_model_context::terminate_path): New.
            (region_model_context::get_ext_state): New.
            (region_model_context::get_malloc_map): New.
            (noop_region_model_context::bifurcate): New.
            (noop_region_model_context::terminate_path): New.
            (noop_region_model_context::get_ext_state): New.
            (noop_region_model_context::get_malloc_map): New.
            * sm-malloc.cc: Include "analyzer/program-state.h".
            (malloc_state_machine::on_realloc_call): Reimplement.
            (malloc_state_machine::on_realloc_with_move): New.
            (region_model::on_realloc_with_move): New.
            * sm-signal.cc (class signal_delivery_edge_info_t): Update for
            conversion from exploded_edge::custom_info_t to custom_edge_info.
            * sm.h (sm_context::get_path_context): New.
            * svalue.cc (svalue::maybe_get_constant): Call
            unwrap_any_unmergeable.
    
    gcc/testsuite/ChangeLog:
            PR analyzer/99260
            * gcc.dg/analyzer/capacity-2.c: Update for changes to realloc
            analysis.
            * gcc.dg/analyzer/pr99193-1.c: Likewise.
            * gcc.dg/analyzer/pr99193-3.c: Likewise.
            * gcc.dg/analyzer/realloc-1.c: Likewise.  Add test coverage for
            realloc of non-heap pointer, realloc from mismatching allocator,
            and realloc on a freed pointer.
            * gcc.dg/analyzer/realloc-2.c: New test.
Comment 5 David Malcolm 2021-08-30 22:47:33 UTC
Should be fixed by the above patch for gcc 12.
Comment 6 Richard W.M. Jones 2021-08-31 09:50:55 UTC
That's excellent news, thanks.  We'll get around to trying this
when GCC 12 appears in Rawhide.