[PATCH 3/3] Implementation of -Wclobbered on tree-ssa

Vladislav Ivanishin vlad@ispras.ru
Thu Oct 3 12:01:00 GMT 2019


What it does
============

A fundamental limitation of the new approach is that it requires phi
nodes for variables to perform the analysis it needs to issue the
warning for them.  No phis - no warning.  In particular, it doesn't deal
with register asm variables (see gcc.target/i386/attr-returns_twice-1.c
which I xfailed) because of how they are currently implemented in GCC.
Also, in theory there can be struct members that SRA doesn't scalarize,
but some other compiler might.

Another "downgrade" from the old implementation is that currently, the
new pass is placed in such a position in the pipeline that it requires
optimize > 0 in order to be run.  But there's nothing fundamental about
it; that is something I just haven't experimented with.

I placed the new pass in tree-ssa-uninit.c, because I used static
functions defined there.  The pass ended up using only struct pred_info
and is_pred_expr_subset_of().

The main PR tracking the problems of the old implementation is 21161.
I also went through the "See also" list of PRs for it.  This new
implementation addresses them all, and the only failing test I saw for
x86_64-pc-linux-gnu is the already mentioned register asm test.

The existing RTL implementation doesn't check return values of
setjmp/vfork and thus does not distinguish the paths only realizable
upon a non-abnormal [*] return (this is what Wclobbered-pr21161-2.c is
about).  The new implementation does deal with it.

How it does it
==============
Though I think, the comments for the new pass are extensive, they are
scattered throughout the code, so here's a brief overview of the patch.

The pass considers all returns-twice functions, though {,_,__}setjmp,
{,_,__}sigsetjmp, and vfork are special in that we know the meaning of
their return values and that allows to suppress false positives (see
below [fp]).  I refer to the "setjmp-like (returns-twice) function call
site" as the setjmp_bb or simply "setjmp" in the code and the comments.
Without losing generality, let's do the same for the purpose of this
discussion.

The representation currently used for setjmp calls is convenient enough
for the analysis.  There is one ABNORMAL_DISPATCHER block per function
and abnormal edges from it to all the setjmps in the function. (A setjmp
call is the first non-debug instruction in its block).  Also, there is
an abnormal edge for every function call leading from the point right
after the call to the abnormal dispatcher.

Now let me walk you through the code in the order of execution (leading
to producing a warning).

First, the gate() method checks whether calls_setjmp is set for the
function.  No need to do anything if it isn't.

Then the main driver - warn_clobbered_setjmp() - is run.  It iterates
over all the setjmps.  For a setjmp, all phi functions in its block are
considered.  The basic idea is to warn when there's a redefinition of a
variable on the path from setjmp to a longjmp call (i.e. to the abnormal
dispatcher).  That redefinition is exactly the thing we intend to warn
about: its effect might not be visible (the value might get clobbered)
after an abnormal return from setjmp - when some register values get
restored from a stash.  A warning is warranted in such a situation
(given that there's also an initial definition flowing into the phi),
because the value of the variable upon an abnormal return from setjmp is
potentially ambiguous (unspecified).  phi_opnd_phantom_redefinition()
discovers such redefinitions.

It does so by "recursively" (via a worklist of phis) walking the phi's
operands.  So when we find an actual non-phi definition we already know
that it flows into the phi in the setjmp's basic block.  We also need to
check that the redefinition is actually reachable from the setjmp and
that it is actually a *re*definition flowing into the setjmp bb via an
abnormal edge (what we actually check is that it flows into the abnormal
dispatcher).

[fp] Now back to the driver.  In some cases such a redefinition may not
do any real harm: if the unspecified value is never used.  We now walk
all ultimate uses of the phi in all_uses_reference_specified_values_p()
and check that all SSA values that reach a use (via phis) are not
unspecified in all_phi_args_are_specified_p().  This is where the
knowledge of the meaning of return values comes into play.  In case of
the setjmp family and vfork, the return value indicates whether this is
the first (normal) or a subsequent (abnormal) return.  And the value may
only be unspecified upon an abnormal return.  So along with plain
reachability we check for compatibility of predicates and thus suppress
false positives on programs such as Wclobbered-pr21161-2.c.

I don't know how to check that a function is one of *setjmp or vfork
reliably, so I copied some code from special_function_p that uses string
comparison (see setjmp_or_vfork_p in the patch).  One option would be to
have builtins for all kinds of setjmp and vfork and compare against
enum built_in_function values, but creating new builtins just for that
doesn't feel right.

Aside
=====

There's a rough analogy with loop analysis here. Let's consider all
loops that flow via the setjmp bb, reach the abnormal dispatcher by
arbitrary path, and go back to the setjmp bb. We are interested in
variables that are live in the dispatcher and modified in a loop (and
thus have a phi node in a setjmp bb). Such variables are candidates for
the warning, but we need to filter the set further. Namely, we want the
definition in the loop eventually used, and not under a predicate that
is executable only prior to going via the abnormal dispatcher.

Testing
=======

Bootstrapped with BOOT_CFLAGS="-O2 -Wclobbered", regtested on
x86_64-pc-linux-gnu, and successfully built emacs from git with
CFLAGS="-O -Wclobbered".

I would like to add a couple more test cases: PR 61118 (there's no
preprocessed version in the BZ), and internal_lisp_condition_case() from
emacs/src/eval.c.  I don't yet have reduced versions, so let me do that
after the discussion of the main patch.


[*] About the terminology: "abnormal" is a GCC term.  There are abnormal
    edges and the ABNORMAL_DISPATHER internal function.  Quite
    intuitively, I call returns abnormal if they are reached at runtime
    via the abnormal dispatcher.  Linux man pages refer to them as
    '"fake" returns' (in quotes).  The C standard does not provide a
    terminology apart from "direct invocation"/"return from a call to
    the longjmp function".

-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Implementation-of-Wclobbered-on-tree-ssa.patch
Type: text/x-patch
Size: 34232 bytes
Desc: not available
URL: <http://gcc.gnu.org/pipermail/gcc-patches/attachments/20191003/0484cc37/attachment.bin>


More information about the Gcc-patches mailing list