This is the mail archive of the
gcc-patches@gcc.gnu.org
mailing list for the GCC project.
Re: [PATCH 4/4] c-format.c: suggest the correct format string to use (PR c/64955)
On Thu, 2016-08-04 at 13:55 -0600, Jeff Law wrote:
> On 08/03/2016 09:45 AM, David Malcolm wrote:
> > This adds fix-it hints to c-format.c so that it can (sometimes)
> > suggest
> > the format string the user should have used.
> >
> > The patch adds selftests for the new code in c-format.c. These
> > selftests are thus lang-specific. This is the first time we've had
> > lang-specific selftests, and hence the patch also adds a langhook
> > for
> > running them. (Note that currently the Makefile only invokes the
> > selftests for cc1).
> >
> > Successfully bootstrapped®rtested in conjunction with the rest
> > of the
> > patch kit on x86_64-pc-linux-gnu.
> >
> > (The v2 version of the patch had a successful selftest run for
> > stage 1 on
> > powerpc-ibm-aix7.1.3.0 (gcc111) in conjunction with the rest of the
> > patch
> > kit, and a successful build of stage1 for all targets via config
> > -list.mk;
> > the patch has only been rebased since)
> >
> > OK for trunk if it passes testing?
> >
> > gcc/c-family/ChangeLog:
> > PR c/64955
> > * c-common.h (selftest::c_format_c_tests): New declaration.
> > (selftest::run_c_tests): New declaration.
> > * c-format.c: Include "selftest.h.
> > (format_warning_va): Add param "corrected_substring" and use
> > it to add a replacement fix-it hint.
> > (format_warning_at_substring): Likewise.
> > (format_warning_at_char): Update for new param of
> > format_warning_va.
> > (check_format_info_main): Pass "fki" to check_format_types.
> > (check_format_types): Add param "fki" and pass it to
> > format_type_warning.
> > (deref_n_times): New function.
> > (get_modifier_for_format_len): New function.
> > (selftest::test_get_modifier_for_format_len): New function.
> > (get_format_for_type): New function.
> > (format_type_warning): Add param "fki" and use it to attempt
> > to provide hints for argument types when calling
> > format_warning_at_substring.
> > (selftest::get_info): New function.
> > (selftest::assert_format_for_type_streq): New function.
> > (ASSERT_FORMAT_FOR_TYPE_STREQ): New macro.
> > (selftest::test_get_format_for_type_printf): New function.
> > (selftest::test_get_format_for_type_scanf): New function.
> > (selftest::c_format_c_tests): New function.
> >
> > gcc/c/ChangeLog:
> > PR c/64955
> > * c-lang.c (LANG_HOOKS_RUN_LANG_SELFTESTS): If CHECKING_P, wire
> > this up to selftest::run_c_tests.
> > (selftest::run_c_tests): New function.
> >
> > gcc/ChangeLog:
> > PR c/64955
> > * langhooks-def.h (LANG_HOOKS_RUN_LANG_SELFTESTS): New default
> > do-nothing langhook.
> > (LANG_HOOKS_INITIALIZER): Add LANG_HOOKS_RUN_LANG_SELFTESTS.
> > * langhooks.h (struct lang_hooks): Add run_lang_selftests.
> > * selftest-run-tests.c: Include "tree.h" and "langhooks.h".
> > (selftest::run_tests): Call lang_hooks.run_lang_selftests.
> >
> > gcc/testsuite/ChangeLog:
> > PR c/64955
> > * gcc.dg/format/diagnostic-ranges.c: Add fix-it hints to
> > expected
> > output.
> So presumably we always use the type of the argument as the "correct"
> type and assume the format string is what needs to be fixed (with the
> exception of getting the right amount of *s handled). That seems
> intuitively the right thing to do, but do we have a hit rate better
> than
> 50% in practice?
>
> This is OK. I'm really just curious about your thoughts/experiences
> on
> the heuristics.
Our current behavior is to emit a warning of the form:
test.c:9:18: warning: format ‘%i’ expects argument of type ‘int’, but
argument 2 has type ‘const char *’ [-Wformat=]
printf("hello %i", msg);
^
Note that the text of the diagnostic tells the user the two types
involved within the message.
My experience with printf-style issues of this form is that I know want
expression I want to print, but I don't necessarily know exactly
whether it's say, an int vs a long: it's not that I want to print "an
int", it's that I wanted to print some specific expression. Hence
(assuming this experience is typical) for mismatches of printf-style
calls, I think it's most helpful to the user to tell the user what
format code they need for the expression they want to print, which the
patch does, via the fix-it:
test.c:9:18: warning: format ‘%i’ expects argument of type ‘int’, but
argument 2 has type ‘const char *’ [-Wformat=]
printf("hello %i",
msg);
~^
%s
Note how the text of the diagnostic is unchanged; it's just the fixit that's new (and the underline, which is patch 3 of the kit).
For scanf-style calls this argument may not be so strong, but I think it still holds for the "do I have an int or a long?" cases (giving int * vs long *). It would be nice for scanf to detect the "you forgot to put an & in front of the destination lvalue" case and offer a fix-it hint for it, but I think that's a followup.
So I think it's likely to be much better than 50%, but that's a gut feeling, based on the above arguments.
I'm not aware of anyone who's done formal usability testing of a compiler's diagnostics. I think there are two distinct types of activity:
(a) the edit->try to compile->edit->try to compile -> "it compiles!" cycle (nested within a debug cycle)
(b) compiling pre-existing code, perhaps with a different configuration than it was written on, often written by someone else
I'd be very interested in seeing usability studies of both aspects of a compiler, but in particular of (a): is there a published corpus somewhere of the kind of half-written non-compiling code that happens during (a) out there? (I know this invites various sarcastic responses, but I'm serious :) )
(For myself, I attempt to run with a relatively recent build of gcc trunk as my day-to-day compiler, and if I see a diagnostic that could be improved whilst I'm hacking on something I make a note of it, and try to come back to it later as an RFE; bug reports of this form are most welcome).
Dave