Glibc allows a project to define custom printf conversions, via one of two APIs: register_printf_function, and more recently, register_printf_specifier. For instance, my project has a custom %v conversion, which takes a pointer to a vector structure that is heavily used within the project, and pretty-prints it. The problem is, every time the custom format conversion is used, gcc (which is invoked with -Wall) generates warnings. test.c:198: warning: unknown conversion type character ‘v’ in format test.c:198: warning: too many arguments for format I can get rid of the warnings with -Wno-format, but that also disables the rest of gcc's format string checking (which is very helpful!). I'd like to request a finer grained means of control. A syntactical element (builtin/pragma/attribute/whatever) to pre-declare a format conversion and the typedef to check it against would be very nice, if complex. A much simpler solution would be a -Wno-format-unknown-specifier option, which skips the argument in the argument list and otherwise ignores conversions it doesn't recognize. Any solution along those lines would be very helpful.
Which project is this? I think a patch that adds -Wno-format-unknown-specifier would be accepted if properly submitted: http://gcc.gnu.org/contribute.html See how the other Wformat-* options are defined in gcc/c-family/c.opt. Then, grep for unknown conversion type character, and just change OPT_Wformat in the warning call. You'll have to add new testcases and adjust existing ones.
Created attachment 23380 [details] 47781.c Here's a rather silly test case that demonstrates the problem with a simple "bool" type. $ gcc -O2 -Wall -o 47781 47781.c 47781.c: In function ‘main’: 47781.c:12: warning: unknown conversion type character ‘b’ in format 47781.c:12: warning: unknown conversion type character ‘b’ in format 47781.c:12: warning: too many arguments for format $ ./47781 true bool: TRUE false bool: FALSE $ (That's on x86-64 linux with gcc 4.4.4-14ubuntu5 and libc6 2.12.1-0ubuntu10.2.)
(In reply to comment #1) > I think a patch that adds -Wno-format-unknown-specifier would be accepted if > properly submitted: Okay, I'll take a look at putting together a patch. Thanks!
On Thu, 17 Feb 2011, mark-gcc at glines dot org wrote: > I'd like to request a finer grained means of control. A syntactical element > (builtin/pragma/attribute/whatever) to pre-declare a format conversion and the > typedef to check it against would be very nice, if complex. A much simpler > solution would be a -Wno-format-unknown-specifier option, which skips the > argument in the argument list and otherwise ignores conversions it doesn't > recognize. You can't reliably know how many arguments the unknown specifier takes, though assuming them to take one argument would be a reasonable heuristic for such an option. For the general issue, my inclination is that we should add plugin hooks into the format checking machinery that allow plugins to define formats with the full flexibility of all the format checking datastructures in GCC. Using GCC plugins for this avoids problems with defining complicated syntax in the source file to describe the peculiarities of different formats, which might constrain future changes to the format checking implementation by making too much of the internals visible to user source code, because by design GCC plugins can use GCC internals which are free to change incompatibly in ways that require plugin changes.
Confirmed.
*** Bug 58512 has been marked as a duplicate of this bug. ***
Related to bug 15338.
(In reply to joseph@codesourcery.com from comment #4) > For the general issue, my inclination is that we should add plugin hooks > into the format checking machinery that allow plugins to define formats > with the full flexibility of all the format checking datastructures in > GCC. Using GCC plugins for this avoids problems with defining complicated > syntax in the source file to describe the peculiarities of different > formats, which might constrain future changes to the format checking > implementation by making too much of the internals visible to user source > code, because by design GCC plugins can use GCC internals which are free > to change incompatibly in ways that require plugin changes. What about using pragmas to describe the new format specifier?
On Thu, 21 Aug 2014, philipp_subx@redfish-solutions.com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47781 > > --- Comment #8 from Philip Prindeville <philipp_subx@redfish-solutions.com> --- > (In reply to joseph@codesourcery.com from comment #4) > > > For the general issue, my inclination is that we should add plugin hooks > > into the format checking machinery that allow plugins to define formats > > with the full flexibility of all the format checking datastructures in > > GCC. Using GCC plugins for this avoids problems with defining complicated > > syntax in the source file to describe the peculiarities of different > > formats, which might constrain future changes to the format checking > > implementation by making too much of the internals visible to user source > > code, because by design GCC plugins can use GCC internals which are free > > to change incompatibly in ways that require plugin changes. > > What about using pragmas to describe the new format specifier? Those have the issue of either being limited in the sorts of formats that can be described, or else exposing more internals than seems desirable to expose as a stable interface. Plugins allow full flexibility (with possible instability of interfaces), though a stable subset (e.g. formats that take no length modifiers or flags) could probably be defined that has a stable interface in source files (such as through attributes or pragmas) that doesn't unduly constrain the internals of the implementation. But I think any such stable interface would not be able to describe the full generality of the existing built-in formats. One interesting question would be whether a good stable interface can be defined that is general enough to describe GCC's internal formats - whether those are regular enough that a description isn't tied to hardcoded special cases or extremely complicated descriptions of what cases should / should not get warnings.
On Aug 21, 2014, at 11:06 AM, joseph at codesourcery dot com <gcc-bugzilla@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47781 > > --- Comment #9 from joseph at codesourcery dot com <joseph at codesourcery dot com> --- > On Thu, 21 Aug 2014, philipp_subx@redfish-solutions.com wrote: > >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47781 >> >> --- Comment #8 from Philip Prindeville <philipp_subx@redfish-solutions.com> --- >> (In reply to joseph@codesourcery.com from comment #4) >> >>> For the general issue, my inclination is that we should add plugin hooks >>> into the format checking machinery that allow plugins to define formats >>> with the full flexibility of all the format checking datastructures in >>> GCC. Using GCC plugins for this avoids problems with defining complicated >>> syntax in the source file to describe the peculiarities of different >>> formats, which might constrain future changes to the format checking >>> implementation by making too much of the internals visible to user source >>> code, because by design GCC plugins can use GCC internals which are free >>> to change incompatibly in ways that require plugin changes. >> >> What about using pragmas to describe the new format specifier? > > Those have the issue of either being limited in the sorts of formats that > can be described, or else exposing more internals than seems desirable to > expose as a stable interface. Plugins allow full flexibility (with > possible instability of interfaces), though a stable subset (e.g. formats > that take no length modifiers or flags) could probably be defined that has > a stable interface in source files (such as through attributes or pragmas) > that doesn't unduly constrain the internals of the implementation. But I > think any such stable interface would not be able to describe the full > generality of the existing built-in formats. > > One interesting question would be whether a good stable interface can be > defined that is general enough to describe GCC's internal formats - > whether those are regular enough that a description isn't tied to > hardcoded special cases or extremely complicated descriptions of what > cases should / should not get warnings. > Yeah, I agree: if the notation is adequate, all existing formats should be expressible using it.
(In reply to joseph@codesourcery.com from comment #4) > For the general issue, my inclination is that we should add plugin hooks > into the format checking machinery that allow plugins to define formats > with the full flexibility of all the format checking datastructures in > GCC. I agree this makes sense for the general case, but I wanted to point out that requiring a plugin for the simple cases is significantly harder for users than some in-source extension mechanism. E.g., firefox has a logging printf that accepts "%hs" to print char16_t* strings. This extension means that printf checking can't be used here. Requiring a plugin to deal with this situation would also be difficult. However letting one write __attribute__((printf, 1, 2, "hs", char16_t*)) would solve this nicely. I suppose I think that a format-for-a-specific-type is the most common kind of extension and so may deserve special treatment.
On Thu, 29 Jan 2015, tromey at gcc dot gnu.org wrote: > E.g., firefox has a logging printf that accepts "%hs" to print char16_t* > strings. This extension means that printf checking can't be used here. > Requiring a plugin to deal with this situation would also be difficult. > However letting one write __attribute__((printf, 1, 2, "hs", char16_t*)) > would solve this nicely. Do you then take this as being length modifier 'h' followed by format specifier 's', or is it a complete specifier on its own with everything that would otherwise be length and specifier being reparsed as an extension if it can't be parsed as a standard format? Do the flags "-wp" and "cR" for %s formats apply to this format?
(In reply to joseph@codesourcery.com from comment #12) > On Thu, 29 Jan 2015, tromey at gcc dot gnu.org wrote: > > > E.g., firefox has a logging printf that accepts "%hs" to print char16_t* > > strings. This extension means that printf checking can't be used here. > > Requiring a plugin to deal with this situation would also be difficult. > > However letting one write __attribute__((printf, 1, 2, "hs", char16_t*)) > > would solve this nicely. > > Do you then take this as being length modifier 'h' followed by format > specifier 's', or is it a complete specifier on its own with everything > that would otherwise be length and specifier being reparsed as an > extension if it can't be parsed as a standard format? Do the flags "-wp" > and "cR" for %s formats apply to this format? I see what you mean -- maybe "simple" isn't straightforward. I have been reconsidering the plugin approach given some new things I learned about the details of the firefox code (namely that it doesn't faithfully follow printf semantics, sigh). One additional note for this bug is that it would be nice if any such addition by a plugin worked properly with -Wmissing-format-attribute.
(In reply to Tom Tromey from comment #13) > I have been reconsidering the plugin approach given some new things > I learned about the details of the firefox code (namely that it doesn't > faithfully follow printf semantics, sigh). > > One additional note for this bug is that it would be nice if any > such addition by a plugin worked properly with -Wmissing-format-attribute. Note that plugins can define attributes. Perhaps one way to go about this would be to create a plugin that parsed some kind of GCC_printf_format_info attribute that matches GCC internal printf checking. Then move GCC own format checking to use this attribute and enable the plugin by default when building GCC. This will give you as much flexibility as GCC format checking supports, and the plugin will be developed, build, tested and distributed alongside GCC. Users outside GCC just need to use the plugin and add the attributes to their own printf-style functions. Moreover, since the plugin is developed alongside GCC, it would be logical to add whatever hooks the plugin needs. Moreover, nothing stops users from creating some kind of intermediate language that simplifies custom printf attribute syntax. Probably some C preprocessor magic could be enough. The challenge is the define the syntax of the attribute, but I think this challenge is unavoidable for whoever wants to implement this. You may present a simplified syntax to the user, but you still need to handle correctly all the complexity and corner cases in c-format.c.
(In reply to Tom Tromey from comment #11) > ...I wanted to point out that requiring a plugin for the simple cases is > significantly harder for users than some in-source extension mechanism. > > E.g., firefox has a logging printf that accepts "%hs" to print char16_t* > strings. This extension means that printf checking can't be used here. > Requiring a plugin to deal with this situation would also be difficult. > However letting one write __attribute__((printf, 1, 2, "hs", char16_t*)) > would solve this nicely. > > I suppose I think that a format-for-a-specific-type is the most common > kind of extension and so may deserve special treatment. Wow, this is pretty much the same syntax I imagined when coming across this issue independently! Except in my idea, I changed the name of the format attribute to "printf-extended", to make it more obvious what the extra arguments are. The case where I came across it was in trying to build a forked old version bfd with -Wsuggest-attribute=format and -Wformat=2, where I was unable to attach a format attribute to the bfd_error_handler_type declaration. This is because _bfd_default_error_handler is extended to accept 2 new format specifiers: %A, which takes args of type asection*, and %B, which takes args of type bfd*. Using an attribute as proposed above, it'd be simple to just write something like, __attribute__((format(printf-extended, 1, 2, "A", asection*, "B", bfd*))) Although checking the commentary on newer mainline versions of the _bfd_default_error_handler function, it looks like it does some additional weird stuff with the argument order, but still, support for extending the format attribute like this would still be a good start!
(In reply to Eric Gallager from comment #15) > Although checking the commentary on newer mainline versions of the > _bfd_default_error_handler function, it looks like it does some additional > weird stuff with the argument order, but still, support for extending the > format attribute like this would still be a good start! As suggested above, whoever wants to see progress on this should start developing a plugin that hooks into gcc/c-family/c-format.c. Whether your plugin will parse an attribute, a pragma, an internal representation or define the formats programmatically is up to you. The important thing is to figure out what plugin hooks you need in GCC to make it work, which will require making the format checking extensible at runtime. Until you get that part working, there is little benefit in discussing any possible syntax.
*** Bug 78183 has been marked as a duplicate of this bug. ***
The Linux kernel also has a bunch of printf format extensions that GCC doesn't know anything about: https://www.kernel.org/doc/Documentation/printk-formats.txt. The extensions take the form of a suffix to the %p directive and take a pointer argument so the GCC format checker treats them all as a plain old %p but the sprintf optimization pass punts when it sees a %p because it doesn't know how much output it might produce (largely because of the Linux kernel extensions, but partly also because each OS has its own slightly different format even for plain %p and it was thought to be simpler to punt than to maintain a database of formats for all supported systems). It would be nice if there were an easy way to describe these extensions not just for the benefit of the format checker but also so that the sprintf pass could do its own thing (i.e., check for buffer overflow).
(In reply to Martin Sebor from comment #18) > The Linux kernel also has a bunch of printf format extensions that GCC > doesn't know anything about: > https://www.kernel.org/doc/Documentation/printk-formats.txt. Further, the printf format extensions in the kernel are designed so as to not create warnings and so are often two character combinations by using a standard format specifier followed by a modifying character. I think that I ran a script once to count how much extra memory the two bytes vs a single byte take and it ended up in the 10s of kilobytes. While this may not sound like much, remember that the kernel data is never paged out and on some embedded systems, it actually does make a difference. Should GCC begin supporting custom printf format specifiers, then I would propose we begin changing them in the kernel to take advantage of that small savings.
Has anything changed since 2017 that would let me use register_printf_specifier and -Wformat warnings at the same time? These two features are in direct conflict with each other. I expected a GNU extension to be compatible with a GNU warning, and all I know to do right now is disable all of the warnings related to format specifiers.
(In reply to Cj Welborn from comment #20) > Has anything changed since 2017 that would let me use > register_printf_specifier and -Wformat warnings at the same time? Not that I know of; people still can't agree on a proper design AFAIK... contributions welcome: https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps
Thank you for the reply. It's probably out of my league, but I might take a look when I get time.