Glibc allows a project to define custom printf conversions, via one of two APIs: register_printf_function, and more recently, register_printf_specifier. For instance, my project has a custom %v conversion, which takes a pointer to a vector structure that is heavily used within the project, and pretty-prints it. The problem is, every time the custom format conversion is used, gcc (which is invoked with -Wall) generates warnings. test.c:198: warning: unknown conversion type character ‘v’ in format test.c:198: warning: too many arguments for format I can get rid of the warnings with -Wno-format, but that also disables the rest of gcc's format string checking (which is very helpful!). I'd like to request a finer grained means of control. A syntactical element (builtin/pragma/attribute/whatever) to pre-declare a format conversion and the typedef to check it against would be very nice, if complex. A much simpler solution would be a -Wno-format-unknown-specifier option, which skips the argument in the argument list and otherwise ignores conversions it doesn't recognize. Any solution along those lines would be very helpful.
Which project is this? I think a patch that adds -Wno-format-unknown-specifier would be accepted if properly submitted: http://gcc.gnu.org/contribute.html See how the other Wformat-* options are defined in gcc/c-family/c.opt. Then, grep for unknown conversion type character, and just change OPT_Wformat in the warning call. You'll have to add new testcases and adjust existing ones.
Created attachment 23380 [details] 47781.c Here's a rather silly test case that demonstrates the problem with a simple "bool" type. $ gcc -O2 -Wall -o 47781 47781.c 47781.c: In function ‘main’: 47781.c:12: warning: unknown conversion type character ‘b’ in format 47781.c:12: warning: unknown conversion type character ‘b’ in format 47781.c:12: warning: too many arguments for format $ ./47781 true bool: TRUE false bool: FALSE $ (That's on x86-64 linux with gcc 4.4.4-14ubuntu5 and libc6 2.12.1-0ubuntu10.2.)
(In reply to comment #1) > I think a patch that adds -Wno-format-unknown-specifier would be accepted if > properly submitted: Okay, I'll take a look at putting together a patch. Thanks!
On Thu, 17 Feb 2011, mark-gcc at glines dot org wrote: > I'd like to request a finer grained means of control. A syntactical element > (builtin/pragma/attribute/whatever) to pre-declare a format conversion and the > typedef to check it against would be very nice, if complex. A much simpler > solution would be a -Wno-format-unknown-specifier option, which skips the > argument in the argument list and otherwise ignores conversions it doesn't > recognize. You can't reliably know how many arguments the unknown specifier takes, though assuming them to take one argument would be a reasonable heuristic for such an option. For the general issue, my inclination is that we should add plugin hooks into the format checking machinery that allow plugins to define formats with the full flexibility of all the format checking datastructures in GCC. Using GCC plugins for this avoids problems with defining complicated syntax in the source file to describe the peculiarities of different formats, which might constrain future changes to the format checking implementation by making too much of the internals visible to user source code, because by design GCC plugins can use GCC internals which are free to change incompatibly in ways that require plugin changes.
Confirmed.
*** Bug 58512 has been marked as a duplicate of this bug. ***
Related to bug 15338.
(In reply to joseph@codesourcery.com from comment #4) > For the general issue, my inclination is that we should add plugin hooks > into the format checking machinery that allow plugins to define formats > with the full flexibility of all the format checking datastructures in > GCC. Using GCC plugins for this avoids problems with defining complicated > syntax in the source file to describe the peculiarities of different > formats, which might constrain future changes to the format checking > implementation by making too much of the internals visible to user source > code, because by design GCC plugins can use GCC internals which are free > to change incompatibly in ways that require plugin changes. What about using pragmas to describe the new format specifier?
On Thu, 21 Aug 2014, philipp_subx@redfish-solutions.com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47781 > > --- Comment #8 from Philip Prindeville <philipp_subx@redfish-solutions.com> --- > (In reply to joseph@codesourcery.com from comment #4) > > > For the general issue, my inclination is that we should add plugin hooks > > into the format checking machinery that allow plugins to define formats > > with the full flexibility of all the format checking datastructures in > > GCC. Using GCC plugins for this avoids problems with defining complicated > > syntax in the source file to describe the peculiarities of different > > formats, which might constrain future changes to the format checking > > implementation by making too much of the internals visible to user source > > code, because by design GCC plugins can use GCC internals which are free > > to change incompatibly in ways that require plugin changes. > > What about using pragmas to describe the new format specifier? Those have the issue of either being limited in the sorts of formats that can be described, or else exposing more internals than seems desirable to expose as a stable interface. Plugins allow full flexibility (with possible instability of interfaces), though a stable subset (e.g. formats that take no length modifiers or flags) could probably be defined that has a stable interface in source files (such as through attributes or pragmas) that doesn't unduly constrain the internals of the implementation. But I think any such stable interface would not be able to describe the full generality of the existing built-in formats. One interesting question would be whether a good stable interface can be defined that is general enough to describe GCC's internal formats - whether those are regular enough that a description isn't tied to hardcoded special cases or extremely complicated descriptions of what cases should / should not get warnings.
On Aug 21, 2014, at 11:06 AM, joseph at codesourcery dot com <gcc-bugzilla@gcc.gnu.org> wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47781 > > --- Comment #9 from joseph at codesourcery dot com <joseph at codesourcery dot com> --- > On Thu, 21 Aug 2014, philipp_subx@redfish-solutions.com wrote: > >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=47781 >> >> --- Comment #8 from Philip Prindeville <philipp_subx@redfish-solutions.com> --- >> (In reply to joseph@codesourcery.com from comment #4) >> >>> For the general issue, my inclination is that we should add plugin hooks >>> into the format checking machinery that allow plugins to define formats >>> with the full flexibility of all the format checking datastructures in >>> GCC. Using GCC plugins for this avoids problems with defining complicated >>> syntax in the source file to describe the peculiarities of different >>> formats, which might constrain future changes to the format checking >>> implementation by making too much of the internals visible to user source >>> code, because by design GCC plugins can use GCC internals which are free >>> to change incompatibly in ways that require plugin changes. >> >> What about using pragmas to describe the new format specifier? > > Those have the issue of either being limited in the sorts of formats that > can be described, or else exposing more internals than seems desirable to > expose as a stable interface. Plugins allow full flexibility (with > possible instability of interfaces), though a stable subset (e.g. formats > that take no length modifiers or flags) could probably be defined that has > a stable interface in source files (such as through attributes or pragmas) > that doesn't unduly constrain the internals of the implementation. But I > think any such stable interface would not be able to describe the full > generality of the existing built-in formats. > > One interesting question would be whether a good stable interface can be > defined that is general enough to describe GCC's internal formats - > whether those are regular enough that a description isn't tied to > hardcoded special cases or extremely complicated descriptions of what > cases should / should not get warnings. > Yeah, I agree: if the notation is adequate, all existing formats should be expressible using it.
(In reply to joseph@codesourcery.com from comment #4) > For the general issue, my inclination is that we should add plugin hooks > into the format checking machinery that allow plugins to define formats > with the full flexibility of all the format checking datastructures in > GCC. I agree this makes sense for the general case, but I wanted to point out that requiring a plugin for the simple cases is significantly harder for users than some in-source extension mechanism. E.g., firefox has a logging printf that accepts "%hs" to print char16_t* strings. This extension means that printf checking can't be used here. Requiring a plugin to deal with this situation would also be difficult. However letting one write __attribute__((printf, 1, 2, "hs", char16_t*)) would solve this nicely. I suppose I think that a format-for-a-specific-type is the most common kind of extension and so may deserve special treatment.
On Thu, 29 Jan 2015, tromey at gcc dot gnu.org wrote: > E.g., firefox has a logging printf that accepts "%hs" to print char16_t* > strings. This extension means that printf checking can't be used here. > Requiring a plugin to deal with this situation would also be difficult. > However letting one write __attribute__((printf, 1, 2, "hs", char16_t*)) > would solve this nicely. Do you then take this as being length modifier 'h' followed by format specifier 's', or is it a complete specifier on its own with everything that would otherwise be length and specifier being reparsed as an extension if it can't be parsed as a standard format? Do the flags "-wp" and "cR" for %s formats apply to this format?
(In reply to joseph@codesourcery.com from comment #12) > On Thu, 29 Jan 2015, tromey at gcc dot gnu.org wrote: > > > E.g., firefox has a logging printf that accepts "%hs" to print char16_t* > > strings. This extension means that printf checking can't be used here. > > Requiring a plugin to deal with this situation would also be difficult. > > However letting one write __attribute__((printf, 1, 2, "hs", char16_t*)) > > would solve this nicely. > > Do you then take this as being length modifier 'h' followed by format > specifier 's', or is it a complete specifier on its own with everything > that would otherwise be length and specifier being reparsed as an > extension if it can't be parsed as a standard format? Do the flags "-wp" > and "cR" for %s formats apply to this format? I see what you mean -- maybe "simple" isn't straightforward. I have been reconsidering the plugin approach given some new things I learned about the details of the firefox code (namely that it doesn't faithfully follow printf semantics, sigh). One additional note for this bug is that it would be nice if any such addition by a plugin worked properly with -Wmissing-format-attribute.
(In reply to Tom Tromey from comment #13) > I have been reconsidering the plugin approach given some new things > I learned about the details of the firefox code (namely that it doesn't > faithfully follow printf semantics, sigh). > > One additional note for this bug is that it would be nice if any > such addition by a plugin worked properly with -Wmissing-format-attribute. Note that plugins can define attributes. Perhaps one way to go about this would be to create a plugin that parsed some kind of GCC_printf_format_info attribute that matches GCC internal printf checking. Then move GCC own format checking to use this attribute and enable the plugin by default when building GCC. This will give you as much flexibility as GCC format checking supports, and the plugin will be developed, build, tested and distributed alongside GCC. Users outside GCC just need to use the plugin and add the attributes to their own printf-style functions. Moreover, since the plugin is developed alongside GCC, it would be logical to add whatever hooks the plugin needs. Moreover, nothing stops users from creating some kind of intermediate language that simplifies custom printf attribute syntax. Probably some C preprocessor magic could be enough. The challenge is the define the syntax of the attribute, but I think this challenge is unavoidable for whoever wants to implement this. You may present a simplified syntax to the user, but you still need to handle correctly all the complexity and corner cases in c-format.c.
(In reply to Tom Tromey from comment #11) > ...I wanted to point out that requiring a plugin for the simple cases is > significantly harder for users than some in-source extension mechanism. > > E.g., firefox has a logging printf that accepts "%hs" to print char16_t* > strings. This extension means that printf checking can't be used here. > Requiring a plugin to deal with this situation would also be difficult. > However letting one write __attribute__((printf, 1, 2, "hs", char16_t*)) > would solve this nicely. > > I suppose I think that a format-for-a-specific-type is the most common > kind of extension and so may deserve special treatment. Wow, this is pretty much the same syntax I imagined when coming across this issue independently! Except in my idea, I changed the name of the format attribute to "printf-extended", to make it more obvious what the extra arguments are. The case where I came across it was in trying to build a forked old version bfd with -Wsuggest-attribute=format and -Wformat=2, where I was unable to attach a format attribute to the bfd_error_handler_type declaration. This is because _bfd_default_error_handler is extended to accept 2 new format specifiers: %A, which takes args of type asection*, and %B, which takes args of type bfd*. Using an attribute as proposed above, it'd be simple to just write something like, __attribute__((format(printf-extended, 1, 2, "A", asection*, "B", bfd*))) Although checking the commentary on newer mainline versions of the _bfd_default_error_handler function, it looks like it does some additional weird stuff with the argument order, but still, support for extending the format attribute like this would still be a good start!
(In reply to Eric Gallager from comment #15) > Although checking the commentary on newer mainline versions of the > _bfd_default_error_handler function, it looks like it does some additional > weird stuff with the argument order, but still, support for extending the > format attribute like this would still be a good start! As suggested above, whoever wants to see progress on this should start developing a plugin that hooks into gcc/c-family/c-format.c. Whether your plugin will parse an attribute, a pragma, an internal representation or define the formats programmatically is up to you. The important thing is to figure out what plugin hooks you need in GCC to make it work, which will require making the format checking extensible at runtime. Until you get that part working, there is little benefit in discussing any possible syntax.
*** Bug 78183 has been marked as a duplicate of this bug. ***
The Linux kernel also has a bunch of printf format extensions that GCC doesn't know anything about: https://www.kernel.org/doc/Documentation/printk-formats.txt. The extensions take the form of a suffix to the %p directive and take a pointer argument so the GCC format checker treats them all as a plain old %p but the sprintf optimization pass punts when it sees a %p because it doesn't know how much output it might produce (largely because of the Linux kernel extensions, but partly also because each OS has its own slightly different format even for plain %p and it was thought to be simpler to punt than to maintain a database of formats for all supported systems). It would be nice if there were an easy way to describe these extensions not just for the benefit of the format checker but also so that the sprintf pass could do its own thing (i.e., check for buffer overflow).
(In reply to Martin Sebor from comment #18) > The Linux kernel also has a bunch of printf format extensions that GCC > doesn't know anything about: > https://www.kernel.org/doc/Documentation/printk-formats.txt. Further, the printf format extensions in the kernel are designed so as to not create warnings and so are often two character combinations by using a standard format specifier followed by a modifying character. I think that I ran a script once to count how much extra memory the two bytes vs a single byte take and it ended up in the 10s of kilobytes. While this may not sound like much, remember that the kernel data is never paged out and on some embedded systems, it actually does make a difference. Should GCC begin supporting custom printf format specifiers, then I would propose we begin changing them in the kernel to take advantage of that small savings.
Has anything changed since 2017 that would let me use register_printf_specifier and -Wformat warnings at the same time? These two features are in direct conflict with each other. I expected a GNU extension to be compatible with a GNU warning, and all I know to do right now is disable all of the warnings related to format specifiers.
(In reply to Cj Welborn from comment #20) > Has anything changed since 2017 that would let me use > register_printf_specifier and -Wformat warnings at the same time? Not that I know of; people still can't agree on a proper design AFAIK... contributions welcome: https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps
Thank you for the reply. It's probably out of my league, but I might take a look when I get time.
I need this feature too. Instead of waiting several more years for an all-singing all-dancing solution, PLEASE can we have a simple solution that allows me to use a custom format specifier and skips a single argument for that specifier. I believe this would cover the vast majority of uses custom format specifiers. My particular use case is that my application generates a lot of JSON strings, so in my printf replacement I want to implement a specifier similar to %s that performs JSON escaping on characters in the string.
(In reply to David Crocker from comment #23) > I need this feature too. Instead of waiting several more years for an > all-singing all-dancing solution, PLEASE can we have a simple solution that > allows me to use a custom format specifier and skips a single argument for > that specifier. I believe this would cover the vast majority of uses custom > format specifiers. My particular use case is that my application generates a > lot of JSON strings, so in my printf replacement I want to implement a > specifier similar to %s that performs JSON escaping on characters in the > string. As a workaround, see the kernel doc linked earlier in this bug. gdb uses this hack as well -- e.g., it uses "%ps" in its formatter to mean a styled string, passed as a pointer to get past gcc's checking.
10 years later, still no solution? I too would really like to be able to use custom single-argument, single-character format specifies (e.g. %b to print an integer in binary). The Linux-kernel work-around with %p<whatever> is painful for two reasons: * My printf function doesn't support format modifiers like that. All format specifiers are single characters. * You have to cast the integer value to a void*, and that just confuses the reader.
It's hard to define something that is sufficiently general to be useful but doesn't expose too much of the details of GCC's internal data structures for describing standard formats. %b for binary is now a standard C23 format and supported for GCC 12 and later.
It is really a pity this can't be resolved :( We have quite a few extensions in the SWI-Prolog source code, mostly for debug messages that deal with internal data structures. It makes writing debug messages a lot easier. What about this: add a pragma that associates a regular expression with a list of types. For example (don't take this literally, I know little about the #pragma conventions). #pragma GCC printf "t" (term_t) Now if the compiler scans a template and finds a %, it runs through these declarations in the order they have been declared. On the first match it knows the type(s) expected from the argument list and continues after the regex match.
(In reply to Jan Wielemaker from comment #27) > It is really a pity this can't be resolved :( We have quite a few > extensions in the SWI-Prolog source code, mostly for debug messages that > deal with internal data structures. It makes writing debug messages a lot > easier. This can be resolved. It only needs someone(s) interested enough to implement it or pay someone else to implement it. There are a lot of suggestions in this page on how to proceed. Personally, I think the best would be to start with a simple design for an attribute rather than a pragma and implement it as a plugin to faster development and testing. Then submit it for comments. The simplest design that will get you faster feedback would be something the replaces some of the current GCC-specific printf formats, like %E, %T, %q, etc. (I don't remember where these are documented and implemented right now) It just needs people with time and patience to do it.
As I said before, the issue is still how to define something general enough to be useful but that doesn't expose too much of the details of GCC's internal data structures for format checking.
(In reply to joseph@codesourcery.com from comment #29) > As I said before, the issue is still how to define something general > enough to be useful but that doesn't expose too much of the details of > GCC's internal data structures for format checking. Indeed, the first step does not even require looking at GCC code or an implementation, but coming up with a design that is flexible enough to be useful.