This is the mail archive of the mailing list for the GCC project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH] Add __attribute__((format(wprintf, 2, 3))) etc. formatchecking support

On Fri, 14 Dec 2001, Jakub Jelinek wrote:

> Following patch does 2 things:
> 1) adds wprintf/wscanf/wcsftime style format checking
> 2) ensures there are no format attribute duplicates added
>    (the latter causes all format checking warnings doubled
>     e.g. if glibc declares sprintf with format(printf, 2, 3) attribute
>     but builtin-attrs.def added that attribute already first).
>    This is needed because attribs.c only checks if attribute values
>    are equal only if they are simple constants, which is not the
>    case for format attributes.

Improve the generic checking, then.  Also try to work out some way of 
checking for duplicate messages in the test harness so a test can be added 
for this - there are other bugs for which such a test facility would be 
useful (e.g., when you give an invalid argument to -std, c/2896).

> I think not overloading printf/scanf/strftime formats is better than
> overloading them, glibc has such attributes in its headers (commented out)
> for ages. On the other side, using wformat doesn't look nicely either
> (IMHO).

My preference is also for wprintf/wscanf/wcsftime.

> The only thing that is overloaded in the following patch is
> format_arg attribute, which now handles both char * -> char * translator
> functions and wchar_t * -> wchar_t * translators.

However, when this was discussed no-one mentioned any need for wchar_t
format_arg attributes.  Thus, I think we should have no support for
wchar_t in such functions (and for that matter no wide strfmon support)  
unless and until someone produces a use for it.

> The code checks that the attribute is not used for char * -> wchar_t *
> or wchar_t * -> char *.
> Ok to commit?

Before such wide string format support can go in, I want the following
properly resolved (and failed to get them resolved on the lists when I
tried before):

* What is the format of STRING_CSTs representing narrow strings on systems
where target bytes are wider than host bytes?  The target bytes may be a
multiple of host bytes (e.g. C4X), or may not be (e.g. PDP10).  They must
represent all values of that type.  Several existing places in the
compiler, that presume them to be the same, must be adapted.  This
includes defining the interface of some target macros in terms of whether
they take a count of host bytes or of target bytes.

* What is the format of wide STRING_CSTs in such cases?

* How wide a target byte need we support?  Can it be wider than 
HOST_WIDE_INT?  Can it be wider than HOST_WIDEST_INT?

* Likewise, for target wchar_t?  In both cases, GCC should fail to build 
if the types are too wide.

* As above, where the target hardware byte is smaller than the target C 
char (BITS_PER_UNIT smaller than CHAR_TYPE_SIZE).  There are no such 
systems currently in tree.  If we don't really support them, we should say 

* Then, there should be common functions used throughout the tree for
accessing individual elements of both narrow and wide strings.  They
should not be specific to format checking.

* There should also be separate such functions for extracting chars from a
narrow string, and extracting multibyte characters.  Both of these are
used in format strings; a narrow format string is a sequence of single
bytes (without reference to the multibyte structure) and multibyte
characters in a slightly confusing way (explained in
and the next edition of POSIX / Single Unix).

(I hope that we will only ever support prefix-free execution multibyte
character sets, though in the UK C Panel we failed to agree whether it was
a defect that (per DR#091) non-prefix-free character sets are allowed,
given that new features in C94 and C99 make them cause more problems than
they did in C90.)

I therefore haven't made a detailed technical review of this patch.

> 	* gcc.dg/format/format.h: Add prototypes for wide character functions.
> 	* gcc.dg/format/c94-wprintf-1.c: New test.
> 	* gcc.dg/format/c94-wscanf-1.c: New test.
> 	* gcc.dg/format/c94-wcsftime-1.c: New test.
> 	* gcc.dg/format/wbranch-1.c: New test.

The format testsuite is already run twice with and without -DWIDE.  This 
is how wide character tests should work mostly.  For example, you'd have

#define CHAR char
#define CHAR wchar_t

in the different modes, and use CHAR where appropriate.  Likewise PRINTF,
etc..  A very few tests, that in C90 mode (narrow) format features that
were added in C94 are rejected, need to move out of their current tests to
ones that #undef WIDE at the top.  A macro L would create narrow or wide
strings, as appropriate.

Note that the testsuite changes to use CHAR etc., but without actually
activating the alternative definitions for -DWIDE, can always be
independent of the rest of the patch (and go in before it).

C94 specifies that most of the function names there are *only* reserved if
the relevant header is included.  The relevant one already in a C90
reserved namespace is wcsftime.  Therefore, while wcsftime should be
checked in C94 mode (being reserved unconditionally there - but shouldn't
be checked in C90 mode even though the name is reserved there too), the
others shouldn't.  By default format.h should provide the declarations
with the attributes on for the wide functions, but should have a #ifdef
NOATTR mode not to provide them.  All cases of which wide functions are
checked in which modes should be checked, but separately from the basic
cases of checking that format checking is working properly for both narrow
and wide formats.

> +  if (info.format_type == wcsftime_format_type && info.first_arg_num != 0)
> +    {
> +      error ("wcsftime formats cannot format arguments");
> +      *no_add_attrs = true;
> +      return NULL_TREE;
> +    }

Get rid of this hardcoding of a name of a particular format type and use
the structures providing information about the formats, to do this for
both strftime and wcsftime (moving declarations about the file if
necessary for this).

> +  /* Wide string version.  */
> +  FMT_FLAG_WIDE = 256

I think it would be better for each table entry to be able to specify both
wide and narrow names for this format, or only one if only one of wide and
narrow is supported, rather than duplicating like this.

Joseph S. Myers

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]