This is the mail archive of the gcc-patches@gcc.gnu.org mailing list for the GCC project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Push HOST_EBCDIC down into libiberty


DJ Delorie <dj@redhat.com> writes:

> I would prefer something less acronymish, like HOST_CHARSET_ASCII (vs
> HC_ASCII).  And we should fix libiberty/hex.c also, as it has its own
> test for the host charset.  We should also document the new
> functionality somehow (rather than have libiberty.texi fall even
> further behind).  Otherwise I'm OK with it.

How's this?

include:
        * safe-ctype.h (HC_UNKNOWN, HC_ASCII, HC_EBCDIC): Rename to
        HOST_CHARSET_UNKNOWN, HOST_CHARSET_ASCII, HOST_CHARSET_EBCDIC
        respectively.
libiberty:
        * safe-ctype.c: Use HOST_CHARSET_ASCII and HOST_CHARSET_EBCDIC,
        not HC_ASCII and HC_EBCDIC.
        Add documentation in form expected by gather-docs.
        * hex.c: Use HOST_CHARSET, not hand-coded check of character set.
        * Makefile.in, functions.texi: Regenerate.
gcc:
        config/i370/i370.c, config/i370/i370.h: Use HOST_CHARSET_ASCII 
        and HOST_CHARSET_EBCDIC, not HC_ASCII and HC_EBCDIC. 

===================================================================
Index: include/safe-ctype.h
--- include/safe-ctype.h	21 Jun 2003 23:22:29 -0000	1.5
+++ include/safe-ctype.h	22 Jun 2003 06:49:45 -0000
@@ -40,19 +40,19 @@ Boston, MA 02111-1307, USA.  */
 #endif
 
 /* Determine host character set.  */
-#define HC_UNKNOWN 0
-#define HC_ASCII   1
-#define HC_EBCDIC  2
+#define HOST_CHARSET_UNKNOWN 0
+#define HOST_CHARSET_ASCII   1
+#define HOST_CHARSET_EBCDIC  2
 
 #if  '\n' == 0x0A && ' ' == 0x20 && '0' == 0x30 \
    && 'A' == 0x41 && 'a' == 0x61 && '!' == 0x21
-#  define HOST_CHARSET HC_ASCII
+#  define HOST_CHARSET HOST_CHARSET_ASCII
 #else
 # if '\n' == 0x15 && ' ' == 0x40 && '0' == 0xF0 \
    && 'A' == 0xC1 && 'a' == 0x81 && '!' == 0x5A
-#  define HOST_CHARSET HC_EBCDIC
+#  define HOST_CHARSET HOST_CHARSET_EBCDIC
 # else
-#  define HOST_CHARSET HC_UNKNOWN
+#  define HOST_CHARSET HOST_CHARSET_UNKNOWN
 # endif
 #endif
 
===================================================================
Index: libiberty/safe-ctype.c
--- libiberty/safe-ctype.c	21 Jun 2003 23:22:30 -0000	1.4
+++ libiberty/safe-ctype.c	22 Jun 2003 06:49:46 -0000
@@ -19,15 +19,100 @@ License along with libiberty; see the fi
 not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330,
 Boston, MA 02111-1307, USA.  */
 
-/* This is a compatible replacement of the standard C library's <ctype.h>
-   with the following properties:
+/*
 
-   - Implements all isxxx() macros required by C99.
-   - Also implements some character classes useful when
-     parsing C-like languages.
-   - Does not change behavior depending on the current locale.
-   - Behaves properly for all values in the range of a signed or
-     unsigned char.  */
+@defvr Extension HOST_CHARSET
+This macro indicates the basic character set and encoding used by the
+host: more precisely, the encoding used for character constants in
+preprocessor @samp{#if} statements (the C "execution character set").
+It is defined by @file{safe-ctype.h}, and will be an integer constant
+with one of the following values:
+
+@ftable @code
+@item HOST_CHARSET_UNKNOWN
+The host character set is unknown - that is, not one of the next two
+possibilities.
+
+@item HOST_CHARSET_ASCII
+The host character set is ASCII.
+
+@item HOST_CHARSET_EBCDIC
+The host character set is some variant of EBCDIC.  (Only one of the
+nineteen EBCDIC varying characters is tested; exercise caution.)
+@end ftable
+@end defvr
+
+@deffn  Extension ISALPHA  (@var{c})
+@deffnx Extension ISALNUM  (@var{c})
+@deffnx Extension ISBLANK  (@var{c})
+@deffnx Extension ISCNTRL  (@var{c})
+@deffnx Extension ISDIGIT  (@var{c})
+@deffnx Extension ISGRAPH  (@var{c})
+@deffnx Extension ISLOWER  (@var{c})
+@deffnx Extension ISPRINT  (@var{c})
+@deffnx Extension ISPUNCT  (@var{c})
+@deffnx Extension ISSPACE  (@var{c})
+@deffnx Extension ISUPPER  (@var{c})
+@deffnx Extension ISXDIGIT (@var{c})
+
+These twelve macros are defined by @file{safe-ctype.h}.  Each has the
+same meaning as the corresponding macro (with name in lowercase)
+defined by the standard header @file{ctype.h}.  For example,
+@code{ISALPHA} returns true for alphabetic characters and false for
+others.  However, there are two differences between these macros and
+those provided by @file{ctype.h}:
+
+@itemize @bullet
+@item These macros are guaranteed to have well-defined behavior for all 
+values representable by @code{signed char} and @code{unsigned char}, and
+for @code{EOF}.
+
+@item These macros ignore the current locale; they are true for these
+fixed sets of characters:
+@multitable {@code{XDIGIT}} {yada yada yada yada yada yada yada yada}
+@item @code{ALPHA}  @tab @kbd{A-Za-z}
+@item @code{ALNUM}  @tab @kbd{A-Za-z0-9}
+@item @code{BLANK}  @tab @kbd{space tab}
+@item @code{CNTRL}  @tab @code{!PRINT}
+@item @code{DIGIT}  @tab @kbd{0-9}
+@item @code{GRAPH}  @tab @code{ALNUM || PUNCT}
+@item @code{LOWER}  @tab @kbd{a-z}
+@item @code{PRINT}  @tab @code{GRAPH ||} @kbd{space}
+@item @code{PUNCT}  @tab @kbd{`~!@@#$%^&*()_-=+[@{]@}\|;:'",<.>/?}
+@item @code{SPACE}  @tab @kbd{space tab \n \r \f \v}
+@item @code{UPPER}  @tab @kbd{A-Z}
+@item @code{XDIGIT} @tab @kbd{0-9A-Fa-f}
+@end multitable
+
+Note that, if the host character set is ASCII or a superset thereof,
+all these macros will return false for all values of @code{char} outside
+the range of 7-bit ASCII.  In particular, both ISPRINT and ISCNTRL return
+false for characters with numeric values from 128 to 255.
+@end itemize
+@end deffn
+
+@deffn  Extension ISIDNUM         (@var{c})
+@deffnx Extension ISIDST          (@var{c})
+@deffnx Extension IS_VSPACE       (@var{c})
+@deffnx Extension IS_NVSPACE      (@var{c})
+@deffnx Extension IS_SPACE_OR_NUL (@var{c})
+@deffnx Extension IS_ISOBASIC     (@var{c})
+These six macros are defined by @file{safe-ctype.h} and provide
+additional character classes which are useful when doing lexical
+analysis of C or similar languages.  They are true for the following
+sets of characters:
+
+@multitable {@code{SPACE_OR_NUL}} {yada yada yada yada yada yada yada yada}
+@item @code{IDNUM}        @tab @kbd{A-Za-z0-9_}
+@item @code{IDST}         @tab @kbd{A-Za-z_}
+@item @code{VSPACE}       @tab @kbd{\r \n}
+@item @code{NVSPACE}      @tab @kbd{space tab \f \v \0}
+@item @code{SPACE_OR_NUL} @tab @code{VSPACE || NVSPACE}
+@item @code{ISOBASIC}     @tab @code{VSPACE || NVSPACE || PRINT}
+@end multitable
+@end deffn
+
+*/
 
 #include "ansidecl.h"
 #include <safe-ctype.h>
@@ -68,7 +153,7 @@ Boston, MA 02111-1307, USA.  */
 #define S  (const unsigned short) (nv|sp|bl|pr)	/* space */
 
 /* Are we ASCII? */
-#if HOST_CHARSET == HC_ASCII
+#if HOST_CHARSET == HOST_CHARSET_ASCII
 
 const unsigned short _sch_istable[256] =
 {
@@ -161,7 +246,7 @@ const unsigned char _sch_toupper[256] =
 };
 
 #else
-# if HOST_CHARSET == HC_EBCDIC
+# if HOST_CHARSET == HOST_CHARSET_EBCDIC
   #error "FIXME: write tables for EBCDIC"
 # else
   #error "Unrecognized host character set"
===================================================================
Index: libiberty/hex.c
--- libiberty/hex.c	15 May 2003 19:02:13 -0000	1.6
+++ libiberty/hex.c	22 Jun 2003 06:49:46 -0000
@@ -19,6 +19,11 @@ Boston, MA 02111-1307, USA.  */
 
 #include <stdio.h>  /* for EOF */
 #include "libiberty.h"
+#include "safe-ctype.h" /* for HOST_CHARSET_ASCII */
+
+#if EOF != -1
+ #error "hex.c requires EOF == -1"
+#endif
 
 /*
 
@@ -62,9 +67,7 @@ systems.
 
 
 /* Are we ASCII? */
-#if '\n' == 0x0A && ' ' == 0x20 && '0' == 0x30 \
-  && 'A' == 0x41 && 'a' == 0x61 && '!' == 0x21 \
-  && EOF == -1
+#if HOST_CHARSET == HOST_CHARSET_ASCII
 
 const unsigned char _hex_value[_hex_array_size] =
 {
===================================================================
Index: gcc/config/i370/i370.c
--- gcc/config/i370/i370.c	21 Jun 2003 23:22:29 -0000	1.38
+++ gcc/config/i370/i370.c	22 Jun 2003 06:49:41 -0000
@@ -121,7 +121,7 @@ static bool i370_rtx_costs PARAMS ((rtx,
 #ifdef TARGET_HLASM
 
 #define MVS_HASH_PRIME 999983
-#if HOST_CHARSET == HC_EBCDIC
+#if HOST_CHARSET == HOST_CHARSET_EBCDIC
 #define MVS_SET_SIZE 256
 #else
 #define MVS_SET_SIZE 128
@@ -156,7 +156,7 @@ static alias_node_t *alias_anchor = 0;
    and must handled in a special manner.  */
 static const char *const mvs_function_table[MVS_FUNCTION_TABLE_LENGTH] =
 {
-#if HOST_CHARSET == HC_EBCDIC /* Changed for EBCDIC collating sequence */
+#if HOST_CHARSET == HOST_CHARSET_EBCDIC /* Changed for EBCDIC collating sequence */
    "ceil",     "edc_acos", "edc_asin", "edc_atan", "edc_ata2", "edc_cos",
    "edc_cosh", "edc_erf",  "edc_erfc", "edc_exp",  "edc_gamm", "edc_lg10",
    "edc_log",  "edc_sin",  "edc_sinh", "edc_sqrt", "edc_tan",  "edc_tanh",
@@ -176,7 +176,7 @@ static const char *const mvs_function_ta
 #endif /* TARGET_HLASM */
 /* ===================================================== */
 
-#if defined(TARGET_EBCDIC) && HOST_CHARSET == HC_ASCII
+#if defined(TARGET_EBCDIC) && HOST_CHARSET == HOST_CHARSET_ASCII
 /* ASCII to EBCDIC conversion table.  */
 static const unsigned char ascebc[256] =
 {
@@ -231,7 +231,7 @@ static const unsigned char ascebc[256] =
 };
 #endif /* target EBCDIC, host ASCII */
 
-#if !defined(TARGET_EBCDIC) && HOST_CHARSET == HC_EBCDIC
+#if !defined(TARGET_EBCDIC) && HOST_CHARSET == HOST_CHARSET_EBCDIC
 /* EBCDIC to ASCII conversion table.  */
 static const unsigned char ebcasc[256] =
 {
@@ -350,11 +350,11 @@ char
 mvs_map_char (c)
      int c;
 {
-#if defined(TARGET_EBCDIC) && HOST_CHARSET == HC_ASCII
+#if defined(TARGET_EBCDIC) && HOST_CHARSET == HOST_CHARSET_ASCII
   fprintf (stderr, "mvs_map_char: TE & !HE: c = %02x\n", c);
   return ascebc[c];
 #else
-#if !defined(TARGET_EBCDIC) && HOST_CHARSET == HC_EBCDIC
+#if !defined(TARGET_EBCDIC) && HOST_CHARSET == HOST_CHARSET_EBCDIC
   fprintf (stderr, "mvs_map_char: !TE & HE: c = %02x\n", c);
   return ebcasc[c];
 #else
===================================================================
Index: gcc/config/i370/i370.h
--- gcc/config/i370/i370.h	21 Jun 2003 23:22:29 -0000	1.65
+++ gcc/config/i370/i370.h	22 Jun 2003 06:49:43 -0000
@@ -141,7 +141,7 @@ extern size_t mvs_function_name_length;
 /* but only define it if really needed, since otherwise it will break builds */
 
 #ifdef TARGET_EBCDIC
-#if HOST_CHARSET == HC_EBCDIC
+#if HOST_CHARSET == HOST_CHARSET_EBCDIC
 #define MAP_CHARACTER(c) ((char)(c))
 #else
 #define MAP_CHARACTER(c) ((char)mvs_map_char (c))
===================================================================
Index: libiberty/Makefile.in
--- libiberty/Makefile.in	16 Apr 2003 22:42:07 -0000	1.92
+++ libiberty/Makefile.in	22 Jun 2003 06:49:46 -0000
@@ -447,7 +447,8 @@ getpwd.o: config.h $(INCDIR)/ansidecl.h 
 getruntime.o: config.h $(INCDIR)/ansidecl.h $(INCDIR)/libiberty.h
 hashtab.o: config.h $(INCDIR)/ansidecl.h $(INCDIR)/hashtab.h \
 	$(INCDIR)/libiberty.h
-hex.o: $(INCDIR)/ansidecl.h $(INCDIR)/libiberty.h
+hex.o: $(INCDIR)/ansidecl.h $(INCDIR)/libiberty.h \
+	$(INCDIR)/safe-ctype.h
 lbasename.o: $(INCDIR)/ansidecl.h $(INCDIR)/libiberty.h \
 	$(INCDIR)/safe-ctype.h
 lrealpath.o: config.h $(INCDIR)/ansidecl.h $(INCDIR)/libiberty.h
===================================================================
Index: libiberty/functions.texi
--- libiberty/functions.texi	3 Jun 2003 18:19:17 -0000	1.15
+++ libiberty/functions.texi	22 Jun 2003 06:49:46 -0000
@@ -3,6 +3,28 @@
 @c Edit the *.c files, configure with --enable-maintainer-mode,
 @c and let gather-docs build you a new copy.
 
+@c safe-ctype.c:24
+@defvr Extension HOST_CHARSET
+This macro indicates the basic character set and encoding used by the
+host: more precisely, the encoding used for character constants in
+preprocessor @samp{#if} statements (the C "execution character set").
+It is defined by @file{safe-ctype.h}, and will be an integer constant
+with one of the following values:
+
+@ftable @code
+@item HOST_CHARSET_UNKNOWN
+The host character set is unknown - that is, not one of the next two
+possibilities.
+
+@item HOST_CHARSET_ASCII
+The host character set is ASCII.
+
+@item HOST_CHARSET_EBCDIC
+The host character set is some variant of EBCDIC.  (Only one of the
+nineteen EBCDIC varying characters is tested; exercise caution.)
+@end ftable
+@end defvr
+
 @c alloca.c:26
 @deftypefn Replacement void* alloca (size_t @var{size})
 
@@ -317,7 +339,7 @@ between calls to @code{getpwd}.
 
 @end deftypefn
 
-@c hex.c:25
+@c hex.c:30
 @deftypefn Extension void hex_init (void)
 
 Initializes the array mapping the current character set to
@@ -327,7 +349,7 @@ default ASCII-based table will normally 
 
 @end deftypefn
 
-@c hex.c:34
+@c hex.c:39
 @deftypefn Extension int hex_p (int @var{c})
 
 Evaluates to non-zero if the given character is a valid hex character,
@@ -336,7 +358,7 @@ or zero if it is not.  Note that the val
 
 @end deftypefn
 
-@c hex.c:42
+@c hex.c:47
 @deftypefn Extension unsigned int hex_value (int @var{c})
 
 Returns the numeric equivalent of the given character when interpreted
@@ -381,6 +403,78 @@ struct qelem @{
 @end example
 
 @end deftypefn
+
+@c safe-ctype.c:45
+@deffn  Extension ISALPHA  (@var{c})
+@deffnx Extension ISALNUM  (@var{c})
+@deffnx Extension ISBLANK  (@var{c})
+@deffnx Extension ISCNTRL  (@var{c})
+@deffnx Extension ISDIGIT  (@var{c})
+@deffnx Extension ISGRAPH  (@var{c})
+@deffnx Extension ISLOWER  (@var{c})
+@deffnx Extension ISPRINT  (@var{c})
+@deffnx Extension ISPUNCT  (@var{c})
+@deffnx Extension ISSPACE  (@var{c})
+@deffnx Extension ISUPPER  (@var{c})
+@deffnx Extension ISXDIGIT (@var{c})
+
+These twelve macros are defined by @file{safe-ctype.h}.  Each has the
+same meaning as the corresponding macro (with name in lowercase)
+defined by the standard header @file{ctype.h}.  For example,
+@code{ISALPHA} returns true for alphabetic characters and false for
+others.  However, there are two differences between these macros and
+those provided by @file{ctype.h}:
+
+@itemize @bullet
+@item These macros are guaranteed to have well-defined behavior for all 
+values representable by @code{signed char} and @code{unsigned char}, and
+for @code{EOF}.
+
+@item These macros ignore the current locale; they are true for these
+fixed sets of characters:
+@multitable {@code{XDIGIT}} {yada yada yada yada yada yada yada yada}
+@item @code{ALPHA}  @tab @kbd{A-Za-z}
+@item @code{ALNUM}  @tab @kbd{A-Za-z0-9}
+@item @code{BLANK}  @tab @kbd{space tab}
+@item @code{CNTRL}  @tab @code{!PRINT}
+@item @code{DIGIT}  @tab @kbd{0-9}
+@item @code{GRAPH}  @tab @code{ALNUM || PUNCT}
+@item @code{LOWER}  @tab @kbd{a-z}
+@item @code{PRINT}  @tab @code{GRAPH ||} @kbd{space}
+@item @code{PUNCT}  @tab @kbd{`~!@@#$%^&*()_-=+[@{]@}\|;:'",<.>/?}
+@item @code{SPACE}  @tab @kbd{space tab \n \r \f \v}
+@item @code{UPPER}  @tab @kbd{A-Z}
+@item @code{XDIGIT} @tab @kbd{0-9A-Fa-f}
+@end multitable
+
+Note that, if the host character set is ASCII or a superset thereof,
+all these macros will return false for all values of @code{char} outside
+the range of 7-bit ASCII.  In particular, both ISPRINT and ISCNTRL return
+false for characters with numeric values from 128 to 255.
+@end itemize
+@end deffn
+
+@c safe-ctype.c:94
+@deffn  Extension ISIDNUM         (@var{c})
+@deffnx Extension ISIDST          (@var{c})
+@deffnx Extension IS_VSPACE       (@var{c})
+@deffnx Extension IS_NVSPACE      (@var{c})
+@deffnx Extension IS_SPACE_OR_NUL (@var{c})
+@deffnx Extension IS_ISOBASIC     (@var{c})
+These six macros are defined by @file{safe-ctype.h} and provide
+additional character classes which are useful when doing lexical
+analysis of C or similar languages.  They are true for the following
+sets of characters:
+
+@multitable {@code{SPACE_OR_NUL}} {yada yada yada yada yada yada yada yada}
+@item @code{IDNUM}        @tab @kbd{A-Za-z0-9_}
+@item @code{IDST}         @tab @kbd{A-Za-z_}
+@item @code{VSPACE}       @tab @kbd{\r \n}
+@item @code{NVSPACE}      @tab @kbd{space tab \f \v \0}
+@item @code{SPACE_OR_NUL} @tab @code{VSPACE || NVSPACE}
+@item @code{ISOBASIC}     @tab @code{VSPACE || NVSPACE || PRINT}
+@end multitable
+@end deffn
 
 @c lbasename.c:23
 @deftypefn Replacement {const char*} lbasename (const char *@var{name})


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]