This is the mail archive of the
gcc@gcc.gnu.org
mailing list for the GCC project.
Re: Ada.Characters.{Wide_}Latin_9 should be deleted
- From: starner at okstate dot edu
- To: gcc at gcc dot gnu dot org
- Date: Sun, 19 May 2002 18:51:40 -0500 (CDT)
- Subject: Re: Ada.Characters.{Wide_}Latin_9 should be deleted
> We are not in the business of telling people how to write code.
I seem to remember you telling someone that GNAT would never have
goto labels, because they were the wrong solution - speed over
safety. We don't tell people how to code, but we are in the
business of giving them the best tools to do so.
> but it is quite inappropriate to
> remove useful features in an attempt to force programmers to program the way
> you think they should.
Every feature is useful to someone. Surely we should remove poorly thought-
out features before they are released and become used.
> and the vast majority of Ada programs use Latin-1
No, they don't. The vast majority of Ada programs, when they use characters,
read some 8-bit characters in, possibly examine some ASCII characters, and
emit those characters. These programs don't care which character set is
being used; most of them don't even care if a multi-byte character set is
being used, as long it doesn't put stuff in the ASCII range.
> But Latin-1 is now obsolete and for practical purposes replaced by Latin-9.
Latin-9 obsoletes Latin-1 no more than C++ obsoletes C. HTML assumes Latin-1
as the base characters; CP-1252 is based on Latin-1, not Latin-9. These
things will never change. In some ways, Latin-1 is _the_ 8-bit character set;
Latin-9 is just another 8-bit character set, no more important than any
other 8-bit character set.
> What you seem to want to do is to use the leverage of the introduction of
> the Euro symbol to force people to move to 16 (or 32!) bit character sets,
> but that's definitely not helpful to most people writing programs in
> Europe today.
Latin-9 is definitely not helpful to at least a third of the people
writing programs in Europe today, as their native languages aren't
included in Latin-9. EUR works just fine for the Euro symbol in most
cases; I see no reason to add something just for one character when
many more people need to use thier native languages.
> Looking at wide_character, a basic assumption is that the first 256
> positions of Wide_Character correspond to Character. A lot of code
> depends on this,
Then it's buggy and needs to be fixed, just like any code that assumes
that every system is big-endian.
> The idea that all programs
> should be able to use the local character set is a wish you have as part of
> your multi-cultural mission, but it is unrealistic,
It's no more unrealistic than writing code portable to other systems. It's
not feasible in some situations, but it's a goal that proper compiler and
library support can encourage and make much simpler.
> Ada is NOT designed to force people in this
> direction, and you have no business distorting Ada to do this.
Poland is now part of NATO; I think it's about time people designing Ada
to be for Western Europe uber alles need to rethink their goals. It's a wide
world, and almost anything - Ada.Characters.KOI8-R,
Ada.Characters.Latin-Greek, Ada.Characters.Latin-10 - would go further in
making Ada part of it.
> So anyway, rather than just produce a patch, I think the first thing is to
> propose a design for discussion (we never implement new features without
> first discussing the design, especially if they are actually or in effect
> language extensions.
I'm all ears, provided you're actually willing to let someone outside ACT in
on the discussion.
The code I have was designed as part of an add-on library, so it doesn't change
GNAT at all. (Ngeadal was designed as a Unicode library for Ada; I gave up at
some point upon realizing that a binding to ICU would be more useful. Having
some basic stuff, like 32-bit characters in the GNAT library itself, would be
useful, though.) Wide_Wide_Character, or whatever it should be called, is an
opaque type that is internally an integer from 0 to 16#10FFFF#. For a start,
the interfaces would parellel those of Ada.Strings.*. Comments, or should I
come up with the full interfaces for discussion?