[Mingw-w64-public] toUpper()

papa@arbolone.ca papa@arbolone.ca
Wed Jul 1 18:01:00 GMT 2015


In an earlier and a different posting, I express my concern about 
MS-Windows' std::string/locales, what is the point of wstring, I asked, if 
at the end there is no reliable uft-8/16/32 support for. them. The best 
thing to do is to use boost's or some other 3rd party, if true utf support 
is needed.
towupper is no different than toupper when it comes to letters like á or ñ.
I you know if a way to handle them, specially the letter LL and its 
lowercase counterpart when arranged in SIAO list 
(sort-in-alphabetical-order) please do let me know.

Thanks in advance.

-----Original Message----- 
From: Martin Sebor
Sent: Wednesday, July 1, 2015 11:17 AM
To: papa@arbolone.ca ; Riot ; mingw-w64-public@lists.sourceforge.net
Cc: gcc-help Mailing List
Subject: Re: [Mingw-w64-public] toUpper()

On 07/01/2015 06:02 AM, papa@arbolone.ca wrote:
> std::wstring source(L"Hello World");
> std::wstring destination;
> destination.resize(source.size());
> std::transform (source.begin(), source.end(), destination.begin(),
> (int(*)(int))std::toupper);
>
> The above code is what did the trick, do not ask how, I am still
> digesting it. However, any suggestions would be very much appreciated

This solved problem (1) below but doesn't work correctly or
portably because of the second problem I described in my first
response. std::toupper(int) is defined for narrow characters in
the range [0, UCHAR_MAX] plus EOF. The function has undefined
behavior for characters outside that range (i.e., all wchar_t
greater than UCHAR_MAX).

I don't know what will happen on Windows(*) but on Linux, I can
see the program doesn't work correctly for the Latin Extended
Additional block of characters (the first one I noticed). For
instance, running the attached modified version of the program
in a UTF-8 locale such as en_US.utf8 to convert U+1EBD (LATIN
SMALL LETTER E WITH TILDE) to its uppercase form (U+1EBC)
prints:

     U+1EBD  U+1EBC  U+1EBD

when the expected output is:

     U+1EBD  U+1EBC  U+1EBC

If you want to use transform with wide characters, you need
to use towupper (declared in <wctype.h>).

Martin

[*] I vaguely recall toupper and friends aborting on Windows
when passed an out-of-range argument but I'm not 100% sure.

>
> -----Original Message----- From: Martin Sebor
> Sent: Tuesday, June 30, 2015 10:01 PM
> To: Riot ; mingw-w64-public@lists.sourceforge.net
> Cc: gcc-help Mailing List
> Subject: Re: [Mingw-w64-public] toUpper()
>
> On 06/30/2015 05:24 PM, Riot wrote:
>>      #include <algorithm>
>>      #include <string>
>>
>>      std::string str = "Hello World";
>>      std::transform(str.begin(), str.end(), str.begin(), std::toupper);
>
> Please note this code is subtly incorrect for two reasons.
> There are two overloads of std::toupper:
>
> 1) int toupper(int) declared in <ctype.h> (and the equivalent
>     std::toupper in <cctype>)
> 2) template <class T> charT std::toupper(T, const locale&)
>     in <locale>
>
> Without the right #include directive, the above may or may
> not resolve to "the right" function (which depends on what
> declarations the two headers bring into scope).
>
> When it resolves to (2) it will fail to compile.
>
> When it resolves to (1), it will do the wrong thing (have
> undefined behavior) at runtime when char is a signed type
> and the argument is negative (because (1) is only defined
> for values between -1 and UCHAR_MAX).
>
> But the question is about converting std::wstring to upper
> case and the above uses a narrow string. For wstring, the
> std::ctype<wchar_t>::toupper() function or its convenience
> non-member template function can be used.
>
>> See also: http://www.cplusplus.com/reference/locale/toupper/
>
> This is one possible way to do it. Another approach is along
> these lines:
>
>     std::locale loc (...);
>     std::wstring wstr = L"...";
>     const std::ctype<wchar_t> &ct =
>         std::use_facet<std::ctype<wchar_t> >(loc);
>     ct.toupper (&wstr[0], &wstr[0] + wstr.size());
>
> Martin
>
>>
>> This may also help in future: http://lmgtfy.com/?q=c%2B%2B+toupper
>>
>> -Riot
>>
>> On 30 June 2015 at 23:58,  <papa@arbolone.ca> wrote:
>>> I would like to write a function to capitalize letters, say...
>>> std::wstring toUpper(const std::wstring wstr){
>>> for ( auto it = wstr.begin(); it != wstr.end(); ++it){
>>>          global_wapstr.append(std::towupper(&it));
>>>
>>> }
>>> }
>>>
>>> This doesn’t work, but doesn’t the standard already have something like
>>> std::wstring::toUpper(...)?
>>>
>>> Thanks in advance
>>>
>>>
>>> ---
>>> This email has been checked for viruses by Avast antivirus software.
>>> http://www.avast.com
>>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> Don't Limit Your Business. Reach for the Cloud.
>>> GigeNET's Cloud Solutions provide you with the tools and support that
>>> you need to offload your IT needs and focus on growing your business.
>>> Configured For All Businesses. Start Your Cloud Today.
>>> https://www.gigenetcloud.com/
>>> _______________________________________________
>>> Mingw-w64-public mailing list
>>> Mingw-w64-public@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> http://www.avast.com
>


---
This email has been checked for viruses by Avast antivirus software.
http://www.avast.com



More information about the Gcc-help mailing list