Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

@upper[] and @lower[] don't like the accented characters

Jul
9
0
If a string contains accented characters, @upper[] does not write them in capital letters:
F:\>echo %@upper[Les élèves français apprennent où placer les caractères accentués] LES éLèVES FRANçAIS APPRENNENT Où PLACER LES CARACTèRES ACCENTUéS
instead of:
LES ÉLÈVES FRANÇAIS APPRENNENT OÙ PLACER LES CARACTÈRES ACCENTUÉS

And lower[] has the opposite problem:
F:\>echo %@lower[LES ÉLÈVES FRANÇAIS APPRENNENT OÙ PLACER LES CARACTÈRES ACCENTUÉS] les ÉlÈves franÇais apprennent oÙ placer les caractÈres accentuÉs

@caps[] is affected too:
F:\>echo %@caps[cet étudiant vit à Paris] Cet étudiant Vit à Paris

Is it possible to fix these functions?

Tested with TCMD 26.02.43 and TCC-RT 29.00.17.
 
It looks like these functions only know ASCII. Handling all the alpha characters in Unicode would not be reasonable. But I'm guessing you only really care about Latin-1, right?

Replacing @UPPER and @LOWER with a plugin would be trivial. @CAPS might be a little more interesting; @CAPS has some funky idiosyncratic parsing to handle the optional separator argument.
 
Indeed, the Latin-1 charset (or better the Latin-9 to have the French ligature œ) is sufficient for me. And I need the @UPPER and @LOWER only, not @CAPS. I had tested it just to see if it was also affected.

Thanks for the Plugins suggestion, it's a good idea. I just downloaded the SDK, I will now study the documentation.
 
Maybe you could roll your own (one character at a time). These ought to be pretty easy to use in a in a "DO /C" loop in a subroutine.

Code:
v:\> echo %@char[%@winapi[user32.dll,CharLower,%@unicode[É]]]
é

v:\> echo %@char[%@winapi[user32.dll,CharUpper,%@unicode[é]]]
É
 
Thanks for the Plugins suggestion, it's a good idea. I just downloaded the SDK, I will now study the documentation.
Unless you really want to wite a plugin, you could try replacing each possible accented char (there can't that be that many, can there?) with set upper=%@REPLACE[%@CHAR[...],%@CHAR[...],%lower]
 
I've been playing with something like this:

Code:
wchar_t UpperChar( wchar_t ch )
{
    if ( ( ch >= 'a' ) && ( ch <= 'z' ) )
        return ch - 32;

    if ( ( ch >= 0x00e0 ) && ( ch <= 0x00fe ) )
        if ( ch != 0x00f7 )
            return ch - 32;

    if ( ch == 0x0153 )
        return 0x0152;

    return ch;
}


wchar_t LowerChar( wchar_t ch )
{
    if ( ( ch >= 'A' ) && ( ch <= 'Z' ) )
        return ch + 32;

    if ( ( ch >= 0x00c0 ) && ( ch <= 0x00de ) )
        if ( ch != 0x00d7 )
            return ch + 32;

    if ( ch == 0x0152 )
        return 0x0153;

    if ( ch == 0x1e9e )
        return 0x00df;

    return ch;
}

This is a pretty naïve approach, though. It wouldn't work for, say, Greek, where sigma has two different lowercase forms. Or for uppercasing the German ß to SS.
 
If you use CharUpper/CharLower on a single character, it returns the converted character; that's what I did in post #4. But CharUpper and CharLower can do a whole string in one call. That's done in-place and the function returns the same pointer_to_string that you supplied as its argument. I couldn't figure out a way to wrap that up in @WINAPI. Did I miss something?
 
If you use CharUpper/CharLower on a single character, it returns the converted character; that's what I did in post #4. But CharUpper and CharLower can do a whole string in one call. That's done in-place and the function returns the same pointer_to_string that you supplied as its argument. I couldn't figure out a way to wrap that up in @WINAPI. Did I miss something?

I like this approach better than my own. @UPPER and @LOWER become practically one-liners. @CAPS is still kind of a mess, though.
 
Here's the PureBasic code that I used in a plugin for a new @upper function;
Code:
ProcedureCDLL.i f_Upper(*lpszString)
  Static theString.s
  Static theUpperString.s
  
  theString = PeekS(*lpszString)
  
  If Len(theString) < 1
    WriteConsoleN("USAGE: @Upper[" + "Les élèves français apprennent où placer les caractères accentués" + "]")
    ProcedureReturn #Null
  Else
    theString = Trim(theString)
  EndIf
  
  theUpperString = PeekS(CharUpper_(theString))
  
  PokeS(*lpszString,theUpperString)
  
  ProcedureReturn #Null
EndProcedure

This produces;
Code:
E:\...\plugin>echo %@upper[Les élèves français apprennent où placer les caractères accentués]
LES ÉLÈVES FRANÇAIS APPRENNENT OÙ PLACER LES CARACTÈRES ACCENTUÉS

Joe
Code:
     _x64: 1
   _admin: 1
_elevated: 1

TCC  29.00.17 x64   Windows 10 [Version 10.0.19044.2604]
 
Please find attached a plugin, TCCUtils, which has the modified @upper function using the code from post #9.

plugin /i TCCUtils.dll for more info.

Joe
 

Attachments

  • tccutils.7z
    3.5 KB · Views: 99
Tough ones ... eh? These work for me.

Code:
INT WINAPI f_UCASE(LPWSTR psz)
{
    CharUpper(psz);
    return 0;
}

INT WINAPI f_LCASE(LPWSTR psz)
{
    CharLower(psz);
    return 0;
}
 
Thank you to all! Your solutions work well. Charles et Joe, thanks for your plugins.
And all your answers have taught me a lot about the possibilities of plugins and @WINAPI.
 

Similar threads

Back
Top