Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

How to? How to use TCC in UTF-8 mode?

Feb
7
0
As far as what I read from previous threads, this is not possible. Is it really so? It seems that Windows 7 Notepad is in ANSI mode by default. I'm not sure what encoding most text files in general have, but for me it's mostly UTF-8. What I'm asking is support for TCC (and why not TCCLE, 4NT and even cmd.exe) commands such as "type" to display my text file umlauts correctly. Just the umlauts would be enough (not as crucial to have other special chars display correctly, though that would be splendid too).

I can only display umlauts correctly if the text file was saved in Unicode. How to fix this problem?
 
By the way, Google Docs seem to save plain .txt in UTF-8. That is one reason why I think UTF-8 is important.
 
I have
Code:
alias utf8toansi tpipe /unicode=utf-8,ansi
I can then run
Code:
myutf8command | utf8toansi
utf8toansi /input=myutf8file
Also
Code:
%@UTF8DECODE[s,string]
decodes a string from UTF-8 to the current code page,
Code:
%@UTF8ENCODE[s,string]
encodes from the current code page to UTF-8 — the latter one is not documented in the help however.

For example
Code:
ffind /s /t"%@UTF8ENCODE[s,string]" files… | utf8toansi
 
The UTF8 support in TCC / TCMD (which is waaay more than anything from Microsoft) is fundamentally limited due to the lack of any significant UTF8 support in Windows itself. What UTF8 support there is in TCC consists primarily of hacks to work around the limitations in the Windows APIs.

Microsoft is committed to UTF16 -- I don't think it's likely they're going to scrap it (and break most existing Windows apps) in order to support native UTF8.
 

Similar threads

Back
Top