What's new Search

Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

Unicode anomaly

Thread starter vefatica
Start date Oct 10, 2009

vefatica

Oct 10, 2009

#1

If I start TCC with /U ...

Code:

v:\> for /l %i in (1,1,1000) ( echo abc >> abc.txt )

v:\> echo %@lines[abc.txt]
999

That's as expected.

Now I use a hex-editor to remove the BOM.

Code:

v:\> hexe abc.txt

v:\> echo %@lines[abc.txt]
1000

While I'm not sure about why the result is different, I am confident TCC doesn't identify the new file as Unicode. The test IS_TEXT_UNICODE_ASCII16 (The text is Unicode, and contains only zero-extended ASCII values/characters) is useless. Here's part of a query I made (without satisfactory results) in microsoft.public.vc.language.

Code:

LPCWSTR szStr[6] = {L"A", L"A ", L"A b", L"A bu", L"A bug", L"A bug!"};
for ( INT i=0; i<6; i++ )
{
    INT test = IS_TEXT_UNICODE_ASCII16;
    BOOL bResult = IsTextUnicode(szStr[i], 2*wcslen(szStr[i]), &test);
    wprintf(L"L\"%s\" is %sUnicode", szStr[i], bResult ? L"" : L"not ");
    wprintf(L" (0x%X)\n", test);
}

L"A" is not Unicode (0x5)
L"A " is Unicode (0x1)
L"A b" is Unicode (0x1)
L"A bu" is not Unicode (0x0)
L"A bug" is not Unicode (0x0)
L"A bug!" is not Unicode (0x0)

The results are different (but equally confusing) results if the terminating NUL
is included in the test:

/* as above but with */
BOOL bResult = IsTextUnicode(szStr[i], 2*wcslen(szStr[i])+2, &test);

L"A" is Unicode (0x1)
L"A " is Unicode (0x1)
L"A b" is not Unicode (0x0)
L"A bu" is not Unicode (0x0)
L"A bug" is not Unicode (0x0)
L"A bug!" is not Unicode (0x0)

FWIW, this is the kind of file you get from CMD started with /U.

You must log in or register to reply here.

Similar threads

@EXECSTR has trouble with Unicode when n < 0

Replies: 2

Views: 628

How can we display unicode characters? Other posts here don't seem to answer

Replies: 15

Views: 1K

Incorrect Unicode detection in "type" and "head" commands

Replies: 7

Views: 1K

Display problem with Unicode/UTF-8 Characters

Replies: 11

Views: 2K

TEE cannot handle Unicode output

Replies: 2

Views: 1K

Peter Murschall

Fullwidth Unicode forms display incorrectly

Replies: 5

Views: 2K

@execstr unicode support

Replies: 6

Views: 2K

TPIPE generate unicode on Piping or redirecting

Replies: 3

Views: 2K

Pasting Unicode data has different behavior on TCC and CMD

Replies: 2

Views: 2K

TYPE goes crazy with no-BOM Unicode file

Replies: 7

Views: 2K

TCC smashing Unicode quotes

Replies: 9

Views: 3K

UNICODE mixed with ANSI Code

Replies: 11

Views: 3K

Unicode, Codepage 437, and line characters

Replies: 3

Views: 3K

How to? Convert Unicode to ANSI

Replies: 1

Views: 2K

Christian Albaret

Fileread fails on Unicode file

Replies: 10

Views: 3K

StarliteLemming

DO ... /P ... and Unicode?

Replies: 3

Views: 2K

Unicode ... I don't understand

Replies: 1

Views: 2K

Echo unicode characters from UTF-8 Batch files?

Replies: 6

Views: 5K

@ASCII vs. @UNICODE

Replies: 5

Views: 2K

How to? Filter history list with unicode chars

Replies: 0

Views: 2K

TYPE, Unicode, installer

Replies: 10

Views: 3K

WAD Limitations on display of unicode punctuation marks

Replies: 11

Views: 4K

Include lists and Unicode

Replies: 1

Views: 2K

How to? How do I read a Unicode file through standard-input?

Replies: 4

Views: 3K

WAD A bit of strangeness related to Unicode-marked file not being Unicode

Replies: 2

Views: 2K

@CHAR and UNICODE

Replies: 4

Views: 2K

LIST command wierdness with empty Unicode file

Replies: 1

Views: 2K

Unicode/dword issue in TCC12

Replies: 4

Views: 3K

dir failure with some unicode characters

Replies: 6

Views: 3K

TCC Unicode support?

Replies: 7

Views: 5K

BOMs in [dir]history / TAIL with Unicode

Replies: 2

Views: 3K

Unicode screw-up in IDE

Replies: 4

Views: 3K

Debugger now Unicode?

Replies: 1

Views: 3K

TYPE /X and Unicode files?

Replies: 0

Views: 3K

Convert ASCII to Unicode or vice versa?

Replies: 6

Views: 6K

HISTORY and DIRHISTORY /R can't handle Unicode

Replies: 0

Views: 3K

Howard Goldstein

Reading an Unicode file with more than 8191 lines

Replies: 1

Views: 4K

An anomaly and a question about EVERYTHING

Replies: 8

Views: 175

Friday at 8:22 AM

WAD Anomaly With %_selected

Replies: 2

Views: 529

Copy/Paste anomaly

Replies: 4

Views: 2K

TCC Window Background Color Anomaly

Replies: 9

Views: 2K

Replies: 8

Views: 2K

WMIQUERY anomaly

Replies: 1

Views: 2K

Replies: 2

Views: 2K

Replies: 6

Views: 3K

Toolbar configuration anomaly

Replies: 5

Views: 3K

Filename completion anomaly

Replies: 2

Views: 4K

Share:

This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
By continuing to use this site, you are consenting to our use of cookies.

Accept Learn more…