Welcome!

By registering with us, you'll be able to discuss, share and private message with other members of our community.

SignUp Now!

Finding the offset of a string in a file

May
12,834
163
I have a BTM which will start DEVENV with the most recently used .SLN file as an argument (avoiding the pick_a_project dialog). AFAIK, that info is only recorded in a file named "ApplicationPrivateSettings.xml", buried deep in the %APPDATA folder. I can get the file name with FFIND (or WHERE.EXE). After that I must find the first occurrence of the string "FullPath" in the file. I can do that with @FILEREAD (et al.) but searching about 10,000 bytes into the file is rather slow (about 3 seconds). For now I use SED.EXE to change "FullPath" to "NullPath" and send the output to a temp file ... then use CMP.EXE on the original file and the temp file and pick the offset out of CMP's output. That's pretty fast. But I'd like a more direct way to get the offset. Can anyone suggest another way, built-in or otherwise?
I've tried the @XML* functions without success. That may be because of my almost non-existent knowledge of XML. Looking at the file is difficult because its 44K bytes are all on one line? Is there a Windows (or free 3rd party) tool to make looking at the file easier? If anyone wants to try the @XML functions, I can post the file.

Thanks!
 
If you're willing to install Python (3.8 or higher) on your machine, you can use the attached script, which, for instance, performs an offset search in a 1MB source-code file in about 0.1s (which is mostly interpreter startup time -- for instance, a 100K file offset search takes 0.09s, so I assume whatever you throw at it will be fast).

Code:
Usage: python find-offset.py <filename> "string to find"

Prints the character offset of the string, or -1 if not found.
 
Huh, apparently the forum doesn't like attaching Python files. Here's the entirety of find-offset.py inline (it's very short).

Code:
import sys

argv = sys.argv

if len(argv) != 3:
    print("""\
Usage: python find-offset.py <filename> "string to find"

Prints the character offset of the string, or -1 if not found.""");
    sys.exit(1)

filename = argv[1]
needle = argv[2]

with open(filename, 'r', encoding="utf-8") as f:
    contents = f.read()

idx = contents.find(needle)
print(idx)
sys.exit(0)
 
HTML Tidy can reformat and pretty-print XML too. It's a CLI program, easy to use from scripts


Code:
tidy --quiet yes -xml -i input-file.xml > pretty-file.xml
 
If you have Microsoft Edge (the new Chromium edition) or even the older IE11, you can drag the XML file into it. It has a built-in XML viewer.
 
If you have Microsoft Edge (the new Chromium edition) or even the older IE11, you can drag the XML file into it. It has a built-in XML viewer.
I do have edge, but I never use it. I tried it. The file looks better in edge than it does in MS's "XML Notepad".

But the bottom line is I want to get the offset of the first "FullPath" (or the actual value of it) in a script. Edge won't help with that.
 
It looks like that file is a mixture of XML and JSON. I was able to extract the JSON fairly easily.
Code:
echo %@xmlopen[C:\Users\mintz\AppData\Local\Microsoft\VisualStudio\16.0_66903569\ApplicationPrivateSettings.xml]
echo %@xmlxpath[/content/indexed/collection[@name="CodeContainers.Offline"]/value]
 
It looks like that file is a mixture of XML and JSON. I was able to extract the JSON fairly easily.
Code:
echo %@xmlopen[C:\Users\mintz\AppData\Local\Microsoft\VisualStudio\16.0_66903569\ApplicationPrivateSettings.xml]
echo %@xmlxpath[/content/indexed/collection[4]/value]
Yes, but that shows a list of used projects (and properties) that has no newlines. I have 53 entries (recently whittled down from 102, manually and tediously). Here's such a list with only 2 entries. Do you know how to go inside that list and pick 1 of the properties of one entry in the list?

Code:
[{"Key":"D:\\Projects2019\\filestr\\filestr.sln","Value":{"LocalProperties":{"FullPath":"D:\\Projects2019\\filestr\\filestr.sln","Type":0,"SourceControl":null},"Remote":null,"IsFavorite":false,"LastAccessed":"2022-05-24T21:50:03.5798427+00:00","IsLocal":true,"HasRemote":false,"IsSourceControlled":false}},{"Key":"P:\\4Utils\\4Utils.sln","Value":{"LocalProperties":{"FullPath":"P:\\4Utils\\4Utils.sln","Type":0,"SourceControl":null},"Remote":null,"IsFavorite":false,"LastAccessed":"2022-05-24T21:48:20.5397892+00:00","IsLocal":true,"HasRemote":false,"IsSourceControlled":false}},{"Key":"P:\\lastproject\\lastproject.sln","Value":{"LocalProperties":{"FullPath":"P:\\lastproject\\lastproject.sln","Type":0,"SourceControl":null},"Remote":null,"IsFavorite":false,"LastAccessed":"2022-05-24T15:48:24.3668938+00:00","IsLocal":true,"HasRemote":false,"IsSourceControlled":false}}, ...]

I'm after the first FullPath.
 
I pasted the JSON into jsonpath.com and it displays it nicely. But I can't figure out the secret to parse it.
 
I pasted the JSON into jsonpath.com and it displays it nicely. But I can't figure out the secret to parse it.
If I did that right, it doesn't look all that good.

1653446821126.png


Here it is in Firefox. Each record in the list starts on a new line. Edge is similar.

1653446929745.png
 
No. But that was not what you asked for:

<Is there a Windows (or free 3rd party) tool to make looking at the file easier?>
OK. The first paragraph was the main question but I didn't actually ask a question.
 
But the bottom line is I want to get the offset of the first "FullPath" (or the actual value of it) in a script.
This returns the "actual value of it".
Code:
perl -p -e 's/.*FullPath":"(.*?)".*/\1/;' -e 's/\\\\/\\/g;' ApplicationPrivateSettings.xml

Edit:
Here it is again if you're running it from Windows command line (CMD/TCC) instead of from Cygwin.
Code:
perl -p -e "s/.*FullPath\":\"(.*?)\".*/\1/;" -e "s/\\\\/\\/g;" ApplicationPrivateSettings.xml
 
Last edited:
Thanks @JohnQSmith. I don't have perl but I'll try making a GnuWin32 sed.exe version of that (which may be a chore). This is what I'm using now to get the offset (then I use the @FILE* functions to get the string.

Code:
sed -e "s/FullPath/NullPath/g" %filename > %tmpfile

set offset=%@word[" ,",4,%@execstr[cmp %filename %tmpfile]]
 
Whew! Almost ...

Code:
v:\> sed -e "s/.*\"FullPath\":\"\([^^\"]*\).*/\1/;" -e "s/\\\\/\\/g;" %file
P:\Linux2\Linux2.sln

But that's the last one in the file. That's because SED is greedy. And SED doesn't have a non-greedy operator. But if I change only the first "FullPath" to "NullPath" and look for "NullPath" I get what I want.

Code:
v:\> sed -e "s/FullPath/NullPath/" -e "s/.*\"NullPath\":\"\([^^\"]*\).*/\1/;" -e "s/\\\\/\\/g;" %file
D:\Projects2019\filestr\filestr.sln

Thanks again @JohnQSmith.
 
Interesting! This fails because the opening '(' is outside quotes (from TCC's point of view) and the ')' is inside quotes (from TCC's point of view) ... thus not considered the closing ')'.

1653501561424.png


That's fixed by escaping (a la TCC) the '('.

Code:
v:\> echo %@execstr[sed -e "s/FullPath/NullPath/" -e "s/.*\"NullPath\":\"\^([^^\"]*\).*/\1/" -e "s/\\\\/\\/g" %file]
D:\Projects2019\filestr\filestr.sln

Whew! (again)
 
I've installed perl many times, usually ActiveState. But I need it so seldom that I never wind up learning even the basics. I always manage to survive on BTMs, C, and VBscript (and I know darn little about VBscript).
 
I use it to run other people's scripts (awstats, sendEmail, ack) and as a SED replacement.
Edit: I don't know if that's a good enough reason to keep 620MB of files on my hard drive, but it comes in handy when I need it.
 
It looks like that file is a mixture of XML and JSON. I was able to extract the JSON fairly easily.
Code:
echo %@xmlopen[C:\Users\mintz\AppData\Local\Microsoft\VisualStudio\16.0_66903569\ApplicationPrivateSettings.xml]
echo %@xmlxpath[/content/indexed/collection[@name="CodeContainers.Offline"]/value]
I got that one to work also, @samintz. Thanks!

Code:
v:\> echo %@execstr[echo %@xmlxpath[/content/indexed/collection[4]/value] | cut -c9- | cut -d "," -f1 | sed -e "s/\\\\/\\/g"]
"D:\Projects2019\filestr\filestr.sln"
 
Code:
jq -r .[0].Value.LocalProperties.FullPath < xx.json
You can get native version of JSON query tool from the project homepage Download jq or using Cygwin setup launcher.
 

Similar threads

Back
Top