I am trying to read a shortcut file to see what file it points to. If I use FileGet then (after parsing) I get the following:
C:\Documents and Settings\bjf\Desktop\Dyrevelferd, et vitenskapelig perspektiv (Bj?rn Forkman).ppt
If on the other hand I use FileGetByte I get the following:
C:\Documents and Settings\bjf\Desktop\Dyrevelferd, et vitenskapelig perspektiv (Björn Forkman).ppt
which is the correct filename. But FileGet is of course much faster, so I would like to use it (it is not possible to use FileReadToEnd etc., shortcut-files are very strange to work with )
Sorry, here it is. (Although I don't think that is the problem - and possibly parsing is an overly ambitious word )
B4X:
FileOpen(c1,Ftxt,cRandom)
txt=FileGet(c1,0,FileSize(Ftxt))
StartPos=StrIndexOf(txt,"C:\",0)
If startpos>0 Then
txt=SubString(txt,StartPos,StrLength(txt)-startpos)
Else
txt="can't find the c:\"
End If
textbox1.Text=txt
("1000" is just used to be certain that I have read the whole file)
B4X:
FileOpen(c1,Ftxt,cRandom)
Do Until i=1000
i=i+1
x=FileGetByte(c1,i)
textbox2.Text=textbox2.Text&Chr(x)
Loop
It looks like a character encoding problem and is probably by design.
FileGet is reading the string with ASCII encoding which only comprises the character Chr(0) to Chr(127) and so doesn't recognise the "ö" character as being in the ASCII character set and so replaces it with a question mark. FilePut also only writes ASCII encoding so at least the two are symmetrical.
Reading it a byte at a time preserves the "ö" because it reads each byte of the file as a number and adds it to the text as a character so all the characters from 0 to 255 will come through. This would probably fail to return the correct characters if the string was UTF8 encoded and contained any multi-byte characters.
You should be OK reading the string as bytes, the worst case is that the filename you get back wouldn't be valid. It might be worth trying to read the filename as bytes and test each byte for a value of zero. Zero is usually used to indicate a string terminator in native Windows strings. I wouldn't be bothered about speed in reading it a byte at a time. The difference won't matter for your appication.
I do get valid filenames when reading them with filegetbyte, so no problem there. It does take longer - unfortunately the short cut files contain a lot of other information, and there are also a number of them (around 250 in my Recent Files folder), but it isn't critical in any way.
The idea of looking for a zero byte is good, I'll use that for the end of the filename (the file is always finished with four zero bytes in a row so that should give me the end of the file instead of the "1000" I used in the example).