Android Question the first character in text file

khwarizmi

Active Member
Licensed User
Longtime User
Hi all

when I read a text file and split it to separated words, I found that the String length for the first word increases by one the number of letters.
for example if the first line is:

The changing of the seasons is caused by the changing position of the Earth in relation to the Sun.

gives me the length of (The) is 4

Where is the problem ?
 

khwarizmi

Active Member
Licensed User
Longtime User
this is a sample
 

Attachments

  • textfile.zip
    375 KB · Views: 215
Upvote 0

klaus

Expert
Licensed User
Longtime User
The problem is that the file has the BOM characters at the beginning!
These are added if you save a file with UTF-8 encoding in Windows.
Attached the same text file saved without BOM with Notepad++.
 

Attachments

  • rtf1new.txt
    244 bytes · Views: 191
Upvote 0

Erel

B4X founder
Staff member
Licensed User
Longtime User
Complete code that also removes the BOM character:
B4X:
Sub Activity_Create(FirstTime As Boolean)
   For Each line As String In File.ReadList(File.DirAssets, "rtf1.txt")
     If line.StartsWith(Chr(0xFEFF)) Then line = line.SubString(1)
     Dim ls1 As List = Regex.Split(" ", line)
     Log(line)
     For Each s As String In ls1
       Log(s & CRLF & s.Length)
     Next
   Next
End Sub
Note that you should never use TextReader (or TextWriter) unless you need to read a non UTF8 file.
 
Upvote 0

khwarizmi

Active Member
Licensed User
Longtime User
very great information!!
thank you very much Klaus and Erel
 
Upvote 0
Top