Android Question Detecting New Line

khwarizmi

Active Member
Licensed User
Longtime User
Hello all

I used this code to split a statement to words and detect the words that begin with, or end with (.) or (,) or the (.) and (,) themselves:

B4X:
Dim sf As StringFunctions
sf.Initialize
dim m,n as string
m="Charles  Dickens  was  born  on 7th February 1812. The Dickens family lived near Portsmouth, on the south coast of England. Later the family lived in London."
m=m & crlf & "Dickens  had  three brothers and three sisters. He was a small, thin boy. And he was often ill."
Dim ls As List
ls = sf.Split(m, " ")
for i=0 to ls.size-1
n=ls.get(i)
If n="." Or n="," Or n.StartsWith(".") Or n.StartsWith(",") Or n.EndsWith(",") Or n.EndsWith(".") Then
msgbox(n,"")
End If
next


but this code doesn't recognize the word if it is in the end of the line.
 

klaus

Expert
Licensed User
Longtime User
This is 'normal'.
You split your string on spaces.
But, between the last word of the first line and the first word in the second line there is no space, there is only the CRLF character.
You have quite some double spaces in your sentence, is this intentionally?
I suggest this code, using standard B4A string function without the StringFunctions library (the Logs are for testing).

B4X:
Dim m As String
m = "Charles  Dickens  was  born  on 7th February 1812. The Dickens family lived near Portsmouth, on the south coast of England. Later the family lived in London."
m = m & CRLF & "Dickens  had  three brothers and three sisters. He was a small, thin boy. And he was often ill."

m = m.Replace("  ", " ")    'remove double spaces
m = m.Replace(CRLF, " ")    'replace CRLF by a space
Dim n() As String
n = Regex.Split(" ", m)
For i = 0 To n.Length - 1
    ' Log((i) & n(i))
    If n(i) = "." Or n (i) = "," Or n(i).StartsWith(".") Or n(i).StartsWith(",") Or n(i).EndsWith(",") Or n(i).EndsWith(".") Then
        'Msgbox(n,"")
        'Log("D " & n(i))
    End If
Next
 
Upvote 0

khwarizmi

Active Member
Licensed User
Longtime User
thank you very much:), I was looking for this:
m = m.Replace(CRLF, " ")
I also used (.Contains) instead of (.StartsWith) and (.EndsWith)
thank you again
 
Upvote 0

udg

Expert
Licensed User
Longtime User
Hi,
I am not an expert on Regex, but you may want to try out:
B4X:
Dim m As String
m = "Charles Dickens was born on 7th February 1812. The Dickens family lived near Portsmouth, on the south coast of England. Later the family lived in London."
m = m & CRLF & "Dickens had three brothers and three sisters. He was a small, thin boy. And he was often ill."
Dim matcher1 As Matcher
matcher1 = Regex.Matcher("\w*\b", m)
Do While matcher1.Find = True
  Log(matcher1.Match)
Loop
That will return you words and separators alternating between the two (e.d word: Charles; separator: "", word: Dickens..); so, every other item you will have a word without punctuation.
If you're only looking for words ending with commas and full stops you may use "\w*,|\w*\." in your matcher expression.

udg
 
Upvote 0

khwarizmi

Active Member
Licensed User
Longtime User
Thank you Klaus and Udg :)
It seems that it is time to read more about Regex.
 
Upvote 0
Top