B4J Question [SOLVED] Request for a code snippet to generate sentences of equal lengths without breaking words...

Mashiane

Expert
Licensed User
Longtime User
Ola

I kindly request a code snipppet that, provided a very long paragraph, it can return the paragraph but each sentence equal in length without breaking words.

For example..

B4X:
Sub BreakStringInEqualLength(para as string, paralen as int) as string
....
...
End Sub

Then

B4X:
Dim eqLen as string = BreakStringInEqualLength("12 andizi kule ndawo 1234567899 a value is a string jhjkd hfkjhksj  khshfks",5)

will return

B4X:
12
andizi
kule
ndawo
1234567899
a
value
is a
string
.....

As you can see, "is a" is two words but because they are less or equal to five characters, they build their own sentence. Whilst 1234567899 is more than 5 characters, it should not be broken as it is one word.

Thanks..
 

Erel

B4X founder
Staff member
Licensed User
Longtime User
B4X:
Sub BreakStringInEqualLength(Text As String, RowLen As Int) As String
   Dim sb As StringBuilder
   sb.Initialize
   Dim CurrentRow As Int
   For Each s As String In Regex.Split(" ", Text)
       If CurrentRow > 0 And CurrentRow + s.Length > RowLen Then
           sb.Append(CRLF)
           CurrentRow = 0
       Else if CurrentRow > 0 Then
           sb.Append(" ")
           CurrentRow = CurrentRow + 1
       End If
       sb.Append(s)
       CurrentRow = CurrentRow + s.Length
   Next
   Return sb.ToString
End Sub

This code works for simple cases where you only split based on spaces. For a real solution the parsing code will be a bit more complicated, you should use Regex.Matcher to actually find the next separator and then handle both the "word" and the specific separator.
 
Upvote 0

Derek Johnson

Active Member
Licensed User
Longtime User
Just tested this in B4A, and if your replace split expression in Erel's code with this it works with one or more whitespace characters.

NB This my be different in B4J!

B4X:
  For Each s As String In Regex.Split("\s+", Text)

Derek
 
Upvote 0

KMatle

Expert
Licensed User
Longtime User
because they are less or equal to five characters, they build their own sentence.

"is a" is less or equel to five chars (= 4)
"is" is less or equel to five chars (= 2)
"a" is less or equel to five chars (= 1)

So after your given rule, it should be 2 "sentences" and not one as given in your example. It would be helpful if you give us more infos what you want to achieve (or it's to warm here so I don't get it).
 
Upvote 0

Mashiane

Expert
Licensed User
Longtime User
My post said...
I kindly request a code snippet that, provided a very long paragraph, it can return the paragraph but each sentence equal in length without breaking words.

I also gave an example method definition and the output of what this can do.

Think of a long paragraph, then think of that paragraph broken into sentences (i.e. a collection of words) and that each sentence should not exceed 5 characters and yet no word should be broken.

"is a" will be a sentence and also less or equal to 5 in length and thus will fit the criteria. Same as "1234567899", whilst longer than 5 characters, its acceptable because no word should break.

Like Erel said (with code provided), it's complex, but do-able. Thing is figuring it out.
 
Last edited:
Upvote 0

Derek Johnson

Active Member
Licensed User
Longtime User
What I am failing to understand here is what you are trying to achieve (and why!). If you split a collection of words into a group, they are not going to form a "Sentence" as most people would understand it. Is this some kind of layout effect you are trying to achieve? The example of a sentence with a maximum length of 5 characters seems very unrealistic.

As far as I can tell this code of Erel's with the change shown to the Regex.split will do what you ask.

B4X:
Sub BreakStringInEqualLength(Text As String, RowLen As Int) As String
   Dim sb As StringBuilder
   sb.Initialize
   Dim CurrentRow As Int
   For Each s As String In Regex.Split("\s+", Text)
       If CurrentRow > 0 And CurrentRow + s.Length > RowLen Then
           sb.Append(CRLF)
           CurrentRow = 0
       Else if CurrentRow > 0 Then
           sb.Append(" ")
           CurrentRow = CurrentRow + 1
       End If
       sb.Append(s)
       CurrentRow = CurrentRow + s.Length
   Next
   Return sb.ToString
End Sub

Derek
 
Upvote 0

Mashiane

Expert
Licensed User
Longtime User
What was I trying to achieve and why? See the toast at the bottom of the LEFT screen?? - the word "value" is broken. Now passing a length of 40 solves my problem as it ensures that things display properly as per RIGHT screen. This is just one part of the issues I wanted this for. Again, thanks for y'all your help!!!

BreakString.gif
BreakStringFix.gif
 
Last edited:
Upvote 0
Top