Android Question Regex Questions

bocker77

Active Member
Licensed User
Longtime User
I am trying to massage string text that I get from a website and pipe it to the text-to-speech engine. I would like to understand the difference between B4A regex and regex with the reflector. What I am doing is removing the period from abbreviations and replacing with a blank. Text-to-speech does a nasty pause on periods where you don't want it to. I get it to work with the reflector sort of but not with the other. I could just call it a day since I am getting almost the results that I am looking for. But I would like to know how to make it work with B4A's regex. Here is my code that almost works. I do have something wrong where " U.S. " changes to " U S " but U.S.S.R. gets changed to "USSR" which is what I want. I'll figure that one out eventually though.
B4X:
    Pattern = "\.(?![A-Z]{2})"
    Replacement = ""
    m = Regex.Matcher(Pattern, Text)
    Dim r As Reflector
    r.Target = m
    Text = r.RunMethod2("replaceAll", Replacement, "java.lang.String")

But if I choose to use B4A I can't get it to work. I am not sure what the Template does even after looking at the docs.
B4X:
Regex.Replace("\.(?![A-Z]{2})", Text, " $")

If anyone can enlighten me it would be nice. BTW disregard the template value for that is the last value that I tried. Also regex is not my forte along with a lot of other things.
 

drgottjr

Expert
Licensed User
Longtime User
not a direct answer, but there is no "b4a's" regex. it's java's regex. and regex isn't the only way to replace periods. the template looks like you're trying to replace a period if it's followed by 2 or more capital letters? in any case, the replace statment looks screwy to me (you're not replacing the period, that is to say, you're not telling the engine to replace the period...)

also, your b4a example, is missing the assignment. putting aside for the moment whether or not the statement is supposed to do what you want, you need to assign text, the way you did in the reflection example:
B4X:
 text = Regex.Replace("\.(?![A-Z]{2})", Text, " $")
. if that's what you meant by its not working, that would do it.

are you just looking to replace periods with a space? or are you trying to fine-grain it more? the finer you go, the trickier it gets as regex does exactly what you tell it to do.
 
Last edited:
Upvote 0

bocker77

Active Member
Licensed User
Longtime User
OK the Java regex statement is what I really am trying to understand. The reflection statements work. What I am having problems with is what goes into the Template field to get it to work like the reflection if the regex statements are the same. Like I said this not a big deal but it would be nice to use one line of code instead of six.
 
Upvote 0

Jeffrey Cameron

Well-Known Member
Licensed User
Longtime User
As far as I know, the B4A compiler just passes the expression to the underlying Java engine, I get the same results when I try it:
Replacements:
    Dim psString As String = "I'm in the U.S.    not the U.S.S.R."
    Dim psStripped As String = psString.Replace(".", " ")
    Dim psPattern As String = "[\\s.]"  'Same results as "\.(?![A-Z]{2})" just simpler
    Log("Repl Stripped: " & psStripped)
    psStripped = Regex.Replace(psPattern, psString, " ")
    Log("Regx Stripped: " & psStripped)
  
    Dim poMatch As Matcher = Regex.Matcher(psPattern, psString)
    Dim r As Reflector
    r.Target = poMatch
    Log("Refl Stripped: " & r.RunMethod2("replaceAll", " ", "java.lang.String"))
Output:
Log:
Repl Stripped: I'm in the U S     not the U S S R
Regx Stripped: I'm in the U S     not the U S S R
Refl Stripped: I'm in the U S     not the U S S R

Also, @Erel has a good post on RegEx: [Server] Regex Tool | B4X Programming Forum
 
Last edited:
Upvote 0

bocker77

Active Member
Licensed User
Longtime User
I can't use any of those because of the end of sentence period. I need to use the one in my first post. As stated the reflection works and the java one liner doesn't. If I knew what to but in the Template to get the java one to work would be nice but not a big deal. I will just use the reflection method.

Working with test-to-speech is quite the conundrum. I am almost there, I think, but on to another problem. This is just in passing and do not need anyone to look into it but here is what looks to be insurmountable.

" St. Francis St. " get said as "street Francis street". Take out the period on the first and it will say saint correctly. After all day of searching for hits on the web it looks as if there is no solution. At least I can't find one. Oh well if this is all that I have to worry about then my life is pretty good!

I thank you Jeffrey and drgottjr for being so kind to look into this for me.

Greg
 
Upvote 0

drgottjr

Expert
Licensed User
Longtime User
i think it's unrealistic to believe that everything can be reduced to a 1-liner (if for no other reason than it becomes a maintenance nightmare should yet another
exception to some rule be discovered.) and there is no law that an abbreviation can't have more than 1 meaning, or that a sentence can't end with an abbreviation. big jobs are often best broken into smaller jobs.
 
Upvote 0

bocker77

Active Member
Licensed User
Longtime User
Its the old IBM assembler programmer in me that has me think that way. "Can't teach an old dog new tricks" I guess!
 
Upvote 0

drgottjr

Expert
Licensed User
Longtime User
then you of all people should know it takes a thousand tiny steps just to put a character on a terminal screen:)
what was that link that gave rise to your issue? it was in your other post about the webview and speechutterance. i remember seeing the various little stories you were dealing with. who does't enjoy a brisk regex outing? i also use text to speech, but whereas you try to eliminate periods, i add them!
 
Upvote 0

Jeffrey Cameron

Well-Known Member
Licensed User
Longtime User
I can't use any of those because of the end of sentence period.

Ok, this is as close as I can get:
Split multiple periods.:
    Dim psString As String = "I'm in the U.S.    not the U.S.S.R."
    Dim psStripped As String = psString.SubString2(0, psString.LastIndexOf(".")).Replace(".", " ") & "."
    Dim psPattern As String = "\.(?=.*\.)"
    Log("Repl Stripped: " & psStripped)
    psStripped = Regex.Replace(psPattern, psString, " ")
    Log("Regx Stripped: " & psStripped)

    Dim poMatch As Matcher = Regex.Matcher(psPattern, psString)
    Dim r As Reflector
    r.Target = poMatch
    Log("Refl Stripped: " & r.RunMethod2("replaceAll", " ", "java.lang.String"))
This code makes some assumptions, such as each string you're replacing is once sentence. This method would not work on a paragraph containing multiple sentences.
Output:
Output:
Repl Stripped: I'm in the U S     not the U S S R.
Regx Stripped: I'm in the U S     not the U S S R.
Refl Stripped: I'm in the U S     not the U S S R.
This still won't help you with "St. = Saint" versus "St. = Street". You're on your own for that ;)
 
Upvote 0

Jeffrey Cameron

Well-Known Member
Licensed User
Longtime User
i think it's unrealistic to believe that everything can be reduced to a 1-liner (if for no other reason than it becomes a maintenance nightmare should yet another
exception to some rule be discovered.) and there is no law that an abbreviation can't have more than 1 meaning, or that a sentence can't end with an abbreviation. big jobs are often best broken into smaller jobs.
One of my favorite comics:
1614113789381.png
 
Upvote 0

bocker77

Active Member
Licensed User
Longtime User
Just for fun a "Hello World" in IBM assembler. I cheated and found the code on the internet. Its been awhile but I still understand it.

B4X:
HELLO    CSECT               The name of this program is 'HELLO'
 *                            Register 15 points here on entry from OPSYS or caller.
          STM   14,12,12(13)  Save registers 14,15, and 0 thru 12 in caller's Save area
          LR    12,15         Set up base register with program's entry point address
          USING HELLO,12      Tell assembler which register we are using for pgm. base
          LA    15,SAVE       Now Point at our own save area
          ST    15,8(13)      Set forward chain
          ST    13,4(15)      Set back chain               
          LR    13,15         Set R13 to address of new save area
 *                            -end of housekeeping (similar for most programs) -
          WTO   'Hello World' Write To Operator  (Operating System macro)
 *
          L     13,4(13)      restore address to caller-provided save area
          XC    8(4,13),8(13) Clear forward chain
          LM    14,12,12(13)  Restore registers as on entry
          DROP  12            The opposite of 'USING'
          SR    15,15         Set register 15 to 0 so that the return code (R15) is Zero
          BR    14            Return to caller
 *           
 SAVE     DS    18F           Define 18 fullwords to save calling program registers 
          END  HELLO          This is the end of the program
 
Upvote 0

Jeffrey Cameron

Well-Known Member
Licensed User
Longtime User
I grew up on an IBM 3740 (it still had a punch-card reader for COBOL programs!) so you're preaching to the choir here :D
 
Upvote 0

bocker77

Active Member
Licensed User
Longtime User
I believe that was MFT. That was before my time. I was an operator on IBM 360/30s and 360/40s mainframes. So I remember punch cards. I started with MVS R1.8 and up to MVS/ESA as a systems programmer working with the OS. Great fun! Then I went into all kinds of directions having to learn Open Systems and Windows because of supporting middleware, MQSeries later branded WebSphere MQ. As an old cohort said which resonated with me "I am just a blue collar worker in a white collar world". Back then you could get away with that.
 
Upvote 0

jerry07

Member
Licensed User
Longtime User
Jeffrey, sorry but you can't compare COBOL to Assembler. :) Bocker77 has "cool title" and not you and I. :cool:

On positive note now during covid when my kids can see me working from home using telenet session to UNIX or Mainframe session with JCL they think I'm hacking some system. I think since then I become little cooler then mom.

Back to assembler I tried to pick it up but never learned it. I was amazed back in days of Windows 95 - 98 by Steve Gibson from grc.com writing windows utilities in assembler. All his window programs were around 25k (size) including GUI. ;)
 
Last edited:
Upvote 0

Jeffrey Cameron

Well-Known Member
Licensed User
Longtime User
Jeffrey, sorry but you can't compare COBOL to Assembler. :)
I never said I didn't program in mainframe assembler, heck I even wrote some low-level drivers for an early piece of digital camera equipment in 8086 assembler on a windows 95 machine. I just don't like assembly programming :D One of my colleagues once paraphrased Robert Heinlein in a quote that still makes me chuckle to this day, "Assembly programming is like masturbating; it should be done in private and you should wash your hands when you're done."
 
Upvote 0

bocker77

Active Member
Licensed User
Longtime User
So much for reminiscing. Being a system programmer you needed assembler to write exits to the OS. BTW I love your your buddy's quote.
 
Upvote 0
Top