Share My Creation StringExtractPaths Snippet

Hello everyone.

I converted a regular expression I wrote a while ago in AutoIt language. The purpose of this snippet is to extract all the network paths, addresses, and links from a string and returns a list populated with the results.

Here is the snippet:
B4X:
'StringExtractPaths by Jmon:
'Extracts network paths from a string
'and returns all the separate paths in a list.
Private Sub StringExtractPaths(s As String) As List
    Dim lst As List : lst.Initialize
    Dim ptrn As String = "(?i)((?:(?:https?|rtp|mms|rtsp|file|ftps?)\:)?[\\|\/]{2}[\w-_]*[\w\\\/?&=.~;\-+!*_#%]*)|([a-z]:[\\\/][\w\.-_\\\/]*)|(w{3}\.[\w\\\/?&=.~;\-+!*_#%]*)"   
    Dim Matcher As Matcher = Regex.Matcher(ptrn, s)
    Do While Matcher.Find
        lst.Add(Matcher.Match)
    Loop
    Return lst
End Sub

Comments and critics are welcome. If you find any way to improve this regular expression, please share.

I also attached a project that demonstrates how to use it. I have only tested this project in Windows, so if someone could test with Linux or Macintosh that would be great.

jmon.
 

Attachments

  • StringExtractPaths.zip
    1.2 KB · Views: 588

Theera

Well-Known Member
Licensed User
Longtime User
Hi Jmon,

Why do I cann't download? Thank you in advance.
 

Theera

Well-Known Member
Licensed User
Longtime User
Ok. I can do now.
 

peacemaker

Expert
Licensed User
Longtime User
Thanks for the sample, but how to also include such situation ?

B4X:
one two http://link1.com and second,http://link2.ru;and 3rd:https://link3.biz, end.
?
 
Last edited:

peacemaker

Expert
Licensed User
Longtime User
Seems, "[" and "]" are not supported, in links like
B4X:
&lmi_shoppingcart.items[1].qty=1"

Help to update the pattern, please.
 
Last edited:

jmon

Well-Known Member
Licensed User
Longtime User
Help to update the pattern, please.
Please try with this pattern:
B4X:
"(?i)((?:(?:https?|rtp|mms|rtsp|file|ftps?)\:)?[\\|\/]{2}[\w-_]*[\w\\\/?&=.~;\-+!*_#%\[\]]*)|([a-z]:[\\\/][\w\.-_\\\/]*)|(w{3}\.[\w\\\/?&=.~;\-+!*_#%]*)"
 
Top