Android Question regex catch first line only

invocker

Active Member
hello,

I need to extract a group from text But I not fammiliar with regex ,Need some one to help me with Please this is what I got
html hier
HTML:
<hr />
<ul>
<li><strong>خالد المشيقح:</strong></li>
</ul>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/01.mp3">شرح التسهيل 1 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/02.mp3">شرح التسهيل 2 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/03.mp3">شرح التسهيل 3</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/04.mp3">شرح التسهيل 4</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/05.mp3">شرح التسهيل 5 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/06.mp3">شرح التسهيل 6</a></p>
<ul>
<li><strong>عبدالسلام الشويعر:</strong></li>
</ul>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/07.mp3">شرح التسهيل 7</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/08.mp3">شرح التسهيل 8 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/09.mp3">شرح التسهيل 9 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/10.mp3">شرح التسهيل 10</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/11.mp3">شرح التسهيل 11 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/12.mp3">شرح التسهيل 12 </a></p>
<ul>
<li><strong>سامي الصقير:</strong></li>
</ul>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/13.mp3">شرح التسهيل 13 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/14.mp3">شرح التسهيل 14 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/15.mp3">شرح التسهيل 15</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/16.mp3">شرح التسهيل 16</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/17.mp3">شرح التسهيل 17 </a></p>
<hr />
 

Attachments

  • Untitled.png.05d1a35e4ba4a11788be9cc9d2a8764e.png
    Untitled.png.05d1a35e4ba4a11788be9cc9d2a8764e.png
    55.5 KB · Views: 46
  • Untitled.png.05d1a35e4ba4a11788be9cc9d2a8764e.png
    Untitled.png.05d1a35e4ba4a11788be9cc9d2a8764e.png
    55.5 KB · Views: 51
Solution
this:
B4X:
    Dim pat As String = $"href=(".*?)</a></p>|<strong>(.+?):</str"$
    Dim mat As Matcher = Regex.Matcher2(pat, Regex.MULTILINE, LookIn)
    Do While mat.Find
        If mat.Group(1) <> Null Then
            Log(mat.Group(1).Replace(">"," "))
        else if mat.Group(2) <> Null Then
            Log(mat.Group(2))
        End If
    Loop

will get you this:

خالد المشيقح
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/01.mp3" شرح التسهيل 1
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/02.mp3" شرح التسهيل 2
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/03.mp3" شرح التسهيل 3
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/04.mp3" شرح التسهيل 4...

epiCode

Active Member
Licensed User
hello,

I need to extract a group from text But I not fammiliar with regex ,Need some one to help me with Please this is what I got
html hier
What exactly do you want to extract. Give an example.
 
Upvote 0

emexes

Expert
Licensed User
I need to extract a group from text But I not fammiliar with regex ,Need some one to help me with Please this is what I got

What exactly do you want to extract. Give an example.
+1

But in the meantime, this random guess might be enough to get you pointed in the right direction:

B4X:
Dim LookIn As String = $"<hr />
<ul>
<li><strong>خالد المشيقح:</strong></li>
</ul>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/01.mp3">شرح التسهيل 1 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/02.mp3">شرح التسهيل 2 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/03.mp3">شرح التسهيل 3</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/04.mp3">شرح التسهيل 4</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/05.mp3">شرح التسهيل 5 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/06.mp3">شرح التسهيل 6</a></p>
<ul>
<li><strong>عبدالسلام الشويعر:</strong></li>
</ul>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/07.mp3">شرح التسهيل 7</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/08.mp3">شرح التسهيل 8 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/09.mp3">شرح التسهيل 9 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/10.mp3">شرح التسهيل 10</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/11.mp3">شرح التسهيل 11 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/12.mp3">شرح التسهيل 12 </a></p>
<ul>
<li><strong>سامي الصقير:</strong></li>
</ul>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/13.mp3">شرح التسهيل 13 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/14.mp3">شرح التسهيل 14 </a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/15.mp3">شرح التسهيل 15</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/16.mp3">شرح التسهيل 16</a></p>
<p><a class="stealth download-pill" href="https://archive.org/download/Kitab_Tasheel_fi_Fiqh/17.mp3">شرح التسهيل 17 </a></p>
<hr />"$

Dim LookFor As String = $"href\=\"([^\"]*)\""$
Dim m As Matcher = Regex.Matcher(LookFor, LookIn)
Do While m.Find
    Log(m.Group(1)) 
Loop

Log output::
Waiting for debugger to connect...
Program started.
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/01.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/02.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/03.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/04.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/05.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/06.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/07.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/08.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/09.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/10.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/11.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/12.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/13.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/14.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/15.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/16.mp3
https://archive.org/download/Kitab_Tasheel_fi_Fiqh/17.mp3
Program terminated (StartMessageLoop was not called).
 
Upvote 0

emexes

Expert
Licensed User
href\=\"([^\"]*)\"

looks like the cat walked across the keyboard, but is actually a regex expression/pattern/template that is looking for:

href=" (the \ are to say: the next character is actually a character to look for, don't interpret it as a special command)
followed by collecting everything inside the ( ) brackets into Matcher.Group(1), ie will match/collect all characters except "
followed by "

https://regex101.com/r/qTRD84/1
 
Upvote 0

invocker

Active Member
Thank's for replay I need to extract author with mp3 and all their lessons
example :

a- سامي الصقير
1 "https://archive.org/download/Kitab_Tasheel_fi_Fiqh/13.mp3" شرح التسهيل 13
2 "https://archive.org/download/Kitab_Tasheel_fi_Fiqh/14.mp3" شرح التسهيل 14
3 "https://archive.org/download/Kitab_Tasheel_fi_Fiqh/15.mp3" شرح التسهيل 15
4 "https://archive.org/download/Kitab_Tasheel_fi_Fiqh/16.mp3" شرح التسهيل 16
5 "https://archive.org/download/Kitab_Tasheel_fi_Fiqh/17.mp3" شرح التسهيل 17

I can Get it using two pattern but I ask if I can get it using one pattern
 
Upvote 0

emexes

Expert
Licensed User
I can Get it using two pattern but I ask if I can get it using one pattern

Not easily. Getting both author name and lessons with one pattern is doable, but not straightforward.

If you are already have working code (using two patterns?) then I'd stick with that.

ps the right-to-left text causing the track numbers to be physically in the string AFTER the track titles... that took me a while to work out 🤣
 
Last edited:
Upvote 0

drgottjr

Expert
Licensed User
Longtime User
this:
B4X:
    Dim pat As String = $"href=(".*?)</a></p>|<strong>(.+?):</str"$
    Dim mat As Matcher = Regex.Matcher2(pat, Regex.MULTILINE, LookIn)
    Do While mat.Find
        If mat.Group(1) <> Null Then
            Log(mat.Group(1).Replace(">"," "))
        else if mat.Group(2) <> Null Then
            Log(mat.Group(2))
        End If
    Loop

will get you this:

خالد المشيقح
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/01.mp3" شرح التسهيل 1
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/02.mp3" شرح التسهيل 2
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/03.mp3" شرح التسهيل 3
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/04.mp3" شرح التسهيل 4
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/05.mp3" شرح التسهيل 5
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/06.mp3" شرح التسهيل 6
عبدالسلام الشويعر
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/07.mp3" شرح التسهيل 7
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/08.mp3" شرح التسهيل 8
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/09.mp3" شرح التسهيل 9
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/10.mp3" شرح التسهيل 10
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/11.mp3" شرح التسهيل 11
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/12.mp3" شرح التسهيل 12
سامي الصقير
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/13.mp3" شرح التسهيل 13
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/14.mp3" شرح التسهيل 14
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/15.mp3" شرح التسهيل 15
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/16.mp3" شرح التسهيل 16
"https://archive.org/download/Kitab_Tasheel_fi_Fiqh/17.mp3" شرح التسهيل 17

is this not what you want?
 
Upvote 2
Solution

emexes

Expert
Licensed User
Dim pat As String = $"href=(".*?)</a></p>|<strong>(.+?):</str"$
Dim mat As Matcher = Regex.Matcher2(pat, Regex.MULTILINE, LookIn)
Do While mat.Find
If mat.Group(1) <> Null Then
Log(mat.Group(1).Replace(">"," "))
else if mat.Group(2) <> Null Then
Log(mat.Group(2))

Lol every day I learn something new, and today I learned a nifty way to conduct two searches in parallel. 🏆
 
Upvote 0

invocker

Active Member
Thank you drgottjr work fine Thank's I modify pattern To get all what I need

B4X:
href="([^"]*)">([^"]*)</a></p>|<strong>([^>]*):</str

Untitled.png
 
Last edited:
Upvote 0
Top