Android Question Need some help parsing html string

Discussion in 'Android Questions' started by qsrtech, May 30, 2015.

  1. qsrtech

    qsrtech Active Member Licensed User

    I need some help to extract/remove some tags within HTML. The specific one I need help with (and then I can probably take it from there for any others) is removing the "href" tag and only keeping the content for example
    <a href="somelink">content</a>

    so I only want 'content' left within the string and this has to work for multiple occurrences within the "HTML" string.

    Thanks :)

    EDIT: I kinda solved it with this but if you have a better way please feel free to share it

    iPos=newHTML.IndexOf("<a href=" & QUOTE)
    Do While iPos<>-1
    'find the pos of '">'
                endPos=newHTML.IndexOf2(QUOTE & ">",iPos)
    "<a href=" & QUOTE)
    Last edited: May 30, 2015
  2. NJDude

    NJDude Expert Licensed User

    Peter Simpson and DonManfred like this.
  3. qsrtech

    qsrtech Active Member Licensed User

    Thanks. I already looked at the first post but couldn't figure out how to translate it to my situation. The second post works great for removing all tags.
  4. sorex

    sorex Expert Licensed User

    you could use

    links=Regex.Matcher("<a href=""(.*?)"">(.*?)</a>",myHTML)
    to grab and loop through all hyperlinks and pick only the title.
    Peter Simpson likes this.
  5. qsrtech

    qsrtech Active Member Licensed User

    Ok thanks :)
    I will check it out later. The "regular expressions" are powerful but i don't use it enough to really spend a lot of time to learn the ins and outs. Maybe one day...
  6. inakigarm

    inakigarm Well-Known Member Licensed User