1

Attempting to correct an HTML table that is incorrectly formatted. I do not have control over the source, my application just loads the contents of a downloaded file as a regular text file. The file contents are a simple HTML table that is missing the closing </tr> elements. I'm attempting to split the contents on <tr> to get an array to which I can a </tr> to the end of the elements that need it. When I attempt to split the string using fleContents.Split("<tr>").ToList I'm getting a lot more elements in the resulting List(Of String) than there should be.

Here I a short little test code that shows the same behavior:

Dim testSource As String = "<table><tr><td>8172745</td><tr><td>8172745</td></table>"
Dim testArr As String() = testSource.Split("<tr>")

'Maybe try splitting on a variable because you can't use a string literal containging "<>" in the Split method
Dim seper as String = "<tr>"
testArr As String() = testSource.Split(seper)

'feed it a new string directly
testArr = testSource .Split(New String("<tr>"))

I would expect that testArr should contain 3 elements, as follows:

  1. "<table>"
  2. "<td>8172745</td>"
  3. "<td>8172745</td></table>"

However, I am receiving the following array:

  1. ""
  2. "table>"
  3. "tr>"
  4. "td>8172745"
  5. "/td>"
  6. "tr>"
  7. "td>8172954"
  8. "/td>"
  9. "/table>"

Can someone please explain why the strings are being split the way they are and how I can go about getting the results I'm expecting?

2 Answers 2

2

Your code is using a different overload of the Split method than you're expecting. You want the method that takes a String[] and StringSplitOptions parameter:

Dim testSource As String = "<table><tr><td>8172745</td><tr><td>8172745</td></table>"
Dim delimeter As String() = { "<tr>" }
Dim testArr As String() = _
    testSource.Split(delimeter, StringSplitOptions.RemoveEmptyEntries)

You can see it working at IDEOne:

http://ideone.com/pcw6aq

Sign up to request clarification or add additional context in comments.

Comments

1

Try to use Regex like that

Imports System.Text.RegularExpressions

Public Class Form1


    Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles Button1.Click
        Dim testSource As String = "<table><tr><td>8172745</td><tr><td>8172745</td></table>"
        Dim testArr As String() = Regex.Split(testSource, "<tr>")

        'Show The Array in TextBox1
        TextBox1.Lines = testArr

    End Sub
End Class

All The Best

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.