Attempting to correct an HTML table that is incorrectly formatted. I do not have control over the source, my application just loads the contents of a downloaded file as a regular text file. The file contents are a simple HTML table that is missing the closing </tr> elements. I'm attempting to split the contents on <tr> to get an array to which I can a </tr> to the end of the elements that need it. When I attempt to split the string using fleContents.Split("<tr>").ToList I'm getting a lot more elements in the resulting List(Of String) than there should be.
Here I a short little test code that shows the same behavior:
Dim testSource As String = "<table><tr><td>8172745</td><tr><td>8172745</td></table>"
Dim testArr As String() = testSource.Split("<tr>")
'Maybe try splitting on a variable because you can't use a string literal containging "<>" in the Split method
Dim seper as String = "<tr>"
testArr As String() = testSource.Split(seper)
'feed it a new string directly
testArr = testSource .Split(New String("<tr>"))
I would expect that testArr should contain 3 elements, as follows:
"<table>""<td>8172745</td>""<td>8172745</td></table>"
However, I am receiving the following array:
"""table>""tr>""td>8172745""/td>""tr>""td>8172954""/td>""/table>"
Can someone please explain why the strings are being split the way they are and how I can go about getting the results I'm expecting?