Rather than "parsing" the html, you may want to work directly with the dom.
If you have a string of html to begin with, you can create a temporary dom node, and assign the html to that:
var tmp = document.createElement("DIV");
tmp.innerHTML = myHtml;
Then you can iterate the dom tree and do stuff to it:
var result = doStuff(tmp);
where "doStuff()" is defined as such (note the recursion):
function doStuff (elem) {
var string = '';
switch (elem.nodeType) {
case 1: // ELEMENT_NODE
for (var i=0; i<elem.childNodes.length; i++)
string += doStuff(elem.childNodes[i]);
break;
case 3: //TEXT_NODE
string += elem.nodeValue;
break;
}
return string;
}
You can do whatever you need within there....possibly adding spaces and such between the separate strings.
This is easy to build on, and not specific to jquery....just pure javascript Dom-walking stuff. Might seem a bit verbose for a simple problem like yours, but the technique is powerful and is easy to see what is going on.