0

I'm trying to read an XML feed, I'm not sure the encoding is proper, but it's set to UTF-8 and when I try to parse it in PHP via SimpleXML, it errors on "BöðVar" (note the special "o" characters).

libxml_use_internal_errors(TRUE);
$XMLOutputXMLObj = simplexml_load_string($xml_string);
if($XMLOutputXMLObj !== FALSE)
{
//do stuff
}

This is all I get for an error:

Entity 'ouml' not defined

Entity 'eth' not defined

I tried using "mb_convert_encoding", in various ways, but that failed.

How can I resolve this issue for any character? IE WITHOUT manually replacing ö with &214; (with # of course)?

Even better... is there a way to make it so SimpleXML doesn't care what it is parsing, as long as the tags are intact?

Thanks

1 Answer 1

2

Have you tried to escape the XML data in the node using the <![CDATA[ and ]]> tags before and after the node's text/value? E.g.

<?xml version="1.0" encoding="UTF-8"?>
<fmsdata>
  <result><![CDATA[Success !@#$%^&*()]]></result>
</fmsdata>
Sign up to request clarification or add additional context in comments.

2 Comments

Oh, dur I forgot that I could do that! Man I'm out of touch. Still wish I could figure out how to do it without it though. Ahhh it won't let me accept your answer yet.
Know what you mean about being out of touch... Last week I had to write a XML proxy/web service and it took me ages remembering all the restrictions that XML docs have. The only let-down, is that CDATA tags only works on the nodes' content/value and not on tag attributes... There you will have to escape the strings.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.