3

I have to read some tags and attributes from an XML that has a defined structure but since those files can be generated from different sources, they can have different namespaces and prefixes.

This is the first XML sample

<Order xmlns="urn:oasis:names:specification:ubl:schema:xsd:Order-2" xmlns:cac="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:cbc="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <cbc:UBLVersionID>2.1</cbc:UBLVersionID>
    <cbc:CustomizationID>urn:www.cenbii.eu:transaction:biitrns001:ver2.0:extended:urn:www.peppol.eu:bis:peppol3a:ver2.0:extended:urn:www.ubl-italia.org:spec:ordine:ver2.1</cbc:CustomizationID>
    <cbc:ID>ORD-001</cbc:ID>
    <cbc:IssueDate>2016-10-01</cbc:IssueDate>
    <cbc:OrderTypeCode listID="UNCL1001">221</cbc:OrderTypeCode>
    <cac:ValidityPeriod>
        <cbc:EndDate>2024-10-19</cbc:EndDate>
    </cac:ValidityPeriod>
    <cac:BuyerCustomerParty>
        <cac:Party>
            <cbc:EndpointID schemeID="IT:IPA">ITAK12MH</cbc:EndpointID>
            <cac:PartyIdentification>
                <cbc:ID schemeID="IT:VAT">01567570254</cbc:ID>
            </cac:PartyIdentification>
            <cac:PartyName>
                <cbc:Name>A Custom Name</cbc:Name>
            </cac:PartyName>
        </cac:Party>
    </cac:BuyerCustomerParty>
</Order>

This is the second XML sample with different namespaces and prefixes, but same structure (tags, attributes).

<ns10:Order xmlns="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:ns2="urn:oasis:names:specification:ubl:schema:xsd:CommonExtensionComponents-2" xmlns:ns3="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2" xmlns:ns4="http://www.w3.org/2000/09/xmldsig#" xmlns:ns5="http://uri.etsi.org/01903/v1.3.2#" xmlns:ns6="urn:oasis:names:specification:ubl:schema:xsd:SignatureBasicComponents-2" xmlns:ns7="urn:oasis:names:specification:ubl:schema:xsd:SignatureAggregateComponents-2" xmlns:ns8="http://uri.etsi.org/01903/v1.4.1#" xmlns:ns9="urn:oasis:names:specification:ubl:schema:xsd:CommonSignatureComponents-2" xmlns:ns10="urn:oasis:names:specification:ubl:schema:xsd:Order-2">
    <UBLVersionID>2.1</UBLVersionID>
    <CustomizationID>urn:www.cenbii.eu:transaction:biitrns001:ver2.0:extended:urn:www.peppol.eu:bis:peppol3a:ver2.0:extended:urn:www.ubl-italia.org:spec:ordine:ver2.1</CustomizationID>
    <ID>ORD-001</ID>
    <IssueDate>2016-10-01</IssueDate>
    <OrderTypeCode listID="UNCL1001">221</OrderTypeCode>
    <ns3:ValidityPeriod>
        <EndDate>2024-10-19</EndDate>
    </ns3:ValidityPeriod>
    <ns3:BuyerCustomerParty>
        <ns3:Party>
            <EndpointID schemeID="IT:IPA">ITAK12MH</EndpointID>
            <ns3:PartyIdentification>
                <ID schemeID="IT:VAT">01567570254</ID>
            </ns3:PartyIdentification>
            <ns3:PartyName>
                <Name>A Custom Name</Name>
            </ns3:PartyName>
        </ns3:Party>
    </ns3:BuyerCustomerParty>
</ns10:Order>

Those files must be considered the same and so both valid.

A third example can be a file similar to the second where the namespaces are the same but their prefixes are different. Obviously the important thing is that the prefix used to match the namespace belongs to that particular tag.

I have no way of knowing in advance what will be the prefixes associated with namespaces.

<aaa:Order xmlns="urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2" xmlns:aaa="urn:oasis:names:specification:ubl:schema:xsd:Order-2" xmlns:bbb="urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2">
    <UBLVersionID>2.1</UBLVersionID>
    <CustomizationID>urn:www.cenbii.eu:transaction:biitrns001:ver2.0:extended:urn:www.peppol.eu:bis:peppol3a:ver2.0:extended:urn:www.ubl-italia.org:spec:ordine:ver2.1</CustomizationID>
    <ID>ORD-001</ID>
    <IssueDate>2016-10-01</IssueDate>
    <OrderTypeCode listID="UNCL1001">221</OrderTypeCode>
    <bbb:ValidityPeriod>
        <EndDate>2024-10-19</EndDate>
    </bbb:ValidityPeriod>
    <bbb:BuyerCustomerParty>
        <bbb:Party>
            <EndpointID schemeID="IT:IPA">ITAK12MH</EndpointID>
            <bbb:PartyIdentification>
                <ID schemeID="IT:VAT">01567570254</ID>
            </bbb:PartyIdentification>
            <bbb:PartyName>
                <Name>A Custom Name</Name>
            </bbb:PartyName>
        </bbb:Party>
    </bbb:BuyerCustomerParty>
</aaa:Order>

This last file must be considered valid as the others.

As you can see, the association between the tags and their namespaces are always the same. The only things that are changed are the prefixes.

My actual code uses XDocument and XElement classes to read the XML but it can be the way because I need to know the exact prefix for each tag and since they can vary, it works only with the first XML file sample.

XDocument doc;
XmlNamespaceManager manager;

using (XmlReader reader = XmlReader.Create(stream))
{
    doc = XDocument.Load(reader);

    // Retrieving namespaces of XML file
    XPathNavigator navigator = doc.CreateNavigator();
    navigator.MoveToFollowing(XPathNodeType.Element);
    IDictionary<string, string> namespaces = navigator.GetNamespacesInScope(XmlNamespaceScope.All);

    // Add namespaces to an XmlNamespaceManager to read nodes
    manager = new XmlNamespaceManager(reader.NameTable);
    foreach (KeyValuePair<string, string> ns in namespaces)
    {
        manager.AddNamespace(ns.Key, ns.Value);
    }
}

XElement currentNode;

currentNode = doc.Root.XPathSelectElement("cbc:ID", manager);
if (currentNode != null)
    item.DespatchAdviceId = currentNode.Value;

currentNode = doc.Root.XPathSelectElement("cbc:IssueDate", manager);
if (currentNode != null)
{
    DateTime dataEmissione;
    if (DateTime.TryParseExact(currentNode.Value, validDateFormats, CultureInfo.InvariantCulture, DateTimeStyles.None, out dataEmissione))
        item.OrderIssueDate = dataEmissione;
}

currentNode = doc.Root.XPathSelectElement("cac:BuyerCustomerParty/cac:Party/cac:PartyIdentification/cbc:ID", manager);
if (currentNode != null)
{
    item.BuyerPartyId = currentNode.Value;
    if (currentNode.Attribute("schemeID") != null)
        item.BuyerPartySchemeId = currentNode.Attribute("schemeID").Value;
}

// ... and so on...

How can I read the XMLs without having to specify the namespace prefixes? Should I use another .NET library or maybe a 3rd party one?

2 Answers 2

3

Using LocalName, you can linq it without adding the namespace.

//this is for <cbc:ID>ORD-001</cbc:ID>
var element = doc.Root.Elements().Where(x => x.Name.LocalName == "ID").FirstOrDefault();

If you want to go in the nested elements

var element = doc.Root.Elements().Where(x => x.Name.LocalName == "ValidityPeriod").
                 Elements().Where(x=> x.Name.LocalName == "EndDate").FirstOrDefault();
Sign up to request clarification or add additional context in comments.

Comments

0

I need to know the exact prefix for each tag.

No, you don't. The prefixes are entirely irrelevant to qualified name of an element or attribute. If you want to go the XPath route, then don't read the namespaces and prefixes from the document to create your namespace manager, specify them yourself so you know what they are. Then use those in your query. For example, this will work with any of your XML documents:

var manager = new XmlNamespaceManager(new NameTable());

manager.AddNamespace("cbc", 
    "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2");

var id = doc.Root.XPathSelectElement("cbc:ID", manager);

What I would encourage, though, is that you ditch XPath. LINQ to XML is so much nicer. And another quick hint, there is an overload of XDocument.Load that accepts a stream. There's no need to create the XmlReader. So:

XNamespace order = "urn:oasis:names:specification:ubl:schema:xsd:Order-2";
XNamespace cbc = "urn:oasis:names:specification:ubl:schema:xsd:CommonBasicComponents-2";
XNamespace cac = "urn:oasis:names:specification:ubl:schema:xsd:CommonAggregateComponents-2";

var doc = XDocument.Load(stream);

var id = (string) doc.Elements(order + "Order")
    .Elements(cbc + "ID")
    .Single();

var issueDate = (DateTime) doc.Elements(order + "Order")
    .Elements(cbc + "IssueDate")
    .Single();

var buyerPartySchemeId = (string) doc.Descendants(cac + "BuyerCustomerParty")
    .Descendants(cbc + "ID")
    .Attributes("schemeID")
    .Single();

2 Comments

With your approach I need to manually initialize all the namespaces. I don't know wich namespaces are needed and also if they change for any reason (new version or so...) I've to change again the code.
@CheshireCat you know what you want to read from it, so you know the namespaces. If a new version changes the namespaces, what's to say it doesn't change what the elements or attributes are called or what their types are? You'll likely have to update the code either way.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.