Using Python or XSLT, I would like to know how to convert highly complex, hierarchical nested XML file to CSV including all the sub-elements and without hard coding as few element nodes as possible or is rational/effective?
Please find attached simplified XML example and the output CSV to get a better understanding of what I’m trying to achieve.
The actual XML file has much more elements but the data hierarchy and the nesting is like in the example. <InvoiceRow> element and its sub-elements are the only repeating elements in the XML file, all the other elements are static that are repeated in the output CSV as many times as there are <InvoiceRow> elements in the XML file.
It’s the repeating <InvoiceRow> element that is causing trouble for me. Elements that don’t repeat are easy to convert to CSV without hard coding any elements.
Complex XML scenarios, with hierarchical data structures and multiple one-to-many relationships all being stored in a single XML file. Structured text file.
Example XML input:
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<Invoice>
<SellerDetails>
<Identifier>1234-1</Identifier>
<SellerAddress>
<SellerStreet>Street1</SellerStreet>
<SellerTown>Town1</SellerTown>
</SellerAddress>
</SellerDetails>
<BuyerDetails>
<BuyerIdentifier>1234-2</BuyerIdentifier>
<BuyerAddress>
<BuyerStreet>Street2</BuyerStreet>
<BuyerTown>Town2</BuyerTown>
</BuyerAddress>
</BuyerDetails>
<BuyerNumber>001234</BuyerNumber>
<InvoiceDetails>
<InvoiceNumber>0001</InvoiceNumber>
</InvoiceDetails>
<InvoiceRow>
<ArticleName>Article1</ArticleName>
<RowText>Product Text1</RowText>
<RowText>Product Text2</RowText>
<RowAmount AmountCurrencyIdentifier="EUR">10.00</RowAmount>
</InvoiceRow>
<InvoiceRow>
<ArticleName>Article2</ArticleName>
<RowText>Product Text11</RowText>
<RowText>Product Text22</RowText>
<RowAmount AmountCurrencyIdentifier="EUR">20.00</RowAmount>
</InvoiceRow>
<InvoiceRow>
<ArticleName>Article3</ArticleName>
<RowText>Product Text111</RowText>
<RowText>Product Text222</RowText>
<RowAmount AmountCurrencyIdentifier="EUR">30.00</RowAmount>
</InvoiceRow>
<EpiDetails>
<EpiPartyDetails>
<EpiBfiPartyDetails>
<EpiBfiIdentifier IdentificationSchemeName="BIC">XXXXX</EpiBfiIdentifier>
</EpiBfiPartyDetails>
</EpiPartyDetails>
</EpiDetails>
<InvoiceUrlText>Some text</InvoiceUrlText>
</Invoice>
Example CSV output:
Identifier,SellerStreet,SellerTown,BuyerIdentifier,BuyerStreet,BuyerTown,BuyerNumber,InvoiceNumber,ArticleName,RowText,RowText,RowAmount,EpiBfiIdentifier,InvoiceUrlText
1234-1,Street1,Town1,1234-2,Street2,Town2,1234,1,Article1,Product Text1,Product Text2,10,XXXXX,Some text
1234-1,Street1,Town1,1234-2,Street2,Town2,1234,1,Article2,Product Text11,Product Text22,20,XXXXX,Some text
1234-1,Street1,Town1,1234-2,Street2,Town2,1234,1,Article3,Product Text111,Product Text222,30,XXXXX,Some text