1

I have the bellow xml:

<modelingOutput>
    <listOfTopics>
        <topic id="1">
            <token id="354">wish</token>
        </topic>
    </listOfTopics>
    <rankedDocs>
        <topic id="1">
            <documents>
                <document id="1" numWords="0"/>
                <document id="2" numWords="1"/>
                <document id="3" numWords="2"/>
            </documents>
        </topic>
    </rankedDocs>
    <listOfDocs>
        <documents>
            <document id="1">
                <topic id="1" percentage="4.790644689978203%"/>
                <topic id="2" percentage="11.427632949428334%"/>
                <topic id="3" percentage="17.86913349249596%"/>
            </document>
        </documents>
    </listOfDocs>
</modelingOutput>

Ι Want to parse this xml file and get the topic id and percentage from ListofDocs

The first way is to get all document element from xml and then I check if grandfather node is ListofDocs. But the element document exist in rankedDocs and in listOfDocs, so I have a very large list.

So I wonder if exist better solution to parse this xml avoiding if statement?

My code:

public void parse(){
    Document dom = null;
    DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
    DocumentBuilder db = dbf.newDocumentBuilder();
    InputSource is = new InputSource(new StringReader(xml));

    dom = db.parse(is);

    Element doc = dom.getDocumentElement();
    NodeList documentnl = doc.getElementsByTagName("document");
    for (int i = 1; i <= documentnl.getLength(); i++) {
        Node item = documentnl.item(i);
        Node parentNode = item.getParentNode();
        Node grandpNode = parentNode.getParentNode();
        if(grandpNode.getNodeName() == "listOfDocs"{
            //get value
        }
    } 
}

3 Answers 3

2

First, when checking the node name you shouldn't compare Strings using ==. Always use the equals method instead.

You can use XPath to evaluate only the document topic elements under listOfDocs:

XPathFactory xPathFactory = XPathFactory.newInstance();
XPath xPath = xPathFactory.newXPath();
XPathExpression xPathExpression = xPath.compile("//listOfDocs//document/topic");

NodeList topicnl = (NodeList) xPathExpression.evaluate(dom, XPathConstants.NODESET);
for(int i = 0; i < topicnl.getLength(); i++) {
   ...
Sign up to request clarification or add additional context in comments.

Comments

1

If you do not want to use the if statement you can use XPath to get the element you need directly.

DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
Document doc = builder.parse("source.xml");
XPathFactory xPathfactory = XPathFactory.newInstance();
XPath xpath = xPathfactory.newXPath();
XPathExpression expr = xpath.compile("/*/listOfDocs/documents/document/topic");
NodeList nodes = (NodeList) expr.evaluate(doc, XPathConstants.NODESET);

for (int i = 0; i < nodes.getLength(); i++) {
    System.out.println(nodes.item(i).getAttributes().getNamedItem("id"));
    System.out.println(nodes.item(i).getAttributes().getNamedItem("percentage"));
}

Please check GitHub project here.

Hope this helps.

1 Comment

It happens to be basically the same solution as the one proposed by manouti, only a little bit more detailed. Probably working on it at the same time. I will leave it here for reference just in case you want to have a look.
0

I like to use XMLBeam for such tasks:

public class Answer {

    @XBDocURL("resource://data.xml")
    public interface DataProjection {

        public interface Topic {
            @XBRead("./@id")
            int getID();

            @XBRead("./@percentage")
            String getPercentage();
        }

        @XBRead("/modelingOutput/listOfDocs//document/topic")
        List<Topic> getTopics();
    }

    public static void main(final String[] args) throws IOException {
        final DataProjection dataProjection = new XBProjector().io().fromURLAnnotation(DataProjection.class);
        for (Topic topic : dataProjection.getTopics()) {
            System.out.println(topic.getID() + ": " + topic.getPercentage());
        }
    }
}

There is even a convenient way to convert the percentage to float or double. Tell me if you like to have an example.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.