9

What is the best method to parse multiple, discrete, custom XML documents with Java?

6
  • 1
    Show us how far you have got - we aren't going to write it for you. Do you want to do it with DOM, SAX or other? Which have you tried so far? Which tutorials/documentation have you looked at? Commented Mar 14, 2011 at 13:19
  • Use java dom api or sax for xml parsing. Give more concrete xml structure. Commented Mar 14, 2011 at 13:20
  • Try castor mapping. Commented Mar 14, 2011 at 13:20
  • What have you done so far? You have code to show us? Commented Mar 14, 2011 at 13:20
  • if efficiency is the case, I would use SAX or on a personal preference - StAX Commented Mar 14, 2011 at 13:21

6 Answers 6

5

I would use Stax to parse XML, it's fast and easy to use. I've been using it on my last project to parse XML files up to 24MB. There's a nice introduction on java.net, which tells you everything you need to know to get started.

Sign up to request clarification or add additional context in comments.

Comments

4

Basically, you have two main XML parsing methods in Java :

  • SAX, where you use an handler to only grab what you want in your XML and ditch the rest
  • DOM, which parses your file all along, and allows you to grab all elements in a more tree-like fashion.

Another very useful XML parsing method, albeit a little more recent than these ones, and included in the JRE only since Java6, is StAX. StAX was conceived as a medial method between the tree-based of DOM and event-based approach of SAX. It is quite similar to SAX in the fact that parsing very large documents is easy, but in this case the application "pulls" info from the parser, instead of the parsing "pushing" events to the application. You can find more explanation on this subject here.

So, depending on what you want to achieve, you can use one of these approaches.

1 Comment

copied from my answer on a duplicate thread, to provide more infos about various methods
3

You will want to use org.xml.sax.XMLReader (http://docs.oracle.com/javase/7/docs/api/org/xml/sax/XMLReader.html).

Comments

2

If you only need to parse then I would recommend using XPath library. Here is a nice reference: http://www.ibm.com/developerworks/library/x-javaxpathapi.html

But you may want to consider turning XMLs to objects and then the sky is the limit. For that you may use XStream, this is a great library which i use alot

Comments

2

Use the dom4j library

First read the document

import java.net.URL;

import org.dom4j.Document;
import org.dom4j.DocumentException;
import org.dom4j.io.SAXReader;

public class Foo {

    public Document parse(URL url) throws DocumentException {
        SAXReader reader = new SAXReader();
        Document document = reader.read(url);
        return document;
    }
}

Then use XPATH to get to the values you need

public void get_author(Document document) {
    Node node = document.selectSingleNode( "//AppealRequestProcessRequest/author" );
    String author = node.getText();
    return author;
}

Comments

0

Below is the code of extracting some value value using vtd-xml.

import com.ximpleware.*;

public class extractValue{
    public static void  main(String s[]) throws VTDException, IOException{
        VTDGen vg = new VTDGen();
        if (!vg.parseFile("input.xml", false));
        VTDNav vn = vg.getNav();
        AutoPilot ap = new AutoPilot(vn);
        ap.selectXPath("/aa/bb[name='k1']/value");
        int i=0;
        while ((i=ap.evalXPath())!=-1){
            System.out.println(" value ===>"+vn.toString(i));
        }   
    }
}

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.