ITPub博客

首页 > Linux操作系统 > Linux操作系统 > XML Parsers

XML Parsers

原创 Linux操作系统 作者:rainwly0819 时间:2008-03-28 11:09:35 0 删除 编辑

There are a lot of XML parsers out there. You need to chose one based on your specific requiremets. here are some descriptions of them.

[1] Categories

  • Full validation parsers
  • Parsers that do not validate, but do read the external DTD subset and external DTD parameter entity references, in order to supply entity replacement text and assign attribute types.
  • Parsers that read only internal DTD subset and do not validate

[2] Avaiable Parsers

Xerces-J is that what I choose and I think it is good enough to get start. Get it from Apache Xerces Java.

[3] Parsers

SAX - Read-Only; Quite fast; extremely memory efficient; Not easy to use for complicate XML document


First You need to create your own handler by extends com.xml.sax.helpers.DefaultHandler.


import org.xml.sax.*;
import org.xml.sax.helpers.DefaultHandler;

public class FibonacciHandler extends DefaultHandler {

  private boolean inDouble = false;

  public void startElement(String namespaceURI, String localName,
   String qualifiedName, Attributes atts) throws SAXException {

    if (localName.equals("double")) inDouble = true;

  }

  public void endElement(String namespaceURI, String localName,
   String qualifiedName) throws SAXException {

    if (localName.equals("double")) inDouble = false;

  }

  public void characters(char[] ch, int start, int length)
  throws SAXException {

    if (inDouble) {
      for (int i = start; i < start+length; i++) {
        System.out.print(ch[i]);
      }
    }
  }
}


Reader XML

      XMLReader parser = XMLReaderFactory.createXMLReader(
        "org.apache.xerces.parsers.SAXParser"
      );
      // There's a name conflict with java.net.ContentHandler
      // so we have to use the fully package qualified name.
      org.xml.sax.ContentHandler handler
       = new FibonacciHandler();
      parser.setContentHandler(handler);

      InputStream in = connection.getInputStream(); // Input stream
      InputSource source = new InputSource(in);
      parser.parse(source);
      System.out.println();

      in.close();
      connection.disconnect();
    }
    catch (Exception e) {
      System.err.println(e);
    }

DOM - Tree-based; Random-access; Quite memory intensive; Must use interfaces and factory methods


      DOMParser parser = new DOMParser();
      InputStream in = connection.getInputStream();
      InputSource source = new InputSource(in);
      parser.parse(source);
      in.close();
      connection.disconnect();

      Document doc = parser.getDocument();
      NodeList doubles = doc.getElementsByTagName("double");
      Node datum = doubles.item(0);
      Text result = (Text) datum.getFirstChild();
      System.out.println(result.getNodeValue());
    }
    catch (Exception e) {
      System.err.println(e);
    }

JDOM - Tree-based, More Intuitive
try {
      // Read the response
      InputStream in = connection.getInputStream();
      SAXBuilder parser = new SAXBuilder();
      Document response = parser.build(in);
      in.close();
      connection.disconnect();

      // Walk down the tree
      String result = response.getRootElement()
                       .getChild("params")
                       .getChild("param")
                       .getChild("value")
                       .getChild("double")
                       .getText();
      System.out.println(result);
    }
    catch (Exception e) {
      System.err.println(e);
    }

Dom4j - Tree-based; Pure-Java; Integrate XPath and XSLT and optional DOM compatibility
try {
      // Read the response
      InputStream in = connection.getInputStream();
      SAXReader reader = new SAXReader();
      Document response = reader.read(in);
      in.close();
      connection.disconnect();

      // Use XPath to find the element we want
      Node node = response.selectSingleNode(
       "/methodResponse/params/param/value/double"
      );

      String result = node.getStringValue();
      System.out.println(result);

    }
    catch (Exception e) {
      System.err.println(e);
    }

Hope this can help!

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/8585387/viewspace-219073/,如需转载,请注明出处,否则将追究法律责任。

下一篇: To be 'selfish'
请登录后发表评论 登录
全部评论

注册时间:2008-03-24

  • 博文量
    15
  • 访问量
    8726