Package groovy.xml

Class XmlParser

java.lang.Object
groovy.xml.XmlParser
All Implemented Interfaces:
org.xml.sax.ContentHandler

public class XmlParser
extends java.lang.Object
implements org.xml.sax.ContentHandler
A helper class for parsing XML into a tree of Node instances for a simple way of processing XML. This parser does not preserve the XML InfoSet - if that's what you need try using W3C DOM, dom4j, JDOM, XOM etc. This parser ignores comments and processing instructions and converts the XML into a Node for each element in the XML with attributes and child Nodes and Strings. This simple model is sufficient for most simple use cases of processing XML.

Example usage:

 import groovy.xml.XmlParser
 def xml = '<root><one a1="uno!"/><two>Some text!</two></root>'
 def rootNode = new XmlParser().parseText(xml)
 assert rootNode.name() == 'root'
 assert rootNode.one[0].@a1 == 'uno!'
 assert rootNode.two.text() == 'Some text!'
 rootNode.children().each { assert it.name() in ['one','two'] }
 
  • Constructor Summary

    Constructors
    Constructor Description
    XmlParser()
    Creates a non-validating and namespace-aware XmlParser which does not allow DOCTYPE declarations in documents.
    XmlParser​(boolean validating, boolean namespaceAware)
    Creates a XmlParser which does not allow DOCTYPE declarations in documents.
    XmlParser​(boolean validating, boolean namespaceAware, boolean allowDocTypeDeclaration)
    Creates a XmlParser.
    XmlParser​(javax.xml.parsers.SAXParser parser)  
    XmlParser​(org.xml.sax.XMLReader reader)  
  • Method Summary

    Modifier and Type Method Description
    protected void addTextToNode()  
    void characters​(char[] buffer, int start, int length)  
    protected Node createNode​(Node parent, java.lang.Object name, java.util.Map attributes)
    Creates a new node with the given parent, name, and attributes.
    void endDocument()  
    void endElement​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)  
    void endPrefixMapping​(java.lang.String prefix)  
    org.xml.sax.Locator getDocumentLocator()  
    org.xml.sax.DTDHandler getDTDHandler()  
    protected java.lang.Object getElementName​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)
    Return a name given the namespaceURI, localName and qName.
    org.xml.sax.EntityResolver getEntityResolver()  
    org.xml.sax.ErrorHandler getErrorHandler()  
    boolean getFeature​(java.lang.String uri)  
    java.lang.Object getProperty​(java.lang.String uri)  
    protected org.xml.sax.XMLReader getXMLReader()  
    void ignorableWhitespace​(char[] buffer, int start, int len)  
    boolean isKeepIgnorableWhitespace()
    Returns the current keep ignorable whitespace setting.
    boolean isNamespaceAware()
    Determine if namespace handling is enabled.
    boolean isTrimWhitespace()
    Returns the current trim whitespace setting.
    Node parse​(java.io.File file)
    Parses the content of the given file as XML turning it into a tree of Nodes.
    Node parse​(java.io.InputStream input)
    Parse the content of the specified input stream into a tree of Nodes.
    Node parse​(java.io.Reader in)
    Parse the content of the specified reader into a tree of Nodes.
    Node parse​(java.lang.String uri)
    Parse the content of the specified URI into a tree of Nodes.
    Node parse​(org.xml.sax.InputSource input)
    Parse the content of the specified input source into a tree of Nodes.
    Node parseText​(java.lang.String text)
    A helper method to parse the given text as XML.
    void processingInstruction​(java.lang.String target, java.lang.String data)  
    void setDocumentLocator​(org.xml.sax.Locator locator)  
    void setDTDHandler​(org.xml.sax.DTDHandler dtdHandler)  
    void setEntityResolver​(org.xml.sax.EntityResolver entityResolver)  
    void setErrorHandler​(org.xml.sax.ErrorHandler errorHandler)  
    void setFeature​(java.lang.String uri, boolean value)  
    void setKeepIgnorableWhitespace​(boolean keepIgnorableWhitespace)
    Sets the keep ignorable whitespace setting value.
    void setNamespaceAware​(boolean namespaceAware)
    Enable and/or disable namespace handling.
    void setProperty​(java.lang.String uri, java.lang.Object value)  
    void setTrimWhitespace​(boolean trimWhitespace)
    Sets the trim whitespace setting value.
    void skippedEntity​(java.lang.String name)  
    void startDocument()  
    void startElement​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, org.xml.sax.Attributes list)  
    void startPrefixMapping​(java.lang.String prefix, java.lang.String namespaceURI)  

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

    Methods inherited from interface org.xml.sax.ContentHandler

    declaration
  • Constructor Details

    • XmlParser

      public XmlParser() throws javax.xml.parsers.ParserConfigurationException, org.xml.sax.SAXException
      Creates a non-validating and namespace-aware XmlParser which does not allow DOCTYPE declarations in documents.
      Throws:
      javax.xml.parsers.ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
      org.xml.sax.SAXException - for SAX errors.
    • XmlParser

      public XmlParser​(boolean validating, boolean namespaceAware) throws javax.xml.parsers.ParserConfigurationException, org.xml.sax.SAXException
      Creates a XmlParser which does not allow DOCTYPE declarations in documents.
      Parameters:
      validating - true if the parser should validate documents as they are parsed; false otherwise.
      namespaceAware - true if the parser should provide support for XML namespaces; false otherwise.
      Throws:
      javax.xml.parsers.ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
      org.xml.sax.SAXException - for SAX errors.
    • XmlParser

      public XmlParser​(boolean validating, boolean namespaceAware, boolean allowDocTypeDeclaration) throws javax.xml.parsers.ParserConfigurationException, org.xml.sax.SAXException
      Creates a XmlParser.
      Parameters:
      validating - true if the parser should validate documents as they are parsed; false otherwise.
      namespaceAware - true if the parser should provide support for XML namespaces; false otherwise.
      allowDocTypeDeclaration - true if the parser should provide support for DOCTYPE declarations; false otherwise.
      Throws:
      javax.xml.parsers.ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
      org.xml.sax.SAXException - for SAX errors.
    • XmlParser

      public XmlParser​(org.xml.sax.XMLReader reader)
    • XmlParser

      public XmlParser​(javax.xml.parsers.SAXParser parser) throws org.xml.sax.SAXException
      Throws:
      org.xml.sax.SAXException
  • Method Details

    • isTrimWhitespace

      public boolean isTrimWhitespace()
      Returns the current trim whitespace setting.
      Returns:
      true if whitespace will be trimmed
    • setTrimWhitespace

      public void setTrimWhitespace​(boolean trimWhitespace)
      Sets the trim whitespace setting value.
      Parameters:
      trimWhitespace - the desired setting value
    • isKeepIgnorableWhitespace

      public boolean isKeepIgnorableWhitespace()
      Returns the current keep ignorable whitespace setting.
      Returns:
      true if ignorable whitespace will be kept (default false)
    • setKeepIgnorableWhitespace

      public void setKeepIgnorableWhitespace​(boolean keepIgnorableWhitespace)
      Sets the keep ignorable whitespace setting value.
      Parameters:
      keepIgnorableWhitespace - the desired new value
    • parse

      public Node parse​(java.io.File file) throws java.io.IOException, org.xml.sax.SAXException
      Parses the content of the given file as XML turning it into a tree of Nodes.
      Parameters:
      file - the File containing the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
      java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse​(org.xml.sax.InputSource input) throws java.io.IOException, org.xml.sax.SAXException
      Parse the content of the specified input source into a tree of Nodes.
      Parameters:
      input - the InputSource for the XML to parse
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
      java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse​(java.io.InputStream input) throws java.io.IOException, org.xml.sax.SAXException
      Parse the content of the specified input stream into a tree of Nodes.

      Note that using this method will not provide the parser with any URI for which to find DTDs etc

      Parameters:
      input - an InputStream containing the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
      java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse​(java.io.Reader in) throws java.io.IOException, org.xml.sax.SAXException
      Parse the content of the specified reader into a tree of Nodes.

      Note that using this method will not provide the parser with any URI for which to find DTDs etc

      Parameters:
      in - a Reader to read the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
      java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse​(java.lang.String uri) throws java.io.IOException, org.xml.sax.SAXException
      Parse the content of the specified URI into a tree of Nodes.
      Parameters:
      uri - a String containing a uri pointing to the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
      java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parseText

      public Node parseText​(java.lang.String text) throws java.io.IOException, org.xml.sax.SAXException
      A helper method to parse the given text as XML.
      Parameters:
      text - the XML text to parse
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
      java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • isNamespaceAware

      public boolean isNamespaceAware()
      Determine if namespace handling is enabled.
      Returns:
      true if namespace handling is enabled
    • setNamespaceAware

      public void setNamespaceAware​(boolean namespaceAware)
      Enable and/or disable namespace handling.
      Parameters:
      namespaceAware - the new desired value
    • getDTDHandler

      public org.xml.sax.DTDHandler getDTDHandler()
    • getEntityResolver

      public org.xml.sax.EntityResolver getEntityResolver()
    • getErrorHandler

      public org.xml.sax.ErrorHandler getErrorHandler()
    • getFeature

      public boolean getFeature​(java.lang.String uri) throws org.xml.sax.SAXNotRecognizedException, org.xml.sax.SAXNotSupportedException
      Throws:
      org.xml.sax.SAXNotRecognizedException
      org.xml.sax.SAXNotSupportedException
    • getProperty

      public java.lang.Object getProperty​(java.lang.String uri) throws org.xml.sax.SAXNotRecognizedException, org.xml.sax.SAXNotSupportedException
      Throws:
      org.xml.sax.SAXNotRecognizedException
      org.xml.sax.SAXNotSupportedException
    • setDTDHandler

      public void setDTDHandler​(org.xml.sax.DTDHandler dtdHandler)
    • setEntityResolver

      public void setEntityResolver​(org.xml.sax.EntityResolver entityResolver)
    • setErrorHandler

      public void setErrorHandler​(org.xml.sax.ErrorHandler errorHandler)
    • setFeature

      public void setFeature​(java.lang.String uri, boolean value) throws org.xml.sax.SAXNotRecognizedException, org.xml.sax.SAXNotSupportedException
      Throws:
      org.xml.sax.SAXNotRecognizedException
      org.xml.sax.SAXNotSupportedException
    • setProperty

      public void setProperty​(java.lang.String uri, java.lang.Object value) throws org.xml.sax.SAXNotRecognizedException, org.xml.sax.SAXNotSupportedException
      Throws:
      org.xml.sax.SAXNotRecognizedException
      org.xml.sax.SAXNotSupportedException
    • startDocument

      public void startDocument() throws org.xml.sax.SAXException
      Specified by:
      startDocument in interface org.xml.sax.ContentHandler
      Throws:
      org.xml.sax.SAXException
    • endDocument

      public void endDocument() throws org.xml.sax.SAXException
      Specified by:
      endDocument in interface org.xml.sax.ContentHandler
      Throws:
      org.xml.sax.SAXException
    • startElement

      public void startElement​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, org.xml.sax.Attributes list) throws org.xml.sax.SAXException
      Specified by:
      startElement in interface org.xml.sax.ContentHandler
      Throws:
      org.xml.sax.SAXException
    • endElement

      public void endElement​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName) throws org.xml.sax.SAXException
      Specified by:
      endElement in interface org.xml.sax.ContentHandler
      Throws:
      org.xml.sax.SAXException
    • characters

      public void characters​(char[] buffer, int start, int length) throws org.xml.sax.SAXException
      Specified by:
      characters in interface org.xml.sax.ContentHandler
      Throws:
      org.xml.sax.SAXException
    • startPrefixMapping

      public void startPrefixMapping​(java.lang.String prefix, java.lang.String namespaceURI) throws org.xml.sax.SAXException
      Specified by:
      startPrefixMapping in interface org.xml.sax.ContentHandler
      Throws:
      org.xml.sax.SAXException
    • endPrefixMapping

      public void endPrefixMapping​(java.lang.String prefix) throws org.xml.sax.SAXException
      Specified by:
      endPrefixMapping in interface org.xml.sax.ContentHandler
      Throws:
      org.xml.sax.SAXException
    • ignorableWhitespace

      public void ignorableWhitespace​(char[] buffer, int start, int len) throws org.xml.sax.SAXException
      Specified by:
      ignorableWhitespace in interface org.xml.sax.ContentHandler
      Throws:
      org.xml.sax.SAXException
    • processingInstruction

      public void processingInstruction​(java.lang.String target, java.lang.String data) throws org.xml.sax.SAXException
      Specified by:
      processingInstruction in interface org.xml.sax.ContentHandler
      Throws:
      org.xml.sax.SAXException
    • getDocumentLocator

      public org.xml.sax.Locator getDocumentLocator()
    • setDocumentLocator

      public void setDocumentLocator​(org.xml.sax.Locator locator)
      Specified by:
      setDocumentLocator in interface org.xml.sax.ContentHandler
    • skippedEntity

      public void skippedEntity​(java.lang.String name) throws org.xml.sax.SAXException
      Specified by:
      skippedEntity in interface org.xml.sax.ContentHandler
      Throws:
      org.xml.sax.SAXException
    • getXMLReader

      protected org.xml.sax.XMLReader getXMLReader()
    • addTextToNode

      protected void addTextToNode()
    • createNode

      protected Node createNode​(Node parent, java.lang.Object name, java.util.Map attributes)
      Creates a new node with the given parent, name, and attributes. The default implementation returns an instance of groovy.util.Node.
      Parameters:
      parent - the parent node, or null if the node being created is the root node
      name - an Object representing the name of the node (typically an instance of QName)
      attributes - a Map of attribute names to attribute values
      Returns:
      a new Node instance representing the current node
    • getElementName

      protected java.lang.Object getElementName​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)
      Return a name given the namespaceURI, localName and qName.
      Parameters:
      namespaceURI - the namespace URI
      localName - the local name
      qName - the qualified name
      Returns:
      the newly created representation of the name