Package groovy.xml

Class XmlParser

java.lang.Object
groovy.xml.XmlParser
All Implemented Interfaces:
ContentHandler

public class XmlParser
extends Object
implements ContentHandler
A helper class for parsing XML into a tree of Node instances for a simple way of processing XML. This parser does not preserve the XML InfoSet - if that's what you need try using W3C DOM, dom4j, JDOM, XOM etc. This parser ignores comments and processing instructions and converts the XML into a Node for each element in the XML with attributes and child Nodes and Strings. This simple model is sufficient for most simple use cases of processing XML.

Example usage:

 import groovy.xml.XmlParser
 def xml = '<root><one a1="uno!"/><two>Some text!</two></root>'
 def rootNode = new XmlParser().parseText(xml)
 assert rootNode.name() == 'root'
 assert rootNode.one[0].@a1 == 'uno!'
 assert rootNode.two.text() == 'Some text!'
 rootNode.children().each { assert it.name() in ['one','two'] }
 
  • Constructor Details

    • XmlParser

      public XmlParser() throws ParserConfigurationException, SAXException
      Creates a non-validating and namespace-aware XmlParser which does not allow DOCTYPE declarations in documents.
      Throws:
      ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
      SAXException - for SAX errors.
    • XmlParser

      public XmlParser​(boolean validating, boolean namespaceAware) throws ParserConfigurationException, SAXException
      Creates a XmlParser which does not allow DOCTYPE declarations in documents.
      Parameters:
      validating - true if the parser should validate documents as they are parsed; false otherwise.
      namespaceAware - true if the parser should provide support for XML namespaces; false otherwise.
      Throws:
      ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
      SAXException - for SAX errors.
    • XmlParser

      public XmlParser​(boolean validating, boolean namespaceAware, boolean allowDocTypeDeclaration) throws ParserConfigurationException, SAXException
      Creates a XmlParser.
      Parameters:
      validating - true if the parser should validate documents as they are parsed; false otherwise.
      namespaceAware - true if the parser should provide support for XML namespaces; false otherwise.
      allowDocTypeDeclaration - true if the parser should provide support for DOCTYPE declarations; false otherwise.
      Throws:
      ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
      SAXException - for SAX errors.
    • XmlParser

      public XmlParser​(XMLReader reader)
    • XmlParser

      public XmlParser​(SAXParser parser) throws SAXException
      Throws:
      SAXException
  • Method Details

    • isTrimWhitespace

      public boolean isTrimWhitespace()
      Returns the current trim whitespace setting.
      Returns:
      true if whitespace will be trimmed
    • setTrimWhitespace

      public void setTrimWhitespace​(boolean trimWhitespace)
      Sets the trim whitespace setting value.
      Parameters:
      trimWhitespace - the desired setting value
    • isKeepIgnorableWhitespace

      public boolean isKeepIgnorableWhitespace()
      Returns the current keep ignorable whitespace setting.
      Returns:
      true if ignorable whitespace will be kept (default false)
    • setKeepIgnorableWhitespace

      public void setKeepIgnorableWhitespace​(boolean keepIgnorableWhitespace)
      Sets the keep ignorable whitespace setting value.
      Parameters:
      keepIgnorableWhitespace - the desired new value
    • parse

      public Node parse​(File file) throws IOException, SAXException
      Parses the content of the given file as XML turning it into a tree of Nodes.
      Parameters:
      file - the File containing the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse​(InputSource input) throws IOException, SAXException
      Parse the content of the specified input source into a tree of Nodes.
      Parameters:
      input - the InputSource for the XML to parse
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse​(InputStream input) throws IOException, SAXException
      Parse the content of the specified input stream into a tree of Nodes.

      Note that using this method will not provide the parser with any URI for which to find DTDs etc

      Parameters:
      input - an InputStream containing the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse​(Reader in) throws IOException, SAXException
      Parse the content of the specified reader into a tree of Nodes.

      Note that using this method will not provide the parser with any URI for which to find DTDs etc

      Parameters:
      in - a Reader to read the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse​(String uri) throws IOException, SAXException
      Parse the content of the specified URI into a tree of Nodes.
      Parameters:
      uri - a String containing a uri pointing to the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parseText

      public Node parseText​(String text) throws IOException, SAXException
      A helper method to parse the given text as XML.
      Parameters:
      text - the XML text to parse
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • isNamespaceAware

      public boolean isNamespaceAware()
      Determine if namespace handling is enabled.
      Returns:
      true if namespace handling is enabled
    • setNamespaceAware

      public void setNamespaceAware​(boolean namespaceAware)
      Enable and/or disable namespace handling.
      Parameters:
      namespaceAware - the new desired value
    • getDTDHandler

      public DTDHandler getDTDHandler()
    • getEntityResolver

      public EntityResolver getEntityResolver()
    • getErrorHandler

      public ErrorHandler getErrorHandler()
    • getFeature

      public boolean getFeature​(String uri) throws SAXNotRecognizedException, SAXNotSupportedException
      Throws:
      SAXNotRecognizedException
      SAXNotSupportedException
    • getProperty

      public Object getProperty​(String uri) throws SAXNotRecognizedException, SAXNotSupportedException
      Throws:
      SAXNotRecognizedException
      SAXNotSupportedException
    • setDTDHandler

      public void setDTDHandler​(DTDHandler dtdHandler)
    • setEntityResolver

      public void setEntityResolver​(EntityResolver entityResolver)
    • setErrorHandler

      public void setErrorHandler​(ErrorHandler errorHandler)
    • setFeature

      public void setFeature​(String uri, boolean value) throws SAXNotRecognizedException, SAXNotSupportedException
      Throws:
      SAXNotRecognizedException
      SAXNotSupportedException
    • setProperty

      public void setProperty​(String uri, Object value) throws SAXNotRecognizedException, SAXNotSupportedException
      Throws:
      SAXNotRecognizedException
      SAXNotSupportedException
    • startDocument

      public void startDocument() throws SAXException
      Specified by:
      startDocument in interface ContentHandler
      Throws:
      SAXException
    • endDocument

      public void endDocument() throws SAXException
      Specified by:
      endDocument in interface ContentHandler
      Throws:
      SAXException
    • startElement

      public void startElement​(String namespaceURI, String localName, String qName, Attributes list) throws SAXException
      Specified by:
      startElement in interface ContentHandler
      Throws:
      SAXException
    • endElement

      public void endElement​(String namespaceURI, String localName, String qName) throws SAXException
      Specified by:
      endElement in interface ContentHandler
      Throws:
      SAXException
    • characters

      public void characters​(char[] buffer, int start, int length) throws SAXException
      Specified by:
      characters in interface ContentHandler
      Throws:
      SAXException
    • startPrefixMapping

      public void startPrefixMapping​(String prefix, String namespaceURI) throws SAXException
      Specified by:
      startPrefixMapping in interface ContentHandler
      Throws:
      SAXException
    • endPrefixMapping

      public void endPrefixMapping​(String prefix) throws SAXException
      Specified by:
      endPrefixMapping in interface ContentHandler
      Throws:
      SAXException
    • ignorableWhitespace

      public void ignorableWhitespace​(char[] buffer, int start, int len) throws SAXException
      Specified by:
      ignorableWhitespace in interface ContentHandler
      Throws:
      SAXException
    • processingInstruction

      public void processingInstruction​(String target, String data) throws SAXException
      Specified by:
      processingInstruction in interface ContentHandler
      Throws:
      SAXException
    • getDocumentLocator

      public Locator getDocumentLocator()
    • setDocumentLocator

      public void setDocumentLocator​(Locator locator)
      Specified by:
      setDocumentLocator in interface ContentHandler
    • skippedEntity

      public void skippedEntity​(String name) throws SAXException
      Specified by:
      skippedEntity in interface ContentHandler
      Throws:
      SAXException
    • getXMLReader

      protected XMLReader getXMLReader()
    • addTextToNode

      protected void addTextToNode()
    • createNode

      protected Node createNode​(Node parent, Object name, Map attributes)
      Creates a new node with the given parent, name, and attributes. The default implementation returns an instance of groovy.util.Node.
      Parameters:
      parent - the parent node, or null if the node being created is the root node
      name - an Object representing the name of the node (typically an instance of QName)
      attributes - a Map of attribute names to attribute values
      Returns:
      a new Node instance representing the current node
    • getElementName

      protected Object getElementName​(String namespaceURI, String localName, String qName)
      Return a name given the namespaceURI, localName and qName.
      Parameters:
      namespaceURI - the namespace URI
      localName - the local name
      qName - the qualified name
      Returns:
      the newly created representation of the name