Package groovy.xml

Class XmlParser

java.lang.Object
groovy.xml.XmlParser
All Implemented Interfaces:
ContentHandler

public class XmlParser extends Object implements ContentHandler
A helper class for parsing XML into a tree of Node instances for a simple way of processing XML. This parser does not preserve the XML InfoSet - if that's what you need try using W3C DOM, dom4j, JDOM, XOM etc. This parser ignores comments and processing instructions and converts the XML into a Node for each element in the XML with attributes and child Nodes and Strings. This simple model is sufficient for most simple use cases of processing XML.

Example usage:

 import groovy.xml.XmlParser
 def xml = '<root><one a1="uno!"/><two>Some text!</two></root>'
 def rootNode = new XmlParser().parseText(xml)
 assert rootNode.name() == 'root'
 assert rootNode.one[0].@a1 == 'uno!'
 assert rootNode.two.text() == 'Some text!'
 rootNode.children().each { assert it.name() in ['one','two'] }
 
  • Constructor Details

    • XmlParser

      public XmlParser() throws ParserConfigurationException, SAXException
      Creates a non-validating and namespace-aware XmlParser which does not allow DOCTYPE declarations in documents.

      Parser options can be configured via setters before the first parse call:

       // Using Groovy named parameters:
       def parser = new XmlParser(namespaceAware: false, trimWhitespace: true)
       
      Throws:
      ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
      SAXException - for SAX errors.
    • XmlParser

      public XmlParser(boolean validating, boolean namespaceAware) throws ParserConfigurationException, SAXException
      Creates a XmlParser which does not allow DOCTYPE declarations in documents.
      Parameters:
      validating - true if the parser should validate documents as they are parsed; false otherwise.
      namespaceAware - true if the parser should provide support for XML namespaces; false otherwise.
      Throws:
      ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
      SAXException - for SAX errors.
    • XmlParser

      public XmlParser(boolean validating, boolean namespaceAware, boolean allowDocTypeDeclaration) throws ParserConfigurationException, SAXException
      Creates a XmlParser.
      Parameters:
      validating - true if the parser should validate documents as they are parsed; false otherwise.
      namespaceAware - true if the parser should provide support for XML namespaces; false otherwise.
      allowDocTypeDeclaration - true if the parser should provide support for DOCTYPE declarations; false otherwise.
      Throws:
      ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
      SAXException - for SAX errors.
    • XmlParser

      public XmlParser(XMLReader reader)
    • XmlParser

      public XmlParser(SAXParser parser) throws SAXException
      Throws:
      SAXException
  • Method Details

    • isTrimWhitespace

      public boolean isTrimWhitespace()
      Returns the current trim whitespace setting.
      Returns:
      true if whitespace will be trimmed
    • setTrimWhitespace

      public void setTrimWhitespace(boolean trimWhitespace)
      Sets the trim whitespace setting value.
      Parameters:
      trimWhitespace - the desired setting value
    • isKeepIgnorableWhitespace

      public boolean isKeepIgnorableWhitespace()
      Returns the current keep ignorable whitespace setting.
      Returns:
      true if ignorable whitespace will be kept (default false)
    • setKeepIgnorableWhitespace

      public void setKeepIgnorableWhitespace(boolean keepIgnorableWhitespace)
      Sets the keep ignorable whitespace setting value.
      Parameters:
      keepIgnorableWhitespace - the desired new value
    • parse

      public Node parse(File file) throws IOException, SAXException
      Parses the content of the given file as XML turning it into a tree of Nodes.
      Parameters:
      file - the File containing the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse(Path path) throws IOException, SAXException
      Parses the content of the file at the given path as XML turning it into a tree of Nodes.
      Parameters:
      path - the path of the File containing the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse(InputSource input) throws IOException, SAXException
      Parse the content of the specified input source into a tree of Nodes.
      Parameters:
      input - the InputSource for the XML to parse
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse(InputStream input) throws IOException, SAXException
      Parse the content of the specified input stream into a tree of Nodes.

      Note that using this method will not provide the parser with any URI for which to find DTDs etc

      Parameters:
      input - an InputStream containing the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse(Reader in) throws IOException, SAXException
      Parse the content of the specified reader into a tree of Nodes.

      Note that using this method will not provide the parser with any URI for which to find DTDs etc

      Parameters:
      in - a Reader to read the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parse

      public Node parse(String uri) throws IOException, SAXException
      Parse the content of the specified URI into a tree of Nodes.
      Parameters:
      uri - a String containing a URI pointing to the XML to be parsed
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parseText

      public Node parseText(String text) throws IOException, SAXException
      A helper method to parse the given text as XML.
      Parameters:
      text - the XML text to parse
      Returns:
      the root node of the parsed tree of Nodes
      Throws:
      SAXException - Any SAX exception, possibly wrapping another exception.
      IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
    • parseTextAs

      public <T> T parseTextAs(Class<T> type, String text)
      Parse the content of the specified XML text into a typed object. Requires jackson-databind on the classpath for type conversion. Supports @JsonProperty and @JsonFormat annotations.
      Type Parameters:
      T - the target type
      Parameters:
      type - the target type
      text - the XML text to parse
      Returns:
      a typed object
      Throws:
      XmlRuntimeException - if parsing or conversion fails, or jackson-databind is absent
      Since:
      6.0.0
    • parseAs

      public <T> T parseAs(Class<T> type, Reader reader)
      Parse XML from a reader into a typed object. Requires jackson-databind on the classpath for type conversion.
      Type Parameters:
      T - the target type
      Parameters:
      type - the target type
      reader - the reader of XML
      Returns:
      a typed object
      Throws:
      XmlRuntimeException - if parsing or conversion fails, or jackson-databind is absent
      Since:
      6.0.0
    • parseAs

      public <T> T parseAs(Class<T> type, InputStream stream)
      Parse XML from an input stream into a typed object. Requires jackson-databind on the classpath for type conversion.
      Type Parameters:
      T - the target type
      Parameters:
      type - the target type
      stream - the input stream of XML
      Returns:
      a typed object
      Throws:
      XmlRuntimeException - if parsing or conversion fails, or jackson-databind is absent
      Since:
      6.0.0
    • parseAs

      public <T> T parseAs(Class<T> type, File file) throws IOException
      Parse XML from a file into a typed object. Requires jackson-databind on the classpath for type conversion.
      Type Parameters:
      T - the target type
      Parameters:
      type - the target type
      file - the XML file
      Returns:
      a typed object
      Throws:
      IOException - if the file cannot be read
      XmlRuntimeException - if parsing or conversion fails, or jackson-databind is absent
      Since:
      6.0.0
    • parseAs

      public <T> T parseAs(Class<T> type, Path path) throws IOException
      Parse XML from a path into a typed object. Requires jackson-databind on the classpath for type conversion.
      Type Parameters:
      T - the target type
      Parameters:
      type - the target type
      path - the path to the XML file
      Returns:
      a typed object
      Throws:
      IOException - if the file cannot be read
      XmlRuntimeException - if parsing or conversion fails, or jackson-databind is absent
      Since:
      6.0.0
    • isNamespaceAware

      public boolean isNamespaceAware()
      Determine if namespace handling is enabled.
      Returns:
      true if namespace handling is enabled
    • setNamespaceAware

      public void setNamespaceAware(boolean namespaceAware)
      Enable and/or disable namespace handling. Must be set before the first parse call.
      Parameters:
      namespaceAware - the new desired value
      Throws:
      IllegalStateException - if called after parsing has started
    • isValidating

      public boolean isValidating()
      Determine if the parser validates documents.
      Returns:
      true if validation is enabled
      Since:
      6.0.0
    • setValidating

      public void setValidating(boolean validating)
      Enable and/or disable validation. Must be set before the first parse call.
      Parameters:
      validating - the new desired value
      Throws:
      IllegalStateException - if called after parsing has started
      Since:
      6.0.0
    • isAllowDocTypeDeclaration

      public boolean isAllowDocTypeDeclaration()
      Determine if DOCTYPE declarations are allowed.
      Returns:
      true if DOCTYPE declarations are allowed
      Since:
      6.0.0
    • setAllowDocTypeDeclaration

      public void setAllowDocTypeDeclaration(boolean allowDocTypeDeclaration)
      Enable and/or disable DOCTYPE declaration support. Must be set before the first parse call.
      Parameters:
      allowDocTypeDeclaration - the new desired value
      Throws:
      IllegalStateException - if called after parsing has started
      Since:
      6.0.0
    • getDTDHandler

      public DTDHandler getDTDHandler()
    • getEntityResolver

      public EntityResolver getEntityResolver()
    • getErrorHandler

      public ErrorHandler getErrorHandler()
    • getFeature

      public boolean getFeature(String uri) throws SAXNotRecognizedException, SAXNotSupportedException
      Throws:
      SAXNotRecognizedException
      SAXNotSupportedException
    • getProperty

      Throws:
      SAXNotRecognizedException
      SAXNotSupportedException
    • setDTDHandler

      public void setDTDHandler(DTDHandler dtdHandler)
    • setEntityResolver

      public void setEntityResolver(EntityResolver entityResolver)
    • setErrorHandler

      public void setErrorHandler(ErrorHandler errorHandler)
    • setFeature

      public void setFeature(String uri, boolean value) throws SAXNotRecognizedException, SAXNotSupportedException
      Throws:
      SAXNotRecognizedException
      SAXNotSupportedException
    • setProperty

      public void setProperty(String uri, Object value) throws SAXNotRecognizedException, SAXNotSupportedException
      Throws:
      SAXNotRecognizedException
      SAXNotSupportedException
    • startDocument

      public void startDocument() throws SAXException
      Specified by:
      startDocument in interface ContentHandler
      Throws:
      SAXException
    • endDocument

      public void endDocument() throws SAXException
      Specified by:
      endDocument in interface ContentHandler
      Throws:
      SAXException
    • startElement

      public void startElement(String namespaceURI, String localName, String qName, Attributes list) throws SAXException
      Specified by:
      startElement in interface ContentHandler
      Throws:
      SAXException
    • endElement

      public void endElement(String namespaceURI, String localName, String qName) throws SAXException
      Specified by:
      endElement in interface ContentHandler
      Throws:
      SAXException
    • characters

      public void characters(char[] buffer, int start, int length) throws SAXException
      Specified by:
      characters in interface ContentHandler
      Throws:
      SAXException
    • startPrefixMapping

      public void startPrefixMapping(String prefix, String namespaceURI) throws SAXException
      Specified by:
      startPrefixMapping in interface ContentHandler
      Throws:
      SAXException
    • endPrefixMapping

      public void endPrefixMapping(String prefix) throws SAXException
      Specified by:
      endPrefixMapping in interface ContentHandler
      Throws:
      SAXException
    • ignorableWhitespace

      public void ignorableWhitespace(char[] buffer, int start, int len) throws SAXException
      Specified by:
      ignorableWhitespace in interface ContentHandler
      Throws:
      SAXException
    • processingInstruction

      public void processingInstruction(String target, String data) throws SAXException
      Specified by:
      processingInstruction in interface ContentHandler
      Throws:
      SAXException
    • getDocumentLocator

      public Locator getDocumentLocator()
    • setDocumentLocator

      public void setDocumentLocator(Locator locator)
      Specified by:
      setDocumentLocator in interface ContentHandler
    • skippedEntity

      public void skippedEntity(String name) throws SAXException
      Specified by:
      skippedEntity in interface ContentHandler
      Throws:
      SAXException
    • getXMLReader

      protected XMLReader getXMLReader()
    • addTextToNode

      protected void addTextToNode()
    • createNode

      protected Node createNode(Node parent, Object name, Map attributes)
      Creates a new node with the given parent, name, and attributes. The default implementation returns an instance of groovy.util.Node.
      Parameters:
      parent - the parent node, or null if the node being created is the root node
      name - an Object representing the name of the node (typically an instance of QName)
      attributes - a Map of attribute names to attribute values
      Returns:
      a new Node instance representing the current node
    • getElementName

      protected Object getElementName(String namespaceURI, String localName, String qName)
      Return a name given the namespaceURI, localName and qName.
      Parameters:
      namespaceURI - the namespace URI
      localName - the local name
      qName - the qualified name
      Returns:
      the newly created representation of the name