Package groovy.util

Class XmlParser

  • All Implemented Interfaces:
    ContentHandler

    public class XmlParser
    extends Object
    implements ContentHandler
    A helper class for parsing XML into a tree of Node instances for a simple way of processing XML. This parser does not preserve the XML InfoSet - if that's what you need try using W3C DOM, dom4j, JDOM, XOM etc. This parser ignores comments and processing instructions and converts the XML into a Node for each element in the XML with attributes and child Nodes and Strings. This simple model is sufficient for most simple use cases of processing XML.

    Example usage:

     def xml = '<root><one a1="uno!"/><two>Some text!</two></root>'
     def rootNode = new XmlParser().parseText(xml)
     assert rootNode.name() == 'root'
     assert rootNode.one[0].@a1 == 'uno!'
     assert rootNode.two.text() == 'Some text!'
     rootNode.children().each { assert it.name() in ['one','two'] }
     
    • Constructor Detail

      • XmlParser

        public XmlParser​(boolean validating,
                         boolean namespaceAware)
                  throws ParserConfigurationException,
                         SAXException
        Creates a XmlParser which does not allow DOCTYPE declarations in documents.
        Parameters:
        validating - true if the parser should validate documents as they are parsed; false otherwise.
        namespaceAware - true if the parser should provide support for XML namespaces; false otherwise.
        Throws:
        ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
        SAXException - for SAX errors.
      • XmlParser

        public XmlParser​(boolean validating,
                         boolean namespaceAware,
                         boolean allowDocTypeDeclaration)
                  throws ParserConfigurationException,
                         SAXException
        Creates a XmlParser.
        Parameters:
        validating - true if the parser should validate documents as they are parsed; false otherwise.
        namespaceAware - true if the parser should provide support for XML namespaces; false otherwise.
        allowDocTypeDeclaration - true if the parser should provide support for DOCTYPE declarations; false otherwise.
        Throws:
        ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
        SAXException - for SAX errors.
      • XmlParser

        public XmlParser​(XMLReader reader)
    • Method Detail

      • isTrimWhitespace

        public boolean isTrimWhitespace()
        Returns the current trim whitespace setting.
        Returns:
        true if whitespace will be trimmed
      • setTrimWhitespace

        public void setTrimWhitespace​(boolean trimWhitespace)
        Sets the trim whitespace setting value.
        Parameters:
        trimWhitespace - the desired setting value
      • isKeepIgnorableWhitespace

        public boolean isKeepIgnorableWhitespace()
        Returns the current keep ignorable whitespace setting.
        Returns:
        true if ignorable whitespace will be kept (default false)
      • setKeepIgnorableWhitespace

        public void setKeepIgnorableWhitespace​(boolean keepIgnorableWhitespace)
        Sets the keep ignorable whitespace setting value.
        Parameters:
        keepIgnorableWhitespace - the desired new value
      • parse

        public Node parse​(File file)
                   throws IOException,
                          SAXException
        Parses the content of the given file as XML turning it into a tree of Nodes.
        Parameters:
        file - the File containing the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        SAXException - Any SAX exception, possibly wrapping another exception.
        IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(InputSource input)
                   throws IOException,
                          SAXException
        Parse the content of the specified input source into a tree of Nodes.
        Parameters:
        input - the InputSource for the XML to parse
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        SAXException - Any SAX exception, possibly wrapping another exception.
        IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(InputStream input)
                   throws IOException,
                          SAXException
        Parse the content of the specified input stream into a tree of Nodes.

        Note that using this method will not provide the parser with any URI for which to find DTDs etc

        Parameters:
        input - an InputStream containing the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        SAXException - Any SAX exception, possibly wrapping another exception.
        IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(Reader in)
                   throws IOException,
                          SAXException
        Parse the content of the specified reader into a tree of Nodes.

        Note that using this method will not provide the parser with any URI for which to find DTDs etc

        Parameters:
        in - a Reader to read the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        SAXException - Any SAX exception, possibly wrapping another exception.
        IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(String uri)
                   throws IOException,
                          SAXException
        Parse the content of the specified URI into a tree of Nodes.
        Parameters:
        uri - a String containing a uri pointing to the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        SAXException - Any SAX exception, possibly wrapping another exception.
        IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parseText

        public Node parseText​(String text)
                       throws IOException,
                              SAXException
        A helper method to parse the given text as XML.
        Parameters:
        text - the XML text to parse
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        SAXException - Any SAX exception, possibly wrapping another exception.
        IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • isNamespaceAware

        public boolean isNamespaceAware()
        Determine if namespace handling is enabled.
        Returns:
        true if namespace handling is enabled
      • setNamespaceAware

        public void setNamespaceAware​(boolean namespaceAware)
        Enable and/or disable namespace handling.
        Parameters:
        namespaceAware - the new desired value
      • getDTDHandler

        public DTDHandler getDTDHandler()
      • setDTDHandler

        public void setDTDHandler​(DTDHandler dtdHandler)
      • setEntityResolver

        public void setEntityResolver​(EntityResolver entityResolver)
      • setErrorHandler

        public void setErrorHandler​(ErrorHandler errorHandler)
      • getDocumentLocator

        public Locator getDocumentLocator()
      • getXMLReader

        protected XMLReader getXMLReader()
      • addTextToNode

        protected void addTextToNode()
      • createNode

        protected Node createNode​(Node parent,
                                  Object name,
                                  Map attributes)
        Creates a new node with the given parent, name, and attributes. The default implementation returns an instance of groovy.util.Node.
        Parameters:
        parent - the parent node, or null if the node being created is the root node
        name - an Object representing the name of the node (typically an instance of QName)
        attributes - a Map of attribute names to attribute values
        Returns:
        a new Node instance representing the current node
      • getElementName

        protected Object getElementName​(String namespaceURI,
                                        String localName,
                                        String qName)
        Return a name given the namespaceURI, localName and qName.
        Parameters:
        namespaceURI - the namespace URI
        localName - the local name
        qName - the qualified name
        Returns:
        the newly created representation of the name