Package groovy.util

Class XmlParser

  • All Implemented Interfaces:
    org.xml.sax.ContentHandler

    public class XmlParser
    extends java.lang.Object
    implements org.xml.sax.ContentHandler
    A helper class for parsing XML into a tree of Node instances for a simple way of processing XML. This parser does not preserve the XML InfoSet - if that's what you need try using W3C DOM, dom4j, JDOM, XOM etc. This parser ignores comments and processing instructions and converts the XML into a Node for each element in the XML with attributes and child Nodes and Strings. This simple model is sufficient for most simple use cases of processing XML.

    Example usage:

     def xml = '<root><one a1="uno!"/><two>Some text!</two></root>'
     def rootNode = new XmlParser().parseText(xml)
     assert rootNode.name() == 'root'
     assert rootNode.one[0].@a1 == 'uno!'
     assert rootNode.two.text() == 'Some text!'
     rootNode.children().each { assert it.name() in ['one','two'] }
     
    • Constructor Summary

      Constructors 
      Constructor Description
      XmlParser()
      Creates a non-validating and namespace-aware XmlParser which does not allow DOCTYPE declarations in documents.
      XmlParser​(boolean validating, boolean namespaceAware)
      Creates a XmlParser which does not allow DOCTYPE declarations in documents.
      XmlParser​(boolean validating, boolean namespaceAware, boolean allowDocTypeDeclaration)
      Creates a XmlParser.
      XmlParser​(javax.xml.parsers.SAXParser parser)  
      XmlParser​(org.xml.sax.XMLReader reader)  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected void addTextToNode()  
      void characters​(char[] buffer, int start, int length)  
      protected Node createNode​(Node parent, java.lang.Object name, java.util.Map attributes)
      Creates a new node with the given parent, name, and attributes.
      void endDocument()  
      void endElement​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)  
      void endPrefixMapping​(java.lang.String prefix)  
      org.xml.sax.Locator getDocumentLocator()  
      org.xml.sax.DTDHandler getDTDHandler()  
      protected java.lang.Object getElementName​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName)
      Return a name given the namespaceURI, localName and qName.
      org.xml.sax.EntityResolver getEntityResolver()  
      org.xml.sax.ErrorHandler getErrorHandler()  
      boolean getFeature​(java.lang.String uri)  
      java.lang.Object getProperty​(java.lang.String uri)  
      protected org.xml.sax.XMLReader getXMLReader()  
      void ignorableWhitespace​(char[] buffer, int start, int len)  
      boolean isKeepIgnorableWhitespace()
      Returns the current keep ignorable whitespace setting.
      boolean isNamespaceAware()
      Determine if namespace handling is enabled.
      boolean isTrimWhitespace()
      Returns the current trim whitespace setting.
      Node parse​(java.io.File file)
      Parses the content of the given file as XML turning it into a tree of Nodes.
      Node parse​(java.io.InputStream input)
      Parse the content of the specified input stream into a tree of Nodes.
      Node parse​(java.io.Reader in)
      Parse the content of the specified reader into a tree of Nodes.
      Node parse​(java.lang.String uri)
      Parse the content of the specified URI into a tree of Nodes.
      Node parse​(org.xml.sax.InputSource input)
      Parse the content of the specified input source into a tree of Nodes.
      Node parseText​(java.lang.String text)
      A helper method to parse the given text as XML.
      void processingInstruction​(java.lang.String target, java.lang.String data)  
      void setDocumentLocator​(org.xml.sax.Locator locator)  
      void setDTDHandler​(org.xml.sax.DTDHandler dtdHandler)  
      void setEntityResolver​(org.xml.sax.EntityResolver entityResolver)  
      void setErrorHandler​(org.xml.sax.ErrorHandler errorHandler)  
      void setFeature​(java.lang.String uri, boolean value)  
      void setKeepIgnorableWhitespace​(boolean keepIgnorableWhitespace)
      Sets the keep ignorable whitespace setting value.
      void setNamespaceAware​(boolean namespaceAware)
      Enable and/or disable namespace handling.
      void setProperty​(java.lang.String uri, java.lang.Object value)  
      void setTrimWhitespace​(boolean trimWhitespace)
      Sets the trim whitespace setting value.
      void skippedEntity​(java.lang.String name)  
      void startDocument()  
      void startElement​(java.lang.String namespaceURI, java.lang.String localName, java.lang.String qName, org.xml.sax.Attributes list)  
      void startPrefixMapping​(java.lang.String prefix, java.lang.String namespaceURI)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • XmlParser

        public XmlParser()
                  throws javax.xml.parsers.ParserConfigurationException,
                         org.xml.sax.SAXException
        Creates a non-validating and namespace-aware XmlParser which does not allow DOCTYPE declarations in documents.
        Throws:
        javax.xml.parsers.ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
        org.xml.sax.SAXException - for SAX errors.
      • XmlParser

        public XmlParser​(boolean validating,
                         boolean namespaceAware)
                  throws javax.xml.parsers.ParserConfigurationException,
                         org.xml.sax.SAXException
        Creates a XmlParser which does not allow DOCTYPE declarations in documents.
        Parameters:
        validating - true if the parser should validate documents as they are parsed; false otherwise.
        namespaceAware - true if the parser should provide support for XML namespaces; false otherwise.
        Throws:
        javax.xml.parsers.ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
        org.xml.sax.SAXException - for SAX errors.
      • XmlParser

        public XmlParser​(boolean validating,
                         boolean namespaceAware,
                         boolean allowDocTypeDeclaration)
                  throws javax.xml.parsers.ParserConfigurationException,
                         org.xml.sax.SAXException
        Creates a XmlParser.
        Parameters:
        validating - true if the parser should validate documents as they are parsed; false otherwise.
        namespaceAware - true if the parser should provide support for XML namespaces; false otherwise.
        allowDocTypeDeclaration - true if the parser should provide support for DOCTYPE declarations; false otherwise.
        Throws:
        javax.xml.parsers.ParserConfigurationException - if no parser which satisfies the requested configuration can be created.
        org.xml.sax.SAXException - for SAX errors.
      • XmlParser

        public XmlParser​(org.xml.sax.XMLReader reader)
      • XmlParser

        public XmlParser​(javax.xml.parsers.SAXParser parser)
                  throws org.xml.sax.SAXException
        Throws:
        org.xml.sax.SAXException
    • Method Detail

      • isTrimWhitespace

        public boolean isTrimWhitespace()
        Returns the current trim whitespace setting.
        Returns:
        true if whitespace will be trimmed
      • setTrimWhitespace

        public void setTrimWhitespace​(boolean trimWhitespace)
        Sets the trim whitespace setting value.
        Parameters:
        trimWhitespace - the desired setting value
      • isKeepIgnorableWhitespace

        public boolean isKeepIgnorableWhitespace()
        Returns the current keep ignorable whitespace setting.
        Returns:
        true if ignorable whitespace will be kept (default false)
      • setKeepIgnorableWhitespace

        public void setKeepIgnorableWhitespace​(boolean keepIgnorableWhitespace)
        Sets the keep ignorable whitespace setting value.
        Parameters:
        keepIgnorableWhitespace - the desired new value
      • parse

        public Node parse​(java.io.File file)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parses the content of the given file as XML turning it into a tree of Nodes.
        Parameters:
        file - the File containing the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(org.xml.sax.InputSource input)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parse the content of the specified input source into a tree of Nodes.
        Parameters:
        input - the InputSource for the XML to parse
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(java.io.InputStream input)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parse the content of the specified input stream into a tree of Nodes.

        Note that using this method will not provide the parser with any URI for which to find DTDs etc

        Parameters:
        input - an InputStream containing the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(java.io.Reader in)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parse the content of the specified reader into a tree of Nodes.

        Note that using this method will not provide the parser with any URI for which to find DTDs etc

        Parameters:
        in - a Reader to read the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parse

        public Node parse​(java.lang.String uri)
                   throws java.io.IOException,
                          org.xml.sax.SAXException
        Parse the content of the specified URI into a tree of Nodes.
        Parameters:
        uri - a String containing a uri pointing to the XML to be parsed
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • parseText

        public Node parseText​(java.lang.String text)
                       throws java.io.IOException,
                              org.xml.sax.SAXException
        A helper method to parse the given text as XML.
        Parameters:
        text - the XML text to parse
        Returns:
        the root node of the parsed tree of Nodes
        Throws:
        org.xml.sax.SAXException - Any SAX exception, possibly wrapping another exception.
        java.io.IOException - An IO exception from the parser, possibly from a byte stream or character stream supplied by the application.
      • isNamespaceAware

        public boolean isNamespaceAware()
        Determine if namespace handling is enabled.
        Returns:
        true if namespace handling is enabled
      • setNamespaceAware

        public void setNamespaceAware​(boolean namespaceAware)
        Enable and/or disable namespace handling.
        Parameters:
        namespaceAware - the new desired value
      • getDTDHandler

        public org.xml.sax.DTDHandler getDTDHandler()
      • getEntityResolver

        public org.xml.sax.EntityResolver getEntityResolver()
      • getErrorHandler

        public org.xml.sax.ErrorHandler getErrorHandler()
      • getFeature

        public boolean getFeature​(java.lang.String uri)
                           throws org.xml.sax.SAXNotRecognizedException,
                                  org.xml.sax.SAXNotSupportedException
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • getProperty

        public java.lang.Object getProperty​(java.lang.String uri)
                                     throws org.xml.sax.SAXNotRecognizedException,
                                            org.xml.sax.SAXNotSupportedException
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • setDTDHandler

        public void setDTDHandler​(org.xml.sax.DTDHandler dtdHandler)
      • setEntityResolver

        public void setEntityResolver​(org.xml.sax.EntityResolver entityResolver)
      • setErrorHandler

        public void setErrorHandler​(org.xml.sax.ErrorHandler errorHandler)
      • setFeature

        public void setFeature​(java.lang.String uri,
                               boolean value)
                        throws org.xml.sax.SAXNotRecognizedException,
                               org.xml.sax.SAXNotSupportedException
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • setProperty

        public void setProperty​(java.lang.String uri,
                                java.lang.Object value)
                         throws org.xml.sax.SAXNotRecognizedException,
                                org.xml.sax.SAXNotSupportedException
        Throws:
        org.xml.sax.SAXNotRecognizedException
        org.xml.sax.SAXNotSupportedException
      • startDocument

        public void startDocument()
                           throws org.xml.sax.SAXException
        Specified by:
        startDocument in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • endDocument

        public void endDocument()
                         throws org.xml.sax.SAXException
        Specified by:
        endDocument in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • startElement

        public void startElement​(java.lang.String namespaceURI,
                                 java.lang.String localName,
                                 java.lang.String qName,
                                 org.xml.sax.Attributes list)
                          throws org.xml.sax.SAXException
        Specified by:
        startElement in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • endElement

        public void endElement​(java.lang.String namespaceURI,
                               java.lang.String localName,
                               java.lang.String qName)
                        throws org.xml.sax.SAXException
        Specified by:
        endElement in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • characters

        public void characters​(char[] buffer,
                               int start,
                               int length)
                        throws org.xml.sax.SAXException
        Specified by:
        characters in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • startPrefixMapping

        public void startPrefixMapping​(java.lang.String prefix,
                                       java.lang.String namespaceURI)
                                throws org.xml.sax.SAXException
        Specified by:
        startPrefixMapping in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • endPrefixMapping

        public void endPrefixMapping​(java.lang.String prefix)
                              throws org.xml.sax.SAXException
        Specified by:
        endPrefixMapping in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • ignorableWhitespace

        public void ignorableWhitespace​(char[] buffer,
                                        int start,
                                        int len)
                                 throws org.xml.sax.SAXException
        Specified by:
        ignorableWhitespace in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • processingInstruction

        public void processingInstruction​(java.lang.String target,
                                          java.lang.String data)
                                   throws org.xml.sax.SAXException
        Specified by:
        processingInstruction in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • getDocumentLocator

        public org.xml.sax.Locator getDocumentLocator()
      • setDocumentLocator

        public void setDocumentLocator​(org.xml.sax.Locator locator)
        Specified by:
        setDocumentLocator in interface org.xml.sax.ContentHandler
      • skippedEntity

        public void skippedEntity​(java.lang.String name)
                           throws org.xml.sax.SAXException
        Specified by:
        skippedEntity in interface org.xml.sax.ContentHandler
        Throws:
        org.xml.sax.SAXException
      • getXMLReader

        protected org.xml.sax.XMLReader getXMLReader()
      • addTextToNode

        protected void addTextToNode()
      • createNode

        protected Node createNode​(Node parent,
                                  java.lang.Object name,
                                  java.util.Map attributes)
        Creates a new node with the given parent, name, and attributes. The default implementation returns an instance of groovy.util.Node.
        Parameters:
        parent - the parent node, or null if the node being created is the root node
        name - an Object representing the name of the node (typically an instance of QName)
        attributes - a Map of attribute names to attribute values
        Returns:
        a new Node instance representing the current node
      • getElementName

        protected java.lang.Object getElementName​(java.lang.String namespaceURI,
                                                  java.lang.String localName,
                                                  java.lang.String qName)
        Return a name given the namespaceURI, localName and qName.
        Parameters:
        namespaceURI - the namespace URI
        localName - the local name
        qName - the qualified name
        Returns:
        the newly created representation of the name