edu.utexas.its.eis.tools.qwicap.template.xml.immutable
Class XMLDocument

java.lang.Object
  extended by edu.utexas.its.eis.tools.qwicap.template.xml.Markup
      extended by edu.utexas.its.eis.tools.qwicap.template.xml.immutable.ImmutableMarkup
          extended by edu.utexas.its.eis.tools.qwicap.template.xml.immutable.XMLDocument
All Implemented Interfaces:
Cloneable, Iterable<Range>

public class XMLDocument
extends ImmutableMarkup

XMLDocument creates a body of immutable markup from the XML document specified by a File or URL object, or, at a lower level, any InputStream or Reader object. Whatever the source of the XML markup, it should be well-formed, in the sense that for every "start" tag there is a corresponding "end" tag. Other than that, XMLDocument doesn't care what you feed to it, whether that be tiny fragments of XML, or entire documents. Be aware, however, that fragments will typically lack any of the markup necessary to identify their character set, which means that they cannot be read in a manner that is assured to be correct unless they are read using one of the constructors that accepts a parameter specifying the character set. If you have any doubts about the significance of character set issues, please read Joel Spolsky's article The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!).

Author:
Chris W. Johnson

Constructor Summary
XMLDocument(File XDoc)
          Reads an XML document from a file.
XMLDocument(File XDoc, String Charset)
          Reads an XML document from a file.
XMLDocument(InputStream In, long LengthHint)
          Reads an XML document from an InputStream.
XMLDocument(InputStream In, long LengthHint, String DocumentName)
          Reads an XML document from an InputStream.
XMLDocument(Reader In, long LengthHint)
          Reads an XML document from a Reader.
XMLDocument(Reader In, long LengthHint, String DocumentName)
          Reads an XML document from a Reader.
XMLDocument(URL DocURL)
          Reads an XML document from a URL.
XMLDocument(URLConnection Conn)
          Reads an XML document from a URLConnection.
 
Method Summary
 XMLDocument clone()
           
 XMLDocument getLatest()
          If this XMLDocument read its markup from a file, and that file has changed since this immutable representation was created, this method returns a new object representing the current contents of the file.
 
Methods inherited from class edu.utexas.its.eis.tools.qwicap.template.xml.immutable.ImmutableMarkup
createMatch, getChangeCount, getCharacterSet, getCSSPatterns, getImmutable, getList, getMutable, toString, write
 
Methods inherited from class edu.utexas.its.eis.tools.qwicap.template.xml.Markup
checkHierarchy, enumerate, first, get, get, get, getCDATA, getComments, getDeclarations, getMarkupName, isEmpty, iterator, print, setMarkupName, size
 
Methods inherited from class java.lang.Object
equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

XMLDocument

public XMLDocument(File XDoc)
            throws TagException,
                   IOException
Reads an XML document from a file. The character set of the document read by this constructor is determined by either the XML declaration (for example: "<?xml version="1.0" encoding="UTF-8"?>") at the beginning of the document, or by a "Content-Type" meta tag (for example: "<meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>") within the "head" tag, assuming the document is XHTML. If both character set specifications are present, the one in the XML declaration takes precedence. To programmatically specify the document's character set, use the XMLDocument(File, String) constructor.

Parameters:
XDoc - The XML file that should be read.
Throws:
TagException - If there was a parsing error.
IOException - If there was an I/O error.

XMLDocument

public XMLDocument(File XDoc,
                   String Charset)
            throws TagException,
                   IOException
Reads an XML document from a file. The file is interpreted as though it is encoded in the specified character set, no matter what the document may, or may not, claim. If your document specifies its character set in its markup, and you prefer to have that specification honored, use the XMLDocument(File) constructor instead.

Parameters:
XDoc - The XML file that should be read.
Charset - The character set of the XML file, or null if the character set should be determined by examining the XML file.
Throws:
TagException - If there's a parsing error.
IOException - If there's an I/O error.

XMLDocument

public XMLDocument(URL DocURL)
            throws TagException,
                   IOException
Reads an XML document from a URL. The URL object's internal connection to the document specified by the URL is closed after the document has been read. This constructor does not check the MIME type of the document specified by the URL, so you can't receive any MIME-type-mismatch surprises. Conversely, though, if you're passing arbitrary URLs to this constructor, be aware that discovering that a document does not contain XML by feeding it to this constructor and seeing whether or not it throws an exception, is a needlessly expensive process. In that situation, you should be checking the MIME type, and using the XMLDocument(URLConnection) constructor, if the type is appropriate (most XHTML documents arrive as type "text/html").

Parameters:
DocURL - The URL of the XML document to read.
Throws:
TagException - If there's a parsing error.
IOException - If there's an I/O error.

XMLDocument

public XMLDocument(URLConnection Conn)
            throws TagException,
                   IOException
Reads an XML document from a URLConnection. The URLConnection object's InputStream is closed after the document has been read. The character set of the document read by this constructor is, ideally, determined by the connection's content type (for example: "text/html; charset=UTF-8"), as returned by the URLConnection.getContentType() method. If the character set is not specfied by the content type, then it is determined by either the XML byte-order-mark (BOM), the XML declaration (for example: "<?xml version="1.0" encoding="UTF-8"?>") at the beginning of the document, or by a "Content-Type" meta tag (for example: "<meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>") within the "head" tag, assuming the document is XHTML. If both character set specifications are present, the one in the XML declaration takes precedence. To programmatically specify the document's character set, get the URLConnection instance's InputStream instance and wrap it in an instance of InputStreamReader(InputStream, String), then pass that Reader to the XMLDocument(Reader, int) constructor.

Parameters:
Conn - A URLConnection from which the XML document can be read.
Throws:
TagException - If there's a parsing error.
IOException - If there's an I/O error.

XMLDocument

public XMLDocument(InputStream In,
                   long LengthHint)
            throws TagException,
                   IOException
Reads an XML document from an InputStream. The InputStream is not closed after the document has been read. The character set of the document read by this constructor is determined by either the XML declaration (for example: "<?xml version="1.0" encoding="UTF-8"?>") at the beginning of the document, or by a "Content-Type" meta tag (for example: "<meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>") within the "head" tag, assuming the document is XHTML. If both character set specifications are present, the one in the XML declaration takes precedence. To programmatically specify the document's character set, wrap the InputStream instance in an instance of InputStreamReader(InputStream, String) and pass that Reader to the XMLDocument(Reader, int) constructor.

Parameters:
In - A InputStream from which the XML document can be read.
LengthHint - The length of the XML document, or your best guess. If you can't even guess, pass zero or a negative value.
Throws:
TagException - If there's a parsing error.
IOException - If there's an I/O error.

XMLDocument

public XMLDocument(InputStream In,
                   long LengthHint,
                   String DocumentName)
            throws TagException,
                   IOException
Reads an XML document from an InputStream. The InputStream is not closed after the document has been read. The character set of the document read by this constructor is determined by either the XML declaration (for example: "<?xml version="1.0" encoding="UTF-8"?>") at the beginning of the document, or by a "Content-Type" meta tag (for example: "<meta http-equiv='Content-Type' content='text/html; charset=UTF-8'/>") within the "head" tag, assuming the document is XHTML. If both character set specifications are present, the one in the XML declaration takes precedence. To programmatically specify the document's character set, wrap the InputStream instance in an instance of InputStreamReader(InputStream, String) and pass that InputStreamReader to the XMLDocument(Reader, int) constructor.

Parameters:
In - A InputStream from which the XML document can be read.
LengthHint - The length of the XML document, or your best guess. If you can't even guess, pass zero or a negative value.
DocumentName - The name to be associated with this document, or null. If not null, this name is used to make error messages more informative, and is available from the Markup.getMarkupName() method.
Throws:
TagException - If there's a parsing error.
IOException - If there's an I/O error.

XMLDocument

public XMLDocument(Reader In,
                   long LengthHint)
            throws TagException,
                   IOException
Reads an XML document from a Reader. The Reader is not closed after the document has been read. Note that when you create the Reader you should explicitly specify the document's character set (see InputStreamReader(InputStream, String)), in order to ensure consistent behavior across platforms.

Parameters:
In - A Reader from which the XML document can be read.
LengthHint - The length of the XML document, or your best guess. If you can't even guess, pass zero or a negative value.
Throws:
TagException - If there's a parsing error.
IOException - If there's an I/O error.

XMLDocument

public XMLDocument(Reader In,
                   long LengthHint,
                   String DocumentName)
            throws TagException,
                   IOException
Reads an XML document from a Reader. The Reader is not closed after the document has been read. Note that when you create the Reader you should explicitly specify the document's character set (see InputStreamReader(InputStream, String)), in order to ensure consistent behavior across platforms.

Parameters:
In - A Reader from which the XML document can be read.
LengthHint - The number of characters in the XML document, or your best guess. If you can't even guess, pass zero or a negative value.
DocumentName - The name to be associated with this document, or null. If not null, this name is used to make error messages more informative, and is available from the Markup.getMarkupName() method.
Throws:
TagException - If there's a parsing error.
IOException - If there's an I/O error.
Method Detail

getLatest

public XMLDocument getLatest()
                      throws TagException,
                             IOException
If this XMLDocument read its markup from a file, and that file has changed since this immutable representation was created, this method returns a new object representing the current contents of the file. If the file has not been modified, this object is returned. If this XMLDocument read its markup from a URL, InputStream or Reader, nothing happens, and this object is returned. That's because, in the case of URLs, it is presumed to be too expensive to check for updates, and in the case of InputStreams and Readers, there's no way this class can determine the original source of the material in order to open a new InputStream or Reader to re-read it (also, re-reading a document to determine whether, or not, it has been changed wouldn't be efficient).

Returns:
The latest version of this XMLDocument.
Throws:
TagException - If there's a parsing error.
IOException - If there's an I/O error.

clone

public XMLDocument clone()
                  throws CloneNotSupportedException
Overrides:
clone in class ImmutableMarkup
Throws:
CloneNotSupportedException