org.jasig.portal.serialize
Class BaseMarkupSerializer

java.lang.Object
  extended by org.jasig.portal.serialize.BaseMarkupSerializer
All Implemented Interfaces:
IAnchoringSerializer, DOMSerializer, Serializer, org.xml.sax.ContentHandler, org.xml.sax.DocumentHandler, org.xml.sax.DTDHandler, org.xml.sax.ext.DeclHandler, org.xml.sax.ext.LexicalHandler
Direct Known Subclasses:
HTMLSerializer, TextSerializer, XHTMLSerializer, XMLSerializer

public abstract class BaseMarkupSerializer
extends java.lang.Object
implements org.xml.sax.ContentHandler, org.xml.sax.DocumentHandler, org.xml.sax.ext.LexicalHandler, org.xml.sax.DTDHandler, org.xml.sax.ext.DeclHandler, DOMSerializer, Serializer, IAnchoringSerializer

Base class for a serializer supporting both DOM and SAX pretty serializing of XML/HTML/XHTML documents. Derives classes perform the method-specific serializing, this class provides the common serializing mechanisms.

The serializer must be initialized with the proper writer and output format before it can be used by calling init. The serializer can be reused any number of times, but cannot be used concurrently by two threads.

If an output stream is used, the encoding is taken from the output format (defaults to UTF-8). If a writer is used, make sure the writer uses the same encoding (if applies) as specified in the output format.

The serializer supports both DOM and SAX. DOM serializing is done by calling serialize(Document) and SAX serializing is done by firing SAX events and using the serializer as a document handler. This also applies to derived class.

If an I/O exception occurs while serializing, the serializer will not throw an exception directly, but only throw it at the end of serializing (either DOM or SAX's DocumentHandler.endDocument().

For elements that are not specified as whitespace preserving, the serializer will potentially break long text lines at space boundaries, indent lines, and serialize elements on separate lines. Line terminators will be regarded as spaces, and spaces at beginning of line will be stripped.

When indenting, the serializer is capable of detecting seemingly element content, and serializing these elements indented on separate lines. An element is serialized indented when it is the first or last child of an element, or immediate following or preceding another element.

Version:
$Revision: 1.13.2.1 $ $Date: 2005/08/05 18:39:26 $
Author:
Assaf Arkin
See Also:
Serializer, DOMSerializer

Field Summary
private  boolean _allowDisableOutputEscaping
          A portal property indicating whether or not to allow the disabling of output escaping.
protected  java.lang.String _docTypePublicId
          The system identifier of the document type, if known.
protected  java.lang.String _docTypeSystemId
          The system identifier of the document type, if known.
private  int _elementStateCount
          The index of the next state to place in the array, or one plus the index of the current state.
private  ElementState[] _elementStates
          Holds array of all element states that have been entered.
private  EncodingInfo _encodingInfo
           
protected  OutputFormat _format
          The output format associated with this serializer.
protected  boolean _indenting
          True if indenting printer.
private  java.io.OutputStream _output
          The output stream.
protected  java.util.Hashtable _prefixes
          Association between namespace URIs (keys) and prefixes (values).
private  boolean _prepared
          True if the serializer has been prepared.
private  java.util.Vector _preRoot
          Vector holding comments and PIs that come before the root element (even after it), see serializePreRoot().
protected  Printer _printer
          The printer used for printing text parts.
protected  boolean _started
          If the document has been started (header serialized), this flag is set to true so it's not started twice.
private  java.io.Writer _writer
          The underlying writer.
protected  java.lang.String anchorId
           
 
Constructor Summary
protected BaseMarkupSerializer(OutputFormat format)
          Protected constructor can only be used by derived class.
 
Method Summary
protected  java.lang.String appendAnchorIfNecessary(java.lang.String elementName, java.lang.String attributeName, java.lang.String attributeValue)
           
 org.xml.sax.ContentHandler asContentHandler()
          Return a ContentHandler interface into this serializer.
 org.xml.sax.DocumentHandler asDocumentHandler()
          Return a DocumentHandler interface into this serializer.
 DOMSerializer asDOMSerializer()
          Return a DOMSerializer interface into this serializer.
 void attributeDecl(java.lang.String eName, java.lang.String aName, java.lang.String type, java.lang.String valueDefault, java.lang.String value)
           
 void characters(char[] chars, int start, int length)
           
protected  void characters(java.lang.String text)
          Called to print the text contents in the prevailing element format.
 void comment(char[] chars, int start, int length)
           
 void comment(java.lang.String text)
           
protected  ElementState content()
          Must be called by a method about to print any type of content.
 void elementDecl(java.lang.String name, java.lang.String model)
           
 void endCDATA()
           
 void endDocument()
          Called at the end of the document to wrap it up.
 void endDTD()
           
 void endEntity(java.lang.String name)
           
 void endNonEscaping()
           
 void endPrefixMapping(java.lang.String prefix)
           
 void endPreserving()
           
protected  ElementState enterElementState(java.lang.String namespaceURI, java.lang.String localName, java.lang.String rawName, boolean preserveSpace)
          Enter a new element state for the specified element.
 void externalEntityDecl(java.lang.String name, java.lang.String publicId, java.lang.String systemId)
           
protected  ElementState getElementState()
          Return the state of the current element.
protected abstract  java.lang.String getEntityRef(int ch)
          Returns the suitable entity reference for this character value, or null if no such entity exists.
protected  java.lang.String getPrefix(java.lang.String namespaceURI)
          Returns the namespace prefix for the specified URI.
 void ignorableWhitespace(char[] chars, int start, int length)
           
 void internalEntityDecl(java.lang.String name, java.lang.String value)
           
protected  boolean isDocumentState()
          Returns true if in the state of the document.
protected  ElementState leaveElementState()
          Leave the current element state and return to the state of the parent element.
 void notationDecl(java.lang.String name, java.lang.String publicId, java.lang.String systemId)
           
protected  void prepare()
           
protected  void printDoctypeURL(java.lang.String url)
          Print a document type public or system identifier URL.
protected  void printEscaped(int ch)
           
protected  void printEscaped(java.lang.String source)
          Escapes a string so it may be printed as text content or attribute value.
protected  void printText(char[] chars, int start, int length, boolean preserveSpace, boolean unescaped)
          Called to print additional text with whitespace handling.
protected  void printText(java.lang.String text, boolean preserveSpace, boolean unescaped)
           
 void processingInstruction(java.lang.String target, java.lang.String code)
           
 void processingInstructionIO(java.lang.String target, java.lang.String code)
           
 boolean reset()
           
 void serialize(org.w3c.dom.Document doc)
          Serializes the DOM document using the previously specified writer and output format.
 void serialize(org.w3c.dom.DocumentFragment frag)
          Serializes the DOM document fragmnt using the previously specified writer and output format.
 void serialize(org.w3c.dom.Element elem)
          Serializes the DOM element using the previously specified writer and output format.
protected abstract  void serializeElement(org.w3c.dom.Element elem)
          Called to serializee the DOM element.
protected  void serializeNode(org.w3c.dom.Node node)
          Serialize the DOM node.
protected  void serializePreRoot()
          Comments and PIs cannot be serialized before the root element, because the root element serializes the document type, which generally comes first.
 void setDocumentLocator(org.xml.sax.Locator locator)
           
 void setOutputByteStream(java.io.OutputStream output)
          Specifies an output stream to which the document should be serialized.
 void setOutputCharStream(java.io.Writer writer)
          Specifies a writer to which the document should be serialized.
 void setOutputFormat(OutputFormat format)
          Specifies an output format for this serializer.
 void skippedEntity(java.lang.String name)
           
 void startAnchoring(java.lang.String anchorId)
          Signify that the serializer should begin to append the anchor ID to URLs of its choosing.
 void startCDATA()
           
 void startDocument()
           
 void startDTD(java.lang.String name, java.lang.String publicId, java.lang.String systemId)
           
 void startEntity(java.lang.String name)
           
 void startNonEscaping()
           
 void startPrefixMapping(java.lang.String prefix, java.lang.String uri)
           
 void startPreserving()
           
 void stopAnchoring()
          Signify that anchoring is no longer desired by the serializer.
 void unparsedEntityDecl(java.lang.String name, java.lang.String publicId, java.lang.String systemId, java.lang.String notationName)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface org.xml.sax.ContentHandler
endElement, startElement
 
Methods inherited from interface org.xml.sax.DocumentHandler
endElement, startElement
 

Field Detail

anchorId

protected java.lang.String anchorId

_encodingInfo

private EncodingInfo _encodingInfo

_elementStates

private ElementState[] _elementStates
Holds array of all element states that have been entered. The array is automatically resized. When leaving an element, it's state is not removed but reused when later returning to the same nesting level.


_elementStateCount

private int _elementStateCount
The index of the next state to place in the array, or one plus the index of the current state. When zero, we are in no state.


_preRoot

private java.util.Vector _preRoot
Vector holding comments and PIs that come before the root element (even after it), see serializePreRoot().


_started

protected boolean _started
If the document has been started (header serialized), this flag is set to true so it's not started twice.


_prepared

private boolean _prepared
True if the serializer has been prepared. This flag is set to false when the serializer is reset prior to using it, and to true after it has been prepared for usage.


_prefixes

protected java.util.Hashtable _prefixes
Association between namespace URIs (keys) and prefixes (values). Accumulated here prior to starting an element and placing this list in the element state.


_docTypePublicId

protected java.lang.String _docTypePublicId
The system identifier of the document type, if known.


_docTypeSystemId

protected java.lang.String _docTypeSystemId
The system identifier of the document type, if known.


_format

protected OutputFormat _format
The output format associated with this serializer. This will never be a null reference. If no format was passed to the constructor, the default one for this document type will be used. The format object is never changed by the serializer.


_printer

protected Printer _printer
The printer used for printing text parts.


_indenting

protected boolean _indenting
True if indenting printer.


_writer

private java.io.Writer _writer
The underlying writer.


_output

private java.io.OutputStream _output
The output stream.


_allowDisableOutputEscaping

private boolean _allowDisableOutputEscaping
A portal property indicating whether or not to allow the disabling of output escaping. When allowed, XSLT stylesheets can request to disable output escaping, therefore enabling the direct pass-through of markup such as HTML.

Constructor Detail

BaseMarkupSerializer

protected BaseMarkupSerializer(OutputFormat format)
Protected constructor can only be used by derived class. Must initialize the serializer before serializing any document.

Method Detail

asDocumentHandler

public org.xml.sax.DocumentHandler asDocumentHandler()
                                              throws java.io.IOException
Description copied from interface: Serializer
Return a DocumentHandler interface into this serializer. If the serializer does not support the DocumentHandler interface, it should return null.

Specified by:
asDocumentHandler in interface Serializer
Throws:
java.io.IOException

asContentHandler

public org.xml.sax.ContentHandler asContentHandler()
                                            throws java.io.IOException
Description copied from interface: Serializer
Return a ContentHandler interface into this serializer. If the serializer does not support the ContentHandler interface, it should return null.

Specified by:
asContentHandler in interface Serializer
Throws:
java.io.IOException

asDOMSerializer

public DOMSerializer asDOMSerializer()
                              throws java.io.IOException
Description copied from interface: Serializer
Return a DOMSerializer interface into this serializer. If the serializer does not support the DOMSerializer interface, it should return null.

Specified by:
asDOMSerializer in interface Serializer
Throws:
java.io.IOException

setOutputByteStream

public void setOutputByteStream(java.io.OutputStream output)
Description copied from interface: Serializer
Specifies an output stream to which the document should be serialized. This method should not be called while the serializer is in the process of serializing a document.

Specified by:
setOutputByteStream in interface Serializer

setOutputCharStream

public void setOutputCharStream(java.io.Writer writer)
Description copied from interface: Serializer
Specifies a writer to which the document should be serialized. This method should not be called while the serializer is in the process of serializing a document.

Specified by:
setOutputCharStream in interface Serializer

setOutputFormat

public void setOutputFormat(OutputFormat format)
Description copied from interface: Serializer
Specifies an output format for this serializer. It the serializer has already been associated with an output format, it will switch to the new format. This method should not be called while the serializer is in the process of serializing a document.

Specified by:
setOutputFormat in interface Serializer
Parameters:
format - The output format to use

reset

public boolean reset()

prepare

protected void prepare()
                throws java.io.IOException
Throws:
java.io.IOException

serialize

public void serialize(org.w3c.dom.Element elem)
               throws java.io.IOException
Serializes the DOM element using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.

Specified by:
serialize in interface DOMSerializer
Parameters:
elem - The element to serialize
Throws:
java.io.IOException - An I/O exception occured while serializing

serialize

public void serialize(org.w3c.dom.DocumentFragment frag)
               throws java.io.IOException
Serializes the DOM document fragmnt using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.

Specified by:
serialize in interface DOMSerializer
Parameters:
frag - the document fragment to serialize
Throws:
java.io.IOException - An I/O exception occured while serializing

serialize

public void serialize(org.w3c.dom.Document doc)
               throws java.io.IOException
Serializes the DOM document using the previously specified writer and output format. Throws an exception only if an I/O exception occured while serializing.

Specified by:
serialize in interface DOMSerializer
Parameters:
doc - The document to serialize
Throws:
java.io.IOException - An I/O exception occured while serializing

startDocument

public void startDocument()
                   throws org.xml.sax.SAXException
Specified by:
startDocument in interface org.xml.sax.ContentHandler
Specified by:
startDocument in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException

characters

public void characters(char[] chars,
                       int start,
                       int length)
                throws org.xml.sax.SAXException
Specified by:
characters in interface org.xml.sax.ContentHandler
Specified by:
characters in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException

ignorableWhitespace

public void ignorableWhitespace(char[] chars,
                                int start,
                                int length)
                         throws org.xml.sax.SAXException
Specified by:
ignorableWhitespace in interface org.xml.sax.ContentHandler
Specified by:
ignorableWhitespace in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException

processingInstruction

public final void processingInstruction(java.lang.String target,
                                        java.lang.String code)
                                 throws org.xml.sax.SAXException
Specified by:
processingInstruction in interface org.xml.sax.ContentHandler
Specified by:
processingInstruction in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException

processingInstructionIO

public void processingInstructionIO(java.lang.String target,
                                    java.lang.String code)
                             throws java.io.IOException
Throws:
java.io.IOException

comment

public void comment(char[] chars,
                    int start,
                    int length)
             throws org.xml.sax.SAXException
Specified by:
comment in interface org.xml.sax.ext.LexicalHandler
Throws:
org.xml.sax.SAXException

comment

public void comment(java.lang.String text)
             throws java.io.IOException
Throws:
java.io.IOException

startCDATA

public void startCDATA()
Specified by:
startCDATA in interface org.xml.sax.ext.LexicalHandler

endCDATA

public void endCDATA()
Specified by:
endCDATA in interface org.xml.sax.ext.LexicalHandler

startNonEscaping

public void startNonEscaping()

endNonEscaping

public void endNonEscaping()

startPreserving

public void startPreserving()

endPreserving

public void endPreserving()

endDocument

public void endDocument()
                 throws org.xml.sax.SAXException
Called at the end of the document to wrap it up. Will flush the output stream and throw an exception if any I/O error occured while serializing.

Specified by:
endDocument in interface org.xml.sax.ContentHandler
Specified by:
endDocument in interface org.xml.sax.DocumentHandler
Throws:
org.xml.sax.SAXException - An I/O exception occured during serializing

startEntity

public void startEntity(java.lang.String name)
Specified by:
startEntity in interface org.xml.sax.ext.LexicalHandler

endEntity

public void endEntity(java.lang.String name)
Specified by:
endEntity in interface org.xml.sax.ext.LexicalHandler

setDocumentLocator

public void setDocumentLocator(org.xml.sax.Locator locator)
Specified by:
setDocumentLocator in interface org.xml.sax.ContentHandler
Specified by:
setDocumentLocator in interface org.xml.sax.DocumentHandler

skippedEntity

public void skippedEntity(java.lang.String name)
                   throws org.xml.sax.SAXException
Specified by:
skippedEntity in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

startPrefixMapping

public void startPrefixMapping(java.lang.String prefix,
                               java.lang.String uri)
                        throws org.xml.sax.SAXException
Specified by:
startPrefixMapping in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

endPrefixMapping

public void endPrefixMapping(java.lang.String prefix)
                      throws org.xml.sax.SAXException
Specified by:
endPrefixMapping in interface org.xml.sax.ContentHandler
Throws:
org.xml.sax.SAXException

startDTD

public final void startDTD(java.lang.String name,
                           java.lang.String publicId,
                           java.lang.String systemId)
                    throws org.xml.sax.SAXException
Specified by:
startDTD in interface org.xml.sax.ext.LexicalHandler
Throws:
org.xml.sax.SAXException

endDTD

public void endDTD()
Specified by:
endDTD in interface org.xml.sax.ext.LexicalHandler

elementDecl

public void elementDecl(java.lang.String name,
                        java.lang.String model)
                 throws org.xml.sax.SAXException
Specified by:
elementDecl in interface org.xml.sax.ext.DeclHandler
Throws:
org.xml.sax.SAXException

attributeDecl

public void attributeDecl(java.lang.String eName,
                          java.lang.String aName,
                          java.lang.String type,
                          java.lang.String valueDefault,
                          java.lang.String value)
                   throws org.xml.sax.SAXException
Specified by:
attributeDecl in interface org.xml.sax.ext.DeclHandler
Throws:
org.xml.sax.SAXException

internalEntityDecl

public void internalEntityDecl(java.lang.String name,
                               java.lang.String value)
                        throws org.xml.sax.SAXException
Specified by:
internalEntityDecl in interface org.xml.sax.ext.DeclHandler
Throws:
org.xml.sax.SAXException

externalEntityDecl

public void externalEntityDecl(java.lang.String name,
                               java.lang.String publicId,
                               java.lang.String systemId)
                        throws org.xml.sax.SAXException
Specified by:
externalEntityDecl in interface org.xml.sax.ext.DeclHandler
Throws:
org.xml.sax.SAXException

unparsedEntityDecl

public void unparsedEntityDecl(java.lang.String name,
                               java.lang.String publicId,
                               java.lang.String systemId,
                               java.lang.String notationName)
                        throws org.xml.sax.SAXException
Specified by:
unparsedEntityDecl in interface org.xml.sax.DTDHandler
Throws:
org.xml.sax.SAXException

notationDecl

public void notationDecl(java.lang.String name,
                         java.lang.String publicId,
                         java.lang.String systemId)
                  throws org.xml.sax.SAXException
Specified by:
notationDecl in interface org.xml.sax.DTDHandler
Throws:
org.xml.sax.SAXException

serializeNode

protected void serializeNode(org.w3c.dom.Node node)
                      throws java.io.IOException
Serialize the DOM node. This method is shared across XML, HTML and XHTML serializers and the differences are masked out in a separate serializeElement(org.w3c.dom.Element).

Parameters:
node - The node to serialize
Throws:
java.io.IOException - An I/O exception occured while serializing
See Also:
serializeElement(org.w3c.dom.Element)

content

protected ElementState content()
                        throws java.io.IOException
Must be called by a method about to print any type of content. If the element was just opened, the opening tag is closed and will be matched to a closing tag. Returns the current element state with empty and afterElement set to false.

Returns:
The current element state
Throws:
java.io.IOException - An I/O exception occured while serializing

characters

protected void characters(java.lang.String text)
                   throws java.io.IOException
Called to print the text contents in the prevailing element format. Since this method is capable of printing text as CDATA, it is used for that purpose as well. White space handling is determined by the current element state. In addition, the output format can dictate whether the text is printed as CDATA or unescaped.

Parameters:
text - The text to print
Throws:
java.io.IOException - An I/O exception occured while serializing

getEntityRef

protected abstract java.lang.String getEntityRef(int ch)
Returns the suitable entity reference for this character value, or null if no such entity exists. Calling this method with '&' will return "&".

Parameters:
ch - Character value
Returns:
Character entity name, or null

serializeElement

protected abstract void serializeElement(org.w3c.dom.Element elem)
                                  throws java.io.IOException
Called to serializee the DOM element. The element is serialized based on the serializer's method (XML, HTML, XHTML).

Parameters:
elem - The element to serialize
Throws:
java.io.IOException - An I/O exception occured while serializing

serializePreRoot

protected void serializePreRoot()
                         throws java.io.IOException
Comments and PIs cannot be serialized before the root element, because the root element serializes the document type, which generally comes first. Instead such PIs and comments are accumulated inside a vector and serialized by calling this method. Will be called when the root element is serialized and when the document finished serializing.

Throws:
java.io.IOException - An I/O exception occured while serializing

printText

protected final void printText(char[] chars,
                               int start,
                               int length,
                               boolean preserveSpace,
                               boolean unescaped)
                        throws java.io.IOException
Called to print additional text with whitespace handling. If spaces are preserved, the text is printed as if by calling printText(String, boolean, boolean) with a call to breakLine() for each new line. If spaces are not preserved, the text is broken at space boundaries if longer than the line width; Multiple spaces are printed as such, but spaces at beginning of line are removed.

Parameters:
preserveSpace - Space preserving flag
unescaped - Print unescaped
Throws:
java.io.IOException

printText

protected final void printText(java.lang.String text,
                               boolean preserveSpace,
                               boolean unescaped)
                        throws java.io.IOException
Throws:
java.io.IOException

printDoctypeURL

protected void printDoctypeURL(java.lang.String url)
                        throws java.io.IOException
Print a document type public or system identifier URL. Encapsulates the URL in double quotes, escapes non-printing characters and print it equivalent to printText(char[], int, int, boolean, boolean).

Parameters:
url - The document type url to print
Throws:
java.io.IOException

printEscaped

protected void printEscaped(int ch)
                     throws java.io.IOException
Throws:
java.io.IOException

printEscaped

protected void printEscaped(java.lang.String source)
                     throws java.io.IOException
Escapes a string so it may be printed as text content or attribute value. Non printable characters are escaped using character references. Where the format specifies a deault entity reference, that reference is used (e.g. <).

Parameters:
source - The string to escape
Throws:
java.io.IOException

getElementState

protected ElementState getElementState()
Return the state of the current element.

Returns:
Current element state

enterElementState

protected ElementState enterElementState(java.lang.String namespaceURI,
                                         java.lang.String localName,
                                         java.lang.String rawName,
                                         boolean preserveSpace)
Enter a new element state for the specified element. Tag name and space preserving is specified, element state is initially empty.

Returns:
Current element state, or null

leaveElementState

protected ElementState leaveElementState()
Leave the current element state and return to the state of the parent element. If this was the root element, return to the state of the document.

Returns:
Previous element state

isDocumentState

protected boolean isDocumentState()
Returns true if in the state of the document. Returns true before entering any element and after leaving the root element.

Returns:
True if in the state of the document

getPrefix

protected java.lang.String getPrefix(java.lang.String namespaceURI)
Returns the namespace prefix for the specified URI. If the URI has been mapped to a prefix, returns the prefix, otherwise returns null.

Parameters:
namespaceURI - The namespace URI
Returns:
The namespace prefix if known, or null

startAnchoring

public void startAnchoring(java.lang.String anchorId)
Description copied from interface: IAnchoringSerializer
Signify that the serializer should begin to append the anchor ID to URLs of its choosing.

Specified by:
startAnchoring in interface IAnchoringSerializer
Parameters:
anchorId - the anchor identifier

stopAnchoring

public void stopAnchoring()
Description copied from interface: IAnchoringSerializer
Signify that anchoring is no longer desired by the serializer.

Specified by:
stopAnchoring in interface IAnchoringSerializer

appendAnchorIfNecessary

protected java.lang.String appendAnchorIfNecessary(java.lang.String elementName,
                                                   java.lang.String attributeName,
                                                   java.lang.String attributeValue)