org.openpipeline.pipeline.docfilter
Class PlainTextFilter

java.lang.Object
  extended by org.openpipeline.pipeline.docfilter.PlainTextFilter
All Implemented Interfaces:
DocFilter

public class PlainTextFilter
extends Object
implements DocFilter

A filter for plain text files. The entire file is loaded into an attribute with an attributeId of "text".


Constructor Summary
PlainTextFilter()
           
 
Method Summary
 String getDescription()
          Return a description of this filter suitable for display in the admin interface.
 String getErrorMessage()
          Return any error message that occurs during the parse process.
 Throwable getException()
          Return any exception that occurred during parsing.
 String[] getExtensions()
          Return an array of file extensions that this filter can handle.
 List getLinks()
          Returns a List of any links found in the document.
 String[] getMimeTypes()
          Return an array of mimetypes that this filter can handle.
 String getName()
          Return the name of this filter.
 boolean getNextItem(Item item)
          Reads data from the input, parses it, and adds it to the the specified item.
 boolean hasError()
          Returns true if the last call to getNextItem() generated an error.
 void setEncoding(String encoding)
          Set the encoding of the data in the input stream.
 void setExtensions(String[] exts)
          Set the extensions that this filter handles.
 void setInputStream(InputStream inStream)
          Set the input stream which contains the document to be added.
 void setMimeTypes(String[] mimeTypes)
          Set the mimetypes that this filter handles.
 
Methods inherited from class java.lang.Object
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

PlainTextFilter

public PlainTextFilter()
Method Detail

setInputStream

public void setInputStream(InputStream inStream)
Description copied from interface: DocFilter
Set the input stream which contains the document to be added.

Specified by:
setInputStream in interface DocFilter

setEncoding

public void setEncoding(String encoding)
Description copied from interface: DocFilter
Set the encoding of the data in the input stream. Optional; may apply in cases where the input is plain text or HTML, but will not apply in cases where the document specifies its own encoding.

Specified by:
setEncoding in interface DocFilter
Parameters:
encoding - an encoding string, for example, "UTF-8" or "ISO-8859-1". Must be one supported by the JVM.

getNextItem

public boolean getNextItem(Item item)
Description copied from interface: DocFilter
Reads data from the input, parses it, and adds it to the the specified item. This method can be called repeatedly until all items in the stream have been exhausted. For normal documents (like HTML or PDF), there will be only one item. This interface can handle streams of items, though, of the kind seen in multi-item XML files or zip files.

Specified by:
getNextItem in interface DocFilter
Returns:
true if there was data in the input stream, false if the input stream was at the end. Returns true if there was data that generated an error.

getLinks

public List getLinks()
Description copied from interface: DocFilter
Returns a List of any links found in the document. In HTML, for example, links are found in <a href=""> tags. The ArrayList contains Link objects.

Specified by:
getLinks in interface DocFilter
Returns:
a list of links. If none, returns an empty list.

hasError

public boolean hasError()
Description copied from interface: DocFilter
Returns true if the last call to getNextItem() generated an error. Check getErrorMessage() and getException() for more details.

Specified by:
hasError in interface DocFilter
Returns:
true if there was an erro

getErrorMessage

public String getErrorMessage()
Description copied from interface: DocFilter
Return any error message that occurs during the parse process. If the parse() method returns false, then this method should return the text of the error.

Specified by:
getErrorMessage in interface DocFilter
Returns:
the error message, or null if the parse was successful

getException

public Throwable getException()
Description copied from interface: DocFilter
Return any exception that occurred during parsing.

Specified by:
getException in interface DocFilter
Returns:
the exception

getName

public String getName()
Description copied from interface: DocFilter
Return the name of this filter.

Specified by:
getName in interface DocFilter
Returns:
a name

getDescription

public String getDescription()
Description copied from interface: DocFilter
Return a description of this filter suitable for display in the admin interface.

Specified by:
getDescription in interface DocFilter
Returns:
a String description

getExtensions

public String[] getExtensions()
Description copied from interface: DocFilter
Return an array of file extensions that this filter can handle. The extensions should be lower case and omit the dot, for example,

{"htm", "html", "jsp", "asp"}

Specified by:
getExtensions in interface DocFilter
Returns:
a list of extensions

getMimeTypes

public String[] getMimeTypes()
Description copied from interface: DocFilter
Return an array of mimetypes that this filter can handle. For example,

{"text/html", "text/plain"}

Other common mimetypes include application/pdf, application/msword, application/vnd.ms-excel, etc.

Specified by:
getMimeTypes in interface DocFilter
Returns:
a list of mimetypes

setExtensions

public void setExtensions(String[] exts)
Description copied from interface: DocFilter
Set the extensions that this filter handles. This value doesn't usually change the behavior of this class; it's just stored here and used for display purposes.

Specified by:
setExtensions in interface DocFilter
Parameters:
exts - extensions this class should handle

setMimeTypes

public void setMimeTypes(String[] mimeTypes)
Description copied from interface: DocFilter
Set the mimetypes that this filter handles. This value doesn't usually change the behavior of this class; it's just stored here and used for display purposes.

Specified by:
setMimeTypes in interface DocFilter
Parameters:
mimeTypes - mimetypes this class should handle