Developers’ Guide
This is a rough table of contents for the upcoming Developers Guide. We’ll start filling these things in soon.
Using OpenPipeline
Introduction
Installation
A Quick Walkthrough
Define a connector
Stages, schedule, whatever
Configure a docfilter
Now what?
The Built-in Connectors
Discuss each
The Built-in DocFilters
The output of each, standard format
Reference the tokenizer
Any special considerations
The Built-in Stages
Discuss each, what it does
Tokenizer
UIMA wrapper
The Scheduler
The jobs config file
Different scheduling options
Logging
Where the logs go
Replacing a logger. Log4j.
The Server
Start, stop, admin page
How to run in Websphere, Tomcat, etc.
Configure openpipeline.home
Running OpenPipeline in batch mode
No need for an app server, just define your job and run this handy batch file
to execute the job on demand
Extending OpenPipeline
General Architecture
Show the picture, discuss
Plugins, generally
How to build a plugin, how to register it
The core classes: Items and Annotations
What they are
Logic behind the design
How to use them, sample code
How to write your own connector
What each method does
Creating a multi-threaded connector
Dealing with deletes, etc.
Creating a pipeline
See “why were stages designed this way” in Stages section
How to write your own docfilter
Discuss the sample, lexer, etc.
How to write your own stage
Extend, don’t implement
Calling processItem
Why were stages implemented this way
Stages must be thread-safe.
How to write a multi-threaded stage that has a shared resource
Diskfile, jdbc connection, socket, etc.
Traversing, dealing with annotations
Annotations
Extending the annotation class
Make this page editable?
Any updates to the completion of the open pipeline developers guide.