FAQ

Contents

[ hide ]

    Where’s the FAQ?

    It’s right here. Ask your own questions and answer those you know the answer to.

    How can I use Openpipeline in an existing system with its own pipeline?

    The ItemSender and ItemReceiver interfaces allow an API like approach of interacting with Openpipeline.  In a typical setup, a connector acts as ItemReceiver, receiving objects from elsewhere and pushing them down the pipeline. The processed data can later be returned by an ItemSender stage.

    How can I configure the application to be multithreaded as mentioned in the introductory presentation?

    In the current revision of Openpipeline the connector interface is having full control of executing the pipeline. Simply implement a multithreaded approach to executing the stages in the connector and you’re all set.

    What is a Connector?

    A connector is the starting point of any pipeline job. The typical role is to acquire data, by e.g. traversing the filesystem or crawling the web,  to push down the pipeline.

    What is a DocFilter?

    The DocFilter is a specialized e.g., text-extracting mechanism for a certain document type, such as pdf, html etc.

    What is a Stage?

    A stage can perform arbitrary operations such as text-processing. Stages can be configured to run in a sequence or in parallel. The role of the last stage in a pipeline is typically to store the content to disk, an index or a repository.

    How do Connectors, DocFilters, and Stages relate?

    A typical overview of a pipeline job is illustrated below:

    Connector

    DocFilter

    Stage

    Stage

    This page is wiki editable login or register .
     |  Trackback