ALPACA Programmer Guide

From Humanitarian FOSS Summer Institute 2008

Jump to: navigation, search

This guide is for experienced Java programmers who would like to understand the inner workings of ALPACA, modify it, or add their own features to the program.

Contents

Main Features

ALPACA is divided into three main parts: Parsers,Classifiers, and the User Interface.

Parsers

All the parsers in ALPACA implement one common Parser interface. This interface requires two parse methods. The first parse method requires only a list of documents (as strings) while the second requires a list of documents and attributes to be requested from the parser to parse. Both of these methods return ParsedArticle objects. ParsedArticle objects have the contents and the keywords of each article that was parsed. Additionally, if it was requested when the parse method was called, it can have other parsed attributes (e.g. Title, Date).

Classifiers

The classifiers in ALPACA are similarly grouped under a single Classifier interface. The classifiers are obtained via the ClassifierSelector, and should therefore have a functional default constructor. Classifiers should have a train method, which takes in the training parameters and the training articles. Similarly, they should have a test method, which takes in testing parameters and articles. Classifiers also have methods which return lists of possible parameters that can be passed to them.


Adding New Parsers and Classifiers

ALPACA has a built in plug-in functionality to make it easy to add new Parsers and/or Classifiers to it. Parsers and Classifiers are handed to ALPACA by Providers. There are two steps required for adding new Parsers and/or Classifiers to ALPACA:

  • Write the new Parsers and Classifiers that you wish to add to ALPACA
    • New Parsers must implement the Parser interface located in parsers.Parser
    • New Classifiers must implement the Classifier interface located in classifiers.Classifier
  • Write a new trivial Provider class to hand your new classes to ALPACA
    • For every Parser or Classifier you wish to add to ALPACA, the constructor of the new Provider must call the putParser or the putClassifier method respectively.
    • New Providers must extend the Provider class located in providers.Provider
    • New Provider classes must end with "Provider.class"
    • New Providers must be placed in the "plugins" folder.

NOTE: a stub for each of the mentioned classes is already present in the "plugins" folder of ALPACA. The ExampleProvider class must be only trivially modified to suit any new Classifiers or Parsers.

Personal tools