ALPACA User Guide
From Humanitarian FOSS Summer Institute 2008
Getting Started
- Download the ALPACA executable files(Media:alpaca_program.tar or Media:alpaca_program.zip)
- Extract the downloaded files.
- Learn how to extract tar files from a Unix shell with this GNU tutorial
After extracting the files run ALPACA:
- DOS (Windows) Users:
- double-click the alpaca.bat file to run ALPACA
- Unix (Mac or Linux) Users:
- Open the shell and change directory into the one where you decompressed ALPACA
- Run ALPACA with the following comand:
./alpaca
- The program will now run and show you what parsers and classifiers have been loaded.
- To open a new classifier, click on File >> New Classifier
- A new window will open in the middle of your screen. Your screen will look like this:
Parsing A Document
- To parse a document, the input text area must first have something to parse. To do so you must either:
- Open a file
- Click on either the training or testing input text areas and click File >> Open File
- Browse for the file you'd like to open and click "OK"
- The text from the file will show up into the text area
- Copy and paste the text into the text area
- Open a file
- Next, you have to select the correct parser type to use from the selection box.
- All the parsers will parse the documents with keywords and bodies as the default attributes to look for. If you would like to ask the parser to look a specific attribute, click on the text area you are using and then click the "Set Attributes" button and enter all the attributes you'd like to search for. When you are done, press "Ok"
- Finally, click on the text area you are using and press the parse button. The parsed information will show up in the parsed text area right below the text area that you are using.
Classifying A Document
- To classify a document, both a training and testing set of documents must be parsed. When this is done, you must choose the type of classifier to classify the documents.
- Training the classifier
- First, you must set the parameters for training the training set of documents. Click on the train parsed input text area and click "Set Parameters." Depending on what parameters are necessary to train the documents, you will be able to enter different values for parameters. If a keyword is asked for, one must be entered. For everything else, if no value is entered, the default value, whatever that is, will be used for the parameter.
- After setting the parameters, you can train the classifier. This might take a while depending on how many documents you have entered.
- The model that the classifier generates will be outputted to the model output text area.
- Testing the classifier
- Once again, you must first set the parameters for testing. The same rules apply, but you must first click on the test parsed input text area to put that area into focus.
- The test results, which include True Positives, False Positives, True Negatives, and False Negatives, will print out into the classified output text area. It will also print out the sensitivity and the specificity as well as the accuracy of the classification.
Categories: Projects | InSTEDD | ALPACA

