opennlp.tools.namefind
Class NameFinderME

java.lang.Object
  extended by opennlp.tools.namefind.NameFinderME
All Implemented Interfaces:
TokenNameFinder

public class NameFinderME
extends java.lang.Object
implements TokenNameFinder

Class for creating a maximum-entropy-based name finder.


Field Summary
protected  NameContextGenerator contextGenerator
           
static java.lang.String CONTINUE
           
protected  opennlp.maxent.MaxentModel model
           
static java.lang.String OTHER
           
static java.lang.String START
           
 
Constructor Summary
NameFinderME(opennlp.maxent.MaxentModel mod)
          Creates a new name finder with the specified model.
NameFinderME(opennlp.maxent.MaxentModel mod, NameContextGenerator cg)
          Creates a new name finder with the specified model and context generator.
NameFinderME(opennlp.maxent.MaxentModel mod, NameContextGenerator cg, int beamSize)
          Creates a new name finder with the specified model and context generator.
 
Method Summary
 void clearAdaptiveData()
          Forgets all adaptive data which was collected during previous calls to one of the find methods.
 Span[] find(java.lang.String[] tokens)
          Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.
 Span[] find(java.lang.String[] tokens, java.lang.String[][] additionalContext)
          Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.
static void main(java.lang.String[] args)
          Trains a new named entity model on the specified training file using the specified encoding to read it in.
 double[] probs()
          Returns an array with the probabilities of the last decoded sequence.
 void probs(double[] probs)
          Populates the specified array with the probabilities of the last decoded sequence.
 double[] probs(Span[] spans)
          Returns an array of probabilities for each of the specified spans which is the product the probabilities for each of the outcomes which make up the span.
static opennlp.maxent.GISModel train(opennlp.maxent.EventStream es, int iterations, int cut)
           
static void usage()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

START

public static final java.lang.String START
See Also:
Constant Field Values

CONTINUE

public static final java.lang.String CONTINUE
See Also:
Constant Field Values

OTHER

public static final java.lang.String OTHER
See Also:
Constant Field Values

model

protected opennlp.maxent.MaxentModel model

contextGenerator

protected NameContextGenerator contextGenerator
Constructor Detail

NameFinderME

public NameFinderME(opennlp.maxent.MaxentModel mod)
Creates a new name finder with the specified model.

Parameters:
mod - The model to be used to find names.

NameFinderME

public NameFinderME(opennlp.maxent.MaxentModel mod,
                    NameContextGenerator cg)
Creates a new name finder with the specified model and context generator.

Parameters:
mod - The model to be used to find names.
cg - The context generator to be used with this name finder.

NameFinderME

public NameFinderME(opennlp.maxent.MaxentModel mod,
                    NameContextGenerator cg,
                    int beamSize)
Creates a new name finder with the specified model and context generator.

Parameters:
mod - The model to be used to find names.
cg - The context generator to be used with this name finder.
beamSize - The size of the beam to be used in decoding this model.
Method Detail

find

public Span[] find(java.lang.String[] tokens)
Description copied from interface: TokenNameFinder
Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.

Specified by:
find in interface TokenNameFinder
Parameters:
tokens - an array of the tokens or words of the sequence, typically a sentence.
Returns:
an array of spans for each of the names identified.

find

public Span[] find(java.lang.String[] tokens,
                   java.lang.String[][] additionalContext)
Generates name tags for the given sequence, typically a sentence, returning token spans for any identified names.

Parameters:
tokens - an array of the tokens or words of the sequence, typically a sentence.
additionalContext - features which are based on context outside of the sentence but which should also be used.
Returns:
an array of spans for each of the names identified.

clearAdaptiveData

public void clearAdaptiveData()
Forgets all adaptive data which was collected during previous calls to one of the find methods. This method is typical called at the end of a document.


probs

public void probs(double[] probs)
Populates the specified array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk. The specified array should be at least as large as the number of tokens in the previous call to chunk.

Parameters:
probs - An array used to hold the probabilities of the last decoded sequence.

probs

public double[] probs()
Returns an array with the probabilities of the last decoded sequence. The sequence was determined based on the previous call to chunk.

Returns:
An array with the same number of probabilities as tokens were sent to chunk when it was last called.

probs

public double[] probs(Span[] spans)
Returns an array of probabilities for each of the specified spans which is the product the probabilities for each of the outcomes which make up the span.

Parameters:
spans - The spans of the names for which probabilities are desired.
Returns:
an array of probabilities for each of the specified spans.

train

public static opennlp.maxent.GISModel train(opennlp.maxent.EventStream es,
                                            int iterations,
                                            int cut)
                                     throws java.io.IOException
Throws:
java.io.IOException

usage

public static void usage()

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Trains a new named entity model on the specified training file using the specified encoding to read it in.

Parameters:
args - [-encoding encoding] training_file model_file
Throws:
java.io.IOException


Copyright 2008 Jason Baldridge, Gann Bierner, and Thomas Morton. All Rights Reserved.