|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectopennlp.tools.ngram.NGramModel
public class NGramModel
The NGramModel can be used to crate ngrams and character ngrams.
StringList| Field Summary | |
|---|---|
protected static String |
COUNT
|
| Constructor Summary | |
|---|---|
NGramModel()
Initializes an empty instance. |
|
NGramModel(InputStream in)
Initializes the current instance. |
|
| Method Summary | |
|---|---|
void |
add(String chars,
int minLength,
int maxLength)
Adds character NGrams to the current instance. |
void |
add(StringList ngram)
Adds one NGram, if it already exists the count increase by one. |
void |
add(StringList ngram,
int minLength,
int maxLength)
Adds NGrams up to the specified length to the current instance. |
boolean |
contains(StringList tokens)
Checks fit he given tokens are contained by the current instance. |
void |
cutoff(int cutoffUnder,
int cutoffOver)
Deletes all ngram which do appear less than the cutoffUnder value and more often than the cutoffOver value. |
boolean |
equals(Object obj)
|
int |
getCount(StringList ngram)
Retrieves the count of the given ngram. |
int |
hashCode()
|
Iterator<StringList> |
iterator()
Retrieves an Iterator over all StringList entries. |
int |
numberOfGrams()
Retrieves the total count of all Ngrams. |
void |
remove(StringList tokens)
Removes the specified tokens form the NGram model, they are just dropped. |
void |
serialize(OutputStream out)
Writes the ngram instance to the given OutputStream. |
void |
setCount(StringList ngram,
int count)
Sets the count of an existing ngram. |
int |
size()
Retrieves the number of StringList entries in the current instance. |
Dictionary |
toDictionary()
Creates a dictionary which contain all StringList which
are in the current NGramModel. |
Dictionary |
toDictionary(boolean caseSensitive)
Creates a dictionary which contains all StringLists which
are in the current NGramModel. |
String |
toString()
|
| Methods inherited from class java.lang.Object |
|---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
| Field Detail |
|---|
protected static final String COUNT
| Constructor Detail |
|---|
public NGramModel()
public NGramModel(InputStream in)
throws IOException,
InvalidFormatException
in -
IOException
InvalidFormatException| Method Detail |
|---|
public int getCount(StringList ngram)
ngram -
public void setCount(StringList ngram,
int count)
ngram - count - public void add(StringList ngram)
ngram -
public void add(StringList ngram,
int minLength,
int maxLength)
ngram - the tokens to build the uni-grams, bi-grams, tri-grams, ..
from.minLength - - minimal lengthmaxLength - - maximal length
public void add(String chars,
int minLength,
int maxLength)
chars - minLength - maxLength - public void remove(StringList tokens)
tokens - public boolean contains(StringList tokens)
tokens -
public int size()
StringList entries in the current instance.
public Iterator<StringList> iterator()
Iterator over all StringList entries.
iterator in interface Iterable<StringList>public int numberOfGrams()
public void cutoff(int cutoffUnder,
int cutoffOver)
cutoffUnder - cutoffOver - public Dictionary toDictionary()
StringList which
are in the current NGramModel.
Entries which are only different in the case are merged into one.
Calling this method is the same as calling #toDictionary(true).
public Dictionary toDictionary(boolean caseSensitive)
StringLists which
are in the current NGramModel.
caseSensitive - Specifies whether case distinctions should be kept in the creation of the dictionary.
public void serialize(OutputStream out)
throws IOException
OutputStream.
out -
IOException - if an I/O Error during writing occurspublic boolean equals(Object obj)
equals in class Objectpublic String toString()
toString in class Objectpublic int hashCode()
hashCode in class Object
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||