|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object opennlp.tools.ngram.NGramModel
public class NGramModel
The NGramModel
can be used to crate ngrams and character ngrams.
StringList
Field Summary | |
---|---|
protected static String |
COUNT
|
Constructor Summary | |
---|---|
NGramModel()
Initializes an empty instance. |
|
NGramModel(InputStream in)
Initializes the current instance. |
Method Summary | |
---|---|
void |
add(String chars,
int minLength,
int maxLength)
Adds character NGrams to the current instance. |
void |
add(StringList ngram)
Adds one NGram, if it already exists the count increase by one. |
void |
add(StringList ngram,
int minLength,
int maxLength)
Adds NGrams up to the specified length to the current instance. |
boolean |
contains(StringList tokens)
Checks fit he given tokens are contained by the current instance. |
void |
cutoff(int cutoffUnder,
int cutoffOver)
Deletes all ngram which do appear less than the cutoffUnder value and more often than the cutoffOver value. |
boolean |
equals(Object obj)
|
int |
getCount(StringList ngram)
Retrieves the count of the given ngram. |
int |
hashCode()
|
Iterator<StringList> |
iterator()
Retrieves an Iterator over all StringList entries. |
int |
numberOfGrams()
Retrieves the total count of all Ngrams. |
void |
remove(StringList tokens)
Removes the specified tokens form the NGram model, they are just dropped. |
void |
serialize(OutputStream out)
Writes the ngram instance to the given OutputStream . |
void |
setCount(StringList ngram,
int count)
Sets the count of an existing ngram. |
int |
size()
Retrieves the number of StringList entries in the current instance. |
Dictionary |
toDictionary()
Creates a dictionary which contain all StringList which
are in the current NGramModel . |
Dictionary |
toDictionary(boolean caseSensitive)
Creates a dictionary which contains all StringList s which
are in the current NGramModel . |
String |
toString()
|
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
protected static final String COUNT
Constructor Detail |
---|
public NGramModel()
public NGramModel(InputStream in) throws IOException, InvalidFormatException
in
-
IOException
InvalidFormatException
Method Detail |
---|
public int getCount(StringList ngram)
ngram
-
public void setCount(StringList ngram, int count)
ngram
- count
- public void add(StringList ngram)
ngram
- public void add(StringList ngram, int minLength, int maxLength)
ngram
- the tokens to build the uni-grams, bi-grams, tri-grams, ..
from.minLength
- - minimal lengthmaxLength
- - maximal lengthpublic void add(String chars, int minLength, int maxLength)
chars
- minLength
- maxLength
- public void remove(StringList tokens)
tokens
- public boolean contains(StringList tokens)
tokens
-
public int size()
StringList
entries in the current instance.
public Iterator<StringList> iterator()
Iterator
over all StringList
entries.
iterator
in interface Iterable<StringList>
public int numberOfGrams()
public void cutoff(int cutoffUnder, int cutoffOver)
cutoffUnder
- cutoffOver
- public Dictionary toDictionary()
StringList
which
are in the current NGramModel
.
Entries which are only different in the case are merged into one.
Calling this method is the same as calling #toDictionary(true)
.
public Dictionary toDictionary(boolean caseSensitive)
StringList
s which
are in the current NGramModel
.
caseSensitive
- Specifies whether case distinctions should be kept in the creation of the dictionary.
public void serialize(OutputStream out) throws IOException
OutputStream
.
out
-
IOException
- if an I/O Error during writing occurspublic boolean equals(Object obj)
equals
in class Object
public String toString()
toString
in class Object
public int hashCode()
hashCode
in class Object
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |