|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object opennlp.tools.sentdetect.DefaultSDContextGenerator
public class DefaultSDContextGenerator
Generate event contexts for maxent decisions for sentence detection.
Field Summary | |
---|---|
protected StringBuffer |
buf
String buffer for generating features. |
protected List<String> |
collectFeats
List for holding features as they are generated. |
Constructor Summary | |
---|---|
DefaultSDContextGenerator(char[] eosCharacters)
Creates a new SDContextGenerator instance with
no induced abbreviations. |
|
DefaultSDContextGenerator(Set<String> inducedAbbreviations,
char[] eosCharacters)
Creates a new SDContextGenerator instance which uses
the set of induced abbreviations. |
Method Summary | |
---|---|
protected void |
collectFeatures(String prefix,
String suffix,
String previous,
String next)
Determines some of the features for the sentence detector and adds them to list features. |
String[] |
getContext(CharSequence sb,
int position)
Returns an array of contextual features for the potential sentence boundary at the specified position within the specified string buffer. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected StringBuffer buf
protected List<String> collectFeats
Constructor Detail |
---|
public DefaultSDContextGenerator(char[] eosCharacters)
SDContextGenerator
instance with
no induced abbreviations.
eosCharacters
- public DefaultSDContextGenerator(Set<String> inducedAbbreviations, char[] eosCharacters)
SDContextGenerator
instance which uses
the set of induced abbreviations.
inducedAbbreviations
- a Set
of Strings
representing induced abbreviations in the training data.
Example: "Mr."eosCharacters
- Method Detail |
---|
public String[] getContext(CharSequence sb, int position)
SDContextGenerator
getContext
in interface SDContextGenerator
sb
- The String
for which sentences are being determined.position
- An index into the specified string buffer when a sentence boundary may occur.
protected void collectFeatures(String prefix, String suffix, String previous, String next)
prefix
- String preceeding the eos character in the eos token.suffix
- String following the eos character in the eos token.previous
- Space delimited token preceeding token containing eos character.next
- Space delimited token following token containsing eos character.
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |