opennlp.tools.tokenize
Class SimpleTokenizer

java.lang.Object
  extended by opennlp.tools.tokenize.SimpleTokenizer
All Implemented Interfaces:
Tokenizer

public class SimpleTokenizer
extends Object

Performs tokenization using character classes.

Author:
tsmorton

Field Summary
static SimpleTokenizer INSTANCE
           
 
Constructor Summary
SimpleTokenizer()
          Deprecated. Use INSTANCE field instead to obtain an instance, constructor will be made private in the future.
 
Method Summary
static void main(String[] args)
          Deprecated. 
 String[] tokenize(String s)
          Splits a string into its atomic parts
 Span[] tokenizePos(String s)
          Finds the boundaries of atomic parts in a string.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

INSTANCE

public static final SimpleTokenizer INSTANCE
Constructor Detail

SimpleTokenizer

@Deprecated
public SimpleTokenizer()
Deprecated. Use INSTANCE field instead to obtain an instance, constructor will be made private in the future.

Method Detail

tokenizePos

public Span[] tokenizePos(String s)
Description copied from interface: Tokenizer
Finds the boundaries of atomic parts in a string.

Parameters:
s - The string to be tokenized.
Returns:
The Span[] with the spans (offsets into s) for each token as the individuals array elements.

main

@Deprecated
public static void main(String[] args)
                 throws IOException
Deprecated. 

Parameters:
args -
Throws:
IOException

tokenize

public String[] tokenize(String s)
Description copied from interface: Tokenizer
Splits a string into its atomic parts

Specified by:
tokenize in interface Tokenizer
Parameters:
s - The string to be tokenized.
Returns:
The String[] with the individual tokens as the array elements.


Copyright © 2010. All Rights Reserved.