opennlp.tools.formats
Class Conll02NameSampleStream

java.lang.Object
  extended by opennlp.tools.formats.Conll02NameSampleStream
All Implemented Interfaces:
ObjectStream<NameSample>

public class Conll02NameSampleStream
extends Object
implements ObjectStream<NameSample>

Parser for the dutch and spanish ner training files of the CONLL 2002 shared task.

The dutch data has a -DOCSTART- tag to mark article boundaries, adaptive data in the feature generators will be cleared before every article.
The spanish data does not contain article boundaries, adaptive data will be cleared for every sentence.

The data contains four named entity types: Person, Organization, Location and Misc.

Data can be found on this web site:
http://www.cnts.ua.ac.be/conll2002/ner/

Note: Do not use this class, internal use only!


Nested Class Summary
static class Conll02NameSampleStream.LANGUAGE
           
 
Field Summary
static int GENERATE_LOCATION_ENTITIES
           
static int GENERATE_MISC_ENTITIES
           
static int GENERATE_ORGANIZATION_ENTITIES
           
static int GENERATE_PERSON_ENTITIES
           
 
Constructor Summary
Conll02NameSampleStream(Conll02NameSampleStream.LANGUAGE lang, InputStream in, int types)
           
Conll02NameSampleStream(Conll02NameSampleStream.LANGUAGE lang, ObjectStream<String> lineStream, int types)
           
 
Method Summary
 void close()
          Closes the ObjectStream and releases all allocated resources.
 NameSample read()
          Returns the next object.
 void reset()
          Repositions the stream at the beginning and the previously seen object sequence will be repeated exactly.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

GENERATE_PERSON_ENTITIES

public static final int GENERATE_PERSON_ENTITIES
See Also:
Constant Field Values

GENERATE_ORGANIZATION_ENTITIES

public static final int GENERATE_ORGANIZATION_ENTITIES
See Also:
Constant Field Values

GENERATE_LOCATION_ENTITIES

public static final int GENERATE_LOCATION_ENTITIES
See Also:
Constant Field Values

GENERATE_MISC_ENTITIES

public static final int GENERATE_MISC_ENTITIES
See Also:
Constant Field Values
Constructor Detail

Conll02NameSampleStream

public Conll02NameSampleStream(Conll02NameSampleStream.LANGUAGE lang,
                               ObjectStream<String> lineStream,
                               int types)

Conll02NameSampleStream

public Conll02NameSampleStream(Conll02NameSampleStream.LANGUAGE lang,
                               InputStream in,
                               int types)
Parameters:
lang -
in - an Input Stream to read data.
Throws:
IOException
Method Detail

read

public NameSample read()
                throws IOException
Description copied from interface: ObjectStream
Returns the next object. Calling this method repeatedly until it returns null will return each object from the underlying source exactly once.

Specified by:
read in interface ObjectStream<NameSample>
Returns:
the next object or null to signal that the stream is exhausted
Throws:
IOException

reset

public void reset()
           throws IOException,
                  UnsupportedOperationException
Description copied from interface: ObjectStream
Repositions the stream at the beginning and the previously seen object sequence will be repeated exactly. This method can be used to re-read the stream if multiple passes over the objects are required. The implementation of this method is optional.

Specified by:
reset in interface ObjectStream<NameSample>
Throws:
IOException
UnsupportedOperationException

close

public void close()
           throws IOException
Description copied from interface: ObjectStream
Closes the ObjectStream and releases all allocated resources. After close was called its not allowed to call read or reset.

Specified by:
close in interface ObjectStream<NameSample>
Throws:
IOException


Copyright © 2010. All Rights Reserved.