edu.mit.jwi
Class DataSourceDictionary

java.lang.Object
  extended by edu.mit.jwi.DataSourceDictionary
All Implemented Interfaces:
IClosable, IHasCharset, IHasLifecycle, IDataSourceDictionary, IDictionary, IHasVersion

public class DataSourceDictionary
extends java.lang.Object
implements IDataSourceDictionary

Basic implementation of the IDictionary interface. A path to the Wordnet dictionary files must be provided. If no IDataProvider is specified, it uses the default implementation provided with the distribution.

Since:
JWI 2.2.0
Version:
2.4.0
Author:
Mark A. Finlayson

Nested Class Summary
 class DataSourceDictionary.DataFileIterator
          Iterates over data files.
 class DataSourceDictionary.ExceptionFileIterator
          Iterates over exception files.
 class DataSourceDictionary.FileIterator<T,N>
          Abstract class used for iterating over line-based files.
 class DataSourceDictionary.FileIterator2<T>
          A file iterator where the data type returned by the iterator is the same as that returned by the backing data source.
 class DataSourceDictionary.IndexFileIterator
          Iterates over index files.
 class DataSourceDictionary.SenseEntryFileIterator
          Iterates over the sense file.
 
Nested classes/interfaces inherited from interface edu.mit.jwi.data.IHasLifecycle
IHasLifecycle.LifecycleState, IHasLifecycle.ObjectClosedException, IHasLifecycle.ObjectOpenException
 
Constructor Summary
DataSourceDictionary(IDataProvider provider)
          Constructs a dictionary with a caller-specified IDataProvider.
 
Method Summary
protected  void checkOpen()
          An internal method for assuring compliance with the dictionary interface that says that methods will throw ObjectClosedExceptions if the dictionary has not yet been opened.
 void close()
          This closes the object by disposing of data backing objects or connections.
 java.nio.charset.Charset getCharset()
          Returns the character set associated with this object.
 IDataProvider getDataProvider()
          Returns the data provider for this dictionary.
 IExceptionEntry getExceptionEntry(IExceptionEntryID id)
          Retrieves the exception entry for the specified id from the database.
 IExceptionEntry getExceptionEntry(java.lang.String surfaceForm, POS pos)
          Retrieves the exception entry for the specified surface form and part of speech from the database.
 java.util.Iterator<IExceptionEntry> getExceptionEntryIterator(POS pos)
          Returns an iterator that will iterate over all exception entries of the specified part of speech.
 IIndexWord getIndexWord(IIndexWordID id)
          Retrieves the specified index word object from the database.
 IIndexWord getIndexWord(java.lang.String lemma, POS pos)
          This method is identical to getIndexWord(IIndexWordID) and is provided as a convenience.
 java.util.Iterator<IIndexWord> getIndexWordIterator(POS pos)
          Returns an iterator that will iterate over all index words of the specified part of speech.
 ISenseEntry getSenseEntry(ISenseKey key)
          Retrieves the sense entry for the specified sense key from the database.
 java.util.Iterator<ISenseEntry> getSenseEntryIterator()
          Returns an iterator that will iterate over all sense entries in the dictionary.
 ISynset getSynset(ISynsetID id)
          Retrieves the synset with the specified id from the database.
 java.util.Iterator<ISynset> getSynsetIterator(POS pos)
          Returns an iterator that will iterate over all synsets of the specified part of speech.
 IVersion getVersion()
          Returns the associated version for this object.
 IWord getWord(ISenseKey key)
          Retrieves the word with the specified sense key from the database.
 IWord getWord(IWordID id)
          Retrieves the word with the specified id from the database.
 boolean isOpen()
          Returns true if the dictionary is open, that is, ready to accept queries; returns false otherwise
 boolean open()
          This opens the object by performing any required initialization steps.
 void setCharset(java.nio.charset.Charset charset)
          Sets the character set associated with this dictionary.
protected  void setHeadWord(ISynset synset)
          This method sets the head word on the specified synset by searching in the dictionary to find the head of its cluster.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

DataSourceDictionary

public DataSourceDictionary(IDataProvider provider)
Constructs a dictionary with a caller-specified IDataProvider.

Throws:
java.lang.NullPointerException - if the specified data provider is null
Method Detail

getDataProvider

public IDataProvider getDataProvider()
Description copied from interface: IDataSourceDictionary
Returns the data provider for this dictionary. Should never return null.

Specified by:
getDataProvider in interface IDataSourceDictionary
Returns:
the data provider for this dictionary

getVersion

public IVersion getVersion()
Description copied from interface: IHasVersion
Returns the associated version for this object. If this object is not associated with any particular version, this method may return null.

Specified by:
getVersion in interface IHasVersion
Returns:
The associated version, or null if none.

open

public boolean open()
             throws java.io.IOException
Description copied from interface: IHasLifecycle
This opens the object by performing any required initialization steps. If this method returns false, then subsequent calls to IHasLifecycle.isOpen() will return false.

Specified by:
open in interface IHasLifecycle
Returns:
true if there were no errors in initialization; false otherwise.
Throws:
java.io.IOException - if there was IO error while performing initializataion

close

public void close()
Description copied from interface: IClosable
This closes the object by disposing of data backing objects or connections. If the object is already closed, or in the process of closing, this method does nothing (although, if the object is in the process of closing, it may block until closing is complete).

Specified by:
close in interface IClosable

isOpen

public boolean isOpen()
Description copied from interface: IHasLifecycle
Returns true if the dictionary is open, that is, ready to accept queries; returns false otherwise

Specified by:
isOpen in interface IHasLifecycle
Returns:
true if the object is open; false otherwise

checkOpen

protected void checkOpen()
An internal method for assuring compliance with the dictionary interface that says that methods will throw ObjectClosedExceptions if the dictionary has not yet been opened.

Throws:
ObjectClosedException - if the dictionary is closed.

getCharset

public java.nio.charset.Charset getCharset()
Description copied from interface: IHasCharset
Returns the character set associated with this object. May be null.

Specified by:
getCharset in interface IHasCharset
Returns:
the Charset associated this object, possibly null

setCharset

public void setCharset(java.nio.charset.Charset charset)
Description copied from interface: IDictionary
Sets the character set associated with this dictionary. The character set may be null.

Specified by:
setCharset in interface IDictionary
Parameters:
charset - the possibly null character set to use when decoding files.

getIndexWord

public IIndexWord getIndexWord(java.lang.String lemma,
                               POS pos)
Description copied from interface: IDictionary
This method is identical to getIndexWord(IIndexWordID) and is provided as a convenience.

Specified by:
getIndexWord in interface IDictionary
Parameters:
lemma - the lemma for the index word requested; may not be null, empty, or all whitespace
pos - the part of speech; may not be null
Returns:
the index word corresponding to the specified lemma and part of speech, or null if none is found

getIndexWord

public IIndexWord getIndexWord(IIndexWordID id)
Description copied from interface: IDictionary
Retrieves the specified index word object from the database. If the specified lemma/part of speech combination is not found, returns null.

Note: This call does no stemming on the specified lemma, it is taken as specified. That is, if you submit the word "dogs", it will search for "dogs", not "dog"; in the standard Wordnet distribution, there is no entry for "dogs" and therefore the call will return null. This is in contrast to the Wordnet API provided by Princeton. If you want your searches to capture morphological variation, use the descendants of the IStemmer class.

Specified by:
getIndexWord in interface IDictionary
Parameters:
id - the id of the index word to search for; may not be null
Returns:
the index word, if found; null otherwise

getWord

public IWord getWord(IWordID id)
Description copied from interface: IDictionary
Retrieves the word with the specified id from the database. If the specified word is not found, returns null

Specified by:
getWord in interface IDictionary
Parameters:
id - the id of the word to search for; may not be null
Returns:
the word, if found; null otherwise

getWord

public IWord getWord(ISenseKey key)
Description copied from interface: IDictionary
Retrieves the word with the specified sense key from the database. If the specified word is not found, returns null

Specified by:
getWord in interface IDictionary
Parameters:
key - the sense key of the word to search for; may not be null
Returns:
the word, if found; null otherwise

getSenseEntry

public ISenseEntry getSenseEntry(ISenseKey key)
Description copied from interface: IDictionary
Retrieves the sense entry for the specified sense key from the database. If the specified sense key has no associated sense entry, returns null

Specified by:
getSenseEntry in interface IDictionary
Parameters:
key - the sense key of the entry to search for; may not be null
Returns:
the entry, if found; null otherwise

getSynset

public ISynset getSynset(ISynsetID id)
Description copied from interface: IDictionary
Retrieves the synset with the specified id from the database. If the specified synset is not found, returns null

Specified by:
getSynset in interface IDictionary
Parameters:
id - the id of the synset to search for; may not be null
Returns:
the synset, if found; null otherwise

setHeadWord

protected void setHeadWord(ISynset synset)
This method sets the head word on the specified synset by searching in the dictionary to find the head of its cluster. We will assume the head is the first adjective head synset related by an '&' pointer (SIMILAR_TO) to this synset.


getExceptionEntry

public IExceptionEntry getExceptionEntry(java.lang.String surfaceForm,
                                         POS pos)
Description copied from interface: IDictionary
Retrieves the exception entry for the specified surface form and part of speech from the database. If the specified surface form/ part of speech pair has no associated exception entry, returns null

Specified by:
getExceptionEntry in interface IDictionary
Parameters:
surfaceForm - the surface form to be looked up; may not be null , empty, or all whitespace
pos - the part of speech; may not be null
Returns:
the entry, if found; null otherwise

getExceptionEntry

public IExceptionEntry getExceptionEntry(IExceptionEntryID id)
Description copied from interface: IDictionary
Retrieves the exception entry for the specified id from the database. If the specified id is not found, returns null

Specified by:
getExceptionEntry in interface IDictionary
Parameters:
id - the exception entry id of the entry to search for; may not be null
Returns:
the exception entry for the specified id

getIndexWordIterator

public java.util.Iterator<IIndexWord> getIndexWordIterator(POS pos)
Description copied from interface: IDictionary
Returns an iterator that will iterate over all index words of the specified part of speech.

Specified by:
getIndexWordIterator in interface IDictionary
Parameters:
pos - the part of speech over which to iterate; may not be null
Returns:
an iterator that will iterate over all index words of the specified part of speech

getSynsetIterator

public java.util.Iterator<ISynset> getSynsetIterator(POS pos)
Description copied from interface: IDictionary
Returns an iterator that will iterate over all synsets of the specified part of speech.

Specified by:
getSynsetIterator in interface IDictionary
Parameters:
pos - the part of speech over which to iterate; may not be null
Returns:
an iterator that will iterate over all synsets of the specified part of speech

getExceptionEntryIterator

public java.util.Iterator<IExceptionEntry> getExceptionEntryIterator(POS pos)
Description copied from interface: IDictionary
Returns an iterator that will iterate over all exception entries of the specified part of speech.

Specified by:
getExceptionEntryIterator in interface IDictionary
Parameters:
pos - the part of speech over which to iterate; may not be null
Returns:
an iterator that will iterate over all exception entries of the specified part of speech

getSenseEntryIterator

public java.util.Iterator<ISenseEntry> getSenseEntryIterator()
Description copied from interface: IDictionary
Returns an iterator that will iterate over all sense entries in the dictionary.

Specified by:
getSenseEntryIterator in interface IDictionary
Returns:
an iterator that will iterate over all sense entries


Copyright © 2007-2013 Massachusetts Institute of Technology. All Rights Reserved.