edu.mit.jwi.data
Class FileProvider

java.lang.Object
  extended by edu.mit.jwi.data.FileProvider
All Implemented Interfaces:
IClosable, IDataProvider, IHasCharset, IHasLifecycle, ILoadable, ILoadPolicy, IHasVersion

public class FileProvider
extends java.lang.Object
implements IDataProvider, ILoadable, ILoadPolicy

Implementation of a data provider for Wordnet that uses files in the file system to back instances of its data sources. This implementation takes a URL to a file system directory as its path argument, and uses the resource hints from the data types and parts of speech for its content types to examine the filenames in the that directory to determine which files contain which data.

This implementation supports loading the wordnet files into memory, but this is actually not that beneficial for speed. This is because the implementation loads the file data into memory uninterpreted, and on modern machines, the time to interpret a line of data (i.e., parse it into a Java object) is much larger than the time it takes to load the line from disk. Those wishing to achieve speed increases from loading Wordnet into memory should rely on the implementation in RAMDictionary, or something similar, which pre-processes the Wordnet data into objects before caching them.

Since:
JWI 1.0
Version:
2.4.0
Author:
Mark A. Finlayson

Nested Class Summary
protected  class FileProvider.JWIBackgroundLoader
          A thread class which tries to load each data source in this provider.
 
Nested classes/interfaces inherited from interface edu.mit.jwi.data.IHasLifecycle
IHasLifecycle.LifecycleState, IHasLifecycle.ObjectClosedException, IHasLifecycle.ObjectOpenException
 
Field Summary
 
Fields inherited from interface edu.mit.jwi.data.ILoadPolicy
BACKGROUND_LOAD, IMMEDIATE_LOAD, NO_LOAD
 
Constructor Summary
FileProvider(java.io.File file)
          Constructs the file provider pointing to the resource indicated by the path.
FileProvider(java.io.File file, int loadPolicy)
          Constructs the file provider pointing to the resource indicated by the path, with the specified load policy.
FileProvider(java.io.File file, int loadPolicy, java.util.Collection<? extends IContentType<?>> types)
          Constructs the file provider pointing to the resource indicated by the path, with the specified load policy, looking for the specified content type.s
FileProvider(java.net.URL url)
          Constructs the file provider pointing to the resource indicated by the path.
FileProvider(java.net.URL url, int loadPolicy)
          Constructs the file provider pointing to the resource indicated by the path, with the specified load policy.
FileProvider(java.net.URL url, int loadPolicy, java.util.Collection<? extends IContentType<?>> types)
          Constructs the file provider pointing to the resource indicated by the path, with the specified load policy, looking for the specified content type.s
 
Method Summary
protected  void checkOpen()
          Convenience method that throws an exception if the provider is closed.
 void close()
          This closes the object by disposing of data backing objects or connections.
protected
<T> ILoadableDataSource<T>
createBinarySearch(java.io.File file, IContentType<T> type)
          Creates a binary search data source for the specified type, using the specified file.
protected
<T> ILoadableDataSource<T>
createDataSource(java.io.File file, IContentType<T> type, int policy)
          Creates the actual data source implementations.
protected
<T> ILoadableDataSource<T>
createDirectAccess(java.io.File file, IContentType<T> type)
          Creates a direct access data source for the specified type, using the specified file.
protected  java.util.Map<IContentType<?>,ILoadableDataSource<?>> createSourceMap(java.util.List<java.io.File> files, int policy)
          Creates the map that contains the content types mapped to the data sources.
protected  IVersion determineVersion(java.util.Collection<? extends IDataSource<?>> srcs)
          Determines a version from the set of data sources, if possible, otherwise returns IVersion.NO_VERSION
 java.nio.charset.Charset getCharset()
          Returns the character set associated with this object.
 int getLoadPolicy()
          Returns the load policy for this object, expressed as an integer.
 java.net.URL getSource()
          Returns the URL that points to the resource location; should never return null.
<T> ILoadableDataSource<T>
getSource(IContentType<T> type)
          Returns a data source object for the specified content type, if one is available; otherwise returns null.
 java.util.Set<? extends IContentType<?>> getTypes()
          Returns a set containing all the content types this provider looks for at the resource location.
 IVersion getVersion()
          Returns the associated version for this object.
 boolean isLoaded()
          Returns whether this object is loaded or not.
static boolean isLocalDirectory(java.io.File dir)
          A utility method for checking whether a file represents an existing local directory.
static boolean isLocalDirectory(java.net.URL url)
          A utility method for checking whether a file represents an existing local directory.
 boolean isOpen()
          Returns true if the dictionary is open, that is, ready to accept queries; returns false otherwise
 void load()
          Starts a simple, non-blocking load.
 void load(boolean block)
          Initiates the loading process.
 boolean open()
          This opens the object by performing any required initialization steps.
<T> IContentType<T>
resolveContentType(IDataType<T> dt, POS pos)
          Returns the first content type, if any, that matches the specified data type and pos object.
 void setCharset(java.nio.charset.Charset charset)
          Sets the character set associated with this dictionary.
 void setLoadPolicy(int policy)
          Sets the load policy for this object.
 void setSource(java.net.URL url)
          This method is used to set the source URL from which the provider accesses the data from which it instantiates data sources.
static java.io.File toFile(java.net.URL url)
          Transforms a URL into a File.
static java.net.URL toURL(java.io.File file)
          Transforms a file into a URL.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

FileProvider

public FileProvider(java.io.File file)
Constructs the file provider pointing to the resource indicated by the path. This file provider has an initial ILoadPolicy.NO_LOAD load policy.

Parameters:
file - A file pointing to the wordnet directory, may not be null
Throws:
java.lang.NullPointerException - if the specified file is null
Since:
JWI 1.0

FileProvider

public FileProvider(java.io.File file,
                    int loadPolicy)
Constructs the file provider pointing to the resource indicated by the path, with the specified load policy.

Parameters:
file - A file pointing to the wordnet directory, may not be null
loadPolicy - the load policy for this provider; this provider supports the three values defined in ILoadPolicy.
Throws:
java.lang.NullPointerException - if the specified file is null
Since:
JWI 2.2.0

FileProvider

public FileProvider(java.io.File file,
                    int loadPolicy,
                    java.util.Collection<? extends IContentType<?>> types)
Constructs the file provider pointing to the resource indicated by the path, with the specified load policy, looking for the specified content type.s

Parameters:
file - A file pointing to the wordnet directory, may not be null
loadPolicy - the load policy for this provider; this provider supports the three values defined in ILoadPolicy.
types - the content types this provider will look for when it loads its data; may not be null or empty
Throws:
java.lang.NullPointerException - if the file or content type collection is null
java.lang.IllegalArgumentException - if the set of types is empty
Since:
JWI 2.2.0

FileProvider

public FileProvider(java.net.URL url)
Constructs the file provider pointing to the resource indicated by the path. This file provider has an initial ILoadPolicy.NO_LOAD load policy.

Parameters:
url - A file URL in UTF-8 decodable format, may not be null
Throws:
java.lang.NullPointerException - if the specified URL is null
Since:
JWI 1.0

FileProvider

public FileProvider(java.net.URL url,
                    int loadPolicy)
Constructs the file provider pointing to the resource indicated by the path, with the specified load policy.

Parameters:
url - A file URL in UTF-8 decodable format, may not be null
loadPolicy - the load policy for this provider; this provider supports the three values defined in ILoadPolicy.
Throws:
java.lang.NullPointerException - if the specified URL is null
Since:
JWI 2.2.0

FileProvider

public FileProvider(java.net.URL url,
                    int loadPolicy,
                    java.util.Collection<? extends IContentType<?>> types)
Constructs the file provider pointing to the resource indicated by the path, with the specified load policy, looking for the specified content type.s

Parameters:
url - A file URL in UTF-8 decodable format, may not be null
loadPolicy - the load policy for this provider; this provider supports the three values defined in ILoadPolicy.
types - the content types this provider will look for when it loads its data; may not be null or empty
Throws:
java.lang.NullPointerException - if the url or content type collection is null
java.lang.IllegalArgumentException - if the set of types is empty
Since:
JWI 2.2.0
Method Detail

getSource

public java.net.URL getSource()
Description copied from interface: IDataProvider
Returns the URL that points to the resource location; should never return null.

Specified by:
getSource in interface IDataProvider
Returns:
theURL that points to the resource location; must not be null

getLoadPolicy

public int getLoadPolicy()
Description copied from interface: ILoadPolicy
Returns the load policy for this object, expressed as an integer.

Specified by:
getLoadPolicy in interface ILoadPolicy
Returns:
the load policy for this object

setSource

public void setSource(java.net.URL url)
Description copied from interface: IDataProvider
This method is used to set the source URL from which the provider accesses the data from which it instantiates data sources. The data at the specified location may be in an implementation-specific format. If the provider is currently open, this method throws an IllegalStateException.

Specified by:
setSource in interface IDataProvider
Parameters:
url - the location of the data, may not be null

setLoadPolicy

public void setLoadPolicy(int policy)
Description copied from interface: ILoadPolicy
Sets the load policy for this object. If the object is currently loaded, or in the process of loading, the load policy will not take effect until the next time objet is instantiated, initialized, or opened.

Specified by:
setLoadPolicy in interface ILoadPolicy
Parameters:
policy - the policy to implement; may be one of NO_LOAD, BACKGROUND_LOAD, IMMEDIATE_LOAD or an implementation-dependent value.

getVersion

public IVersion getVersion()
Description copied from interface: IHasVersion
Returns the associated version for this object. If this object is not associated with any particular version, this method may return null.

Specified by:
getVersion in interface IHasVersion
Returns:
The associated version, or null if none.

determineVersion

protected IVersion determineVersion(java.util.Collection<? extends IDataSource<?>> srcs)
Determines a version from the set of data sources, if possible, otherwise returns IVersion.NO_VERSION

Parameters:
srcs - the data sources to be used to determine the verison
Returns:
the single version that describes these data sources, or IVersion.NO_VERSION if there is none
Since:
JWI 2.1.0

getCharset

public java.nio.charset.Charset getCharset()
Description copied from interface: IHasCharset
Returns the character set associated with this object. May be null.

Specified by:
getCharset in interface IHasCharset
Returns:
the Charset associated this object, possibly null

setCharset

public void setCharset(java.nio.charset.Charset charset)
Description copied from interface: IDataProvider
Sets the character set associated with this dictionary. The character set may be null.

Specified by:
setCharset in interface IDataProvider
Parameters:
charset - the possibly null character set to use when decoding files.

resolveContentType

public <T> IContentType<T> resolveContentType(IDataType<T> dt,
                                              POS pos)
Description copied from interface: IDataProvider
Returns the first content type, if any, that matches the specified data type and pos object. Either parameter may be null.

Specified by:
resolveContentType in interface IDataProvider
Parameters:
dt - the data type, possibly null, of the desired content type
pos - the part of speech, possibly null, of the desired content type
Returns:
the first content type that matches the specified data type and part of speech.

open

public boolean open()
             throws java.io.IOException
Description copied from interface: IHasLifecycle
This opens the object by performing any required initialization steps. If this method returns false, then subsequent calls to IHasLifecycle.isOpen() will return false.

Specified by:
open in interface IHasLifecycle
Returns:
true if there were no errors in initialization; false otherwise.
Throws:
java.io.IOException - if there was IO error while performing initializataion

load

public void load()
Description copied from interface: ILoadable
Starts a simple, non-blocking load. If the object is already loaded, the method returns immediately and has no effect. If the object is in the process of loading, the method also returns immediately.

Specified by:
load in interface ILoadable

load

public void load(boolean block)
          throws java.lang.InterruptedException
Description copied from interface: ILoadable
Initiates the loading process. Depending on the flag, the method may return immediately (block is false), or return only when the loading process is complete. If the object is already loaded, the method returns immediately and has no effect. If the object is in the process of loading, and the method is called in blocking mode, the method blocks until loading is complete, even if that call of the method did not initiate the loading process. Some implementors of this interface may not support the immediate-return functionality.

Specified by:
load in interface ILoadable
Parameters:
block - if true, the method returns only when the loading process is complete; if false, the method returns immediately.
Throws:
java.lang.InterruptedException - if the method is blocking, and is interrupted while waiting for loading to complete

isLoaded

public boolean isLoaded()
Description copied from interface: ILoadable
Returns whether this object is loaded or not. This method should return true only if the loading process has completed and the object is actually loaded; if the object is still in the process of loading, or failed to load, the method should return false.

Specified by:
isLoaded in interface ILoadable
Returns:
true if the method has completed loading; false otherwise

createSourceMap

protected java.util.Map<IContentType<?>,ILoadableDataSource<?>> createSourceMap(java.util.List<java.io.File> files,
                                                                                int policy)
                                                                         throws java.io.IOException
Creates the map that contains the content types mapped to the data sources. The method should return a non-null result, but it may be empty if no data sources can be created. Subclasses may override this method.

Parameters:
files - the files from which the data sources should be created, may not be null
policy - the load policy of the provider
Returns:
a map, possibly empty, but not null, of content types mapped to data sources
Throws:
java.lang.NullPointerException - if the file list is null
java.io.IOException - if there is a problem creating the data source
Since:
JWI 2.2.0

createDataSource

protected <T> ILoadableDataSource<T> createDataSource(java.io.File file,
                                                      IContentType<T> type,
                                                      int policy)
                                           throws java.io.IOException
Creates the actual data source implementations.

Type Parameters:
T - the content type of the data source
Parameters:
file - the file from which the data source should be created, may not be null
type - the content type of the data source
policy - the load policy to follow when creating the data source
Returns:
the created data source
Throws:
java.lang.NullPointerException - if any argument is null
java.io.IOException - if there is an IO problem when creating the data source
Since:
JWI 2.2.0

createDirectAccess

protected <T> ILoadableDataSource<T> createDirectAccess(java.io.File file,
                                                        IContentType<T> type)
                                             throws java.io.IOException
Creates a direct access data source for the specified type, using the specified file.

Type Parameters:
T - the parameter of the content type
Parameters:
file - the file on which the data source is based; may not be null
type - the data type for the data source; may not be null
Returns:
the data source
Throws:
java.lang.NullPointerException - if either argument is null
java.io.IOException - if there is an IO problem when creating the data source object
Since:
JWI 2.2.0

createBinarySearch

protected <T> ILoadableDataSource<T> createBinarySearch(java.io.File file,
                                                        IContentType<T> type)
                                             throws java.io.IOException
Creates a binary search data source for the specified type, using the specified file.

Type Parameters:
T - the parameter of the content type
Parameters:
file - the file on which the data source is based; may not be null
type - the data type for the data source; may not be null
Returns:
the data source
Throws:
java.lang.NullPointerException - if either argument is null
java.io.IOException - if there is an IO problem when creating the data source object
Since:
JWI 2.2.0

isOpen

public boolean isOpen()
Description copied from interface: IHasLifecycle
Returns true if the dictionary is open, that is, ready to accept queries; returns false otherwise

Specified by:
isOpen in interface IHasLifecycle
Returns:
true if the object is open; false otherwise

close

public void close()
Description copied from interface: IClosable
This closes the object by disposing of data backing objects or connections. If the object is already closed, or in the process of closing, this method does nothing (although, if the object is in the process of closing, it may block until closing is complete).

Specified by:
close in interface IClosable

checkOpen

protected void checkOpen()
Convenience method that throws an exception if the provider is closed.

Throws:
ObjectClosedException - if the provider is closed
Since:
JWI 1.1

getSource

public <T> ILoadableDataSource<T> getSource(IContentType<T> type)
Description copied from interface: IDataProvider
Returns a data source object for the specified content type, if one is available; otherwise returns null.

Specified by:
getSource in interface IDataProvider
Type Parameters:
T - the content type of the data source
Parameters:
type - the content type of the data source to be retrieved
Returns:
the data source for the specified content type, or null if this provider has no such data source

getTypes

public java.util.Set<? extends IContentType<?>> getTypes()
Description copied from interface: IDataProvider
Returns a set containing all the content types this provider looks for at the resource location. The returned collection may be unmodifiable, or may be a copy of an internal array; in any event modification of the returned collection should not affect the set of types used by the provider.

Specified by:
getTypes in interface IDataProvider
Returns:
a non-null, non-empty set of content types for this provider

toFile

public static java.io.File toFile(java.net.URL url)
Transforms a URL into a File. The URL must use the 'file' protocol and must be in a UTF-8 compatible format as specified in URLDecoder.

Returns:
a file pointing to the same place as the url
Throws:
java.lang.NullPointerException - if the url is null
java.lang.IllegalArgumentException - if the url does not use the 'file' protocol
Since:
JWI 1.0

toURL

public static java.net.URL toURL(java.io.File file)
Transforms a file into a URL.

Parameters:
file - the file to be transformed
Returns:
a URL representing the file
Throws:
java.lang.NullPointerException - if the specified file is null
Since:
JWI 2.2.0

isLocalDirectory

public static boolean isLocalDirectory(java.net.URL url)
A utility method for checking whether a file represents an existing local directory.

Parameters:
url - the url object to check, may not be null
Returns:
true if the url object represents a local directory which exists; false otherwise.
Throws:
java.lang.NullPointerException - if the specified url object is null
Since:
JWI 2.4.0

isLocalDirectory

public static boolean isLocalDirectory(java.io.File dir)
A utility method for checking whether a file represents an existing local directory.

Parameters:
dir - the file object to check, may not be null
Returns:
true if the file object represents a local directory which exist; false otherwise.
Throws:
java.lang.NullPointerException - if the specified file object is null
Since:
JWI 2.4.0


Copyright © 2007-2013 Massachusetts Institute of Technology. All Rights Reserved.