edu.mit.jmwe.detect
Class Exhaustive

java.lang.Object
  extended by edu.mit.jmwe.index.HasMWEIndex
      extended by edu.mit.jmwe.detect.Exhaustive
All Implemented Interfaces:
IMWEDetector, IHasMWEIndex

public class Exhaustive
extends HasMWEIndex
implements IMWEDetector

Implements an exhaustive algorithm that detects all possible non-stop-word MWEs in a sentence, including MWEs that are out of order or discontinuous. A "Stop Word MWE" is an MWE that consists of only stop words, as defined by the set of strings returned by the getStopWords() method.

To detect stop word MWEs, use the StopWords or TrulyExhaustive detectors.

Since:
jMWE 1.0.0
Version:
$Id: Exhaustive.java 610 2011-05-06 20:05:20Z markaf $
Author:
N. Kulkarni

Constructor Summary
Exhaustive(IMWEIndex index)
          Constructs the simple lookup detector from the given index of multi-word expressions.
 
Method Summary
protected
<T extends IToken>
boolean
containsDuplicate(Collection<? extends IMWE<T>> results, IMWE<T> mwe)
          Returns true if the given collection of MWEs already contains a particular MWE.
<T extends IToken>
List<IMWE<T>>
detect(List<T> sentence)
          Given a list of tokens, the detector searches for the MWEs in the list.
protected  Set<String> getStopWords()
          Returns the stop words used by this detector.
 
Methods inherited from class edu.mit.jmwe.index.HasMWEIndex
getMWEIndex
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Exhaustive

public Exhaustive(IMWEIndex index)
Constructs the simple lookup detector from the given index of multi-word expressions.

Parameters:
index - An IMWEIndex that can be used by the detector to look up MWEs. May not be null.
Throws:
NullPointerException - if the index is null
Since:
jMWE 1.0.0
Method Detail

detect

public <T extends IToken> List<IMWE<T>> detect(List<T> sentence)
Description copied from interface: IMWEDetector
Given a list of tokens, the detector searches for the MWEs in the list. It returns a set of IMWE objects representing these multi-word expressions. The method returns an empty list if no MWEs are found; the method should never return null.

Specified by:
detect in interface IMWEDetector
Type Parameters:
T - the type of the tokens in the sentence
Parameters:
sentence - a sentence which the detector should search for multi-word expressions.
Returns:
a list of IMWE objects representing the multi-word expressions found in the sentence. Returns an empty list if no multi-word expressions are found; never returns null

getStopWords

protected Set<String> getStopWords()
Returns the stop words used by this detector. Subclasses may override to provide their own set of stop words.

Returns:
the set of stop words for this detector
Since:
jMWE 1.0.0

containsDuplicate

protected <T extends IToken> boolean containsDuplicate(Collection<? extends IMWE<T>> results,
                                                       IMWE<T> mwe)
Returns true if the given collection of MWEs already contains a particular MWE.

Type Parameters:
T - the type of tokens in the MWEs
Parameters:
results - the collection to be checked
mwe - the MWE being searched for
Returns:
true if the given collection of MWEs already contains a particular MWE, false otherwise.
Since:
jMWE 1.0.0


Copyright © 2011 Massachusetts Institute of Technology. All Rights Reserved.