edu.mit.jmwe.data
Class MWE<T extends IToken>

java.lang.Object
  extended by edu.mit.jmwe.data.MWE<T>
Type Parameters:
T - type of IToken objects that form the multi-word expression
All Implemented Interfaces:
IHasForm, IMWE<T>

public class MWE<T extends IToken>
extends Object
implements IMWE<T>

Default implementation of the IMWE interface.

Since:
jMWE 1.0.0
Version:
$Id: MWE.java 639 2011-09-26 21:03:51Z markaf $
Author:
Nidhi Kulkarni, M.A. Finlayson

Constructor Summary
MWE(Map<T,IMWEDesc.IPart> partMap)
          Constructs a new multi-word expression from a map of tokens to parts.
MWE(Map<T,IMWEDesc.IPart> partMap, boolean reallocate)
          Constructs a new multi-word expression from a map of tokens to parts.
 
Method Summary
static boolean equals(IMWE<?> one, IMWE<?> two)
          Returns true if the two MWEs use the same tokens and are assigned the same root entries.
 boolean equals(Object obj)
           
 IMWEDesc getEntry()
          Gets the MWE description object corresponding to this multi-word expression.
 String getForm()
          Returns the object's surface form text, exactly as it appears in its original context, with capitalization intact.
 Map<T,IMWEDesc.IPart> getPartMap()
          Gets the mapping from tokens to parts in this multi-word expression.
 List<T> getTokens()
          Gets the list of tokens identified as comprising the multi-word expression.
 int hashCode()
           
 boolean isInflected()
          Returns true if this MWE is inflected relative to its associated MWE description; false otherwise.
static double overlap(IMWE<?> one, IMWE<?> two)
          Returns a score which is the ratio of the number of tokens shared between the two MWEs and the total number of unique tokens in both MWEs together.
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

MWE

public MWE(Map<T,IMWEDesc.IPart> partMap)
Constructs a new multi-word expression from a map of tokens to parts. This constructor allocates a new internal map, and so subsequent changes to the source map will not affect this object.

Parameters:
partMap - the map of tokens to MWE parts that will make up this multi-word expression, may not be null or empty, nor contain null. Iterating over the map should return the tokens in the same order they are found in the original sentence.
Throws:
NullPointerException - if either argument is null, or the map contains null
IllegalArgumentException - if the token map is empty
Since:
jMWE 1.0.0

MWE

public MWE(Map<T,IMWEDesc.IPart> partMap,
           boolean reallocate)
Constructs a new multi-word expression from a map of tokens to parts. This constructor may or may not allocate a new internal map, depending on the value of the reallocation flag. If no reallocation is requested, this constructor reuses the given map, merely wrapping it to make it unmodifiable, and so subsequent changes to the source list will affect this object.

Parameters:
partMap - the map of tokens to MWE parts that will make up this multi-word expression, may not be null or empty, nor contain null. Iterating over the map should return the tokens in the same order they are found in the original sentence.
reallocate - if true, reallocate the specified map; otherwise, reuse the specified map
Throws:
NullPointerException - if either argument is null, or the map contains null
IllegalArgumentException - if the part map is empty, or the mwe description does not match between the parts
Since:
jMWE 1.0.0
Method Detail

getForm

public String getForm()
Description copied from interface: IHasForm
Returns the object's surface form text, exactly as it appears in its original context, with capitalization intact. May be a single word or punctuation. The surface form may not contain whitespace or underscores. This method will never return null.

Specified by:
getForm in interface IHasForm
Returns:
the original text, never null.

getEntry

public IMWEDesc getEntry()
Description copied from interface: IMWE
Gets the MWE description object corresponding to this multi-word expression. Useful for retrieving the the lemma, list of parts, and part of speech of the multi-word expression. This method should never return null.

Specified by:
getEntry in interface IMWE<T extends IToken>
Returns:
the non-null MWE description corresponding to the multi-word expression represented by this object.

getTokens

public List<T> getTokens()
Description copied from interface: IMWE
Gets the list of tokens identified as comprising the multi-word expression. The order of the tokens should correspond to the order of the words in the multi-word expression. This method should never return null or an empty list.

Specified by:
getTokens in interface IMWE<T extends IToken>
Returns:
the non-null, non-empty list of tokens that comprise the multi-word expression.

getPartMap

public Map<T,IMWEDesc.IPart> getPartMap()
Description copied from interface: IMWE
Gets the mapping from tokens to parts in this multi-word expression. Useful when determining which token corresponds to which part in the expression, especially when some parts of the expression are repeated or if the tokens are not in the canonical order. This method should never return null. Iteration order of the map should correspond to the order of tokens in the original sentence.

Specified by:
getPartMap in interface IMWE<T extends IToken>
Returns:
the non-null map from tokens to parts in this MWE object

isInflected

public boolean isInflected()
Description copied from interface: IMWE
Returns true if this MWE is inflected relative to its associated MWE description; false otherwise.

Specified by:
isInflected in interface IMWE<T extends IToken>
Returns:
true if this MWE is inflected relative to its associated MWE description; false otherwise.

toString

public String toString()
Overrides:
toString in class Object

hashCode

public int hashCode()
Overrides:
hashCode in class Object

equals

public boolean equals(Object obj)
Overrides:
equals in class Object

equals

public static boolean equals(IMWE<?> one,
                             IMWE<?> two)
Returns true if the two MWEs use the same tokens and are assigned the same root entries.

Parameters:
one - the first MWE to be compared; may be null
two - the second MWE to be compared; may be null
Returns:
true if the two MWEs use the same tokens and are assigned the same root entries.
Since:
jMWE 1.0.0

overlap

public static double overlap(IMWE<?> one,
                             IMWE<?> two)
Returns a score which is the ratio of the number of tokens shared between the two MWEs and the total number of unique tokens in both MWEs together.

If the two MWEs being compared do not come from the same sentence, or share no tokens, the score will be zero.

Returns:
an alignment score which is zero if the two MWEs don't overlap, one if they overlap perfectly, and somewhere in between otherwise
Throws:
NullPointerException - if either argument is null
Since:
jMWE 1.0.0


Copyright © 2011 Massachusetts Institute of Technology. All Rights Reserved.