|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.mit.jmwe.data.AbstractMWEDesc<P>
P
- the type of the part for this mwe descriptionpublic abstract class AbstractMWEDesc<P extends IMWEDesc.IPart>
A base class for MWE descriptions that can be used to construct a description from some combination of: a surface form, a list of parts, and counts relating to the MWE's appearance in a reference concordance.
Nested Class Summary | |
---|---|
protected class |
AbstractMWEDesc.AbstractPart
Default implementation of the IPart interface. |
Nested classes/interfaces inherited from interface edu.mit.jmwe.data.IMWEDesc |
---|
IMWEDesc.IPart |
Field Summary | |
---|---|
protected int[] |
counts
|
Fields inherited from interface edu.mit.jmwe.data.IMWEDesc |
---|
boundaryUnderscores, comma, underscore, underscores |
Constructor Summary | |
---|---|
AbstractMWEDesc(List<String> parts)
Constructs a new MWE description object from the list of parts. |
|
AbstractMWEDesc(List<String> parts,
int... counts)
Constructs a new MWE description object from the list of parts and counts relating to the MWE's appearance in a reference concordance. |
|
AbstractMWEDesc(String surfaceForm)
Constructs a new MWE description object from the specified surface form that has no inflected forms. |
|
AbstractMWEDesc(String surfaceForm,
int... counts)
Constructs a new MWE description object that has no inflected forms from the specified surface form and counts relating to the MWE's appearance in a reference concordance. |
Method Summary | |
---|---|
protected static int |
checkCount(int count)
Checks that each passed in count is non-negative. |
int |
compareTo(IMWEDesc id)
|
static String |
concatenate(Iterable<String> parts,
String separator)
Utility method for concatenating collections of strings into a single string using a specified separator. |
static boolean |
equalsRoots(IMWEDesc one,
IMWEDesc two)
Returns true if the root descriptions associated with each
of this MWE descriptions are the same; false otherwise. |
int[] |
getCounts()
Returns an array containing the marked split, marked continuous, unmarked exact, and unmarked pattern occurrences of this MWE in the reference concordance. |
protected abstract int |
getExpectedCountLength()
Subclasses should implement this method to return the number of counts relating to the MWE's appearance in a reference concordance that are expected in the implementation. |
String |
getForm()
Returns the object's surface form text, exactly as it appears in its original context, with capitalization intact. |
int |
getMarkedContinuous()
Returns the number of times this MWE was marked on a continuous run of tokens in the reference concordance. |
int |
getMarkedSplit()
Returns the number of times this MWE was marked on a non-continuous run of tokens in the reference concordance. |
List<P> |
getParts()
Returns an unmodifiable list of parts that comprise the MWE. |
static IRootMWEDesc |
getRoot(IMWEDesc desc)
Returns the root mwe description associated with this object. |
int |
getUnmarkedExact()
Returns the number of times the exact surface form of this MWE description occurs in the reference concordance without being marked as an occurrence of the MWE. |
int |
getUnmarkedPattern()
Returns the number of times a this MWE description occurs in the reference concordance without being marked as an occurrence of the MWE, and whose form matches a known inflection pattern. |
static boolean |
isFillerForSlot(IToken token,
IMWEDesc.IPart part)
Returns true if the part's lemma matches either the surface form of the given token or any of the token's stems, regardless of case. |
protected boolean |
isStopWord(String text)
Helper method that calculates, for efficiency's sake, whether this MWE part is a stop word. |
protected abstract P |
makePart(String form,
int index)
Subclasses should implement this method to construct an IMWEDesc.IPart
given the form and index of a part of an MWE. |
static List<String> |
splitOnUnderscores(String str)
Splits a specified string into constituent strings that are separated by underscores. |
String |
toString()
|
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Methods inherited from interface edu.mit.jmwe.data.IMWEDesc |
---|
getID |
Methods inherited from interface edu.mit.jmwe.data.IHasMWEPOS |
---|
getPOS |
Field Detail |
---|
protected final int[] counts
Constructor Detail |
---|
public AbstractMWEDesc(String surfaceForm)
surfaceForm
- A string representing the MWE with its words separated by
underscores
NullPointerException
- if the argument is null
IllegalArgumentException
- if the surface form does not contain underscorespublic AbstractMWEDesc(String surfaceForm, int... counts)
surfaceForm
- A string representing the MWE with its words separated by
underscorescounts
- the implementation-specific counts relating to the MWE's
appearance in a reference concordance.
NullPointerException
- if either argument is null
IllegalArgumentException
- if the surface form does not contain underscorespublic AbstractMWEDesc(List<String> parts)
parts
- the list of parts that will make up this list, may neither be
null
nor empty, and may not contain any
null
s, empty or all whitespace strings, or
strings that contain the underscore character.
NullPointerException
- if the specified list of parts is null
, or
contains a null
IllegalArgumentException
- if the specified list has less than two elements, or any
trimmed string in the list contains an underscore, is empty,
or contains whitespacepublic AbstractMWEDesc(List<String> parts, int... counts)
parts
- the list of parts that will make up this list, may neither be
null
nor empty, and may not contain any
null
s, empty or all whitespace strings, or
strings that contain the underscore character.counts
- the implementation-specific counts relating to the MWE's
appearance in a reference concordance.
NullPointerException
- if the specified list of parts is null
, or
contains a null
IllegalArgumentException
- if the specified list has less than two elements, or any
trimmed string in the list contains an underscore, is empty,
or contains whitespaceMethod Detail |
---|
protected abstract int getExpectedCountLength()
protected static int checkCount(int count)
count
- the count to be checked
IllegalArgumentException
- if the count is less than zeroprotected abstract P makePart(String form, int index)
IMWEDesc.IPart
given the form and index of a part of an MWE.
form
- the text of the partindex
- the index of the part in the MWEpublic String getForm()
IHasForm
null
.
getForm
in interface IHasForm
null
.public int getMarkedContinuous()
IMWEDesc
getMarkedContinuous
in interface IMWEDesc
public int getMarkedSplit()
IMWEDesc
getMarkedSplit
in interface IMWEDesc
public int getUnmarkedExact()
IMWEDesc
getUnmarkedExact
in interface IMWEDesc
public int getUnmarkedPattern()
IMWEDesc
getUnmarkedPattern
in interface IMWEDesc
public List<P> getParts()
IMWEDesc
getParts
in interface IMWEDesc
public int[] getCounts()
IMWEDesc
getCounts
in interface IMWEDesc
public int compareTo(IMWEDesc id)
compareTo
in interface Comparable<IMWEDesc>
public String toString()
toString
in class Object
protected boolean isStopWord(String text)
Exhaustive.getStopWords()
method. Subclasses
may override this method to use a different set of stop words.
text
- text, to be checked for being a stop word
true
if the verbatim text is a stop word;
false
otherwisepublic static boolean equalsRoots(IMWEDesc one, IMWEDesc two)
true
if the root descriptions associated with each
of this MWE descriptions are the same; false
otherwise.
one
- the first mwe descriptiontwo
- the second mwe description
true
if the root descriptions associated with each
of this MWE descriptions are the same; false
otherwise.
NullPointerException
- if either argument is null
public static IRootMWEDesc getRoot(IMWEDesc desc)
desc
- the mwe object object from which to extract the root
NullPointerException
- if the argument is null
public static List<String> splitOnUnderscores(String str)
str
- a string to be split into underscore-delimited parts
NullPointerException
- if the specified string is null
public static String concatenate(Iterable<String> parts, String separator)
parts
- List of parts to be concatenated, may not be null
separator
- String used to separate the parts in the result, may be
null
.
NullPointerException
- if the specified iterable is null
public static boolean isFillerForSlot(IToken token, IMWEDesc.IPart part)
token
- the token to be compared to the part's lemma
NullPointerException
- if either argument is null
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |