|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.mit.jmwe.data.Token
edu.mit.jmwe.data.concordance.ConcordanceToken
public class ConcordanceToken
Default implementation of IConcordanceToken
.
This class requires JSemcor to be on the classpath.
Field Summary | |
---|---|
static Pattern |
semcorTokenPattern
A compiled regular expression pattern that captures the string representation of tagged tokens. |
static Pattern |
whitespaceDelimited
A compiled regular expression a non-empty run of whitespace. |
Constructor Summary | |
---|---|
ConcordanceToken(String text,
String tag,
int tokenNum,
int partNum,
String... stems)
Constructs a new semcor token object with the specified text, tag, token number, part number and stems. |
Method Summary | |
---|---|
boolean |
equals(Object obj)
|
int |
getPartNumber()
Returns the index of the part in the Semcor token from which this part was extracted. |
int |
getTokenNumber()
Returns the index of the token in the Semcor sentence from which it was extracted. |
int |
hashCode()
|
static ConcordanceToken |
parse(String str)
Parses a string of the form "test_NN_stem1_stem2_..._stemN_1_0" into a ConcordanceToken instance. |
static List<ConcordanceToken> |
parseList(String str)
Parses a string formed from the concatenation of strings of the form "test-1-0-NN-stem1:stem2 " into a list of corresponding ConcordanceToken instances. |
String |
toString()
|
static String |
toString(IConcordanceToken token)
Returns the String representation of the given token. |
static ConcordanceToken |
toToken(int tokenNum,
int partNum,
edu.mit.jsemcor.element.IWordform wf,
edu.mit.jsemcor.element.ISentence sent)
Constructs a semcor token object from the given token number, part number, IWordform, and sentence drawn from the semcor corpus. |
static List<ConcordanceToken> |
toTokens(edu.mit.jsemcor.element.IToken t,
int tokenNum,
edu.mit.jsemcor.element.ISentence sent)
Returns a list of Concordance token objects if the token specified by the token number in the sentence is a continuous MWE. |
Methods inherited from class edu.mit.jmwe.data.Token |
---|
checkStems, checkString, getForm, getStems, getTag |
Methods inherited from class java.lang.Object |
---|
clone, finalize, getClass, notify, notifyAll, wait, wait, wait |
Methods inherited from interface edu.mit.jmwe.data.IToken |
---|
getStems, getTag |
Methods inherited from interface edu.mit.jmwe.data.IHasForm |
---|
getForm |
Field Detail |
---|
public static final Pattern semcorTokenPattern
public static final Pattern whitespaceDelimited
Constructor Detail |
---|
public ConcordanceToken(String text, String tag, int tokenNum, int partNum, String... stems)
null
or
empty. If null
, no stems have been assigned. If empty, the
token is unstemmable.
text
- the surface form of the token as it appears in the sentence,
capitalization intacttag
- the tag of the token, if assigned, otherwise null
tokenNum
- the token number. Must be greater than or equal to 0.partNum
- the part number representing the index of the token in a
multi-word expression, 0 if it is not part of one. Must be
greater than or equal to 0.stems
- the list of stems, possibly empty or null
NullPointerException
- if the text is null
IllegalArgumentException
- if the text is empty or all whitespace or if the token number
or part number is less than 0.Method Detail |
---|
public int getTokenNumber()
IConcordanceToken
getTokenNumber
in interface IConcordanceToken
public int getPartNumber()
IConcordanceToken
getPartNumber
in interface IConcordanceToken
public String toString()
toString
in class Token
public int hashCode()
hashCode
in class Object
public boolean equals(Object obj)
equals
in class Object
public static String toString(IConcordanceToken token)
token
- the token to be represented as a string
public static ConcordanceToken parse(String str)
ConcordanceToken
instance.
str
- the string representing the tagged token
NullPointerException
- if the specified string is null
IllegalArgumentException
- if the specified string does not match the expected formatpublic static List<ConcordanceToken> parseList(String str)
ConcordanceToken
instances.
str
- the concatenated string representing the tagged token
NullPointerException
- if the specified string is null
IllegalArgumentException
- if the specified string does not conform to the expected
formatpublic static List<ConcordanceToken> toTokens(edu.mit.jsemcor.element.IToken t, int tokenNum, edu.mit.jsemcor.element.ISentence sent)
t
- the token specified by the token number in the given sentencetokenNum
- the token number of the token to be translated into a
concordance token objectsent
- the sentence
public static ConcordanceToken toToken(int tokenNum, int partNum, edu.mit.jsemcor.element.IWordform wf, edu.mit.jsemcor.element.ISentence sent)
tokenNum
- the token numberpartNum
- the part numberwf
- the word formsent
- the JSemcor sentence
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |