|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectedu.mit.jmwe.detect.score.AbstractScorer<IMWE<T>>
edu.mit.jmwe.detect.score.LeskScore<T>
T
- the type of token used by this scorerpublic class LeskScore<T extends IToken>
Scores an object with its lesk-score overlap with dictionary glosses.
Field Summary | |
---|---|
protected Set<String> |
contextWords
|
protected edu.mit.jwi.IDictionary |
dict
|
protected static Pattern |
punctuation
|
protected edu.mit.jwi.morph.IStemmer |
stemmer
|
protected static Pattern |
whitespace
|
Constructor Summary | |
---|---|
LeskScore(List<T> sentence,
edu.mit.jwi.IDictionary dict)
Constructs a new lesk scorer for the specified sentence and dictionary. |
Method Summary | |
---|---|
protected List<String> |
getContentWords(String str)
Given a string representation of a sentence, removes all punctuation and stop words. |
protected List<String> |
getGlosses(String lemma,
MWEPOS pos)
Returns a list of the glosses of a word or MWE by looking up its lemma and part of speech in the dictionary. |
protected Set<String> |
getStemmedWords(Collection<String> words)
Returns a set of string containing all the string in the specified list, as well as all the stemmed versions of those strings. |
protected Set<String> |
getStopWords()
Returns the set of stop words for this scorer. |
protected int |
overlap(String gloss)
Returns the number of elements the gloss has in common with the stemmed word list |
double |
score(IMWE<T> mwe)
Score the specified object. |
Methods inherited from class edu.mit.jmwe.detect.score.AbstractScorer |
---|
compare |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Methods inherited from interface java.util.Comparator |
---|
equals |
Field Detail |
---|
protected final Set<String> contextWords
protected final edu.mit.jwi.IDictionary dict
protected final edu.mit.jwi.morph.IStemmer stemmer
protected static final Pattern whitespace
protected static final Pattern punctuation
Constructor Detail |
---|
public LeskScore(List<T> sentence, edu.mit.jwi.IDictionary dict)
sentence
- the sentence for the scorerdict
- the dictionary to be used by the scorer; may not be
null
NullPointerException
- if either argument is null
Method Detail |
---|
public double score(IMWE<T> mwe)
IScorer
null
,
depending on the implementation.
mwe
- the object to be scored
protected List<String> getContentWords(String str)
str
- the string from which the content words will be extracted
protected Set<String> getStopWords()
protected List<String> getGlosses(String lemma, MWEPOS pos)
lemma
- the lemma of the word or MWEpos
- the part of speech of the word. If it is a proper noun, this
method will try looking up the word as a noun, just in case it
is listed as such in the dictionary.
protected int overlap(String gloss)
gloss
- the gloss
protected Set<String> getStemmedWords(Collection<String> words)
words
- the collection of strings to be stemmed
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |