Package | Description |
---|---|
edu.mit.jmwe.data |
Provides the basic data structures used by the library and their default implementations.
|
edu.mit.jmwe.detect |
Provides MWE detector API, a baseline detector, plus numerous other detector implementations.
|
edu.mit.jmwe.detect.score |
Provides various scoring mechanisms that can be used by subclasses of the FilterByScore and ResolveByScore detectors.
|
edu.mit.jmwe.harness |
Provides testing harness infrastructure
|
edu.mit.jmwe.harness.result |
Provides objects that encapsulate the results of a test harness run
|
edu.mit.jmwe.harness.result.error |
Provides error detectors to evaluate the results of a test harness run
|
edu.mit.jmwe.index |
Provides the MWE index interfaces and default implementations, which allow one to look up an MWE given one of its parts.
|
Modifier and Type | Class and Description |
---|---|
class |
MWE<T extends IToken>
Default implementation of the
IMWE interface. |
Modifier and Type | Method and Description |
---|---|
int |
MWEComparator.compare(IMWE<T> one,
IMWE<T> two) |
int |
MWEComparator.compare(IMWE<T> one,
IMWE<T> two) |
protected boolean |
MWEComparator.earlier(IMWE<T> one,
IMWE<T> two)
Internal method used to determine if one multi-word expression appears in
the sentence before another.
|
protected boolean |
MWEComparator.earlier(IMWE<T> one,
IMWE<T> two)
Internal method used to determine if one multi-word expression appears in
the sentence before another.
|
static boolean |
MWE.equals(IMWE<?> one,
IMWE<?> two)
Returns true if the two MWEs use the same tokens and are assigned the
same root entries.
|
static boolean |
MWE.equals(IMWE<?> one,
IMWE<?> two)
Returns true if the two MWEs use the same tokens and are assigned the
same root entries.
|
static double |
MWE.overlap(IMWE<?> one,
IMWE<?> two)
Returns a score which is the ratio of the number of tokens shared between
the two MWEs and the total number of unique tokens in both MWEs together.
|
static double |
MWE.overlap(IMWE<?> one,
IMWE<?> two)
Returns a score which is the ratio of the number of tokens shared between
the two MWEs and the total number of unique tokens in both MWEs together.
|
Modifier and Type | Method and Description |
---|---|
static <T extends IToken> |
LMLR.longest(IMWE<T> one,
IMWE<T> two,
java.util.Comparator<T> c)
Compares two MWEs and returns the longest MWE.
|
IMWE<T> |
MWEBuilder.toMWE()
Converts the tokens in a full record into an
IMWE object. |
Modifier and Type | Method and Description |
---|---|
<T extends IToken> |
StopWords.detect(java.util.List<T> sentence) |
<T extends IToken> |
ResolveByScore.detect(java.util.List<T> sentence) |
<T extends IToken> |
ProperNouns.detect(java.util.List<T> sentence) |
<T extends IToken> |
Perfect.detect(java.util.List<T> sentence) |
<T extends IToken> |
NoProperNouns.detect(java.util.List<T> sentence) |
<T extends IToken> |
NoInflection.detect(java.util.List<T> sentence) |
<T extends IToken> |
LMLR.detect(java.util.List<T> s) |
<T extends IToken> |
InOrder.detect(java.util.List<T> sentence) |
<T extends IToken> |
InflectionPattern.detect(java.util.List<T> sentence) |
<T extends IToken> |
InflectionLookup.detect(java.util.List<T> sentence) |
<T extends IToken> |
IMWEDetector.detect(java.util.List<T> sentence)
Given a list of tokens, the detector searches for the MWEs in the list.
|
<T extends IToken> |
HasMWEDetector.detect(java.util.List<T> sentence) |
<T extends IToken> |
FilterByScore.detect(java.util.List<T> sentence) |
<T extends IToken> |
Exhaustive.detect(java.util.List<T> sentence) |
<T extends IToken> |
Continuous.detect(java.util.List<T> sentence) |
<T extends IToken> |
Consecutive.detect(java.util.List<T> sent) |
<T extends IToken> |
CompositeDetector.detect(java.util.List<T> sentence) |
protected <T extends IToken> |
SmallestVariance.getScorer(java.util.List<T> sentence) |
protected abstract <T extends IToken> |
ResolveByScore.getScorer(java.util.List<T> sentence)
Returns the scoring function for this filter.
|
protected <T extends IToken> |
MoreFrequentAsMWE.getScorer(java.util.List<T> sentence) |
protected <T extends IToken> |
Longest.getScorer(java.util.List<T> scorer) |
protected <T extends IToken> |
LeskAtLeast.getScorer(java.util.List<T> sentence) |
protected <T extends IToken> |
Leftmost.getScorer(java.util.List<T> sentence) |
protected abstract <T extends IToken> |
FilterByScore.getScorer(java.util.List<T> sentence)
Returns a scoring function for the specified sentence.
|
protected <T extends IToken> |
ConstrainLength.getScorer(java.util.List<T> sentence) |
Modifier and Type | Method and Description |
---|---|
protected <T extends IToken> |
Exhaustive.containsDuplicate(java.util.Collection<? extends IMWE<T>> results,
IMWE<T> mwe)
Returns true if the given collection of MWEs already contains a
particular MWE.
|
static <T extends IToken> |
InflectionLookup.getSurfaceFormDescription(IRootMWEDesc root,
IMWE<T> mwe)
Returns a multi-word expression description with a lemma that is
constructed by concatenating the tokens of the MWE exactly as they appear
in the sentence with underscores.
|
<T extends IToken> |
InflectionRule.getTagPattern(IMWE<T> mwe)
Concatenates the tags of each token in the MWE, separating each by
underscores.
|
static <T extends IToken> |
InflectionRule.inflects(T token,
IMWE<T> mwe)
Returns true if a the text of a token from an MWE does not equal the
corresponding part lemma.
|
static <T extends IToken> |
Continuous.isDiscontinuous(IMWE<T> mwe,
java.util.List<T> sentence)
Determines if the specified MWE is continuous, i.e., there are no
interstitial tokens inside its boundaries that are not a part of the MWE.
|
static <T extends IToken> |
Continuous.isDiscontinuous(IMWE<T> mwe,
java.util.Map<T,java.lang.Integer> indexMap)
Determines if the specified MWE is continuous, i.e., there are no
interstitial tokens inside its boundaries that are not a part of the MWE.
|
static boolean |
InflectionRule.isInflectedByPattern(IMWE<?> mwe)
Returns
true if and only if (1) the given multi-word
expressions syntactically matches a rule listed in the enumeration
InflectionRule and (2) parts inflect according to that rule |
static boolean |
InflectionRule.isInflectedByPattern(IMWE<?> mwe,
java.util.Collection<? extends IInflectionRule> rules)
Returns
true if the specified MWE inflects according to some
rule in the specified collection; false otherwise. |
static <T extends IToken> |
InOrder.isOutOfOrder(IMWE<T> mwe)
Determines if the constituents of the specified MWE are out of order.
|
<T extends IToken> |
InflectionRule.isValid(IMWE<T> mwe) |
<T extends IToken> |
IInflectionRule.isValid(IMWE<T> mwe)
Returns
true if this MWE follows the rule;
false otherwise. |
static <T extends IToken> |
LMLR.longest(IMWE<T> one,
IMWE<T> two,
java.util.Comparator<T> c)
Compares two MWEs and returns the longest MWE.
|
static <T extends IToken> |
LMLR.longest(IMWE<T> one,
IMWE<T> two,
java.util.Comparator<T> c)
Compares two MWEs and returns the longest MWE.
|
<T extends IToken> |
InflectionRule.matches(IMWE<T> mwe) |
<T extends IToken> |
IInflectionRule.matches(IMWE<T> mwe)
Returns
true if the given MWE has the same syntax as this
rule. |
Modifier and Type | Method and Description |
---|---|
protected <T extends IToken> |
Exhaustive.containsDuplicate(java.util.Collection<? extends IMWE<T>> results,
IMWE<T> mwe)
Returns true if the given collection of MWEs already contains a
particular MWE.
|
Modifier and Type | Method and Description |
---|---|
double |
VarianceScore.score(IMWE<T> mwe) |
double |
StartingIndexScore.score(IMWE<T> mwe) |
double |
LeskScore.score(IMWE<T> mwe) |
double |
LengthScore.score(IMWE<T> mwe) |
double |
FractionAsMWEScore.score(IMWE<T> mwe) |
Modifier and Type | Method and Description |
---|---|
<T extends IToken> |
IAnswerKey.getAnswers(IMarkedSentence<T> sentence)
Gets the answer multi-word expressions from the given sentence.
|
<T extends IToken> |
ConcordanceAnswerKey.getAnswers(IMarkedSentence<T> sent) |
<T extends IToken> |
ConcordanceAnswerKey.getAnswers(IMarkedSentence<T> sent,
edu.mit.jsemcor.element.ISentence answers)
Extracts a set of MWE answers from a sentence and its corresponding
answer sentence.
|
protected <T extends IToken> |
ConcordanceAnswerKey.getContinuousMWEs(IMarkedSentence<T> sent,
edu.mit.jsemcor.element.ISentence answer,
java.util.Set<edu.mit.jsemcor.element.IWordform> used)
Gets the multi-word expressions from the given sentence that are marked
as single tokens.
|
protected <T extends IToken> |
ConcordanceAnswerKey.getNonContinuousMWEs(IMarkedSentence<T> sent,
edu.mit.jsemcor.element.ISentence answer,
java.util.Set<edu.mit.jsemcor.element.IWordform> used)
Gets the multi-word expressions from the given sentence that are
non-contiguous (e.g., have a distance value not equal to zero).
|
protected <T extends IToken,S extends IMarkedSentence<T>> |
TestHarness.runDetector(IMWEDetector detector,
IResultBuilder<T,S> builder,
S sent,
java.util.List<IMWE<T>> answers)
Runs the detector over a single sentence, storing the result as an
ISentenceResult in the given result builder. |
Modifier and Type | Method and Description |
---|---|
protected <T extends IToken,S extends IMarkedSentence<T>> |
TestHarness.runDetector(IMWEDetector detector,
IResultBuilder<T,S> builder,
S sent,
java.util.List<IMWE<T>> answers)
Runs the detector over a single sentence, storing the result as an
ISentenceResult in the given result builder. |
protected <T extends IToken,S extends IMarkedSentence<T>> |
TestHarness.runDetectors(java.util.Map<IMWEDetector,IResultBuilder<T,S>> detectors,
S sent,
java.util.List<IMWE<T>> answers)
Runs a set of detectors on the specified sentence, comparing the results
to the specified answers.
|
Modifier and Type | Method and Description |
---|---|
java.util.List<IMWE<T>> |
SentenceResult.getAnswers() |
java.util.List<IMWE<T>> |
ISentenceResult.getAnswers()
Returns the answer multi-word expression in the sentence.
|
java.util.Map<java.lang.String,java.util.List<IMWE<T>>> |
IErrorResult.getDetails()
Returns a
Map that stores multi-word expressions under the ID of
the error class they belong to. |
java.util.Map<java.lang.String,java.util.List<IMWE<T>>> |
ErrorResult.getDetails() |
java.util.List<IMWE<T>> |
SentenceResult.getFalseNegatives() |
java.util.List<IMWE<T>> |
ISentenceResult.getFalseNegatives()
Returns a list of the false negatives.
|
java.util.List<IMWE<T>> |
SentenceResult.getFalsePositives() |
java.util.List<IMWE<T>> |
ISentenceResult.getFalsePositives()
Returns a list of the false positives.
|
java.util.List<IMWE<T>> |
SentenceResult.getFound() |
java.util.List<IMWE<T>> |
ISentenceResult.getFound()
Returns the multi-word expression found by the detector in the sentence.
|
java.util.List<IMWE<T>> |
SentenceResult.getTruePositives() |
java.util.List<IMWE<T>> |
ISentenceResult.getTruePositives()
Returns a list of the true positives.
|
Modifier and Type | Method and Description |
---|---|
void |
TokenResultBuilder.process(java.util.List<IMWE<T>> found,
java.util.List<IMWE<T>> answers) |
void |
TokenResultBuilder.process(java.util.List<IMWE<T>> found,
java.util.List<IMWE<T>> answers) |
void |
MWEResultBuilder.process(java.util.List<IMWE<T>> found,
java.util.List<IMWE<T>> answers) |
void |
MWEResultBuilder.process(java.util.List<IMWE<T>> found,
java.util.List<IMWE<T>> answers) |
void |
IResultBuilder.process(java.util.List<IMWE<T>> found,
java.util.List<IMWE<T>> answers)
Updates the internal data stored in this builder by comparing the
multi-word expressions found by an MWE detector to the answer
multi-word expressions.
|
void |
IResultBuilder.process(java.util.List<IMWE<T>> found,
java.util.List<IMWE<T>> answers)
Updates the internal data stored in this builder by comparing the
multi-word expressions found by an MWE detector to the answer
multi-word expressions.
|
Constructor and Description |
---|
ErrorResult(java.util.Map<java.lang.String,java.util.List<IMWE<T>>> details)
Constructs the error result from a map that stores MWEs under the ID of
the error class that they belong to.
|
ErrorResult(java.util.Map<java.lang.String,java.util.List<IMWE<T>>> details,
boolean reallocate)
Constructs the error result from a
Map that stores multi-word
expressions under the ID of the error class that they belong to.This
constructor may or may not allocate a new internal map, depending on the
value of the reallocation flag. |
ErrorResult(java.lang.String errorID,
java.util.List<IMWE<T>> errors)
Constructs the error result that stores the given multi-word expressions
under the given ID of the error class that they belong to.
|
SentenceResult(java.util.List<IMWE<T>> answer,
java.util.List<IMWE<T>> retrieved,
S sentence)
Constructs a sentence result from a list of answer multi-word expressions
and a list of multi-word expressions found by the detector.
|
SentenceResult(java.util.List<IMWE<T>> answer,
java.util.List<IMWE<T>> retrieved,
S sentence)
Constructs a sentence result from a list of answer multi-word expressions
and a list of multi-word expressions found by the detector.
|
SentenceResult(java.util.List<IMWE<T>> answer,
java.util.List<IMWE<T>> retrieved,
S sentence,
boolean reallocate)
Constructs a sentence result from a list of answer multi-word expressions and
a list of multi-word expressions found by the detector.
|
SentenceResult(java.util.List<IMWE<T>> answer,
java.util.List<IMWE<T>> retrieved,
S sentence,
boolean reallocate)
Constructs a sentence result from a list of answer multi-word expressions and
a list of multi-word expressions found by the detector.
|
Modifier and Type | Method and Description |
---|---|
protected static <T extends IToken> |
ExtraPrep.findTag(IMWE<T> test,
java.lang.String tag)
Returns the index of the first token in the MWE with the specified tag.
|
static <T extends IToken> |
InterstitialTokens.hasParticle(IMWE<T> mwe,
java.util.List<T> sentence)
Returns true if the given MWE contains a token that is a particle and is
separated from the previous token in the MWE by one or more non-MWE
tokens in the sentence.
|
static <T extends IToken> |
VBDVBN.isProblem(IMWE<T> mwe)
Determines if the specified MWE is a problem according to this error
class.
|
static <T extends IToken> |
MissingFromIndex.isProblem(IMWE<T> mwe,
IMWEIndex index)
Determines if the specified MWE is a problem, relative to the specified
index, according to this error class.
|
static <T extends IToken,S extends IMarkedSentence<T>> |
DetectorDisagreement.isProblem(IMWE<T> mwe,
ISentenceResult<T,S> result,
IMWEDetector detector)
Determines if the specified MWE is a problem relative to the specified
sentence according to this error class.
|
Modifier and Type | Method and Description |
---|---|
protected boolean |
IndexBuilder.contains(java.util.List<IMWE<IConcordanceToken>> list,
IMWE<IConcordanceToken> mwe)
Whether the specified MWE is contained in the specified list
|
protected <T extends IConcordanceToken> |
IndexBuilder.isSplit(IMWE<T> mwe)
Returns
true if this MWE is not continuous - if it has
interstitial tokens that are not a part of it; false
otherwise. |
Modifier and Type | Method and Description |
---|---|
protected boolean |
IndexBuilder.contains(java.util.List<IMWE<IConcordanceToken>> list,
IMWE<IConcordanceToken> mwe)
Whether the specified MWE is contained in the specified list
|
void |
IndexBuilder.countMarked(java.util.List<IMWE<IConcordanceToken>> answers,
java.util.Map<IMWEDescID,IndexBuilder.MutableRootMWEDesc> index)
Counts instances of marked MWEs
|
void |
IndexBuilder.countUnmarked(IMWEDetector detector,
IConcordanceSentence sent,
java.util.List<IMWE<IConcordanceToken>> answers)
Counts the number of MWEs that are detected by the specified detected,
but not marked in the answer set as being MWEs.
|
<T extends IToken> |
IndexBuilder.findMissingMWEs(java.util.List<IMWE<T>> mwes,
java.util.Map<IMWEDescID,IndexBuilder.MutableRootMWEDesc> index,
java.util.Set<IndexBuilder.MutableRootMWEDesc> missing)
Finds MWEs that are marked in the the specified list, but not in the
index.
|
Copyright © 2011 Massachusetts Institute of Technology. All Rights Reserved.