public class ConcordanceAnswerKey extends java.lang.Object implements IAnswerKey
IAnswerKey
interface. Searches for
the answer multi-word expressions in an IConcordanceSentence
by using a
Semcor corpus, which has multi-word expressions annotated.
This class requires JSemcor to be on the classpath.
Modifier and Type | Field and Description |
---|---|
static java.util.regex.Pattern |
condordanceSentenceIDPattern
A compiled regular expression pattern that captures the string
representation of a Semcor sentence ID.
|
static java.util.regex.Pattern |
lexSensePattern
A compiled regular expression pattern that captures the string
representation of sense key.
|
Constructor and Description |
---|
ConcordanceAnswerKey(edu.mit.jsemcor.main.IConcordance c)
Constructs an answer key from a single concordance
|
ConcordanceAnswerKey(java.lang.Iterable<? extends edu.mit.jsemcor.main.IConcordance> i)
Constructs an answer key from the given semcor concordance set.
|
ConcordanceAnswerKey(java.util.Map<java.lang.String,edu.mit.jsemcor.main.IConcordance> concords)
Constructs an answer key from the given semcor concordance set.
|
Modifier and Type | Method and Description |
---|---|
protected MWEPOS |
disambiguatePOS(java.util.List<edu.mit.jsemcor.element.IWordform> mwe)
Attempts to disambiguate the part of speech of a multi-expression that
does not have a semantic tag and whose parts are labeled with different
part of speech tags.
|
<T extends IToken> |
getAnswers(IMarkedSentence<T> sent)
Gets the answer multi-word expressions from the given sentence.
|
<T extends IToken> |
getAnswers(IMarkedSentence<T> sent,
edu.mit.jsemcor.element.ISentence answers)
Extracts a set of MWE answers from a sentence and its corresponding
answer sentence.
|
protected <T extends IToken> |
getContinuousMWEs(IMarkedSentence<T> sent,
edu.mit.jsemcor.element.ISentence answer,
java.util.Set<edu.mit.jsemcor.element.IWordform> used)
Gets the multi-word expressions from the given sentence that are marked
as single tokens.
|
protected MWEPOS |
getMWEPOS(java.lang.String lexSense)
Given the lexical sense of a word form, extracts the one digit decimal
integer representing the synset type of the sense and returns the
corresponding part of speech.
|
protected <T extends IToken> |
getNonContinuousMWEs(IMarkedSentence<T> sent,
edu.mit.jsemcor.element.ISentence answer,
java.util.Set<edu.mit.jsemcor.element.IWordform> used)
Gets the multi-word expressions from the given sentence that are
non-contiguous (e.g., have a distance value not equal to zero).
|
static edu.mit.jsemcor.element.ISentence |
getSentence(java.util.Map<java.lang.String,edu.mit.jsemcor.main.IConcordance> concords,
IMarkedSentence<?> sent)
Returns the concordance sentence that corresponds to the specified marked
sentence
|
boolean |
isIgnoringProperNouns()
Returns
true if this answer key includes proper nouns in its
results; false otherwise |
protected static boolean |
isIllformattedLemma(edu.mit.jsemcor.element.ISemanticTag tag)
Returns true if the semantic tag of a multi-word expression is null, tags
a proper noun, or if the lemma encoded in the semantic tag is not
formatted properly, that is, with underscores separating the parts of the
multi-word expression.
|
void |
setIgnoreProperNouns(boolean ignoreProperNouns)
Sets the flag that, if
true , determines that the answer key
will include proper nouns in its results. |
public static final java.util.regex.Pattern condordanceSentenceIDPattern
public static final java.util.regex.Pattern lexSensePattern
public ConcordanceAnswerKey(edu.mit.jsemcor.main.IConcordance c)
c
- the concordance that backs this answer key. May not be
null
.public ConcordanceAnswerKey(java.lang.Iterable<? extends edu.mit.jsemcor.main.IConcordance> i)
i
- the set of concordances that backs this answer key. May not be
null
.java.lang.NullPointerException
- if the specified concordance set is null
public ConcordanceAnswerKey(java.util.Map<java.lang.String,edu.mit.jsemcor.main.IConcordance> concords)
concords
- the semcor concordance that backs this answer key. May not be
null
.java.lang.NullPointerException
- if the specified concordance set is null
public boolean isIgnoringProperNouns()
true
if this answer key includes proper nouns in its
results; false
otherwisetrue
if this answer key includes proper nouns in its
results; false
otherwisepublic void setIgnoreProperNouns(boolean ignoreProperNouns)
true
, determines that the answer key
will include proper nouns in its results.ignoreProperNouns
- true
if this answer key should include proper
nouns in its results; false
otherwisepublic <T extends IToken> java.util.List<IMWE<T>> getAnswers(IMarkedSentence<T> sent)
IAnswerKey
null
.getAnswers
in interface IAnswerKey
T
- type of tokens that are contained in the sentence.sent
- the sentence for which the answers should be retrieved May not
be null
.public <T extends IToken> java.util.List<IMWE<T>> getAnswers(IMarkedSentence<T> sent, edu.mit.jsemcor.element.ISentence answers)
T
- the token typesent
- the sentence for which answers are neededanswers
- the answersprotected <T extends IToken> java.util.List<IMWE<T>> getNonContinuousMWEs(IMarkedSentence<T> sent, edu.mit.jsemcor.element.ISentence answer, java.util.Set<edu.mit.jsemcor.element.IWordform> used)
T
- the token typesent
- the unit for which the answers are being constructedanswer
- the semcor sentence from which the multi-token MWEs should be
extractedused
- the set of wordforms already usedjava.lang.NullPointerException
- if either argument is null
protected <T extends IToken> java.util.List<IMWE<T>> getContinuousMWEs(IMarkedSentence<T> sent, edu.mit.jsemcor.element.ISentence answer, java.util.Set<edu.mit.jsemcor.element.IWordform> used)
T
- the token typesent
- the unit for which the answers are being constructedanswer
- the semcor sentence from which the single-token MWEs should be
extractedused
- the set of wordforms already usedjava.lang.NullPointerException
- if either argument is null
protected MWEPOS getMWEPOS(java.lang.String lexSense)
lexSense
- the lexical sense of a word form.protected MWEPOS disambiguatePOS(java.util.List<edu.mit.jsemcor.element.IWordform> mwe)
MWEPOS.VERB
. Otherwise, returns
null
.mwe
- the set of wordforms in the MWEnull
if nonepublic static edu.mit.jsemcor.element.ISentence getSentence(java.util.Map<java.lang.String,edu.mit.jsemcor.main.IConcordance> concords, IMarkedSentence<?> sent)
concords
- the concordances which should be searched for the sentencesent
- the sentence corresponding to the concordance sentence that
should be retrievedjava.lang.IllegalArgumentException
- if unable to find the sentenceprotected static boolean isIllformattedLemma(edu.mit.jsemcor.element.ISemanticTag tag)
tag
- the semantic tag of a wordform that is a part of a multi-word
expression.Copyright © 2011 Massachusetts Institute of Technology. All Rights Reserved.