edu.mit.jmwe.data.concordance
Class ConcordanceSentence

java.lang.Object
  extended by java.util.AbstractCollection<E>
      extended by java.util.AbstractList<IConcordanceToken>
          extended by edu.mit.jmwe.data.concordance.ConcordanceSentence
All Implemented Interfaces:
IConcordanceSentence, IHasID, IMarkedSentence<IConcordanceToken>, Iterable<IConcordanceToken>, Collection<IConcordanceToken>, List<IConcordanceToken>, RandomAccess

public class ConcordanceSentence
extends AbstractList<IConcordanceToken>
implements IConcordanceSentence, RandomAccess

Default implementation of ISemcorSentence

This class requires JSemcor to be on the classpath.

Since:
jMWE 1.0.0
Version:
$Id: ConcordanceSentence.java 620 2011-05-08 21:13:58Z markaf $
Author:
M.A. Finlayson

Field Summary
static Pattern taggedSemcorSentencePattern
          A compiled regular expression pattern that captures the string representation of a Semcor sentence.
 
Fields inherited from class java.util.AbstractList
modCount
 
Constructor Summary
ConcordanceSentence(edu.mit.jsemcor.element.IContextID cid, int sentNum, List<? extends IConcordanceToken> tokens)
          Constructs a new semcor sentence from the list of tokens.
ConcordanceSentence(edu.mit.jsemcor.element.IContextID cid, edu.mit.jsemcor.element.ISentence sent)
          Constructs a new semcor sentence from the specified context id and JSemcor sentence object.
 
Method Summary
 IConcordanceToken get(int index)
           
 edu.mit.jsemcor.element.IContextID getContextID()
          Returns the context id from which this sentence was drawn.
 String getID()
          Returns an ID string that uniquely identifies this object or object type.
 int getSentenceNumber()
          Returns the sentence number of this sentence in the specified Semcor context.
static String makeID(edu.mit.jsemcor.element.IContextID cid, int sentNum)
          Returns a string ID constructed from the given IContextID and sentence number.
static ConcordanceSentence parse(String toString)
          Parses a string formed from the a string of the form
 int size()
           
 String toString()
           
 
Methods inherited from class java.util.AbstractList
add, add, addAll, clear, equals, hashCode, indexOf, iterator, lastIndexOf, listIterator, listIterator, remove, removeRange, set, subList
 
Methods inherited from class java.util.AbstractCollection
addAll, contains, containsAll, isEmpty, remove, removeAll, retainAll, toArray, toArray
 
Methods inherited from class java.lang.Object
clone, finalize, getClass, notify, notifyAll, wait, wait, wait
 
Methods inherited from interface java.util.List
add, add, addAll, addAll, clear, contains, containsAll, equals, hashCode, indexOf, isEmpty, iterator, lastIndexOf, listIterator, listIterator, remove, remove, removeAll, retainAll, set, subList, toArray, toArray
 

Field Detail

taggedSemcorSentencePattern

public static final Pattern taggedSemcorSentencePattern
A compiled regular expression pattern that captures the string representation of a Semcor sentence. [\\S&&[^/]] Pattern: ^\\s*([\\S&&[^/]]+)/([\\S&&[^/]]+)/(\\d+)\\s+(\\S.*)$
  1. ^ beginning of the line
  2. \\s* any amount of whitespace
  3. ([\\S&&[^/]]+)/ capturing group 1, concordance name (unbroken run of non-whitespace, non-forward-slash characters) followed by a forward slash
  4. ([\\S&&[^/]]+)/ capturing group 2, context name (unbroken run of non-whitespace, non-forward-slash characters) followed by a forward slash
  5. (\\d+) capturing group 3, sentence number (unbroken run of digits)
  6. \\s+ some amount of whitespace
  7. (\S.*) capturing group 4, the first non-whitespace character plus the remainder of the characters to the end of the line
  8. $ end of the line

Since:
jMWE 1.0.0
Constructor Detail

ConcordanceSentence

public ConcordanceSentence(edu.mit.jsemcor.element.IContextID cid,
                           edu.mit.jsemcor.element.ISentence sent)
Constructs a new semcor sentence from the specified context id and JSemcor sentence object.

Parameters:
cid - the context id for the JSemcor sentence; may not be null
sent - the JSemcor sentence; may not be null
Throws:
NullPointerException - if either argument is null
Since:
jMWE 1.0.0

ConcordanceSentence

public ConcordanceSentence(edu.mit.jsemcor.element.IContextID cid,
                           int sentNum,
                           List<? extends IConcordanceToken> tokens)
Constructs a new semcor sentence from the list of tokens. This constructor allocates a new internal list, and so subsequent changes to the source list will not affect this object.

Parameters:
cid - the context id for the JSemcor sentence; may not be null
sentNum - the sentence number; must be positive
tokens - the list of tokens that will make up this list, may not be null or empty
Throws:
NullPointerException - if the context id, or the list of source tokens is null or contains null
IllegalArgumentException - if the sentence number is non-positive, or the list is empty
Since:
jMWE 1.0.0
Method Detail

getID

public String getID()
Description copied from interface: IHasID
Returns an ID string that uniquely identifies this object or object type. Should never return null.

Specified by:
getID in interface IHasID
Returns:
the non-null id String

getContextID

public edu.mit.jsemcor.element.IContextID getContextID()
Description copied from interface: IConcordanceSentence
Returns the context id from which this sentence was drawn. May not return null.

Specified by:
getContextID in interface IConcordanceSentence
Returns:
the non-null context id from which this sentence was drawn.

getSentenceNumber

public int getSentenceNumber()
Description copied from interface: IConcordanceSentence
Returns the sentence number of this sentence in the specified Semcor context.

Specified by:
getSentenceNumber in interface IConcordanceSentence
Returns:
the sentence number of this sentence in the specified Semcor context.

get

public IConcordanceToken get(int index)
Specified by:
get in interface List<IConcordanceToken>
Specified by:
get in class AbstractList<IConcordanceToken>

size

public int size()
Specified by:
size in interface Collection<IConcordanceToken>
Specified by:
size in interface List<IConcordanceToken>
Specified by:
size in class AbstractCollection<IConcordanceToken>

toString

public String toString()
Overrides:
toString in class AbstractCollection<IConcordanceToken>

parse

public static ConcordanceSentence parse(String toString)
Parses a string formed from the a string of the form
 concordanceName/contextID/sentNumber [tok_tag_stems_num_part]+
 
into a ConcordanceSentence instance.

Parameters:
toString - the string representing the tagged semcor sentence.
Returns:
a SemcorSentence instance
Throws:
NullPointerException - if the specified string is null
IllegalArgumentException - if the specified string does not conform to the expected format
Since:
jMWE 1.0.0

makeID

public static String makeID(edu.mit.jsemcor.element.IContextID cid,
                            int sentNum)
Returns a string ID constructed from the given IContextID and sentence number. For the Semcor corpus, this ID has the form:

brown1/br-a01/1

Parameters:
cid - the context id
sentNum - the sentence number
Returns:
a string ID constructed from the given context id and sentence number:
Since:
jMWE 1.0.0


Copyright © 2011 Massachusetts Institute of Technology. All Rights Reserved.