edu.mit.jmwe.data
Class Token

java.lang.Object
  extended by edu.mit.jmwe.data.Token
All Implemented Interfaces:
IHasForm, IToken
Direct Known Subclasses:
ConcordanceToken

public class Token
extends Object
implements IToken

Default implementation of the IToken interface.

Since:
jMWE 1.0.0
Version:
$Id: Token.java 583 2011-05-05 19:58:06Z markaf $
Author:
Nidhi Kulkarni, M.A. Finlayson

Constructor Summary
Token(String text, String tag)
          Constructs a new token object with the specified text and tag, with no stems yet assigned.
Token(String text, String tag, String... stems)
          Constructs a new token object with the specified text, tag, and stems.
 
Method Summary
static List<String> checkStems(String[] stems)
          Checks the specified array of strings to ensure each one is non- null, and, once trimmed, is not empty and does not contain whitespace.
static String checkString(String text)
          Checks the specified string to see that, once trimmed, it is not empty and does not contain whitespace.
 String getForm()
          Returns the object's surface form text, exactly as it appears in its original context, with capitalization intact.
 List<String> getStems()
          Returns an unmodifiable list of stems, all in lowercase.
 String getTag()
          Returns the part of speech tag for this token, or null if the token is not tagged.
 String toString()
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Token

public Token(String text,
             String tag)
Constructs a new token object with the specified text and tag, with no stems yet assigned.

Parameters:
text - the surface form of the token as it appears in the sentence, capitalization intact
tag - the tag of the token, if assigned, otherwise null
Throws:
NullPointerException - if the text is null
NullPointerException - if the text is null
IllegalArgumentException - if the trimmed text is empty or contains whitespace
Since:
jMWE 1.0.0

Token

public Token(String text,
             String tag,
             String... stems)
Constructs a new token object with the specified text, tag, and stems.

Parameters:
text - the surface form of the token as it appears in the sentence, capitalization intact
tag - the tag of the token, if assigned, otherwise null
stems - the array of stems, possibly empty or null, but not containing null. If null, this means that no stemming has yet been attempted. If empty, this means the token is not stemmable.
Throws:
NullPointerException - if the text is null, or any of the stems are null
IllegalArgumentException - if the trimmed text is empty or contains whitespace
Since:
jMWE 1.0.0
Method Detail

getForm

public String getForm()
Description copied from interface: IHasForm
Returns the object's surface form text, exactly as it appears in its original context, with capitalization intact. May be a single word or punctuation. The surface form may not contain whitespace or underscores. This method will never return null.

Specified by:
getForm in interface IHasForm
Returns:
the original text, never null.

getTag

public String getTag()
Description copied from interface: IToken
Returns the part of speech tag for this token, or null if the token is not tagged. If the part of speech is null, no part of speech has yet been assigned.

Specified by:
getTag in interface IToken
Returns:
the part of speech tag for this token, or null if the token is not tagged.

getStems

public List<String> getStems()
Description copied from interface: IToken
Returns an unmodifiable list of stems, all in lowercase. The order of the stems depends on the implementation. No stem should be repeated in the list. If the method returns an empty list, this means that the token is not stemmable. If the method returns null, this means no stemming has yet been attempted.

Specified by:
getStems in interface IToken
Returns:
a possibly null, possibly empty list of lowercase stems

toString

public String toString()
Overrides:
toString in class Object

checkString

public static String checkString(String text)
Checks the specified string to see that, once trimmed, it is not empty and does not contain whitespace. If not, the trimmed string is returned. Otherwise, the method throws an exception.

Parameters:
text - the text to be checked
Returns:
the trimmed String
Throws:
NullPointerException - if the specified String is null
IllegalArgumentException - if, after being trimmed, the specified String is empty or contains whitespace
Since:
jMWE 1.0.0

checkStems

public static List<String> checkStems(String[] stems)
Checks the specified array of strings to ensure each one is non- null, and, once trimmed, is not empty and does not contain whitespace. If all strings check out, an unmodifiable list of the trimmed, lowercase strings is returned. Otherwise, the method throws an exception.

Parameters:
stems - the list of stems to check; may be null or empty, but may not contain null
Returns:
an unmodifiable list of trimmed, lowercase strings
Throws:
NullPointerException - if the any string in the array is null
IllegalArgumentException - if, after being trimmed, any string in the array is empty or contains whitespace
Since:
jMWE 1.0.0


Copyright © 2011 Massachusetts Institute of Technology. All Rights Reserved.