edu.mit.jmwe.data
Interface IMWEDesc

All Superinterfaces:
Comparable<IMWEDesc>, IHasForm, IHasMWEPOS
All Known Subinterfaces:
IInfMWEDesc, IndexBuilder.IMutableMWEDesc, IRootMWEDesc
All Known Implementing Classes:
AbstractMWEDesc, IndexBuilder.MutableInfMWEDesc, IndexBuilder.MutableRootMWEDesc, InfMWEDesc, RootMWEDesc

public interface IMWEDesc
extends IHasForm, IHasMWEPOS, Comparable<IMWEDesc>

An MWE description consisting of an IMWEDescID, list of parts, and counts relating to the MWE's appearance in a reference concordance.

Since:
jMWE 1.0.0
Version:
$Id: IMWEDesc.java 571 2011-05-05 19:40:04Z markaf $
Author:
M.A. Finlayson, N. Kulkarni

Nested Class Summary
static interface IMWEDesc.IPart
          A part of a multi-word expression.
 
Field Summary
static Pattern boundaryUnderscores
          The pattern consisting of one or more underscores that occur at the beginning or end of the input.
static Pattern comma
          The pattern consisting of a single underscore.
static Pattern underscore
          The pattern consisting of a single underscore.
static Pattern underscores
          The pattern consisting of one or more underscores.
 
Method Summary
 int[] getCounts()
          Returns an array containing the marked split, marked continuous, unmarked exact, and unmarked pattern occurrences of this MWE in the reference concordance.
 IMWEDescID getID()
          Returns the IMWEDescID associated with this description.
 int getMarkedContinuous()
          Returns the number of times this MWE was marked on a continuous run of tokens in the reference concordance.
 int getMarkedSplit()
          Returns the number of times this MWE was marked on a non-continuous run of tokens in the reference concordance.
 List<? extends IMWEDesc.IPart> getParts()
          Returns an unmodifiable list of parts that comprise the MWE.
 int getUnmarkedExact()
          Returns the number of times the exact surface form of this MWE description occurs in the reference concordance without being marked as an occurrence of the MWE.
 int getUnmarkedPattern()
          Returns the number of times a this MWE description occurs in the reference concordance without being marked as an occurrence of the MWE, and whose form matches a known inflection pattern.
 
Methods inherited from interface edu.mit.jmwe.data.IHasForm
getForm
 
Methods inherited from interface edu.mit.jmwe.data.IHasMWEPOS
getPOS
 
Methods inherited from interface java.lang.Comparable
compareTo
 

Field Detail

underscore

static final Pattern underscore
The pattern consisting of a single underscore.

Since:
jMWE 1.0.0

comma

static final Pattern comma
The pattern consisting of a single underscore.

Since:
jMWE 1.0.0

underscores

static final Pattern underscores
The pattern consisting of one or more underscores.

Since:
jMWE 1.0.0

boundaryUnderscores

static final Pattern boundaryUnderscores
The pattern consisting of one or more underscores that occur at the beginning or end of the input.

Since:
jMWE 1.0.0
Method Detail

getID

IMWEDescID getID()
Returns the IMWEDescID associated with this description.

Returns:
the IMWEDescID associated with this description. Never null.
Since:
jMWE 1.0.0

getMarkedContinuous

int getMarkedContinuous()
Returns the number of times this MWE was marked on a continuous run of tokens in the reference concordance. Will always zero or a positive number.

Returns:
the number of times this MWE was marked on a unbroken run of tokens in the reference concordance.
Since:
jMWE 1.0.0

getMarkedSplit

int getMarkedSplit()
Returns the number of times this MWE was marked on a non-continuous run of tokens in the reference concordance. Will always zero or a positive number.

Returns:
the number of times this MWE was marked on a non-continuous run of tokens in the reference concordance.
Since:
jMWE 1.0.0

getUnmarkedExact

int getUnmarkedExact()
Returns the number of times the exact surface form of this MWE description occurs in the reference concordance without being marked as an occurrence of the MWE. To be counted as an exact unmarked occurrence, there must be a continuous run of tokens whose forms match, in order, the forms of the parts (ignoring case) of this MWE description. Will always zero or a positive number.

Returns:
the number exact unmarked occurrences of this MWE in the reference concordance.
Since:
jMWE 1.0.0

getUnmarkedPattern

int getUnmarkedPattern()
Returns the number of times a this MWE description occurs in the reference concordance without being marked as an occurrence of the MWE, and whose form matches a known inflection pattern. To be counted as a pattern-inflected occurrence, there must be a continuous run of tokens whose forms or stems match, in order, the forms of the parts (ignoring case) of this MWE description, and whose inflection pattern matches one of reference inflection patterns. Will always zero or a positive number.

Returns:
the number of inflected unmarked occurrences of this MWE in the reference concordance.
Since:
jMWE 1.0.0

getParts

List<? extends IMWEDesc.IPart> getParts()
Returns an unmodifiable list of parts that comprise the MWE.

Returns:
an unmodifiable list of parts that comprise the MWE.
Since:
jMWE 1.0.0

getCounts

int[] getCounts()
Returns an array containing the marked split, marked continuous, unmarked exact, and unmarked pattern occurrences of this MWE in the reference concordance.

Returns:
an array containing the counts relating to the MWE's appearance in the reference concordance.
Since:
jMWE 1.0.0


Copyright © 2011 Massachusetts Institute of Technology. All Rights Reserved.