Skip navigation links

jMWE 1.0.2
(MIT jMWE Library)

jMWE is a Java library for constructing and testing Multi-Word Expression detectors.

See: Description

Packages 
Package Description
edu.mit.jmwe.data
Provides the basic data structures used by the library and their default implementations.
edu.mit.jmwe.data.concordance
Provides interfaces and classes for accessing data taken Semcor-formatted concordances, useful for benchmarking detectors.
edu.mit.jmwe.detect
Provides MWE detector API, a baseline detector, plus numerous other detector implementations.
edu.mit.jmwe.detect.score
Provides various scoring mechanisms that can be used by subclasses of the FilterByScore and ResolveByScore detectors.
edu.mit.jmwe.harness
Provides testing harness infrastructure
edu.mit.jmwe.harness.result
Provides objects that encapsulate the results of a test harness run
edu.mit.jmwe.harness.result.error
Provides error detectors to evaluate the results of a test harness run
edu.mit.jmwe.index
Provides the MWE index interfaces and default implementations, which allow one to look up an MWE given one of its parts.
edu.mit.jmwe.util
Provides utility classes used by many classes across the library
jMWE is a Java library for constructing and testing Multi-Word Expression detectors. A Multi-Word Expression (MWE) is a group of words that (1) occurs together more often than would be expected by pure chance and, (2) is arbitrarily restricted with regard to their syntactic or semantic flexibility. Examples of common MWEs are compound nouns such as world record or verb-particle constructions such as look up, as, for example in the sentence:
  1. She looked up the world record.
The library has three main facilities: (1) a detector API, (2) a MWE index facility, and (3) a test harness. The detector API defines a detector interface which provides a single method for detecting MWE tokens in a list of individual tokens; anyone interested in taking advantage of jMWEs testing infrastructure or writing their own MWE token detection algorithm need only implement this interface. jMWE provides several baseline MWE token detection strategies. Also provided are detector filters and resolvers, which apply a specific constraint to or resolve conflicts in the output another detector. The MWE index provides classes for constructing, storing, and accessing indices of valid MWE types. An MWE index allows an algorithm to retrieve a list of MWE types given a single word token and part of speech. The index also lists how frequently, in a particular concordance, a set of tokens appears as a particular MWE type rather than as independent words. The test harness allows one to run an MWE detector over a given corpus and measure its precision and recall. The library has no GUI elements.

jMWE is free to use for all purposes, as long as proper acknowledgment is made. Details can be found in the license, which is included with the distribution.

Skip navigation links

Copyright © 2011 Massachusetts Institute of Technology. All Rights Reserved.