edu.mit.jwi.morph
Class WordnetStemmer
java.lang.Object
edu.mit.jwi.morph.SimpleStemmer
edu.mit.jwi.morph.WordnetStemmer
- All Implemented Interfaces:
- IStemmer
public class WordnetStemmer
- extends SimpleStemmer
This stemmer adds functionality to the simple pattern-based stemmer
SimpleStemmer
by checking to see if possible stems are actually
contained in Wordnet. If any stems are found, only these stems are returned.
If no prospective stems are found, the word is considered unknown, and the
result returned is the same as that of the SimpleStemmer
class.
- Since:
- JWI 1.0
- Version:
- 2.4.0
- Author:
- Mark A. Finlayson
Fields inherited from class edu.mit.jwi.morph.SimpleStemmer |
ENDING_ch, ENDING_e, ENDING_man, ENDING_null, ENDING_s, ENDING_sh, ENDING_x, ENDING_y, ENDING_z, ruleMap, SUFFIX_ches, SUFFIX_ed, SUFFIX_er, SUFFIX_es, SUFFIX_est, SUFFIX_ful, SUFFIX_ies, SUFFIX_ing, SUFFIX_men, SUFFIX_s, SUFFIX_ses, SUFFIX_shes, SUFFIX_ss, SUFFIX_xes, SUFFIX_zes, underscore |
Constructor Summary |
WordnetStemmer(IDictionary dict)
Constructs a WordnetStemmer that, naturally, requires a Wordnet
dictionary. |
Method Summary |
java.util.List<java.lang.String> |
findStems(java.lang.String word,
POS pos)
Takes the surface form of a word, as it appears in the text, and the
assigned Wordnet part of speech. |
IDictionary |
getDictionary()
Returns the dictionary in use by the stemmer; will not return null |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
WordnetStemmer
public WordnetStemmer(IDictionary dict)
- Constructs a WordnetStemmer that, naturally, requires a Wordnet
dictionary.
- Parameters:
dict
- the dictionary to use; may not be null
- Throws:
java.lang.NullPointerException
- if the specified dictionary is null
- Since:
- JWI 1.0
getDictionary
public IDictionary getDictionary()
- Returns the dictionary in use by the stemmer; will not return
null
- Returns:
- the dictionary in use by this stemmer
- Since:
- JWI 2.2.0
findStems
public java.util.List<java.lang.String> findStems(java.lang.String word,
POS pos)
- Description copied from interface:
IStemmer
- Takes the surface form of a word, as it appears in the text, and the
assigned Wordnet part of speech. The surface form may or may not contain
whitespace or underscores, and may be in mixed case. The part of speech
may be
null
, which means that all parts of speech should be
considered. Returns a list of stems, in preferred order. No stem should
be repeated in the list. If no stems are found, this call returns an
empty list. It will never return null
.
- Specified by:
findStems
in interface IStemmer
- Overrides:
findStems
in class SimpleStemmer
- Parameters:
word
- the surface form of which to find the stempos
- the part of speech to find stems for; if null
,
find stems for all parts of speech
- Returns:
- the set of stems found for the surface form and part of speech
combination
Copyright © 2007-2013 Massachusetts Institute of Technology. All Rights Reserved.