nltk.stem package¶

Submodules¶

nltk.stem.api module
- StemmerI
  - StemmerI.stem()
nltk.stem.arlstem module
- ARLSTem
nltk.stem.arlstem2 module
- ARLSTem2
nltk.stem.cistem module
- Cistem
nltk.stem.isri module
- ISRIStemmer
nltk.stem.lancaster module
- LancasterStemmer
nltk.stem.porter module
- PorterStemmer
- demo()
nltk.stem.regexp module
- RegexpStemmer
  - RegexpStemmer.__init__()
  - RegexpStemmer.stem()
nltk.stem.rslp module
- RSLPStemmer
nltk.stem.snowball module
nltk.stem.util module
- prefix_replace()
- suffix_replace()
nltk.stem.wordnet module
- WordNetLemmatizer
  - WordNetLemmatizer.lemmatize()
  - WordNetLemmatizer.morphy()

Module contents¶

NLTK Stemmers

Interfaces used to remove morphological affixes from words, leaving only the word stem. Stemming algorithms aim to remove those affixes required for eg. grammatical role, tense, derivational morphology leaving only the stem of the word. This is a difficult problem due to irregular words (eg. common verbs in English), complicated morphological rules, and part-of-speech and sense ambiguities (eg. ceil- is not the stem of ceiling).

StemmerI defines a standard interface for stemmers.