nltk.text.ContextIndex

class nltk.text.ContextIndex[source]

Bases: object

A bidirectional index between words and their ‘contexts’ in a text. The context of a word is usually defined to be the words that occur in a fixed window around the word; but other definitions may also be used by providing a custom context function.

__init__(tokens, context_func=None, filter=None, key=<function ContextIndex.<lambda>>)[source]
tokens()[source]
Return type

list(str)

Returns

The document that this context index was created from.

word_similarity_dict(word)[source]

Return a dictionary mapping from words to ‘similarity scores,’ indicating how often these two words occur in the same context.

similar_words(word, n=20)[source]
common_contexts(words, fail_on_unknown=False)[source]

Find contexts where the specified words can all appear; and return a frequency distribution mapping each context to the number of times that context was used.

Parameters
  • words (str) – The words used to seed the similarity search

  • fail_on_unknown – If true, then raise a value error if any of the given words do not occur at all in the index.