nltk.model package¶

Submodules¶

nltk.model.api module¶

class nltk.model.api.ModelI[source]¶

Bases: builtins.object

A processing interface for assigning a probability to the next word.

choose_random_word(context)[source]¶: Randomly select a word that is likely to appear in this context.

entropy(text)[source]¶: Evaluate the total entropy of a message with respect to the model. This is the sum of the log probability of each word in the message.

generate(n)[source]¶: Generate n words of text from the language model.

logprob(word, context)[source]¶: Evaluate the (negative) log probability of this word in this context.

prob(word, context)[source]¶: Evaluate the probability of this word in this context.

nltk.model.ngram module¶

class nltk.model.ngram.NgramModel(n, train, pad_left=True, pad_right=False, estimator=None, *estimator_args, **estimator_kwargs)[source]¶

Bases: nltk.model.api.ModelI

A processing interface for assigning a probability to the next word.

backoff[source]¶

choose_random_word(context)[source]¶

Randomly select a word that is likely to appear in this context.

Parameters:	context (list(str)) – the context the word is in

entropy(text)[source]¶

Calculate the approximate cross-entropy of the n-gram model for a given evaluation text. This is the average log probability of each word in the text.

Parameters:	text (list(str)) – words to use for evaluation

generate(num_words, context=())[source]¶

Generate random text based on the language model.

Parameters:	num_words (int) – number of words to generate context (list(str)) – initial words in generated string

logprob(word, context)[source]¶

Evaluate the (negative) log probability of this word in this context.

Parameters:	word (str) – the word to get the probability of context (list(str)) – the context the word is in

ngrams[source]¶

perplexity(text)[source]¶

Calculates the perplexity of the given text. This is simply 2 ** cross-entropy for the text.

Parameters:	text (list(str)) – words to calculate perplexity of

prob(word, context)[source]¶

Evaluate the probability of this word in this context using Katz Backoff.

Parameters:	word (str) – the word to get the probability of context (list(str)) – the context the word is in

probdist[source]¶

unicode_repr()¶

nltk.model.ngram.teardown_module(module=None)[source]¶

nltk.model package¶

Submodules¶

nltk.model.api module¶

nltk.model.ngram module¶

Module contents¶

Table Of Contents

Search