nltk.classify.api module

Interfaces for labeling tokens with category labels (or “class labels”).

ClassifierI is a standard interface for “single-category classification”, in which the set of categories is known, the number of categories is finite, and each text belongs to exactly one category.

MultiClassifierI is a standard interface for “multi-category classification”, which is like single-category classification except that each text belongs to zero or more categories.

class nltk.classify.api.ClassifierI[source]

Bases: object

A processing interface for labeling tokens with a single category label (or “class”). Labels are typically strs or ints, but can be any immutable type. The set of labels that the classifier chooses from must be fixed and finite.

Subclasses must define:
  • labels()

  • either classify() or classify_many() (or both)

Subclasses may define:
  • either prob_classify() or prob_classify_many() (or both)

classify(featureset)[source]
Returns:

the most appropriate label for the given featureset.

Return type:

label

classify_many(featuresets)[source]

Apply self.classify() to each element of featuresets. I.e.:

return [self.classify(fs) for fs in featuresets]

Return type:

list(label)

labels()[source]
Returns:

the list of category labels used by this classifier.

Return type:

list of (immutable)

prob_classify(featureset)[source]
Returns:

a probability distribution over labels for the given featureset.

Return type:

ProbDistI

prob_classify_many(featuresets)[source]

Apply self.prob_classify() to each element of featuresets. I.e.:

return [self.prob_classify(fs) for fs in featuresets]

Return type:

list(ProbDistI)

class nltk.classify.api.MultiClassifierI[source]

Bases: object

A processing interface for labeling tokens with zero or more category labels (or “labels”). Labels are typically strs or ints, but can be any immutable type. The set of labels that the multi-classifier chooses from must be fixed and finite.

Subclasses must define:
  • labels()

  • either classify() or classify_many() (or both)

Subclasses may define:
  • either prob_classify() or prob_classify_many() (or both)

classify(featureset)[source]
Returns:

the most appropriate set of labels for the given featureset.

Return type:

set(label)

classify_many(featuresets)[source]

Apply self.classify() to each element of featuresets. I.e.:

return [self.classify(fs) for fs in featuresets]

Return type:

list(set(label))

labels()[source]
Returns:

the list of category labels used by this classifier.

Return type:

list of (immutable)

prob_classify(featureset)[source]
Returns:

a probability distribution over sets of labels for the given featureset.

Return type:

ProbDistI

prob_classify_many(featuresets)[source]

Apply self.prob_classify() to each element of featuresets. I.e.:

return [self.prob_classify(fs) for fs in featuresets]

Return type:

list(ProbDistI)