nltk.classify.api module

Interfaces for labeling tokens with category labels (or “class labels”).

ClassifierI is a standard interface for “single-category classification”, in which the set of categories is known, the number of categories is finite, and each text belongs to exactly one category.

MultiClassifierI is a standard interface for “multi-category classification”, which is like single-category classification except that each text belongs to zero or more categories.

class nltk.classify.api.ClassifierI[source]

Bases: object

A processing interface for labeling tokens with a single category label (or “class”). Labels are typically strs or ints, but can be any immutable type. The set of labels that the classifier chooses from must be fixed and finite.

Subclasses must define:
  • labels()

  • either classify() or classify_many() (or both)

Subclasses may define:
  • either prob_classify() or prob_classify_many() (or both)

classify(featureset)[source]
Returns

the most appropriate label for the given featureset.

Return type

label

classify_many(featuresets)[source]

Apply self.classify() to each element of featuresets. I.e.:

return [self.classify(fs) for fs in featuresets]

Return type

list(label)

labels()[source]
Returns

the list of category labels used by this classifier.

Return type

list of (immutable)

prob_classify(featureset)[source]
Returns

a probability distribution over labels for the given featureset.

Return type

ProbDistI

prob_classify_many(featuresets)[source]

Apply self.prob_classify() to each element of featuresets. I.e.:

return [self.prob_classify(fs) for fs in featuresets]

Return type

list(ProbDistI)

class nltk.classify.api.MultiClassifierI[source]

Bases: object

A processing interface for labeling tokens with zero or more category labels (or “labels”). Labels are typically strs or ints, but can be any immutable type. The set of labels that the multi-classifier chooses from must be fixed and finite.

Subclasses must define:
  • labels()

  • either classify() or classify_many() (or both)

Subclasses may define:
  • either prob_classify() or prob_classify_many() (or both)

classify(featureset)[source]
Returns

the most appropriate set of labels for the given featureset.

Return type

set(label)

classify_many(featuresets)[source]

Apply self.classify() to each element of featuresets. I.e.:

return [self.classify(fs) for fs in featuresets]

Return type

list(set(label))

labels()[source]
Returns

the list of category labels used by this classifier.

Return type

list of (immutable)

prob_classify(featureset)[source]
Returns

a probability distribution over sets of labels for the given featureset.

Return type

ProbDistI

prob_classify_many(featuresets)[source]

Apply self.prob_classify() to each element of featuresets. I.e.:

return [self.prob_classify(fs) for fs in featuresets]

Return type

list(ProbDistI)