# nltk.classify.naivebayes module¶

A classifier based on the Naive Bayes algorithm. In order to find the probability for a label, this algorithm first uses the Bayes rule to express P(label|features) in terms of P(label) and P(features|label):

P(label) * P(features|label)
P(label|features) = ——————————
P(features)

The algorithm then makes the ‘naive’ assumption that all features are independent, given the label:

P(label) * P(f1|label) * … * P(fn|label)
P(label|features) = ——————————————–
P(features)

Rather than computing P(features) explicitly, the algorithm just calculates the numerator for each label, and normalizes them so they sum to one:

P(label) * P(f1|label) * … * P(fn|label)
P(label|features) = ——————————————–
SUM[l]( P(l) * P(f1|l) * … * P(fn|l) )
class nltk.classify.naivebayes.NaiveBayesClassifier[source]

Bases: `ClassifierI`

A Naive Bayes classifier. Naive Bayes classifiers are paramaterized by two probability distributions:

• P(label) gives the probability that an input will receive each label, given no information about the input’s features.

• P(fname=fval|label) gives the probability that a given feature (fname) will receive a given value (fval), given that the label (label).

If the classifier encounters an input with a feature that has never been seen with any label, then rather than assigning a probability of 0 to all labels, it will ignore that feature.

The feature value ‘None’ is reserved for unseen feature values; you generally should not use ‘None’ as a feature value for one of your own features.

__init__(label_probdist, feature_probdist)[source]
Parameters
• label_probdist – P(label), the probability distribution over labels. It is expressed as a `ProbDistI` whose samples are labels. I.e., P(label) = `label_probdist.prob(label)`.

• feature_probdist – P(fname=fval|label), the probability distribution for feature values, given labels. It is expressed as a dictionary whose keys are `(label, fname)` pairs and whose values are `ProbDistI` objects over feature values. I.e., P(fname=fval|label) = `feature_probdist[label,fname].prob(fval)`. If a given `(label,fname)` is not a key in `feature_probdist`, then it is assumed that the corresponding P(fname=fval|label) is 0 for all values of `fval`.

classify(featureset)[source]
Returns

the most appropriate label for the given featureset.

Return type

label

labels()[source]
Returns

the list of category labels used by this classifier.

Return type

list of (immutable)

most_informative_features(n=100)[source]

Return a list of the ‘most informative’ features used by this classifier. For the purpose of this function, the informativeness of a feature `(fname,fval)` is equal to the highest value of P(fname=fval|label), for any label, divided by the lowest value of P(fname=fval|label), for any label:

max[ P(fname=fval|label1) / P(fname=fval|label2) ]
prob_classify(featureset)[source]
Returns

a probability distribution over labels for the given featureset.

Return type

ProbDistI

show_most_informative_features(n=10)[source]
classmethod train(labeled_featuresets, estimator=<class 'nltk.probability.ELEProbDist'>)[source]
Parameters

labeled_featuresets – A list of classified featuresets, i.e., a list of tuples `(featureset, label)`.

nltk.classify.naivebayes.demo()[source]