A set of functions used to interface with the external megam maxent
optimization package. Before megam can be used, you should tell NLTK where it
can find the megam binary, using the
config_megam() function. Typical
>>> from nltk.classify import megam >>> megam.config_megam() # pass path to megam if not found in PATH [Found megam: ...]
Use with MaxentClassifier. Example below, see MaxentClassifier documentation for details.
Configure NLTK’s interface to the
megammaxent optimization package.
bin (str) – The full path to the
megambinary. If not specified, then nltk will search the system for a
megambinary; and if one is not found, it will raise a
- nltk.classify.megam.write_megam_file(train_toks, encoding, stream, bernoulli=True, explicit=True)¶
Generate an input file for
megambased on the given corpus of classified tokens.
train_toks (list(tuple(dict, str))) – Training data, represented as a list of pairs, the first member of which is a feature dictionary, and the second of which is a classification label.
encoding (MaxentFeatureEncodingI) – A feature encoding, used to convert featuresets into feature vectors. May optionally implement a cost() method in order to assign different costs to different class predictions.
stream (stream) – The stream to which the megam input file should be written.
bernoulli – If true, then use the ‘bernoulli’ format. I.e., all joint features have binary values, and are listed iff they are true. Otherwise, list feature values explicitly. If
bernoulli=False, then you must call
explicit – If true, then use the ‘explicit’ format. I.e., list the features that would fire for any of the possible labels, for each token. If
explicit=True, then you must call
- nltk.classify.megam.parse_megam_weights(s, features_count, explicit=True)¶
Given the stdout output generated by
megamwhen training a model, return a
numpyarray containing the corresponding weight vector. This function does not currently handle bias features.
megambinary with the given arguments.