nltk.parse.malt module

class nltk.parse.malt.MaltParser[source]

Bases: ParserI

A class for dependency parsing with MaltParser. The input is the paths to: - (optionally) a maltparser directory - (optionally) the path to a pre-trained MaltParser .mco model file - (optionally) the tagger to use for POS tagging before parsing - (optionally) additional Java arguments

Example:
>>> from nltk.parse import malt
>>> # With MALT_PARSER and MALT_MODEL environment set.
>>> mp = malt.MaltParser(model_filename='engmalt.linear-1.7.mco') 
>>> mp.parse_one('I shot an elephant in my pajamas .'.split()).tree() 
(shot I (elephant an) (in (pajamas my)) .)
>>> # Without MALT_PARSER and MALT_MODEL environment.
>>> mp = malt.MaltParser('/home/user/maltparser-1.9.2/', '/home/user/engmalt.linear-1.7.mco') 
>>> mp.parse_one('I shot an elephant in my pajamas .'.split()).tree() 
(shot I (elephant an) (in (pajamas my)) .)
__init__(parser_dirname='', model_filename=None, tagger=None, additional_java_args=None)[source]

An interface for parsing with the Malt Parser.

Parameters
  • parser_dirname (str) – The path to the maltparser directory that contains the maltparser-1.x.jar

  • model_filename (str) – The name of the pre-trained model with .mco file extension. If provided, training will not be required. (see http://www.maltparser.org/mco/mco.html and see http://www.patful.com/chalk/node/185)

  • tagger (function) – The tagger used to POS tag the raw string before formatting to CONLL format. It should behave like nltk.pos_tag

  • additional_java_args (list) – This is the additional Java arguments that one can use when calling Maltparser, usually this is the heapsize limits, e.g. additional_java_args=[‘-Xmx1024m’] (see https://goo.gl/mpDBvQ)

generate_malt_command(inputfilename, outputfilename=None, mode=None)[source]

This function generates the maltparser command use at the terminal.

Parameters
  • inputfilename (str) – path to the input file

  • outputfilename (str) – path to the output file

parse_sents(sentences, verbose=False, top_relation_label='null')[source]

Use MaltParser to parse multiple sentences. Takes a list of sentences, where each sentence is a list of words. Each sentence will be automatically tagged with this MaltParser instance’s tagger.

Parameters

sentences – Input sentences to parse

Returns

iter(DependencyGraph)

parse_tagged_sents(sentences, verbose=False, top_relation_label='null')[source]

Use MaltParser to parse multiple POS tagged sentences. Takes multiple sentences where each sentence is a list of (word, tag) tuples. The sentences must have already been tokenized and tagged.

Parameters

sentences – Input sentences to parse

Returns

iter(iter(DependencyGraph)) the dependency graph representation of each sentence

train(depgraphs, verbose=False)[source]

Train MaltParser from a list of DependencyGraph objects

Parameters

depgraphs (DependencyGraph) – list of DependencyGraph objects for training input data

train_from_file(conll_file, verbose=False)[source]

Train MaltParser from a file :param conll_file: str for the filename of the training input data :type conll_file: str

nltk.parse.malt.find_malt_model(model_filename)[source]

A module to find pre-trained MaltParser model.

nltk.parse.malt.find_maltparser(parser_dirname)[source]

A module to find MaltParser .jar file and its dependencies.

nltk.parse.malt.malt_regex_tagger()[source]