nltk.parse.malt module¶

class nltk.parse.malt.MaltParser[source]¶

Bases: ParserI

A class for dependency parsing with MaltParser. The input is the paths to: - (optionally) a maltparser directory - (optionally) the path to a pre-trained MaltParser .mco model file - (optionally) the tagger to use for POS tagging before parsing - (optionally) additional Java arguments

Example:

>>> from nltk.parse import malt
>>> # With MALT_PARSER and MALT_MODEL environment set.
>>> mp = malt.MaltParser(model_filename='engmalt.linear-1.7.mco')
>>> mp.parse_one('I shot an elephant in my pajamas .'.split()).tree()
(shot I (elephant an) (in (pajamas my)) .)
>>> # Without MALT_PARSER and MALT_MODEL environment.
>>> mp = malt.MaltParser('/home/user/maltparser-1.9.2/', '/home/user/engmalt.linear-1.7.mco')
>>> mp.parse_one('I shot an elephant in my pajamas .'.split()).tree()
(shot I (elephant an) (in (pajamas my)) .)

__init__(parser_dirname='', model_filename=None, tagger=None, additional_java_args=None)[source]¶

An interface for parsing with the Malt Parser.

Parameters:

parser_dirname (str) – The path to the maltparser directory that contains the maltparser-1.x.jar
model_filename (str) – The name of the pre-trained model with .mco file extension. If provided, training will not be required. (see http://www.maltparser.org/mco/mco.html and see http://www.patful.com/chalk/node/185)
tagger (function) – The tagger used to POS tag the raw string before formatting to CONLL format. It should behave like nltk.pos_tag
additional_java_args (list) – This is the additional Java arguments that one can use when calling Maltparser, usually this is the heapsize limits, e.g. additional_java_args=[‘-Xmx1024m’] (see https://javarevisited.blogspot.com/2011/05/java-heap-space-memory-size-jvm.html)

generate_malt_command(inputfilename, outputfilename=None, mode=None)[source]¶

This function generates the maltparser command use at the terminal.

Parameters:

inputfilename (str) – path to the input file
outputfilename (str) – path to the output file

parse_sents(sentences, verbose=False, top_relation_label='null')[source]¶

Use MaltParser to parse multiple sentences. Takes a list of sentences, where each sentence is a list of words. Each sentence will be automatically tagged with this MaltParser instance’s tagger.

Parameters:: sentences – Input sentences to parse
Returns:: iter(DependencyGraph)

parse_tagged_sents(sentences, verbose=False, top_relation_label='null')[source]¶

Use MaltParser to parse multiple POS tagged sentences. Takes multiple sentences where each sentence is a list of (word, tag) tuples. The sentences must have already been tokenized and tagged.

Parameters:: sentences – Input sentences to parse
Returns:: iter(iter(DependencyGraph)) the dependency graph representation of each sentence

train(depgraphs, verbose=False)[source]¶

Train MaltParser from a list of DependencyGraph objects

Parameters:: depgraphs (DependencyGraph) – list of DependencyGraph objects for training input data

train_from_file(conll_file, verbose=False)[source]¶: Train MaltParser from a file :param conll_file: str for the filename of the training input data :type conll_file: str

nltk.parse.malt.find_malt_model(model_filename)[source]¶: A module to find pre-trained MaltParser model.

nltk.parse.malt.find_maltparser(parser_dirname)[source]¶: A module to find MaltParser .jar file and its dependencies.

nltk.parse.malt.malt_regex_tagger()[source]¶

NLTK

Documentation

nltk.parse.malt module¶