nltk.parse.earleychart module

Data classes and parser implementations for incremental chart parsers, which use dynamic programming to efficiently parse a text. A “chart parser” derives parse trees for a text by iteratively adding “edges” to a “chart”. Each “edge” represents a hypothesis about the tree structure for a subsequence of the text. The “chart” is a “blackboard” for composing and combining these hypotheses.

A parser is “incremental”, if it guarantees that for all i, j where i < j, all edges ending at i are built before any edges ending at j. This is appealing for, say, speech recognizer hypothesis filtering.

The main parser class is EarleyChartParser, which is a top-down algorithm, originally formulated by Jay Earley (1970).

class nltk.parse.earleychart.CompleteFundamentalRule[source]

Bases: SingleEdgeFundamentalRule

class nltk.parse.earleychart.CompleterRule[source]

Bases: CompleteFundamentalRule

apply(chart, grammar, edge)[source]

Return a generator that will add edges licensed by this rule and the given edges to the chart, one at a time. Each time the generator is resumed, it will either add a new edge and yield that edge; or return.

Parameters:

edges (list(EdgeI)) – A set of existing edges. The number of edges that should be passed to apply() is specified by the NUM_EDGES class variable.

Return type:

iter(EdgeI)

class nltk.parse.earleychart.EarleyChartParser[source]

Bases: IncrementalChartParser

__init__(grammar, **parser_args)[source]

Create a new Earley chart parser, that uses grammar to parse texts.

Parameters:
  • grammar (CFG) – The grammar used to parse texts.

  • trace (int) – The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.

  • trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.

  • chart_class – The class that should be used to create the charts used by this parser.

class nltk.parse.earleychart.FeatureCompleteFundamentalRule[source]

Bases: FeatureSingleEdgeFundamentalRule

class nltk.parse.earleychart.FeatureCompleterRule[source]

Bases: CompleterRule

class nltk.parse.earleychart.FeatureEarleyChartParser[source]

Bases: FeatureIncrementalChartParser

__init__(grammar, **parser_args)[source]

Create a new Earley chart parser, that uses grammar to parse texts.

Parameters:
  • grammar (CFG) – The grammar used to parse texts.

  • trace (int) – The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.

  • trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.

  • chart_class – The class that should be used to create the charts used by this parser.

class nltk.parse.earleychart.FeatureIncrementalBottomUpChartParser[source]

Bases: FeatureIncrementalChartParser

__init__(grammar, **parser_args)[source]

Create a new Earley chart parser, that uses grammar to parse texts.

Parameters:
  • grammar (CFG) – The grammar used to parse texts.

  • trace (int) – The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.

  • trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.

  • chart_class – The class that should be used to create the charts used by this parser.

class nltk.parse.earleychart.FeatureIncrementalBottomUpLeftCornerChartParser[source]

Bases: FeatureIncrementalChartParser

__init__(grammar, **parser_args)[source]

Create a new Earley chart parser, that uses grammar to parse texts.

Parameters:
  • grammar (CFG) – The grammar used to parse texts.

  • trace (int) – The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.

  • trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.

  • chart_class – The class that should be used to create the charts used by this parser.

class nltk.parse.earleychart.FeatureIncrementalChart[source]

Bases: IncrementalChart, FeatureChart

select(end, **restrictions)[source]

Returns an iterator over the edges in this chart. See Chart.select for more information about the restrictions on the edges.

class nltk.parse.earleychart.FeatureIncrementalChartParser[source]

Bases: IncrementalChartParser, FeatureChartParser

__init__(grammar, strategy=[<nltk.parse.chart.LeafInitRule object>, <nltk.parse.featurechart.FeatureEmptyPredictRule object>, <nltk.parse.featurechart.FeatureBottomUpPredictCombineRule object>, <nltk.parse.earleychart.FeatureCompleteFundamentalRule object>], trace_chart_width=20, chart_class=<class 'nltk.parse.earleychart.FeatureIncrementalChart'>, **parser_args)[source]

Create a new Earley chart parser, that uses grammar to parse texts.

Parameters:
  • grammar (CFG) – The grammar used to parse texts.

  • trace (int) – The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.

  • trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.

  • chart_class – The class that should be used to create the charts used by this parser.

class nltk.parse.earleychart.FeatureIncrementalTopDownChartParser[source]

Bases: FeatureIncrementalChartParser

__init__(grammar, **parser_args)[source]

Create a new Earley chart parser, that uses grammar to parse texts.

Parameters:
  • grammar (CFG) – The grammar used to parse texts.

  • trace (int) – The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.

  • trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.

  • chart_class – The class that should be used to create the charts used by this parser.

class nltk.parse.earleychart.FeaturePredictorRule[source]

Bases: FeatureTopDownPredictRule

class nltk.parse.earleychart.FeatureScannerRule[source]

Bases: ScannerRule

class nltk.parse.earleychart.FilteredCompleteFundamentalRule[source]

Bases: FilteredSingleEdgeFundamentalRule

apply(chart, grammar, edge)[source]

Return a generator that will add edges licensed by this rule and the given edges to the chart, one at a time. Each time the generator is resumed, it will either add a new edge and yield that edge; or return.

Parameters:

edges (list(EdgeI)) – A set of existing edges. The number of edges that should be passed to apply() is specified by the NUM_EDGES class variable.

Return type:

iter(EdgeI)

class nltk.parse.earleychart.IncrementalBottomUpChartParser[source]

Bases: IncrementalChartParser

__init__(grammar, **parser_args)[source]

Create a new Earley chart parser, that uses grammar to parse texts.

Parameters:
  • grammar (CFG) – The grammar used to parse texts.

  • trace (int) – The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.

  • trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.

  • chart_class – The class that should be used to create the charts used by this parser.

class nltk.parse.earleychart.IncrementalBottomUpLeftCornerChartParser[source]

Bases: IncrementalChartParser

__init__(grammar, **parser_args)[source]

Create a new Earley chart parser, that uses grammar to parse texts.

Parameters:
  • grammar (CFG) – The grammar used to parse texts.

  • trace (int) – The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.

  • trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.

  • chart_class – The class that should be used to create the charts used by this parser.

class nltk.parse.earleychart.IncrementalChart[source]

Bases: Chart

edges()[source]

Return a list of all edges in this chart. New edges that are added to the chart after the call to edges() will not be contained in this list.

Return type:

list(EdgeI)

See:

iteredges, select

initialize()[source]

Clear the chart.

iteredges()[source]

Return an iterator over the edges in this chart. It is not guaranteed that new edges which are added to the chart before the iterator is exhausted will also be generated.

Return type:

iter(EdgeI)

See:

edges, select

select(end, **restrictions)[source]

Return an iterator over the edges in this chart. Any new edges that are added to the chart before the iterator is exahusted will also be generated. restrictions can be used to restrict the set of edges that will be generated.

Parameters:
  • span – Only generate edges e where e.span()==span

  • start – Only generate edges e where e.start()==start

  • end – Only generate edges e where e.end()==end

  • length – Only generate edges e where e.length()==length

  • lhs – Only generate edges e where e.lhs()==lhs

  • rhs – Only generate edges e where e.rhs()==rhs

  • nextsym – Only generate edges e where e.nextsym()==nextsym

  • dot – Only generate edges e where e.dot()==dot

  • is_complete – Only generate edges e where e.is_complete()==is_complete

  • is_incomplete – Only generate edges e where e.is_incomplete()==is_incomplete

Return type:

iter(EdgeI)

class nltk.parse.earleychart.IncrementalChartParser[source]

Bases: ChartParser

An incremental chart parser implementing Jay Earley’s parsing algorithm:

For each index end in [0, 1, …, N]:
For each edge such that edge.end = end:
If edge is incomplete and edge.next is not a part of speech:
Apply PredictorRule to edge
If edge is incomplete and edge.next is a part of speech:
Apply ScannerRule to edge
If edge is complete:
Apply CompleterRule to edge
Return any complete parses in the chart
__init__(grammar, strategy=[<nltk.parse.chart.LeafInitRule object>, <nltk.parse.chart.EmptyPredictRule object>, <nltk.parse.chart.BottomUpPredictCombineRule object>, <nltk.parse.earleychart.CompleteFundamentalRule object>], trace=0, trace_chart_width=50, chart_class=<class 'nltk.parse.earleychart.IncrementalChart'>)[source]

Create a new Earley chart parser, that uses grammar to parse texts.

Parameters:
  • grammar (CFG) – The grammar used to parse texts.

  • trace (int) – The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.

  • trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.

  • chart_class – The class that should be used to create the charts used by this parser.

chart_parse(tokens, trace=None)[source]

Return the final parse Chart from which all possible parse trees can be extracted.

Parameters:

tokens (list(str)) – The sentence to be parsed

Return type:

Chart

class nltk.parse.earleychart.IncrementalLeftCornerChartParser[source]

Bases: IncrementalChartParser

__init__(grammar, **parser_args)[source]

Create a new Earley chart parser, that uses grammar to parse texts.

Parameters:
  • grammar (CFG) – The grammar used to parse texts.

  • trace (int) – The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.

  • trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.

  • chart_class – The class that should be used to create the charts used by this parser.

class nltk.parse.earleychart.IncrementalTopDownChartParser[source]

Bases: IncrementalChartParser

__init__(grammar, **parser_args)[source]

Create a new Earley chart parser, that uses grammar to parse texts.

Parameters:
  • grammar (CFG) – The grammar used to parse texts.

  • trace (int) – The level of tracing that should be used when parsing a text. 0 will generate no tracing output; and higher numbers will produce more verbose tracing output.

  • trace_chart_width (int) – The default total width reserved for the chart in trace output. The remainder of each line will be used to display edges.

  • chart_class – The class that should be used to create the charts used by this parser.

class nltk.parse.earleychart.PredictorRule[source]

Bases: CachedTopDownPredictRule

class nltk.parse.earleychart.ScannerRule[source]

Bases: CompleteFundamentalRule

apply(chart, grammar, edge)[source]

Return a generator that will add edges licensed by this rule and the given edges to the chart, one at a time. Each time the generator is resumed, it will either add a new edge and yield that edge; or return.

Parameters:

edges (list(EdgeI)) – A set of existing edges. The number of edges that should be passed to apply() is specified by the NUM_EDGES class variable.

Return type:

iter(EdgeI)

nltk.parse.earleychart.demo(print_times=True, print_grammar=False, print_trees=True, trace=2, sent='I saw John with a dog with my cookie', numparses=5)[source]

A demonstration of the Earley parsers.