nltk.chunk.RegexpChunkParser¶
- class nltk.chunk.RegexpChunkParser[source]¶
Bases:
ChunkParserIA regular expression based chunk parser.
RegexpChunkParseruses a sequence of “rules” to find chunks of a single type within a text. The chunking of the text is encoded using aChunkString, and each rule acts by modifying the chunking in theChunkString. The rules are all implemented using regular expression matching and substitution.The
RegexpChunkRuleclass and its subclasses (ChunkRule,StripRule,UnChunkRule,MergeRule, andSplitRule) define the rules that are used byRegexpChunkParser. Each rule defines anapply()method, which modifies the chunking encoded by a givenChunkString.- Variables
_rules – The list of rules that should be applied to a text.
_trace – The default level of tracing.
- __init__(rules, chunk_label='NP', root_label='S', trace=0)[source]¶
Construct a new
RegexpChunkParser.- Parameters
rules (list(RegexpChunkRule)) – The sequence of rules that should be used to generate the chunking for a tagged text.
chunk_label (str) – The node value that should be used for chunk subtrees. This is typically a short string describing the type of information contained by the chunk, such as
"NP"for base noun phrases.root_label (str) – The node value that should be used for the top node of the chunk structure.
trace (int) – The level of tracing that should be used when parsing a text.
0will generate no tracing output;1will generate normal tracing output; and2or higher will generate verbose tracing output.
- parse(chunk_struct, trace=None)[source]¶
- Parameters
chunk_struct (Tree) – the chunk structure to be (further) chunked
trace (int) – The level of tracing that should be used when parsing a text.
0will generate no tracing output;1will generate normal tracing output; and2or higher will generate verbose tracing output. This value overrides the trace level value that was given to the constructor.
- Return type
- Returns
a chunk structure that encodes the chunks in a given tagged sentence. A chunk is a non-overlapping linguistic group, such as a noun phrase. The set of chunks identified in the chunk structure depends on the rules used to define this
RegexpChunkParser.
- rules()[source]¶
- Returns
the sequence of rules used by
RegexpChunkParser.- Return type
list(RegexpChunkRule)
- accuracy(gold)¶
Score the accuracy of the chunker against the gold standard. Remove the chunking the gold standard text, rechunk it using the chunker, and return a
ChunkScoreobject reflecting the performance of this chunk parser.- Parameters
gold (list(Tree)) – The list of chunked sentences to score the chunker on.
- Return type
- evaluate(**kwargs)¶
@deprecated: Use accuracy(gold) instead.
- grammar()¶
- Returns
The grammar used by this parser.
- parse_sents(sents, *args, **kwargs)¶
Apply
self.parse()to each element ofsents. :rtype: iter(iter(Tree))