nltk.grammar.CFG¶
- class nltk.grammar.CFG[source]¶
Bases:
object
A context-free grammar. A grammar consists of a start state and a set of productions. The set of terminals and nonterminals is implicitly specified by the productions.
If you need efficient key-based access to productions, you can use a subclass to implement it.
- __init__(start, productions, calculate_leftcorners=True)[source]¶
Create a new context-free grammar, from the given start state and set of
Production
instances.- Parameters
start (Nonterminal) – The start symbol
productions (list(Production)) – The list of productions that defines the grammar
calculate_leftcorners (bool) – False if we don’t want to calculate the leftcorner relation. In that case, some optimized chart parsers won’t work.
- classmethod fromstring(input, encoding=None)[source]¶
Return the grammar instance corresponding to the input string(s).
- Parameters
input – a grammar, either in the form of a string or as a list of strings.
- productions(lhs=None, rhs=None, empty=False)[source]¶
Return the grammar productions, filtered by the left-hand side or the first item in the right-hand side.
- Parameters
lhs – Only return productions with the given left-hand side.
rhs – Only return productions with the given first item in the right-hand side.
empty – Only return productions with an empty right-hand side.
- Returns
A list of productions matching the given constraints.
- Return type
list(Production)
- leftcorners(cat)[source]¶
Return the set of all nonterminals that the given nonterminal can start with, including itself.
This is the reflexive, transitive closure of the immediate leftcorner relation: (A > B) iff (A -> B beta)
- Parameters
cat (Nonterminal) – the parent of the leftcorners
- Returns
the set of all leftcorners
- Return type
set(Nonterminal)
- is_leftcorner(cat, left)[source]¶
True if left is a leftcorner of cat, where left can be a terminal or a nonterminal.
- Parameters
cat (Nonterminal) – the parent of the leftcorner
left (Terminal or Nonterminal) – the suggested leftcorner
- Return type
bool
- leftcorner_parents(cat)[source]¶
Return the set of all nonterminals for which the given category is a left corner. This is the inverse of the leftcorner relation.
- Parameters
cat (Nonterminal) – the suggested leftcorner
- Returns
the set of all parents to the leftcorner
- Return type
set(Nonterminal)
- check_coverage(tokens)[source]¶
Check whether the grammar rules cover the given list of tokens. If not, then raise an exception.
- is_nonlexical()[source]¶
Return True if all lexical rules are “preterminals”, that is, unary rules which can be separated in a preprocessing step.
This means that all productions are of the forms A -> B1 … Bn (n>=0), or A -> “s”.
Note: is_lexical() and is_nonlexical() are not opposites. There are grammars which are neither, and grammars which are both.
- is_binarised()[source]¶
Return True if all productions are at most binary. Note that there can still be empty and unary productions.
- is_flexible_chomsky_normal_form()[source]¶
Return True if all productions are of the forms A -> B C, A -> B, or A -> “s”.
- is_chomsky_normal_form()[source]¶
Return True if the grammar is of Chomsky Normal Form, i.e. all productions are of the form A -> B C, or A -> “s”.
- chomsky_normal_form(new_token_padding='@$@', flexible=False)[source]¶
Returns a new Grammar that is in chomsky normal
- Param
new_token_padding Customise new rule formation during binarisation
- classmethod remove_unitary_rules(grammar)[source]¶
Remove nonlexical unitary rules and convert them to lexical