nltk.featstruct.FeatStructReader¶
- class nltk.featstruct.FeatStructReader[source]¶
Bases:
object
- __init__(features=(*slash*, *type*), fdict_class=<class 'nltk.featstruct.FeatStruct'>, flist_class=<class 'nltk.featstruct.FeatList'>, logic_parser=None)[source]¶
- fromstring(s, fstruct=None)[source]¶
Convert a string representation of a feature structure (as displayed by repr) into a
FeatStruct
. This process imposes the following restrictions on the string representation:Feature names cannot contain any of the following: whitespace, parentheses, quote marks, equals signs, dashes, commas, and square brackets. Feature names may not begin with plus signs or minus signs.
Only the following basic feature value are supported: strings, integers, variables, None, and unquoted alphanumeric strings.
For reentrant values, the first mention must specify a reentrance identifier and a value; and any subsequent mentions must use arrows (
'->'
) to reference the reentrance identifier.
- read_partial(s, position=0, reentrances=None, fstruct=None)[source]¶
Helper function that reads in a feature structure.
- Parameters
s – The string to read.
position – The position in the string to start parsing.
reentrances – A dictionary from reentrance ids to values. Defaults to an empty dictionary.
- Returns
A tuple (val, pos) of the feature structure created by parsing and the position where the parsed feature structure ends.
- Return type
bool
- VALUE_HANDLERS = [('read_fstruct_value', re.compile('\\s*(?:\\((\\d+)\\)\\s*)?(\\??[\\w-]+)?(\\[)')), ('read_var_value', re.compile('\\?[a-zA-Z_][a-zA-Z0-9_]*')), ('read_str_value', re.compile('[uU]?[rR]?([\'"])')), ('read_int_value', re.compile('-?\\d+')), ('read_sym_value', re.compile('[a-zA-Z_][a-zA-Z0-9_]*')), ('read_app_value', re.compile('<(app)\\((\\?[a-z][a-z]*)\\s*,\\s*(\\?[a-z][a-z]*)\\)>')), ('read_logic_value', re.compile('<(.*?)(?<!-)>')), ('read_set_value', re.compile('{')), ('read_tuple_value', re.compile('\\('))]¶
A table indicating how feature values should be processed. Each entry in the table is a pair (handler, regexp). The first entry with a matching regexp will have its handler called. Handlers should have the following signature:
def handler(s, position, reentrances, match): ...
and should return a tuple (value, position), where position is the string position where the value ended. (n.b.: order is important here!)