nltk.tbl.demo module


Run a demo with defaults. See source comments for details, or docstrings of any of the more specific demo_* functions.


Writes a file with context for each erroneous word after tagging testing data


Template.expand and Feature.expand are class methods facilitating generating large amounts of templates. See their documentation for details.

Note: training with 500 templates can easily fill all available even on relatively small corpora


Discard rules with low accuracy. This may hurt performance a bit, but will often produce rules which are more interesting read to a human.


Plot a learning curve – the contribution on tagging accuracy of the individual rules. Note: requires matplotlib


Templates can have more than a single feature.


The feature/s of a template takes a list of positions relative to the current word where the feature should be looked for, conceptually joined by logical OR. For instance, Pos([-1, 1]), given a value V, will hold whenever V is found one step to the left and/or one step to the right.

For contiguous ranges, a 2-arg form giving inclusive end points can also be used: Pos(-3, -1) is the same as the arg below.


Exemplify repr(Rule) (see also str(Rule) and Rule.format(“verbose”))


Serializes the learned tagger to a file in pickle format; reloads it and validates the process.


Exemplify repr(Rule) (see also str(Rule) and Rule.format(“verbose”))


Show aggregate statistics per template. Little used templates are candidates for deletion, much used templates may possibly be refined.

Deleting unused templates is mostly about saving time and/or space: training is basically O(T) in the number of templates T (also in terms of memory usage, which often will be the limiting factor).


Exemplify Rule.format(“verbose”)

nltk.tbl.demo.postag(templates=None, tagged_data=None, num_sents=1000, max_rules=300, min_score=3, min_acc=None, train=0.8, trace=3, randomize=False, ruleformat='str', incremental_stats=False, template_stats=False, error_output=None, serialize_output=None, learning_curve_output=None, learning_curve_take=300, baseline_backoff_tagger=None, separate_baseline_data=False, cache_baseline_tagger=None)[source]

Brill Tagger Demonstration :param templates: how many sentences of training and testing data to use :type templates: list of Template

  • tagged_data (C{int}) – maximum number of rule instances to create

  • num_sents (C{int}) – how many sentences of training and testing data to use

  • max_rules (C{int}) – maximum number of rule instances to create

  • min_score (C{int}) – the minimum score for a rule in order for it to be considered

  • min_acc (C{float}) – the minimum score for a rule in order for it to be considered

  • train (C{float}) – the fraction of the the corpus to be used for training (1=all)

  • trace (C{int}) – the level of diagnostic tracing output to produce (0-4)

  • randomize (C{bool}) – whether the training data should be a random subset of the corpus

  • ruleformat (C{str}) – rule output format, one of “str”, “repr”, “verbose”

  • incremental_stats (C{bool}) – if true, will tag incrementally and collect stats for each rule (rather slow)

  • template_stats (C{bool}) – if true, will print per-template statistics collected in training and (optionally) testing

  • error_output (C{string}) – the file where errors will be saved

  • serialize_output (C{string}) – the file where the learned tbl tagger will be saved

  • learning_curve_output (C{string}) – filename of plot of learning curve(s) (train and also test, if available)

  • learning_curve_take (C{int}) – how many rules plotted

  • baseline_backoff_tagger (tagger) – the file where rules will be saved

  • separate_baseline_data (C{bool}) – use a fraction of the training data exclusively for training baseline

  • cache_baseline_tagger (C{string}) – cache baseline tagger to this file (only interesting as a temporary workaround to get deterministic output from the baseline unigram tagger between python versions)

Note on separate_baseline_data: if True, reuse training data both for baseline and rule learner. This is fast and fine for a demo, but is likely to generalize worse on unseen data. Also cannot be sensibly used for learning curves on training data (the baseline will be artificially high).