Pretty-printing of discontinuous trees. Adapted from the disco-dop project, by Andreas van Cranenburgh. https://github.com/andreasvc/disco-dop
Interesting reference (not used for this code): T. Eschbach et al., Orth. Hypergraph Drawing, Journal of Graph Algorithms and Applications, 10(2) 141–157 (2006)149. http://jgaa.info/accepted/2006/EschbachGuentherBecker2006.10.2.pdf
- class nltk.treeprettyprinter.TreePrettyPrinter¶
Pretty-print a tree in text format, either as ASCII or Unicode. The tree can be a normal tree, or discontinuous.
TreePrettyPrinter(tree, sentence=None, highlight=())creates an object from which different visualizations can be created.
tree – a Tree object.
sentence – a list of words (strings). If sentence is given, tree must contain integers as leaves, which are taken as indices in sentence. Using this you can display a discontinuous tree.
highlight – Optionally, a sequence of Tree objects in tree which should be highlighted. Has the effect of only applying colors to nodes in this sequence (nodes should be given as Tree objects, terminals as indices).
>>> from nltk.tree import Tree >>> tree = Tree.fromstring('(S (NP Mary) (VP walks))') >>> print(TreePrettyPrinter(tree).text()) ... S ____|____ NP VP | | Mary walks
- __init__(tree, sentence=None, highlight=())¶
- static nodecoords(tree, sentence, highlight)¶
Produce coordinates of nodes on a grid.
- Produce coordinates for a non-overlapping placement of nodes and
- Order edges so that crossing edges cross a minimal number of previous
horizontal lines (never vertical lines).
bottom up level order traversal (start at terminals)
at each level, identify nodes which cannot be on the same row
identify nodes which cannot be in the same column
place nodes into a grid at (row, column)
order child-parent edges with crossing edges last
Coordinates are (row, column); the origin (0, 0) is at the top left; the root node is on row 0. Coordinates do not consider the size of a node (which depends on font, &c), so the width of a column of the grid should be automatically determined by the element with the greatest width in that column. Alternatively, the integer coordinates could be converted to coordinates in which the distances between adjacent nodes are non-uniform.
Produces tuple (nodes, coords, edges, highlighted) where:
nodes[id]: Tree object for the node with this integer id
coords[id]: (n, m) coordinate where to draw node with id in the grid
edges[id]: parent id of node with this id (ordered dictionary)
highlighted: set of ids that should be highlighted
- text(nodedist=1, unicodelines=False, html=False, ansi=False, nodecolor='blue', leafcolor='red', funccolor='green', abbreviate=None, maxwidth=16)¶
ASCII art for a discontinuous tree.
unicodelines – whether to use Unicode line drawing characters instead of plain (7-bit) ASCII.
html – whether to wrap output in html code (default plain text).
ansi – whether to produce colors with ANSI escape sequences (only effective when html==False).
nodecolor (leafcolor,) – specify colors of leaves and phrasal nodes; effective when either html or ansi is True.
abbreviate – if True, abbreviate labels longer than 5 characters. If integer, abbreviate labels longer than abbr characters.
maxwidth – maximum number of characters before a label starts to wrap; pass None to disable.
- svg(nodecolor='blue', leafcolor='red', funccolor='green')¶
SVG representation of a tree.