nltk.corpus.reader.string_category module

Read tuples from a corpus consisting of categorized strings. For example, from the question classification corpus:

NUM:dist How far is it from Denver to Aspen ? LOC:city What county is Modesto , California in ? HUM:desc Who was Galileo ? DESC:def What is an atom ? NUM:date When did Hawaii become a state ?

class nltk.corpus.reader.string_category.StringCategoryCorpusReader[source]

Bases: nltk.corpus.reader.api.CorpusReader

__init__(root, fileids, delimiter=' ', encoding='utf8')[source]
  • root – The root directory for this corpus.

  • fileids – A list or regexp specifying the fileids in this corpus.

  • delimiter – Field delimiter