nltk.corpus.reader.indian module¶

Indian Language POS-Tagged Corpus Collected by A Kumaran, Microsoft Research, India Distributed with permission

Contents:

class nltk.corpus.reader.indian.IndianCorpusReader[source]¶

List of words, one per line. Blank lines are ignored.

class nltk.corpus.reader.indian.IndianCorpusView[source]¶

__init__(corpus_file, encoding, tagged, group_by_sent, tag_mapping_function=None)[source]¶

Create a new corpus view, based on the file fileid, and read with block_reader. See the class documentation for more information.

Parameters:

fileid – The path to the file that is read by this corpus view. fileid can either be a string or a PathPointer.
startpos – The file position at which the view will start reading. This can be used to skip over preface sections.
encoding – The unicode encoding that should be used to read the file’s contents. If no encoding is specified, then the file’s contents will be read as a non-unicode string (i.e., a str).

read_block(stream)[source]¶

Read a block from the input stream.

NLTK