nltk.chunk.ChunkScore¶
- class nltk.chunk.ChunkScore[source]¶
Bases:
objectA utility class for scoring chunk parsers.
ChunkScorecan evaluate a chunk parser’s output, based on a number of statistics (precision, recall, f-measure, misssed chunks, incorrect chunks). It can also combine the scores from the parsing of multiple texts; this makes it significantly easier to evaluate a chunk parser that operates one sentence at a time.Texts are evaluated with the
scoremethod. The results of evaluation can be accessed via a number of accessor methods, such asprecisionandf_measure. A typical use of theChunkScoreclass is:>>> chunkscore = ChunkScore() >>> for correct in correct_sentences: ... guess = chunkparser.parse(correct.leaves()) ... chunkscore.score(correct, guess) >>> print('F Measure:', chunkscore.f_measure()) F Measure: 0.823
- Variables
kwargs –
Keyword arguments:
max_tp_examples: The maximum number actual examples of true positives to record. This affects the
correctmember function:correctwill not return more than this number of true positive examples. This does not affect any of the numerical metrics (precision, recall, or f-measure)max_fp_examples: The maximum number actual examples of false positives to record. This affects the
incorrectmember function and theguessedmember function:incorrectwill not return more than this number of examples, andguessedwill not return more than this number of true positive examples. This does not affect any of the numerical metrics (precision, recall, or f-measure)max_fn_examples: The maximum number actual examples of false negatives to record. This affects the
missedmember function and thecorrectmember function:missedwill not return more than this number of examples, andcorrectwill not return more than this number of true negative examples. This does not affect any of the numerical metrics (precision, recall, or f-measure)chunk_label: A regular expression indicating which chunks should be compared. Defaults to
'.*'(i.e., all chunks).
_tp – List of true positives
_fp – List of false positives
_fn – List of false negatives
_tp_num – Number of true positives
_fp_num – Number of false positives
_fn_num – Number of false negatives.
- score(correct, guessed)[source]¶
Given a correctly chunked sentence, score another chunked version of the same sentence.
- Parameters
correct (chunk structure) – The known-correct (“gold standard”) chunked sentence.
guessed (chunk structure) – The chunked sentence to be scored.
- accuracy()[source]¶
Return the overall tag-based accuracy for all text that have been scored by this
ChunkScore, using the IOB (conll2000) tag encoding.- Return type
float
- precision()[source]¶
Return the overall precision for all texts that have been scored by this
ChunkScore.- Return type
float
- recall()[source]¶
Return the overall recall for all texts that have been scored by this
ChunkScore.- Return type
float
- f_measure(alpha=0.5)[source]¶
Return the overall F measure for all texts that have been scored by this
ChunkScore.- Parameters
alpha (float) – the relative weighting of precision and recall. Larger alpha biases the score towards the precision value, while smaller alpha biases the score towards the recall value.
alphashould have a value in the range [0,1].- Return type
float
- missed()[source]¶
Return the chunks which were included in the correct chunk structures, but not in the guessed chunk structures, listed in input order.
- Return type
list of chunks
- incorrect()[source]¶
Return the chunks which were included in the guessed chunk structures, but not in the correct chunk structures, listed in input order.
- Return type
list of chunks