nltk.translate.meteor_score module

nltk.translate.meteor_score.exact_match(hypothesis: Iterable[str], reference: Iterable[str]) Tuple[List[Tuple[int, int]], List[Tuple[int, str]], List[Tuple[int, str]]][source]

matches exact words in hypothesis and reference and returns a word mapping based on the enumerated word id between hypothesis and reference

Parameters
  • hypothesis (Iterable[str]) – pre-tokenized hypothesis

  • reference (Iterable[str]) – pre-tokenized reference

Returns

enumerated matched tuples, enumerated unmatched hypothesis tuples, enumerated unmatched reference tuples

Return type

Tuple[List[Tuple[int, int]], List[Tuple[int, str]], List[Tuple[int, str]]]

nltk.translate.meteor_score.stem_match(hypothesis: Iterable[str], reference: Iterable[str], stemmer: nltk.stem.api.StemmerI = <PorterStemmer>) Tuple[List[Tuple[int, int]], List[Tuple[int, str]], List[Tuple[int, str]]][source]

Stems each word and matches them in hypothesis and reference and returns a word mapping between hypothesis and reference

Parameters
  • hypothesis (Iterable[str]) – pre-tokenized hypothesis

  • reference (Iterable[str]) – pre-tokenized reference

  • stemmer (nltk.stem.api.StemmerI) – nltk.stem.api.StemmerI object (default PorterStemmer())

Returns

enumerated matched tuples, enumerated unmatched hypothesis tuples, enumerated unmatched reference tuples

Return type

Tuple[List[Tuple[int, int]], List[Tuple[int, str]], List[Tuple[int, str]]]

nltk.translate.meteor_score.wordnetsyn_match(hypothesis: Iterable[str], reference: Iterable[str], wordnet: nltk.corpus.reader.wordnet.WordNetCorpusReader = <WordNetCorpusReader in '/Users/sbird1/nltk_data/corpora/wordnet'>) Tuple[List[Tuple[int, int]], List[Tuple[int, str]], List[Tuple[int, str]]][source]

Matches each word in reference to a word in hypothesis if any synonym of a hypothesis word is the exact match to the reference word.

Parameters
  • hypothesis (Iterable[str]) – pre-tokenized hypothesis

  • reference (Iterable[str]) – pre-tokenized reference

  • wordnet (nltk.corpus.reader.wordnet.WordNetCorpusReader) – a wordnet corpus reader object (default nltk.corpus.wordnet)

Returns

list of mapped tuples

Return type

Tuple[List[Tuple[int, int]], List[Tuple[int, str]], List[Tuple[int, str]]]

nltk.translate.meteor_score.align_words(hypothesis: Iterable[str], reference: Iterable[str], stemmer: nltk.stem.api.StemmerI = <PorterStemmer>, wordnet: nltk.corpus.reader.wordnet.WordNetCorpusReader = <WordNetCorpusReader in '/Users/sbird1/nltk_data/corpora/wordnet'>) Tuple[List[Tuple[int, int]], List[Tuple[int, str]], List[Tuple[int, str]]][source]

Aligns/matches words in the hypothesis to reference by sequentially applying exact match, stemmed match and wordnet based synonym match. In case there are multiple matches the match which has the least number of crossing is chosen.

Parameters
Returns

sorted list of matched tuples, unmatched hypothesis list, unmatched reference list

Return type

Tuple[List[Tuple[int, int]], List[Tuple[int, str]], List[Tuple[int, str]]]

nltk.translate.meteor_score.single_meteor_score(reference: Iterable[str], hypothesis: Iterable[str], preprocess: Callable[[str], str] = <method 'lower' of 'str' objects>, stemmer: nltk.stem.api.StemmerI = <PorterStemmer>, wordnet: nltk.corpus.reader.wordnet.WordNetCorpusReader = <WordNetCorpusReader in '/Users/sbird1/nltk_data/corpora/wordnet'>, alpha: float = 0.9, beta: float = 3.0, gamma: float = 0.5) float[source]

Calculates METEOR score for single hypothesis and reference as per “Meteor: An Automatic Metric for MT Evaluation with HighLevels of Correlation with Human Judgments” by Alon Lavie and Abhaya Agarwal, in Proceedings of ACL. https://www.cs.cmu.edu/~alavie/METEOR/pdf/Lavie-Agarwal-2007-METEOR.pdf

>>> hypothesis1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'which', 'ensures', 'that', 'the', 'military', 'always', 'obeys', 'the', 'commands', 'of', 'the', 'party']
>>> reference1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'that', 'ensures', 'that', 'the', 'military', 'will', 'forever', 'heed', 'Party', 'commands']
>>> round(single_meteor_score(reference1, hypothesis1),4)
0.7398

If there is no words match during the alignment the method returns the score as 0. We can safely return a zero instead of raising a division by zero error as no match usually implies a bad translation.

>>> round(meteor_score(['this', 'is', 'a', 'cat'], ['non', 'matching', 'hypothesis']),4)
0.0
Parameters
  • reference (Iterable[str]) – pre-tokenized reference

  • hypothesis (Iterable[str]) – pre-tokenized hypothesis

  • preprocess (Callable[[str], str]) – preprocessing function (default str.lower)

  • stemmer (nltk.stem.api.StemmerI) – nltk.stem.api.StemmerI object (default PorterStemmer())

  • wordnet (nltk.corpus.reader.wordnet.WordNetCorpusReader) – a wordnet corpus reader object (default nltk.corpus.wordnet)

  • alpha (float) – parameter for controlling relative weights of precision and recall.

  • beta (float) – parameter for controlling shape of penalty as a function of as a function of fragmentation.

  • gamma (float) – relative weight assigned to fragmentation penalty.

Returns

The sentence-level METEOR score.

Return type

float

nltk.translate.meteor_score.meteor_score(references: Iterable[Iterable[str]], hypothesis: Iterable[str], preprocess: Callable[[str], str] = <method 'lower' of 'str' objects>, stemmer: nltk.stem.api.StemmerI = <PorterStemmer>, wordnet: nltk.corpus.reader.wordnet.WordNetCorpusReader = <WordNetCorpusReader in '/Users/sbird1/nltk_data/corpora/wordnet'>, alpha: float = 0.9, beta: float = 3.0, gamma: float = 0.5) float[source]

Calculates METEOR score for hypothesis with multiple references as described in “Meteor: An Automatic Metric for MT Evaluation with HighLevels of Correlation with Human Judgments” by Alon Lavie and Abhaya Agarwal, in Proceedings of ACL. https://www.cs.cmu.edu/~alavie/METEOR/pdf/Lavie-Agarwal-2007-METEOR.pdf

In case of multiple references the best score is chosen. This method iterates over single_meteor_score and picks the best pair among all the references for a given hypothesis

>>> hypothesis1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'which', 'ensures', 'that', 'the', 'military', 'always', 'obeys', 'the', 'commands', 'of', 'the', 'party']
>>> hypothesis2 = ['It', 'is', 'to', 'insure', 'the', 'troops', 'forever', 'hearing', 'the', 'activity', 'guidebook', 'that', 'party', 'direct']
>>> reference1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'that', 'ensures', 'that', 'the', 'military', 'will', 'forever', 'heed', 'Party', 'commands']
>>> reference2 = ['It', 'is', 'the', 'guiding', 'principle', 'which', 'guarantees', 'the', 'military', 'forces', 'always', 'being', 'under', 'the', 'command', 'of', 'the', 'Party']
>>> reference3 = ['It', 'is', 'the', 'practical', 'guide', 'for', 'the', 'army', 'always', 'to', 'heed', 'the', 'directions', 'of', 'the', 'party']
>>> round(meteor_score([reference1, reference2, reference3], hypothesis1),4)
0.7398

If there is no words match during the alignment the method returns the score as 0. We can safely return a zero instead of raising a division by zero error as no match usually implies a bad translation.

>>> round(meteor_score([['this', 'is', 'a', 'cat']], ['non', 'matching', 'hypothesis']),4)
0.0
Parameters
  • references (Iterable[Iterable[str]]) – pre-tokenized reference sentences

  • hypothesis (Iterable[str]) – a pre-tokenized hypothesis sentence

  • preprocess (Callable[[str], str]) – preprocessing function (default str.lower)

  • stemmer (nltk.stem.api.StemmerI) – nltk.stem.api.StemmerI object (default PorterStemmer())

  • wordnet (nltk.corpus.reader.wordnet.WordNetCorpusReader) – a wordnet corpus reader object (default nltk.corpus.wordnet)

  • alpha (float) – parameter for controlling relative weights of precision and recall.

  • beta (float) – parameter for controlling shape of penalty as a function of as a function of fragmentation.

  • gamma (float) – relative weight assigned to fragmentation penalty.

Returns

The sentence-level METEOR score.

Return type

float