nltk.sentiment.vader module

If you use the VADER sentiment analysis tools, please cite:

Hutto, C.J. & Gilbert, E.E. (2014). VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text. Eighth International Conference on Weblogs and Social Media (ICWSM-14). Ann Arbor, MI, June 2014.

class nltk.sentiment.vader.SentiText[source]

Bases: object

Identify sentiment-relevant string-level properties of input text.

__init__(text, punc_list, regex_remove_punctuation)[source]
allcap_differential(words)[source]

Check whether just some words in the input are ALL CAPS

Parameters

words (list) – The words to inspect

Returns

True if some but not all items in words are ALL CAPS

class nltk.sentiment.vader.SentimentIntensityAnalyzer[source]

Bases: object

Give a sentiment intensity score to sentences.

__init__(lexicon_file='sentiment/vader_lexicon.zip/vader_lexicon/vader_lexicon.txt')[source]
make_lex_dict()[source]

Convert lexicon file to a dictionary

polarity_scores(text)[source]

Return a float for sentiment strength based on the input text. Positive values are positive valence, negative value are negative valence.

Note

Hashtags are not taken into consideration (e.g. #BAD is neutral). If you are interested in processing the text in the hashtags too, then we recommend preprocessing your data to remove the #, after which the hashtag text may be matched as if it was a normal word in the sentence.

score_valence(sentiments, text)[source]
sentiment_valence(valence, sentitext, item, i, sentiments)[source]
class nltk.sentiment.vader.VaderConstants[source]

Bases: object

A class to keep the Vader lists and constants.

BOOSTER_DICT = {'absolutely': 0.293, 'almost': -0.293, 'amazingly': 0.293, 'awfully': 0.293, 'barely': -0.293, 'completely': 0.293, 'considerably': 0.293, 'decidedly': 0.293, 'deeply': 0.293, 'effing': 0.293, 'enormously': 0.293, 'entirely': 0.293, 'especially': 0.293, 'exceptionally': 0.293, 'extremely': 0.293, 'fabulously': 0.293, 'flippin': 0.293, 'flipping': 0.293, 'frickin': 0.293, 'fricking': 0.293, 'friggin': 0.293, 'frigging': 0.293, 'fucking': 0.293, 'fully': 0.293, 'greatly': 0.293, 'hardly': -0.293, 'hella': 0.293, 'highly': 0.293, 'hugely': 0.293, 'incredibly': 0.293, 'intensely': 0.293, 'just enough': -0.293, 'kind of': -0.293, 'kind-of': -0.293, 'kinda': -0.293, 'kindof': -0.293, 'less': -0.293, 'little': -0.293, 'majorly': 0.293, 'marginally': -0.293, 'more': 0.293, 'most': 0.293, 'occasionally': -0.293, 'particularly': 0.293, 'partly': -0.293, 'purely': 0.293, 'quite': 0.293, 'really': 0.293, 'remarkably': 0.293, 'scarcely': -0.293, 'slightly': -0.293, 'so': 0.293, 'somewhat': -0.293, 'sort of': -0.293, 'sort-of': -0.293, 'sorta': -0.293, 'sortof': -0.293, 'substantially': 0.293, 'thoroughly': 0.293, 'totally': 0.293, 'tremendously': 0.293, 'uber': 0.293, 'unbelievably': 0.293, 'unusually': 0.293, 'utterly': 0.293, 'very': 0.293}
B_DECR = -0.293
B_INCR = 0.293
C_INCR = 0.733
NEGATE = {"ain't", 'aint', "aren't", 'arent', "can't", 'cannot', 'cant', "couldn't", 'couldnt', "daren't", 'darent', 'despite', "didn't", 'didnt', "doesn't", 'doesnt', "don't", 'dont', "hadn't", 'hadnt', "hasn't", 'hasnt', "haven't", 'havent', "isn't", 'isnt', "mightn't", 'mightnt', "mustn't", 'mustnt', "needn't", 'neednt', 'neither', 'never', 'none', 'nope', 'nor', 'not', 'nothing', 'nowhere', "oughtn't", 'oughtnt', 'rarely', 'seldom', "shan't", 'shant', "shouldn't", 'shouldnt', 'uh-uh', 'uhuh', "wasn't", 'wasnt', "weren't", 'werent', 'without', "won't", 'wont', "wouldn't", 'wouldnt'}
N_SCALAR = -0.74
PUNC_LIST = ['.', '!', '?', ',', ';', ':', '-', "'", '"', '!!', '!!!', '??', '???', '?!?', '!?!', '?!?!', '!?!?']
REGEX_REMOVE_PUNCTUATION = re.compile('[!"\\#\\$%\\&\'\\(\\)\\*\\+,\\-\\./:;<=>\\?@\\[\\\\\\]\\^_`\\{\\|\\}\\~]')
SPECIAL_CASE_IDIOMS = {'bad ass': 1.5, 'cut the mustard': 2, 'hand to mouth': -2, 'kiss of death': -1.5, 'the bomb': 3, 'the shit': 3, 'yeah right': -2}
__init__()[source]
negated(input_words, include_nt=True)[source]

Determine if input contains negation words

normalize(score, alpha=15)[source]

Normalize the score to be between -1 and 1 using an alpha that approximates the max expected value

scalar_inc_dec(word, valence, is_cap_diff)[source]

Check if the preceding words increase, decrease, or negate/nullify the valence