Sentiment Analysis is a part of Natural Language Processing(NLP) which inturn, is a part of Artificial Intelligence(A.I.). It is a growing area in today's world. It is a software for automatically extracting opinions,emotions and sentiments in text. It allows us to track attitudes and feelings on the web. People write blog posts,comments, reviews and tweets about all sorts of different topics. We can track products,brands and people for example and determine whether they are viewed positively or negatively or neutral on the web.
We can analyse facts and opinions:
Fact: The painting was more expensive than the Monet.
Opinion: Don't like Monet,Pollock is the better artist.
But it is not always easy to differentiate between facts and opinions
It allows business to track bad rants,new product perception,reputation of a product,brand name in the society.It also allows individuals to get an opinion on something like review about something.
The problem has several dimensions :
We can analyse facts and opinions:
Fact: The painting was more expensive than the Monet.
Opinion: Don't like Monet,Pollock is the better artist.
But it is not always easy to differentiate between facts and opinions
It allows business to track bad rants,new product perception,reputation of a product,brand name in the society.It also allows individuals to get an opinion on something like review about something.
The problem has several dimensions :
- How does a machine define subjectivity and sentiment?
- How does a machine analyse polarity(negative or positive)?
- How does a machine deal with subjective word senses?
- How does a machine assign an opinion rating?
- How does a machine know about sentiment intensity?
- community
- other person
- user/author
- document(eg: what makes people happy)
- sentence or clause
- aspect
- Machine learning - supervised and unsupervised learning
- Lexicon-based- Dictionary and corpus
- Discourse analysis
Features:
▫words (bag-of-words)
▫n-grams▫parts-of-speech (e.g. Adjectives and adjective-adverb combinations)
▫opinion words (lexicon-based: dictionary or corpus)
▫valence intensifiers and shifters (for negation); modal verbs▫syntactic dependency
•Feature selection based on
▫frequency
▫information gain
▫odds ratio (for binary-class models)
▫mutual information
•Feature weighting
▫term presence or term frequency
▫inverse document frequency
▫term position : e.g. title, first and last sentence(s)
▫words (bag-of-words)
▫n-grams▫parts-of-speech (e.g. Adjectives and adjective-adverb combinations)
▫opinion words (lexicon-based: dictionary or corpus)
▫valence intensifiers and shifters (for negation); modal verbs▫syntactic dependency
•Feature selection based on
▫frequency
▫information gain
▫odds ratio (for binary-class models)
▫mutual information
•Feature weighting
▫term presence or term frequency
▫inverse document frequency
▫term position : e.g. title, first and last sentence(s)
Some tools:
- Ling Pipe: lingustic processung of text including entity extraction,clustering and classification,etc http://alias-i.com/lingpipe/
- OpenNLP: the most common NLP tasks,such as POS tagging,named entity extracton,chunking and coreference resolution. http://opennlp.apache.org/
- Stanford parser and POS tagger: http://nlp.stanford.edu/software/tagger.shtm/
- NLTK : toolkit for teaching and researching classification,clustering and parsing. http://www.nltk.org/
- OpinionFinder: subjective sentence,source of the subjectivity and words that are included in phrases expressing positive or negative sentiments. http://code.google.com/p/opinionfinder/
- Basic sentiment tokenizer plus some tools,by Christopher Potts: http://sentiment.christopherpotts.net
- Twitter NLP and POS tagging: http://www.ark.cs.cmu.edu/TweetNLP/
Comments
Post a Comment