Chapter 11: Opinion MiningBing LiuDepartment of Computer ScienceUniversity of Illinois at [email protected] Liu, UIC Web Data Mining2Introduction – facts and opinions Two main types of textual information on the Web. Facts and Opinions Current search engines search for facts (assume they are true) Facts can be expressed with topic keywords. Search engines do not search for opinions Opinions are hard to express with a few keywords How do people think of Motorola Cell phones? Current search ranking strategy is not appropriate for opinion retrieval/search.Bing Liu, UIC Web Data Mining3Introduction – user generated content Word-of-mouth on the Web One can express personal experiences and opinions on almost anything, at review sites, forums, discussion groups, blogs ... (called the user generated content.) They contain valuable information Web/global scale: No longer – one’s circle of friends Our interest: to mine opinions expressed in the user-generated content An intellectually very challenging problem. Practically very useful. Bing Liu, UIC Web Data Mining4Introduction – Applications Businesses and organizations: product and service benchmarking. Market intelligence. Business spends a huge amount of money to find consumer sentiments and opinions. Consultants, surveys and focused groups, etc Individuals: interested in other’s opinions when Purchasing a product or using a service, Finding opinions on political topics, Ads placements: Placing ads in the user-generated content Place an ad when one praises a product. Place an ad from a competitor if one criticizes a product. Opinion retrieval/search: providing general search for opinions.Bing Liu, UIC Web Data Mining5Two types of evaluation Direct Opinions: sentiment expressions on some objects, e.g., products, events, topics, persons. E.g., “the picture quality of this camera is great” Subjective Comparisons: relations expressing similarities or differences of more than one object. Usually expressing an ordering. E.g., “car x is cheaper than car y.” Objective or subjective.Bing Liu, UIC Web Data Mining6Opinion search (Liu, Web Data Mining book, 2007) Can you search for opinions as conveniently as general Web search? Whenever you need to make a decision, you may want some opinions from others, Wouldn’t it be nice? you can find them on a search system instantly, by issuing queries such as Opinions: “Motorola cell phones” Comparisons: “Motorola vs. Nokia” Cannot be done yet! (but could be soon …)Bing Liu, UIC Web Data Mining7Typical opinion search queries Find the opinion of a person or organization (opinion holder) on a particular object or a feature of the object. E.g., what is Bill Clinton’s opinion on abortion? Find positive and/or negative opinions on a particular object (or some features of the object), e.g., customer opinions on a digital camera. public opinions on a political topic. Find how opinions on an object change over time. How object A compares with Object B? Gmail vs. HotmailBing Liu, UIC Web Data Mining8Find the opinion of a person on X In some cases, the general search engine can handle it, i.e., using suitable keywords. Bill Clinton’s opinion on abortion Reason: One person or organization usually has only one opinion on a particular topic. The opinion is likely contained in a single document. Thus, a good keyword query may be sufficient.Bing Liu, UIC Web Data Mining9Find opinions on an objectWe use product reviews as an example: Searching for opinions in product reviews is different from general Web search. E.g., search for opinions on “Motorola RAZR V3” General Web search (for a fact): rank pages according to some authority and relevance scores. The user views the first page (if the search is perfect). One fact = Multiple facts Opinion search: rank is desirable, however reading only the review ranked at the top is not appropriate because it is only the opinion of one person. One opinion ≠ Multiple opinionsBing Liu, UIC Web Data Mining10Search opinions (contd) Ranking: produce two rankings Positive opinions and negative opinions Some kind of summary of both, e.g., # of each Or, one ranking but The top (say 30) reviews should reflect the natural distributionof all reviews (assume that there is no spam), i.e., with the right balance of positive and negative reviews. Questions: Should the user reads all the top reviews? OR Should the system prepare a summary of the reviews?Bing Liu, UIC Web Data Mining11Reviews are similar to surveys Reviews can be regarded as traditional surveys. In traditional survey, returned survey forms are treated as raw data. Analysis is performed to summarize the survey results. E.g., % against or for a particular issue, etc. In opinion search, Can a summary be produced? What should the summary be?Bing Liu, UIC Web Data Mining12Roadmap Opinion mining – the abstraction Document level sentiment classification Sentence level sentiment analysis Feature-based opinion mining and summarization Comparative sentence and relation extraction SummaryBing Liu, UIC Web Data Mining13Opinion mining – the abstraction(Hu and Liu, KDD-04; Liu, Web Data Mining book 2007) Basic components of an opinion Opinion holder: The person or organization that holds a specific opinion on a particular object. Object: on which an opinion is expressed Opinion: a view, attitude, or appraisal on an object from an opinion holder. Objectives of opinion mining: many ... Let us abstract the problem put existing research into a common framework We use consumer reviews of products to
View Full Document