SMU CSE 8331 - Web Usage Mining and Pattern Discovery

Unformatted text preview:

Web Usage Mining and Pattern Discovery: A Survey PaperByNaresh BarsagadeCSE 8331Web Usage Mining and PatternDiscovery: A Survey Paper ByNaresh Barsagade CSE 8331December 8, 20031. Introduction Web technology is not evolving in comfortable and incremental steps, but it isturbulent, erratic, and often rather uncomfortable. It is estimated that theInternet, arguably the most important part of the new technologicalenvironment, has expanded by about 2000 % and that is doubling in sizeevery six to ten months. In recent years, the advance in computer and webtechnologies and the decrease in their cost have expanded the meansavailable to collect and store data. As an intermediate consequence, theamount of information (Meaningful data) stored has been increasing at a veryfast pace. Traditional information analysis techniques are useful to createinformative reports from data and to confirm predefined hypothesis about thedata. However, huge volumes of data being collected create new challengesfor such techniques as organizations look for ways to make use of the storedinformation to gain an edge over competitors. It is reasonable to believe thatdata collected over an extended period contains hidden knowledge about thebusiness or patterns characterizing customer profile and behavior. With therapid growth of the World Wide Web, the study of knowledge discovery inweb, modeling and predicting the user’s access on a web site has becomevery important [GO2003].From the administration, business and application point of view, knowledgeobtained from the Web usage patterns could be directly applied to efficientlymanage activities related to e-Business, e-CRM, e-Services, e-Education, e-Newspapers, e-Government, Digital Libraries, and so on [AR2003]. Web isbecoming the necessity of the businesses and organizations because of itsdemand from the clients. Since the web technology largely feeds on ideasSurvey Paper: Barsagade Page 2 of 30 1/14/2019and knowledge rather than being dependent on fixed assets, it gave birth tonew companies such as Yahoo, Google, Netscape, e-Bay, e-Trade, Expedia,Amazon and so on. With the large number of companies using the Internet todistribute and collect information, knowledge discovery on the web hasbecome an important research area [JTP2002]. With the explosive growth ofinformation sources available on the World Wide Web, it has becomenecessary for organizations to discover the usage patterns and analyze thediscovered patterns to gain an edge over competitors. Jespersen et al [JTB2002] proposed a hybrid approach for analyzing the visitorclick stream sequences. A combination of hypertext probabilistic grammarand click fact table approach is used to mine Web logs, which could be alsoused for general sequence mining tasks. Mobasher et al [MCS1999] proposedthe web personalization system, which consists of offline tasks related to themining if usage data and online process of automatic Web pagecustomization based on the knowledge discovered. LOGSOM (LOGSOM, asystem that utilizes Kohonen's self-organizing map (SOM) to organize webpages into a two-dimensional map) proposed by Smith et al [SN2003], utilizesa self-organizing map based solely on the users’ navigation behavior, ratherthan the content of the web pages. LumberJack proposed by Chi et al[CRHL2002] builds up user profiles by combining both clustering of usersessions and traditional statistical traffic analysis using k–means algorithm.Joshi et al [JJYK1999] used relational online analytical processing approach forcreating a Web log warehouse using access logs and mined logs. Acomprehensive overview of web usage mining research is found in[SCDT2000, CMS97, CMS1999, RWC2000].Survey Paper: Barsagade Page 3 of 30 1/14/2019Web mining can be divided into three areas, namely web content mining, webstructure mining and web usage mining [SCDT2000]. Web Content miningfocuses on discovery of information stored on the Internet. Web Structuremining focuses on improvement in structural design of a website. Web Usagemining, the main topic of this paper, focuses on knowledge discovery fromthe usage of individuals web sites. Global Internet Usage Average Usage [NN2003] shows the current usagearound the globe and in United States.Month of September 2003, Panel Type: Home September August%ChangeNumber of Sessions per Month 22 22 1.65Number of Unique DomainsVisited 55 54 0.89Page Views per Month 901 899 0.3Page Views per Surfing Session 41 41 0Time Spent per Month 11:59:20 11:50:30 1.24Time Spent During SurfingSession 0:32:29 0:32:37 -0.4Duration of a Page Viewed 0:00:48 0:00:47 0.94Active Internet Universe 252,672,070 253,054,814 -0.15Current Internet UniverseEstimate 419,054,724 416,339,888 0.65United States: Average Web UsageMonth of October 2003, Panel Type: HomeSessions/Visits Per Person 71Domains Visited Per Person 103PC Time Per Person 80:46:37Duration of a Web Page Viewed 0:01:00Active Digital Media Universe 47,003,165Current Digital Media Universe Estimate 51,012,930Survey Paper: Barsagade Page 4 of 30 1/14/2019The remainder of the paper is organized as follows: Section 2 containsapplications of web usage mining, section 3 contains basic components ofweb mining terminologies, taxonomy of web mining, architecture of webusage mining, explanation of individual components in web usage miningarchitecture, section 4 summarizes the paper, identifies several futureresearch directions and section 5 contains the bibliography. 2. Applications of Web Usage MiningEach of the applications can benefit from patterns that are ranked bysubjective interesting.Web usage mining is used in the following areas: Web usage mining offers users the ability to analyze massive volumesof clickstream or click flow data, integrate the data seamlessly withtransaction and demographic data from offline sources and applysophisticated analytics for web personalization, e-CRM and otherinteractive marketing programs.  Personalization for a user can be achieved by keeping track ofpreviously


View Full Document

SMU CSE 8331 - Web Usage Mining and Pattern Discovery

Download Web Usage Mining and Pattern Discovery
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Web Usage Mining and Pattern Discovery and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Web Usage Mining and Pattern Discovery 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?