UT INF 385Q - In Search of A New Generation of Knowledge Management Applications

Unformatted text preview:

FOCUS AREA ARTICLES In Search of A New Generation of Knowledge Management Applications Edy S. Liongosari, Kelly L. Dempski, Kishore S. Swaminathan Center for Strategic Technology Research Andersen Consulting 3773 Willow Drive, Northbrook; IL 60062, USA E-mail: [email protected] ABSTRACT Today's typical Knowledge Management Systems are not much different from document management systems. In both cases, the retrieval process involves entering a set of keywords and then browsing through a list of documents related to those keywords found by the systems. If Knowledge Management is to live up to its promises, a new generation of Knowledge Management-enabled applications has to be developed. The information has to be presented beyond just a list of documents. Applying data mining techniques to these systems is one of the few promising avenues that may yield a new set of applications. This paper describes our on-going research effort to extract and mine information from one of the largest private Knowledge Management systems in the world. INTRODUCTION With well over 3,000 databases containing millions of documents accessed by over 50,000 users across 78 countries, Andersen Consulting has the largest Lotus Notes installation in the world. This system, called the Knowledge Xchange TM or simply KX, is one of the strategic assets of Andersen Consulting. It is designed to support communities of practice that share and reuse knowledge. It contains discussion databases, library databases, and various directories. The library databases contain a wide variety of documents such as project proposals, project deliverables, case studies, credentials, rrsumrs, newsletters and prototypes. The KX is used primarily for two purposes: finding documents and finding subject matter experts. Imagine a typical scenario where we need to quickly write a proposal for a client. We need to gather our credentials on the subject matter, previous proposals related to that subject, case studies, estimating guidelines, and the names of the subject matter experts. With the KX, you can find all of that information without leaving your desk. THE EXPLORATION The KX has been in place for over five years and it stores over 200 GB of information accumulated by tens of thousands of Andersen Consulting's employees over that period of time. If you view the KX as a sea of raw data, performing data mining over the KX may indeed provide much insightful information such as how Andersen Consulting operates as an organization, how it manages its relationships with its clients, and how it responds to market demands. These insights in turn can be used for various planning and decision making processes, as well as tracking the progress of certain strategic objectives. With this in mind, we started an exploration to find out what type of new information we can discover from the KX as it is today. Obviously this is very much a bottom- up approach. In this paper, we will describe the scope of our inquiry and the initial challenges we encountered. We will then cover some of the quick wins, followed by a more in- depth examination of one facet of the exploration: the construction of The Old Boys Network. We conclude this paper with a brief discussion on other parts of this investigation and the business benefits the exploration has delivered so far. EXPLORATION SCOPE: DATA SELECTION In order to scope this exploratio n effort, we have selected the twenty largest and most widely used databases in the KX. Ten of them are document libraries, five discussion databases, and five directories. Together they comprise about 1.5GB of data excluding the attachments. With these twenty databases, we have a manageable size of data to mine and at the same time it is large enough to represent the rest of the KX. The document libraries contain information such as the authors of the documents, creation dates~ abstracts, and related keywords. The discussion databases contains the topics of the discussions, the participants, and the dates of discussions. The directories include information about Andersen Consulting's employees such as their phone numbers, e-mail addresses, locations, and the groups to which they belong. INITIAL CHALLENGES Mining directly from Lotus Notes databases is arduous. Unlike DBMSs, the underlying storage of Lotus Notes is structurally weak. Even though Lotus Notes has things called databases, they are not really databases in the traditional sense. They are tables. Each row represents a document and each column represents an attribute associated with the document. 60 SIGGROUP Bulletin August 1999/Vol 20, No.2There is no notion of relationships across tables. This makes consistency maintenance across tables a difficult thing to do. Many of our databases, especially the older ones, do not have any consistency maintenance mechanisms in place. This creates many problems. For example, the names of document authors are not validated. David S. Smith can be entered as Dave Smith. If David S. Smith has authored two documents, but one of the documents has Dave Smith as its author and the other has David S. Smith, there is no trivial way to reconcile the two documents and conclude that they are in fact, written by the same person. This inconsistency may cause incorrect results in our mining process. While the solution we came up with does not solve all of the inconsistencies, it is good enough to enable us to move forward to the next step. Our solution can be found in [1]. The process we went through to cleanse and integrate the data is very similar to the KDD process as described in [3,4]. Suffice to say that the central idea of our solution is based on a data model similar to the one shown in Figure 1. This model serves as an index to the underlying information• The information of a person, for example, is derived from the list of document authors in the library databases, the project member listings, the employee • telephone listing, and so on. Similarly the relationships between the entities are also derived. The Has skills in


View Full Document

UT INF 385Q - In Search of A New Generation of Knowledge Management Applications

Documents in this Course
Agents

Agents

12 pages

Groupware

Groupware

20 pages

Load more
Download In Search of A New Generation of Knowledge Management Applications
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view In Search of A New Generation of Knowledge Management Applications and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view In Search of A New Generation of Knowledge Management Applications 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?