DOC PREVIEW
The Digital Library of India Project: Process, Policies and Architecture

This preview shows page 1-2-3 out of 9 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 9 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1 The Digital Library of India Project: Process, Policies and Architecture Vamshi Ambati1, N.Balakrishnan2, Raj Reddy1, Lakshmi Pratha3, and C.V. Jawahar3 1 Carnegie Mellon University, PA, USA [email protected], [email protected] 2 Indian Institute of Science, Bangalore, India [email protected] 3 International Institute of Information Technology, Hyderabad, India [email protected], [email protected] Abstract In this paper we share the experience gained from establishing a process and a supporting architecture for the Digital Library of India (DLI) project. The DLI project was started with a vision of digitizing books and making them available online, in a searchable and browseable form. The digitization of the books takes place at geographically distributed locations. This raises many issues related to policy and collaboration. We discuss these problems in detail and present the process and workflow that is established to solve them. We also share the architecture of the project that supports the smooth implementation of the process. The architecture of the DLI project has been arrived at after considering factors like high performance, scalability, availability and economy. Keywords: digital library, digital library architecture, Digital library project of India, Universal digital library, DLI process 1. Introduction Digital Libraries have received wide attention in the recent years allowing access to digital information from anywhere across the world. They have become widely accepted and even preferred information sources in areas of education, science and others (Bainbridge, Thompson, & H.Witten 2003). The rapid growth of Internet and the increasing interest in development of digital library related technologies and collections (McCray & Gallagher 2001) (Marchionini & Maurer 1995) helped accelerate the digitization of printed documents in the past few years. With a vision of digitizing a million books by 2008, the Digital Library of India (DLI) project aims to digitally preserve all the significant literary, artistic and scientific works of people and make it freely available to anyone, anytime, from any corner of the world, for education, research and also for appreciation by our future generations. Ever since its inception in November, 2002 operating at three centers, the project has been successfully digitizing books, which are a dominant store of knowledge and culture. We now host close to one tenth of a million books online with about 33 million pages scanned at almost 30 centers across the country. The scanning centers include academic institutions of high repute, religious and government institutions. In such a highly distributed environment establishing a notion of collaborative effort and distributing discrete chunks of work while maintaining uniform standards becomes a high priority task. Traditionally, digital libraries work in a closed environment and contain the process information and2 the content in a local repository. Although doing so increases the ease of server management and administration along with simpler resolution of process oriented issues, such an isolated set up does not scale up easily or promote collaboration across geographically distributed points of operation. This adds unacceptable delays in the implementation of the project. In projects like the DLI with such ambitious missions, a distributed environment becomes a requisite. There are discrete phases and chunks of work, which need not be collocated for operation. For example the process of scanning books can take place at one place while the image processing and web-enabling of the same could occur at a different place. Also with some planning both these tasks could go on concurrently at different places on different consignments of books in turn yielding a higher throughput. In the DLI project, we have established a flexible, yet cohesive, process that automates the entire workflow in a distributed environment. In doing so we have confronted a few problems and issues (Sankar et al. 2006) starting from the selection of books for digitizing, operating and establishing a protocol for being free from effort duplication, producing digital output of good quality and preservation of the digitized book objects for access in a user friendly, reliable and highly available manner. In this paper we describe the experience gained from establishing a process involving an efficient workflow and effective policies and deploying a scalable, distributed architecture for the Digital Library of India project. We suppose that the same can be successfully applied to other similar digital libraries and digitization systems. We also attempt at throwing light on some unforeseen problems in the digitization process and issues of collaboration in a distributed environment. The rest of the paper is organized as follows. In section 2, we give an overview of the project and its organization. In section 3, we discuss the problems and challenges that we experienced in the course of the digitization and web-enablement of books. In section 4, we discuss the process established in the project that has helped us address a few of the aforementioned problems. Section 5, discusses the architecture that addresses the issue of reliable web access of digitized content and supports the DLI process and we conclude in section 6. 2. Overview of the Project The Digital Library of India project was initiated in the year 2002, with motivations from the Universal Digital Library project1. The project currently digitizes and preserves books ,though one of the future avenues is to preserve existing digital media of different formats like video, audio etc. The scanning operations and preservation of digital data takes place at different centers across India, Regional Mega Scanning Center (RMSC). The RMSCs themselves function as individual organizations with scanning units established at several locations in the region. Responsibilities of a RMSC include regulating the processes of


The Digital Library of India Project: Process, Policies and Architecture

Download The Digital Library of India Project: Process, Policies and Architecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view The Digital Library of India Project: Process, Policies and Architecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view The Digital Library of India Project: Process, Policies and Architecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?