Unformatted text preview:

Database Metatheory:Asking the Big QueriesChristos H. PapadimitriouUniversity of California San Diegochristos@cs. ucsd.eduIs “database theory” an oxymoron? Or isata platitude?What is the fitness measure that decides the surviva! ofideas (and areas) in mathematics, in applted science,and in computer science? Which ideas from databasetheory during the past twenty-five years have influencedresearch in other fields of computer science? How manywere encapsulated in actual products? Was the rela-tional model the on[y true paradigm sh@ m computerscience ? Is applicability the only and ultimate justifica-tion of theoretical research in an applied science? Areapplicability pressures rea!ly exogenous and unwelcome?Are negattve results appropriate goals of theoretical re-search in an appiied science —or are they the on[y pos-sibie such research goals? If scientific theories must berefutab!e, what are the “hard facts” that provide the pos-sibility of refutation in the case of database theory?1 IntroductionThe phrase theoretical computer science often elicits apuzzled smile from a well-intended layman. I suspectthat database theory has the same effect.If it is thefield to which you have dedicated your life’s work, thismakes you stop and think. This paper is my one-timeattempt to organize and register these thoughts.2 Theory and its FunctionThe historical justification of theoretical computer sci-ence is well-known: After all, theoreticians have foundedcomputer science (Turing, Godel, von Neumann), andcontributed crucially to its most celebrated triumphs,(understanding how to write compilers, how to lay outchips, and how to design databases, to name three).However, historical arguments of this sort have pitfalls,because situations change and yesterday’s truths are noPermission to copy without fee all or part of this material isgranted provided that the copies are not made or distributed fordirect commercial advantage, the ACM copyright notice and thetitle of the publication and its date appear, and notice is giventhat copying is by permission of the Association of ComputingMachinery.To copy otherwise, or to republish, requiresa fee and/or specific permission.PODS ’95 San Jose CA USA@ 1995 ACM 0-89791 -730-8/95/0005 ..$3.50more. Here I will concentrate on a less inductive argu-ment.In the context of an applied science, theory zn thebroad sense is the use of significant abstraction inscientific research, the suppression of low-level detailsof the object or artifact being studied or designed. Thisabstraction usually requires insight and ingenuity; forexample, in economics it is challenging to capture theabstract nature of markets. In contrast, as HerbertSimon observed ([Si] p. 22), the high-level behaviorof the computer is the only aspect that is directlyobservable, and in fact it is so in a most deliberateway. Computer theory, in the broad sense, is morethan justified, possible, or desirable: It is inevitable.Virtually all computer research, including experimentalresearch, is ‘(theoretical” in this weak sense. This is evenmore true in database research, because abstraction isthe essence and raison d ‘ilre of databases.However, here we are concerned with theory zn thenarrow sense, the sense usually employed in computerscience: the use of sophisticated mathematical tech-niques for developing, using, and analyzing mathemat-ical models of the (possibly already significantly ab-stracted) artifact under consideration. Theory is neededto cure the complexity that plagues the artifact beingstudied or, more typically, designed. Simon [Si] observesthat this complexity is not innate, but it is the result ofthe adaptation of the artifact to the complexities of theenvironment. For example, the complexity of an algo-rithm is an adaptation to the complexity of the problemposed to it, and the complexity of an operating system isthe result of its adaptation to its usage pattern and thecomponents’ performance and cost. Of course, nowhereis this adaptation to the environment more prevalentand complexity-inducing than in databases, whose pur-pose is to represent parts of the environment, as well asto interact with other parts.Theoreticians attack the complexity of the artifact inseveral distinct ways.(a) They develop mathematical modeis of the artifact.Turing machines, formal languages, and the relational1model come to mind as particularly successful (at leastin the sense explored in Section 4) examples of suchmodels from computer science.(b) Since ours is a science of the artificial, abstractmodels can become reality:Theoreticians proposecomplexity-reducing solutions (typically, algorithms andrepresentational schemes) that are derived from themathematical models. This function of theory is whatwe usually mean by “synthesis” or “positive results.”Such results must be actually verified by ezpernnenis.The Berkeley-IBM experiment in the middle 1970s [As,SWKH], which established the feasibility of relationaldatabases, was implicitly suggested in database theory’smost celebrated positive result, the formulation of therelational model [Col].(c) Theoreticians analyze the mathematical models topredict the outcome of the experiments (and calibratethe models). For example, the need and importanceof normalization in relational databases, and the roleplayed by dependencies in it, were amply predicted; thedifficulty of query optimization, on the other hand, cameas a surprise, and necessitated new model development,synthesis, analysis, and experiments.(d) Finally, theoreticians explore. They develop andstudy extensions and alternative applications of themodel, and they fathom its ultimate limitations. Theyintroduce and apply more and more sophisticatedmathematical techniques.They build a theoreticalbody of knowledge and an edifice of mathematicalmethodology that transcend the motivating artifact andmodel (presumably in anticipation of higher complexity,stemming from more complex environments yet tocome). Exploration is usually guided by (individualor collective) aesthetics,taste, and sense of what is“important” and “relevant”.Model building, synthesis,and analysis are obviouslyand uncontroversially necessary parts of the researchand discovery process in any science of the artificial.But exploration is what theoreticians do most often(and most gladly); predictably, it is also the aspect thatis criticized most viciously. As it will become clear inthe next sections, I believe that there


View Full Document
Download Database Metatheory
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Database Metatheory and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Database Metatheory 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?