DOC PREVIEW
MIT 6 830 - Lecture Notes

This preview shows page 1-2-3-4 out of 11 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 11 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

6.830/6.814 — Notes∗ for Lecture 1: Introduction to Database Systems Carlo A. Curino September 10, 2010 2 Introduction READING MATERIAL: Ramakrishnan and Gehrke Chapter 1 What is a database? A database is a collection of structured data. A database captures an abstract representation of the domain of an application. Typically organized as “records” (traditionally, large numbers, on disk) • and relationships between records • This class is about database management systems (DBMS): systems for cre-ating, manipulating, accessing a database. A DBMS is a (usually complex) piece of software that sits in front of a collection of data, and mediates applications accesses to the data, guaranteeing many properties about the data and the accesses. Why should you care? There are lots of applications that we don’t offer classes on at MIT. Why are databases any different? 23 DB"a collection of structure data"DBMS"a system to create, manipulate, access databases(mediate access to the data)"APP1 APP2Figure 1: What is a database management system? • Ubiquity (anywhere from your smartphone to Wikipedia) • Real world impact: software market (roughly same size as OS market roughly $20B/y). Web sites, big companies, scientific projects, all manage both day to day operations as well as business intelligence + data mining. • You need to know about databases if you want to be happy! The goal of a DBMS is to simplify the storing and accessing of data. To this purpose DBMSs provide facilities that serve the most common operations performed on data. The database community has devoted significant effort in formalizing few key concepts that most applications exploit to manipulate data. This provides a formal ground for us to discuss the application requirements on data storage and access, and compare ways for the DBMS to meet such requirements. This will provide you with powerful conceptual tools that go beyond the specific topics we tackle in this class, and are of general use for any application that needs to deal with data. Now we proceed in showing an example, and show how hard is doing things without a DB, later we will introduce formal DB concepts and show how much easier things are using a DB. Mafia Example Today we cover the user perspective, trying to detail the many reason we want to use a DBMS rather than organizing and accessing data directly, for example as files. Let us assume I am a Mafia Boss (Note: despite the accent this is not the case, but only hypothetical!) and I want to organize my group of “picciotti” (sicilian for the criminals/bad guys working for the boss, a.k.a the soldiers, see Figure 2) to achieve more efficiency in all our operations. I will also need a lot of book-keeping, security/privacy etc.. Note that my organization is very 3Figure 2: Mafia hierarchy. large, so there is quite a bit of things going on at any moment (i.e., many people accessing the database to record or read information). I need to store information about: • people that work for me (soldiers, caporegime, etc..) • organizations I do business with (police, ’Ndrangheta, politicians) • completed and open operations: – protection rackets – arms trafficking – drug trafficking – loan sharking – control of contracting/politics – I need to avoid that any of may man is involved in burglary, mugging, kidnapping (too much police attention) – cover-up operations/businesses – money laundry and funds tracking • assignment of soldiers to operations etc...• I will need to share some of this information with external organizations I work with, protecting some of the information. Therefore I need: • the boss, underboss and consigliere should be able to access all the data and do any kind of operations (assign soldiers to operations, create or shutdown operations, pay cops, check the total state of money movements, etc...) • the accountants (20 of them) access to perform money book-keeping (track money laundering operations, move money from bank to bank, report bribing expenses) 4 ConsigliereCaporegimeSoldiersAssociatesBossUnderbossCaporegimeSoldiersCaporegimeSoldiersImage by MIT OpenCourseWare.• the soldiers (5000) need to report daily misdeeds in a daily-log, and report money expenses and collections • the semi-public interface accessible by other bosses I collaborate with (search for cops on our books, check areas we already cover, etc..) personorganizationlogoperationaccountsname nickname phonelog_idauthortitlesummaryname desc$$coverup-nameinvolvecollaboration_withnamebossrankaccount-numberfalse-identitybalanceFigure 3: What data to store in my Mafia database. 3.1 An offer you cannot refuse I make you an offer you cannot refuse: “you are hired to create my Mafia Information System, if you get it right you will have money, sexy cars, and a great life. If you get it wrong... well you don’t want to get it wrong”. As a first attempt, you think about just using a file system: 1. What to represent:, what are the key entities in the real world I need to represent? how many details? 2. How to store data: maybe we can use just files: people.txt, organiza-tions.txt, operations.txt, money.txt, daily-log.txt. Each files contains a textual representation of the information with one item per line. 3. Control access credentials at low granularity: accountants should know about money movement, but not the names and addresses of our soldiers. Soldiers should know about operations, but not access money information 4. How to access data: we could write a separate procedural program opening one or more files, scanning through them and reading/writing information in them. 5. Access patterns and performance: how to find shop we didn’t col-lected money from for the longest time (and at least 1 month)? scan the huge operation file, sort by time, pick the oldest, measure time? (need to be timely or they will stop paying, and this get the boss mad... you surely 5don’t want that, and make sure no one is accessing it right now). “Tony Schifezza” is a mole, we need to find all the operations and people he was involved or knew about and shut them down... quick... like REAL quick!!! 6. Atomicity: when an


View Full Document
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?