DOC PREVIEW
USC CSCI 599 - I1_pederson_p40

This preview shows page 1-2 out of 7 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 7 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

0018-9162/01/$17.00 © 2001 IEEE40 ComputerMultidimensionalDatabase TechnologyThe relational data model, which was intro-duced by E.F. Codd in 1970 and earned himthe Turing award a decade later, served as thefoundation of today’s multibillion-dollardatabase industry. During the past decade,the multidimensional data model emerged for usewhen the objective is to analyze data rather than toperform online transactions. Multidimensional data-base technology is a key factor in the interactive analy-sis of large amounts of data for decision-makingpurposes. In contrast to previous technologies, thesedatabases view data as multidimensional cubes thatare particularly well suited for data analysis. Multidimensional models categorize data either asfacts with associated numerical measures or as textualdimensions that characterize the facts. In the case of aretail business, a purchase would be a fact and the pur-chase amount and price would be measures; the typeof product being bought and the purchase time andlocation would be dimensions. Queries aggregate mea-sure values over a range of dimension values to provideresults such as total sales per month of a given prod-uct. Multidimensional data models have three impor-tant application areas within data analysis. • Data warehouses are large repositories that inte-grate data from several sources in an enterprisefor analysis. • Online analytical processing (OLAP) systems pro-vide fast answers for queries that aggregate largeamounts of detail data to find overall trends.• Data mining applications seek to discover knowl-edge by searching semiautomatically for previ-ously unknown patterns and relationships in mul-tidimensional databases. Academic researchers have proposed formal math-ematical models of multidimensional databases, whileindustry has implicitly specified proposals via the con-crete software tools that implement them.1,2The“Multidimensional Database History” sidebar de-scribes the evolution of the multidimensional datamodel and how it has benefited from the use of seman-tic as well as scientific and statistical data models. SPREADSHEETS AND RELATIONSA spreadsheet such as that shown in Table 1 is a use-ful tool for analyzing sales data such as product sold,number of purchases, and city of sale. A pivot table isa two-dimensional spreadsheet with associated subto-tals and totals that supports viewing more complexdata by nesting several dimensions on the x- or y-axisand displaying data on multiple pages. Pivot tablesgenerally support interactively selecting data subsetsand changing the displayed level of detail. Spreadsheets are an inadequate tool for managingand storing multidimensional data because they tie datastorage too tightly to the presentation—they do notseparate the structural information from the desiredviews of the information. Thus, adding a third dimen-sion such as time or grouping the data into higher-levelproduct types requires a considerably more complexsetup. The obvious solution is to use a separate spread-sheet for each dimension, but this will work only to alimited extent because analyzing the additional valuesof the extra dimension quickly becomes unwieldy. Multidimensional databases model data as either facts, dimensions, ornumerical measures for use in the interactive analysis of large amounts of data for decision-making purposes. Torben BachPedersen Christian S.JensenAalborg UniversityCOVER FEATUREUsing a Structured Query Language database man-agement system offers considerable flexibility in struc-turing data. However, formulating many desirablecomputations such as cumulative aggregates (sales inyear to date), combining totals and subtotals, or deter-mining rankings such as the top 10 selling products isdifficult if not impossible in standard SQL. Also, trans-posing rows and columns requires manually specify-ing and combining multiple views. Although SQLextensions such as the data cube operator3and querywindows4will remedy some of these problems, theSQL-based relational model does not handle hierar-chical dimensions satisfactorily.Spreadsheets and relational databases provide ade-quate support for a small volume of data that has onlya few nonhierarchical dimensions, but they do notfully support the requirements for advanced dataanalysis. The only robust solution is to use databasetechnology that offers inherent support for the fullrange of multidimensional data modeling.CUBESMultidimensional databases view data as cubes thatgeneralize spreadsheets to any number of dimensions.In addition, cubes support hierarchies in dimensionsand formulas without duplicating their definitions. Acollection of related cubes comprises a multidimen-sional database or data warehouse.Because dimensions in a cube are first-class, built-in concepts with associated domains, cubes can easilymanage the addition of new dimension values.Although the term implies three dimensions, a cubecan theoretically have any number of dimensions; infact, most real-world cubes have four to 12 dimen-sions.5,6Current tools often experience performanceproblems when a so-called hypercube contains morethan 10 to 15 dimensions.Combinations of dimension values define a cube’scells. Depending on the specific application, the cellsin a cube range from sparse to dense. Cubes tend tobecome sparser as dimensionality increases and as thedimension values’ granularities become finer. Figure 1 shows a cube capturing the sales for thetwo Danish cities in Table 1 with the additionaldimension of time. The corresponding cells store thenumber of sales. The example has a fact—a nonemptycell that contains a number of associated numericalmeasures—for each combination of time, product,and city where at least one sale was made. The cellsstore numerical values associated with a fact—in thiscase, the number of sales is the only measure.Generally, a cube supports viewing only two or threedimensions simultaneously, but it can show up to fourlow-cardinality dimensions by nesting one dimensionwithin another on the axes. Thus, cube dimensional-ity is reduced at query time by projecting it down toDecember 2001 41Table 1. Sample sales spreadsheet.Product Number of purchases by cityAalborg Copenhagen Los Angeles New York CityMilk 123 555 145 5,001 Bread 102 250 54 2,010 Jeans 20 89 32 345Light bulbs 22 213 32 9,4502000 2001123 12757 455667211CopenhagenAalborgBreadMilkFigure 1. Sample cube capturing sales data. Data cubes support viewing of up to fourlow-cardinality


View Full Document

USC CSCI 599 - I1_pederson_p40

Documents in this Course
Week8_1

Week8_1

22 pages

Week2_b

Week2_b

10 pages

LECT6BW

LECT6BW

20 pages

LECT6BW

LECT6BW

20 pages

5

5

44 pages

12

12

15 pages

16

16

20 pages

Nima

Nima

8 pages

Week1

Week1

38 pages

Week11_c

Week11_c

30 pages

afsin

afsin

5 pages

October5b

October5b

43 pages

Week11_2

Week11_2

20 pages

final

final

2 pages

c-4

c-4

12 pages

0420

0420

3 pages

Week9_b

Week9_b

20 pages

S7Kriegel

S7Kriegel

21 pages

Week4_2

Week4_2

16 pages

sandpres

sandpres

21 pages

Week6_1

Week6_1

20 pages

4

4

33 pages

Week10_c

Week10_c

13 pages

fft

fft

18 pages

LECT7BW

LECT7BW

19 pages

24

24

15 pages

14

14

35 pages

Week9_c

Week9_c

24 pages

Week11_67

Week11_67

22 pages

Week1

Week1

37 pages

LECT3BW

LECT3BW

28 pages

Week8_c2

Week8_c2

19 pages

Week5_1

Week5_1

19 pages

LECT5BW

LECT5BW

24 pages

Week10_b

Week10_b

16 pages

Week11_1

Week11_1

43 pages

Week7_2

Week7_2

15 pages

Week5_b

Week5_b

19 pages

Week11_a

Week11_a

29 pages

LECT14BW

LECT14BW

24 pages

T7kriegel

T7kriegel

21 pages

0413

0413

2 pages

3

3

23 pages

C2-TSE

C2-TSE

16 pages

10_19_99

10_19_99

12 pages

s1and2-v2

s1and2-v2

37 pages

Week10_3

Week10_3

23 pages

jalal

jalal

6 pages

1

1

25 pages

T3Querys

T3Querys

47 pages

CS17

CS17

15 pages

porkaew

porkaew

20 pages

LECT4BW

LECT4BW

21 pages

Week10_1

Week10_1

25 pages

wavelet

wavelet

17 pages

October5a

October5a

22 pages

p289-korn

p289-korn

12 pages

2

2

33 pages

rose

rose

36 pages

9_7_99

9_7_99

18 pages

Week10_2

Week10_2

28 pages

Week7_3

Week7_3

37 pages

Load more
Download I1_pederson_p40
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view I1_pederson_p40 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view I1_pederson_p40 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?