DOC PREVIEW
USC CSCI 585 - Session17

This preview shows page 1-2-3-20-21-22-41-42-43 out of 43 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 43 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

CSCI585CSCI585C. ShahabiMultidimensional DatabasesCyrus ShahabiComputer Science DepartmentUniversity of Southern [email protected]. ShahabiOutlineDefinitions (from A.R. 20 and 21)Focus Application: OLAPPrefix-Sum (from A.R. 16)Dynamic Data Cube (from A.R. 17)Iterative Data Cube (from A.R. 18)Wavelet-based approachesCompact Data Cube (from A.R. 19)ProPolyne (from A.R. 22 and 23)CSCI585CSCI585C. ShahabiDefinitionsDuring the past decade, the multidimensional data model emerged for use when the objective is to analyze data rather than to perform online transactions. In contrast to previous technologies, these databases view data as multidimensional cubes that are particularly well suited for data analysis.Multidimensional data models have three important application areas within data analysis:Data warehouses are large repositories that integrate data from several sources in an enterprise for analysis.Online analytical processing (OLAP) systems provide fast answers for queries that aggregate large amounts of detail data to find overall trends.Data mining applications seek to discover knowledge by searching semi-automatically for previously unknown patterns and relationships in multidimensional databases.CSCI585CSCI585C. ShahabiDefinitions…Multidimensional databases view data as cubes that generalize spreadsheets to any number of dimensions.In addition, cubes support hierarchies in dimensions and formulas without duplicating their definitions. A collection of related cubes comprises a multidimensional database or data warehouse.Dimensions are used for selecting and aggregating data at the desired level of detail. A dimension is organized into a containment-like hierarchy composed of numerous levels, each representing a level of detail required by the desired analyses. Each instance of the dimension, or dimension value, belongs to a particular level.CSCI585CSCI585C. ShahabiDefinitions …Facts represent the subject—the interesting pattern or event in the enterprise that must be analyzed In most multidimensional data models, facts are implicitly defined by their combination of dimension values; a fact exists only if there is a nonempty cell for a particular combination of values.A measure consists of two components:a fact’s numerical property, such as the sales price or profita formula, usually a simple aggregation function such as sum , that can combine several measure values into one.In a multidimensional database, measures generally represent the properties of the fact that the user wants to optimize. Measures then take on different values for various dimension combinations. The property and formula are chosen to provide a meaningful value for all combinations of aggregation levels.CSCI585CSCI585C. ShahabiFocus Application: On-Line Analytical Processing (OLAP)Multidimensional data sets: Dimension attributes (e.g., Store, Product, Data)  Measure attributes (e.g., Sale, Price)Range-sum queries Average sale of shoes in CA in 2001 Number of jackets sold in Seattle in Sep. 2001Tougher queries: Covariance of sale and price of jackets in CA in 2001 (correlation)  Variance of price of jackets in 2001 in SeattleStoreLocationProductDate SaleLA Shoes Jan. 01 $21,500 $85.99NY Jacket June 01 $28,700 $45.99Price...............Market-RelationMarket-Relationσ(p=shoe)σ(s <in> CA)σ(d <in> 2001)Avg (sale)Too Slow!Too Slow!CSCI585CSCI585C. ShahabiRange Queries in OLAP Data CubesChing-Tien Ho Rakesh AgrawalNimord Megiddo Rammakrishnan SrikantIBM Almaden Research CenterCSCI585CSCI585C. ShahabiOutlineIntroductionThe Basic Range-Sum AlgorithmThe Blocked Range-Sum AlgorithmThe Batch-Update Algorithm for Range-Sum QueriesChoosing Dimension,Cuboids and Block SizesConclusionCSCI585CSCI585C. ShahabiIntroductionExample for MDDBAssume the data cube has four functional attributes (dimensions): age,year,state and insurance type. The data cube will have age*year * state* type cells, with each cell containing the total revenue (the measure attribute) for the corresponding combination of these four attributesCSCI585CSCI585C. ShahabiMeasure AttributeIntroduction(example)Customer_ageDateDimensional Attribute$390$590$560$950$800$500$37010/30$190$360$250$500$440$390$17010/29$170$230$250$600$600$230$25010/28$250$500$470$800$790$550$80010/27$300$500$500$600$200$170$11010/26$300$100$300$240$100$130$12010/25$150$600$250$500$700$400$30010/2423222120191817IndexCSCI585CSCI585C. ShahabiIntroductionRange QueriesThat apply a given aggregation operation over selected cells where the selection is specified as continuous ranges in the domains of some of the attributes.Example: Sum, Max, Count and AverageCSCI585CSCI585C. ShahabiIntroduction The Basic Range-Sum AlgorithmThe Blocked Range-Sum AlgorithmThe Batch-Update Algorithm for Range-Sum QueriesConclusionCSCI585CSCI585C. ShahabiThe Basic Range-Sum Algorithm Pre-computed Prefix-Sum array PLet P be a d-dimensional array of sizeN = n1 * n2 * ……* nd∑∑= ===xiyjjiAyxSumyxP0 0],[):0,:0(],[CSCI585CSCI585C. ShahabiThe Basic Range-Sum Algorithm533242228623713221530543210IndexArray A (original array)635340292412244392921181011613119830543210IndexArray P (Prefix-Sum array)ExampleP(1,1)=A(0,0)+A(0,1)+A(1,0)+A(1,1)6318CSCI585CSCI585C. ShahabiThe Basic Range-Sum AlgorithmTheorem 1:Proves how any range-sum of A can be computed from up to 2dappropriate elements of P.CSCI585CSCI585C. ShahabiThe Basic Range-Sum AlgorithmQuestion:How can we get the sum of this area?533242228623713221530543210IndexArray A (original array)CSCI585CSCI585C. ShahabiThe Basic Range-Sum AlgorithmWhen d=2, the range-sum Sum(l1:h1,l2:h2) can be obtained in three computation steps asSum(l1:h1,l2:h2) = P [h1,h2]-P [h1,l2-1]-P [l1-1,h2]+P [l1-1,l2-1]22-1=3 stepsCSCI585CSCI585C. ShahabiThe Basic Range-Sum AlgorithmRegion A in Original ArrayRegion C=Region B in Array P+Region ERegion DCSCI585CSCI585C. ShahabiThe Basic Range-Sum AlgorithmFor instance:The range-sum Sum(2:3,1:2)=P [3,2]-P [3,0]-P [1,2]+P [1,0]=40-11-24+8=13635340292412244392921181011613119830543210IndexArray P4029242921181198533242228623713221530543210IndexArray A (original array)CSCI585CSCI585C. ShahabiIntroductionThe Basic Range-Sum Algorithm The Blocked Range-Sum AlgorithmThe Batch-Update Algorithm for Range-Sum


View Full Document

USC CSCI 585 - Session17

Download Session17
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Session17 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Session17 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?