DOC PREVIEW
UMD CMSC 424 - Misc Topics 2

This preview shows page 1-2 out of 6 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 6 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Misc Topics 2TopicsOLAPData WarehousesData MiningInformation RetrievalMisc Topics 2Misc Topics 2Amol DeshpandeAmol DeshpandeCMSC424CMSC424TopicsTopicsOLAPData WarehousesInformation RetrievalOLAPOLAPOn-line Analytical ProcessingWhy ?Exploratory analysisInteractiveDifferent queries than typical SPJ SQL queriesData CUBEA summary structure used for this purpose–E.g. give me total sales by zipcode; now show me total sales by customer employment categoryMuch much faster than using SQL queries against the raw data–The tables are hugeApplications:Sales reporting, Marketing, Forecasting etc etcData WarehousesData WarehousesA repository of integrated information for querying and analysis purposesTend to be very very largeTypically not kept up-to-date with the real dataSpecialized query processing and indexing techniques are usedVery widely usedData MiningData MiningSearching for patterns in dataTypically done in data warehousesAssociation Rules:When a customer buys X, she also typically buys YUse ? Move X and Y together in supermarketsA customer buys a lot of shirtsSend him a catalogue of shirtsPatterns are not always obviousClassic example: It was observed that men tend to buy beer and diapers together (may be an urban legend)Other types of miningClassificationDecision TreesInformation RetrievalInformation RetrievalRelational DB == Structured dataInformation Retrieval == Unstructured dataEvolved independently of each otherStill very little interaction between the twoGoal: Searching within documentsQueries are different; typically a list of words, not SQLE.g. Web searchingIf you just look for documents containing the words, millions of them Mostly uselessRanking:This is the key in IRMany different ways to do itE.g. something that takes into account term frequenciesPagerank (from Google) seems to work best for


View Full Document

UMD CMSC 424 - Misc Topics 2

Documents in this Course
Lecture 2

Lecture 2

36 pages

Databases

Databases

44 pages

Load more
Download Misc Topics 2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Misc Topics 2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Misc Topics 2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?