View Full Document

Storing and Processing Multi-dimensional Scientific Datasets



View the full content.
View Full Document
View Full Document

5 views

Unformatted text preview:

Storing and Processing Multi dimensional Scientific Datasets Alan Sussman UMIACS Department of Computer Science http www cs umd edu als Data Exploration and Analysis Large data collections emerge as important resources Data collected from sensors and large scale simulations Multi resolution multi scale multi dimensional o o data elements often correspond to points in multi dim attribute space medical images satellite data hydrodynamics data etc Terabytes to petabytes today Low cost high performance high capacity commodity hardware 5 PCs 5 Terabytes of disk storage for 10 000 Alan Sussman 3 5 08 2 Large Data Collections Scientific data exploration and analysis To identify trends or interesting phenomena Only requires a portion of the data accessed through spatial index e g Quad tree R tree Spatial range query often used to specify iterator computation on data obtained from spatial query computation aggregates data MapReduce resulting data product size significantly smaller than results of range query Alan Sussman 3 5 08 3 Typical Query Output grid onto which a projection is carried out Specify portion of raw sensor data corresponding to some search criterion Alan Sussman 3 5 08 4 Target example applications Processing Remotely Sensed Data NOAA Tiros N w AVHRR sensor Pathology AVHRR Level 1 Data As the TIROS N satellite orbits the Advanced Very High Resolution Radiometer AVHRR sensor scans perpendicular to the satellite s track At regular intervals along a scan line measurements are gathered to form an instantaneous field of view IFOV Scan lines are aggregated into Level 1 data sets A single file of Global Area Coverage GAC data represents Satellite Data Processing one full earth orbit 110 minutes 40 megabytes 15 000 scan lines One scan line is 409 IFOV s Water Contamination Study Multi perspective volume reconstruction Alan Sussman 3 5 08 5 Outline Active Data Repository Overall architecture Query planning Query execution Experimental Results DataCutter Alan Sussman



Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view Storing and Processing Multi-dimensional Scientific Datasets and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Storing and Processing Multi-dimensional Scientific Datasets and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?