HARVARD CS 263 - An evaluation of multi-resolution search

Unformatted text preview:

An evaluation of multi-resolution search and storage inresource-constrained sensor networksDeepak Ganesan, Ben Greenstein, Denis Perelyubskiy,Deborah Estrin, John HeidemannAbstractWireless sensor networks enable dense sensing of the environment, offering unprecedented opportunities for observingthe physical world. Centralized data collection and analysis adversely impact sensor node lifetime. Previous sensor networkresearch has, therefore, focused on in network aggregation and query processing, but has done so for applications where thefeatures of interest are known a priori. When features are not known a priori, as is the case with many scientific applications indense sensor arrays, efficient support for multi-resolution storage and iterative, drill-down queries is essential.Our system demonstrates the use of in-network wavelet-based summarization and progressive aging of summaries in sup-port of long-term querying in storage and communication-constrained networks. We evaluate the performance of our linuximplementation and show that it achieves: (a) low communication overhead for multi-resolution summarization, (b) highlyefficient drill-down search over such summaries, and (c) efficient use of network storage capacity through load-balancing andprogressive aging of summaries.1 IntroductionResearch in sensor networks has been targeted at numerous scientific applications, including micro-climate and habitat mon-itoring [1, 2, 3] and earthquake and building health monitoring ([4]). Such networks are primarily intended for long-termdeployment, to obtain data about previously unobservable phenomena for detailed analysis by experts in the field. Data analy-sis in such applications often involves complex signal manipulation, including modeling, searching for new patterns or trends,looking for correlation structures, etc. For instance, researchers interested in building health monitoring seek to correlate chang-ing vibration patterns of buildings to data about small earthquakes. Conventional approaches to such monitoring have involvedwired and sparsely deployed networks that transfer all data from sensors to a central data repository for persistent storage.The goals of providing a non-invasive, in situ, dense, long-term deployment of sensing infrastructure has necessitatedthat sensor nodes be cheap, wireless, and consume very little power. An unfortunate consequence of the limited resourcesof such nodes is that they are highly communication constrained, severely limiting deployment lifetime if all raw data mustbe transmitted to a central location (Table 1, [5]). The long-term storage requirements of such systems add an additionaldimension to optimize, since the storage capacity of low-end sensor nodes (motes[6], smartdust[7]) will be limited by cost andform factor. Table 1 shows that the storage on current mote sensor nodes ([6]) is highly insufficient for high data rate sensornetwork applications (building health, habitat monitoring), and lasts at most a month for even low data rate applications such asmicro-climate monitoring. The problem of insufficient storage is exacerbated by the fact that such systems are long-lived andoperate unattended for many years. Non-volatile storage prices and form factor will no doubt decline, yet, for long-lived sensornodes, the disparity between the sizes of sensor data produced and on-board memory, will mandate storage optimization.Existing techniques support applications where the features of interest and aggregation operators are well-defined (Figure 1).For instance, a Diffusion query suggested in [8] tracks the movement of a bird with known signature. Such an event detectionscheme can be augmented with in-network Data-Centric Storage ([9]) to store related detections at predefined locations in thenetwork. TAG[10] provides SQL-like semantics to define aggregates on a data collection tree, so that operators at junctionscan construct streaming data aggregates such as histograms. While these techniques are all three important to current andfuture sensing systems, they are not sufficient for all applications. Diffusion and DCS require that queries and associatedprocessing be defined a priori. Aggregation operators in TAG are pre-defined, and intentionally selective enough that little datais communicated out of the network. Furthermore, TAG’s standing queries are not designed to search through stored data. Forinstance, identifying an anomalous seismic event might require statistics about previous events. For such queries, it is moreefficient approach to store data within the network, and pre-process it to efficiently handle multiple searches.The key idea behind our system is spatio-temporal summarization: we construct multi-resolution summaries of sensor dataand store them in the network in a spatially and hierarchically decomposed distributed storage structure optimized for efficientquerying. A promising approach was introduced in [11], where multi-resolution summarization using wavelets, and drill-down1Ill−definedWell DefinedUnlimited LimitedStorageDiffusionTagStorage and AgingFeaturesSummarizationData−Centric SummarizationFigure 1: Feature Extraction in Sensor Networks       TemporalSummarizationSummarizationSpatialTime−series summaryFiner ViewCoarser ViewAgingSummaries storedfor more timeSummaries storedfor less time(Level 0)(Level 3)(Level 1)Figure 2: Constructing a DIMENSIONS Hierarchy: Temporaland Spatial SummarizationApplication Sensors Expected DataRatesData Require-ments/YearExpected Lifetime using Cen-tralized Storage (approx)Expected Time to Storage Limitif all Raw data were stored (ap-prox)MoteaMK-2[13]bMote MK-2Seismic Monitoring [4] Accelerometer 30minutes seismicevents per day persensor8Gb/year few weeks few months few days 1 yearMicro-climate Monitor-ingTemperature,Light, Precipi-tation, Pressure,Humidity1 sam-ple/minute/sensor40Mb/year few months 1 year 1 months 25 yearsHabitat Monitoring Acoustic, Video 10 minutes of audioand 5 mins of videoper day1 Gb/year few weeks few months few days 8 yearsTable 1: Data Requirement estimates for Scientific ApplicationsaMote: Peak Current-25mA, Radio Baud Rate - 10Kbps effective, 4Mb storage, 2 AA batteriesbMK2: Peak Current-50mA, Radio Baud Rate - 10Kbps effective, 1Gb storage, 8 AA batteriesquerying was proposed. Summaries are generated


View Full Document

HARVARD CS 263 - An evaluation of multi-resolution search

Download An evaluation of multi-resolution search
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view An evaluation of multi-resolution search and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view An evaluation of multi-resolution search 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?