Location-based Spatial Queries Jun Zhang§ Manli Zhu§ Dimitris Papadias§ Yufei Tao† Dik Lun Lee§ §Department of Computer Science Hong Kong University of Science and Technology Clear Water Bay, Hong Kong { zhangjun, cszhuml, dimitris, dlee}@cs.ust.hk †Department of Computer Science Carnegie Mellon University Pittsburgh, USA [email protected] Abstract In this paper we propose an approach that enables mobile clients to determine the validity of previous queries based on their current locations. In order to make this possible, the server returns in addition to the query result, a validity region around the client's location within which the result remains the same. We focus on two of the most common spatial query types, namely nearest neighbor and window queries, define the validity region in each case and propose the corresponding query processing algorithms. In addition, we provide analytical models for estimating the expected size of the validity region. Our techniques can significantly reduce the number of queries issued to the server, while introducing minimal computational and network overhead compared to traditional spatial queries. 1. INTRODUCTION Spatial databases have been extensively studied during the last two decades and several spatial access methods (SAMs) have been proposed [GG98]. Among the most popular ones is the R-tree and its variations, notably the R*-tree [BKSS90]. R-trees can be viewed as multi-dimensional extensions of B-trees. Figure 1 shows an exemplary R-tree for a set of points {a, b, c,…} assuming a capacity of three entries per node. Points that are close in space (e.g., a, b, c) are clustered in the same leaf node (E4) represented as a minimum bounding rectangle (MBR). Nodes are then recursively grouped together following the same principle until the top level, which consists of a single root. abcdefhgE1E2E3E4E5E6E7E8RootE9iE1E2E4E5E8E2046810246810x axisy axisbEfomitted1E2edcagE3E5E6E4E78contentsE9iquerywindowq Fig. 1: An R-tree example R-trees (like most SAMs) were motivated by the need to efficiently process window queries, which retrieve all the objects that intersect a window (shaded area in Figure 1). The R-tree answers a window query q as follows. The root is first retrieved and the entries inside it are compared with q. The ones (e.g., E1, E2) that intersect q may contain qualifying points and are recursively searched until the leaf (data point) level. Entries not intersecting the query window (e.g., E3) are not visited. Another important type of spatial information processing is the nearest neighbor query, which retrieves the data point that is closest to a query point. Roussopoulos et al., [RKV95] propose a branch-and-bound algorithm that searches the R-tree in a depth-first manner. Specifically, starting from the root, all entries are sorted according to their minimum distance (mindist) from the query point, and the entry with the smallest value is visited first. The process is repeated recursively until the leaf level where the first potential nearest neighbor is found. During backtracking to the upper levels, the algorithm only visits entries whose mindist is smaller than the distance of the nearest neighbor already found. In the example of Figure 2, the algorithm first visits the root entry E1 (since it has the minimum mindist), and then E4, where the first candidate object (a) is retrieved. When backtracking to the previous level, entries E5 and E6 are excluded, since their mindist is greater than (or equal to) the distance of a. Then E2 and E8 are accessed, where the actual nearest neighbor (point h) is found. Samet and Hjaltason [HS99] develop an improved algorithm, which only visits nodes that may contain the actual nearest neighbor (i.e., in the previous example the algorithm would find point h, without first retrieving a). Both algorithms can be easily applied for the retrieval of k > 1 nearest neighbors. E2046810246810x axisy axisbEf1E2edcagE3E5E6E4E78E9iquery pointmindist( )E2h= mindist( )E8mindist( )E4= mindist( )amindist( )E1 Fig. 2: Nearest neighbor example The traditional scenario in spatial databases assumes that (i) queries are static and (ii) each query returns a single output and terminates. Consider now an alternative situation where a user with a location-aware mobile device poses a continuous query with respect to his/her current position (e.g., he wants to know the closest restaurant as he moves along). The query is sent to the server, where it is processed and the result is returned to the client via the underlying wireless networks. Due to the mobility of the user, the result may be invalidated immediately as the user's position changes. The conventional approach to attain up-to-date information is to pose a new query to the server after a position update, which could lead to high network overhead and extra processing effort. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. ACM SIGMOD'2003, June 9-12, San Diego, California, USA. Copyright 2003 ACM 1-58113-634-X/03/06...$5.00. 443443Assume in Figure 3 that the user (at position q) issues a nearest neighbor query returning point o. The motivation of this work is that while the user remains in a certain area around the initial position, called the validity region (shaded area), the result remains the same. In addition to the query result, the server has to return the validity region of the query, according to which the mobile client is able to determine whether a new query should be issued by verifying whether it is still inside the validity region. Given that the mobile clients have limited storage and processing capabilities compared to the server, it pays off to perform additional processing on the server side (for the initial query) in order to reduce the number of subsequent queries. Furthermore, the representation of the validity region should be compact in order to reduce the network cost, the storage requirements and the computation that must be performed at the client side. validityregionqo Fig. 3: Validity region In this paper, we study the problem of determining
View Full Document