GT CS 8803 - CS 8803 – Advanced Internet Architecture
School name Georgia Tech
Pages 7

Unformatted text preview:

CS 8803 – Advanced Internet Architecture Spring 2006 Page 1 of 7 What Where Wi: An analysis of information leaked by millions of wireless access points Kipp Jones [email protected] Abstract Wireless networks have sprung up by the millions throughout the world as consumers and businesses move towards a mobile, networked world. These access points (APs) are installed and managed by individuals and businesses and are unregulated, allowing anybody to install and operate one of these radio devices. This has allowed literally millions of these APs to become available and ‘visible’ to any interested party who happens to be within range of the radio waves that are emitted from the device. There has been much written about the security risks [1][2][4][5][8] associated with wireless access points, and many of them operate in a ‘unsecured’ mode which does not require authentication to use. Research to date has focused on two primary topics: protecting access to these wireless networks and maintaining the privacy of the users of these networks. But there is another potential for information leakage that is accessible even for those not interested in directly accessing the network. These wireless access points emit a certain amount of information that can be combined with the physical location and potentially used for other purposes. This proposed research will explore the possibilities for using this information to discover patterns, analyze behavior, explore naming and location information that can be discerned by gathering information about these access points over time. The goal is to explore the “What, Where, and Why” of the WiFi access points and their information. Motivation We have obtained the rights to analyze information regarding over 3.25 million 802.11 wireless access points. This data has been gathered over approximately one year and corresponds to the systematic scanning in some 75 cities throughout the United States as shown in Figure 1. This scanning process has produced the correlation of each access point information with its GPS location and AP signal strength information.CS 8803 – Advanced Internet Architecture Spring 2006 Page 2 of 7 Figure 1. Skyhook Wireless coverage areas. Cities in red are in progress while blue cities have been completed. The dataset (as detailed in Table 1) is comprised of GPS and WiFi access point logs obtained by Skyhook Wireless. Skyhook has granted us the ability to use this data to conduct research (with basic protections for their data being a requirement). We have access to both the raw data as well as the processed data indicating the location of these access points as calculated. In addition, the data includes information that indicates the amount of motion that these APs experience over time, whether due to calculation error or due to physical movement of the access points. Information related to each access point includes the Service Set Identifier (SSID), the MAC address, the geographic location by longitude and latitude, and the dates the AP was first and last scanned. Table Name Description Number of Records CentralAP Contains unique access points and their calculated geographical location 3,252,883 ChangeAP Contains location adjustments to APs over time 2,957,034 RawScanningLogs Contains the AP scan records from drivers 817,838,373 ScannerGpsLogs Contains original GPS logs from drivers 760,369,932 Table 1. Description of data available for direct analysis. Beyond the fact that this data is available for analysis, there are several other motivating factors that include: • The company that generated the data, Skyhook Wireless, is interested in learning more about the value of the data and ways to use the information to improve their service; • There are potentially interesting privacy and/or security implications, especially in the naming of access points that should be explored;CS 8803 – Advanced Internet Architecture Spring 2006 Page 3 of 7 • The fact that these access points are associated with location information could be used to infer additional 3rd party details by means of data fusion; • Companies such as FON [3] are reliant on the installation of their software on these access points and having sufficient coverage to provide their service, analyzing the manufacturer per region may yield good marketing, engineering requirements; In short, the increase in number of access points is not going to stop. Nor is the gathering of information about these access points. This project intends to explore what value (good or bad) the information that is leaked by these devices can provide. Approach and Project Plan The general approach will be to identify items of interest and then create programs to analyze, calculate, and potentially map the results using an interactive process. In essence, this project will create a ‘Mashup’ [7] of wireless positioning data that will be mapped using the Google Maps API. Project Outline The following are the general steps that will be followed to accomplish this goal: 1. Obtain and validate data and rights to data 2. Install and load DB on a local system 3. Perform initial hand analysis to identify priority items (see below for candidates and initial order) 4. Create mapping Mashup visualization 5. Create interactive analysis interface 6. Evaluate the resulting information to discern items of value Key elements of this outline will be detailed in the following section. Project Schedule Milestone Description March 1 Preparation March 13 Pre-analysis March 27 Interactive Interface April 12 Initial Results April 24 Final Project Delivery Table 2. Key project milestones.CS 8803 – Advanced Internet Architecture Spring 2006 Page 4 of 7 Preparation Preparation consists of obtaining final rights for the data, installing and loading the database, installing and configuring a web server. These tasks are scheduled to be completed by the end of February. Pre-analysis During the pre-analysis phase, the data will be examined further and candidates for analysis will be prioritized. This step will require some manual processing of the data and will serve as the prototyping of the interactive system. Some options that will be analyzed further to determine the potential value and possibilities for competition include: Mine the data for interesting information: • Manufacturer based on MAC address [9] • Location information in


View Full Document

GT CS 8803 - CS 8803 – Advanced Internet Architecture

Download CS 8803 – Advanced Internet Architecture
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view CS 8803 – Advanced Internet Architecture and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view CS 8803 – Advanced Internet Architecture 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?