U-M EECS 582 - Crowdsourcing for Large-Scale Pervasive Sensing - D3004220

Home> Schools> University of Michigan> Electrical Engineering And Computer Science (EECS) > EECS 582> Crowdsourcing for Large-Scale Pervasive Sensing

DOC PREVIEW

U-M EECS 582 - Crowdsourcing for Large-Scale Pervasive Sensing

School name University of Michigan

Course Eecs 582- Adv Operat Sys

Pages 2

This preview shows page 1 out of 2 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 2 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 2 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Crowdsourcing for Large-Scale Pervasive SensingDeepak Ganesan, Mark CornerDepartment of Computer ScienceUniversity of Massachusetts Amherst, MA 01003{dganesan,mcorner}@cs.umass.eduCrowdsourcing, or the act of outsourcing a task to the crowd, has the potential to revolutionize informationcollection and processing systems by enabling in-depth, large-scale, cost-effective information gathering,and more accurate techniques for information extraction from data. Crowdsourcing provides a powerfulmechanism for creating data about the physical world, particularly through the use of mobile phonesand their rich set of on-board sensors (GPS, audio, video, etc). These sensors can be utilized to providecontinuous and unprecedented visibility into the state of the world across many scales. Crowdsourcing isalso effective when humans are better than existing automated computer algorithms, for example labelingimages, transcribing speech, annotating text, transcribing scanned documents, language translation, andothers. Our research seeks to leverage crowdsourcing to address several hard challenges in pervasivecomputing systems.Crowdsourcing for Data Collection Despite the ability to leverage app markets for smartphones, re-searchers still face a significant barrier to doing experiments at scale. The barriers to scaling are many:1) competition with hundreds of thousands of applications on mobile app markets, 2) investment in timeand energy to design a robust, scalable, and visually appealing application and backend infrastructure, 3)limited retention from users, who rarely use applications beyond a few weeks, 4) handling human subjects,privacy, incentives, data quality, and several other challenges that are intrinsic to the use of the crowds,and others. Thus, while the idea of scaling to millions of devices is appealing, the pervasive computingcommunity still largely relies on expensive and short-term user studies with limited numbers of users.mCrowd1is a research enabler for pervasive computing at scale, inspired by crowdsourcing systems suchas the Amazon Mechanical Turk. mCrowd enables a researcher to rapidly create "tasks" involving datacollection from users and sensors on the phone such as audio, video, images, surveys, GPS traces, wirelessconnectivity traces, and others. Users who download our application for mobile phones can participatein any of these data collection efforts, in return for a reward. mCrowd will also provide several APIs forincentives, privacy, data quality management, and sensor processing that will enable a user to leverage theframework for their specific data collection goals. By providing access to a large set of people who arewilling to participate in research studies for low pay, we hope to spur research innovation in a variety ofdisciplines that can use data from people and phones.Data collection from the masses opens up a spectrum of research problems. How can we incentivizesensor data collection to scale to millions of users? What is the relationship between reward and delay?How can we use the system for diverse end-applications such as healthcare, traffic monitoring, citizenjournalism, and others? How can we design techniques to process and filter mobile sensor data to extractuseful, actionable information? How can we filter redundant information and spam? How can we handle1http://crowd.cs.umass.edu1malicious users whose only goal is to maximize their rewards? How can we target information gatheringto the specific users who might have the most valuable data for a task? What are the implications onprivacy? While such questions have been looked at in narrower user studies in the past, mCrowd enablesresearchers to explore them in the context of a real, crowdsourced marketplace at significantly larger scalethan previously possible.Crowdsourcing for Data Processing While the ability to generate sensor data on mobile devices hasincreased dramatically, our ability to process and filter this data to extract useful, actionable informationis still a major challenge. Sensor data from a phone presents several data quality challenges such asblurry, dark, or out-of-focus images, high background noise and clutter in audio, GPS error, time-varyingsensor orientation, sensor calibration issues, improper sensor placement, missing samples, and others. Inaddition to these intrinsic quality issues with data from an individual device, data collection from themasses presents challenges due to the need to filter redundant submissions and spam, and to verify theauthenticity of the data. Thus, addressing data quality looms as one of the biggest challenges in handlingthe torrent of sensor data from mobile devices.We argue that one of the limitations of existing mobile data processing systems is that they are focusedsolely on the "computation" aspect i.e. more sophisticated data processing techniques, and the use of largecomputing resources in the cloud. In contrast to automated processing, humans are surprisingly goodat filtering sensor data and identifying the most relevant information. Crowdsourced data processing isparticularly effective when humans are better than existing automated computer algorithms, for examplelabeling images, transcribing audio, and others. It is precisely this ability that we seek to tap into in-orderto design a human-in-the-loop mobile sensor data processing system.While cloud computing and crowdsourcing have been used largely separate of one another, we believe thatthey are more powerful in conjunction than isolation. In particular, we believe that new mobile serviceswill involve tight integration of clouds and crowds in a feedback-driven manner — where clouds use state-of-art algorithms to process sensor data, but use crowds when their confidence in the result is low — andthe crowds provide feedback to the clouds on the quality of results to enable continuous improvement inalgorithm parameters. We envison an integrated information architecture that combines clouds and crowdsto enable in-depth, large-scale, and cost-effective information gathering, and more accurate techniques forinformation extraction from data.Our vision is to develop and implement a comprehensive data quality assessment, filtering, cleaning,aggregation, and validation framework that combines sophisticated computational data processing in con-junction with human computation, and that can be used on a wide range of mobile services such as partic-ipatory sensing, mobile multimedia search and

View Full Document