DOC PREVIEW
UW CSEP 590 - Lecture Notes

This preview shows page 1-2-3 out of 8 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 8 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

1MobileASL: Making Cell Phones Accessible to the Deaf CommunityRichard LadnerUniversity of Washington2American Sign Language (ASL)• ASL is the preferred language for about 500,000 - 1,000,000 Deaf people in the U.S and most of Canada.• ASL is not a code for English• Signs usually occur within the “sign-box”• Composed of location, orientation, shape of hands and arms + facial expressions• Usually uses 2 hands, but one-handed signing not uncommon3Current Technology for Deaf People(text)TTYSidekicks and Blackberries (text, pictures, non-real-time video)Benefits:Low bandwidthMobile (PDAs)Problems:English, not ASL4Current Technology for Deaf People(video phones)Set-top boxesWeb camsBenefits:ASL, not EnglishProblems:Requires high bandwidthNot mobile5Challenges:• Limited network bandwidth• Limited processing power on cell phonesOur goal:• ASL communication using video cell phones over current U.S. cell phone network6Architecture CameraEncoderTransmitterSenderPlayerDecoderReceiverReceiverCell Phone NetworkCell phone user interface27Cell Phone Network Constraints• MobileASL is about fair access to the current network– As soon as possible, no special accommodations• Low bit rate constraint– GPRS - Ranges from 30kbps to 80kbps (download)• Low Power– Cell phones run at much lower Hz then PCs • New mobile broadband services– Higher bandwidth for download, not upload.8What about 3G?9Portrait• Special Codec from Microsoft Asia• Low Bandwidth, Low Power, small size video (160 x 120)• May not be suitable for sign languageKeman Yu, Jiangbo Lv, Jiang Li and Shipeng Li, 200310Codec Used: x264*• Open source implementation of H.264 standard• Doubles compression ratio over MPEG2 • x264 offers faster encoding• Main profile• Off-the-shelf H.264 decoder can be used*The code is written from scratch by Laurent Aimar, Loren Merritt, Eric Petis, Min Chen, Justin Clay, Mans Rullgard, Radek Czyz, Christian Heine, Alex Izvorski, and Alex Wright. It is released under the terms of the GPL license. 11Outline• Motivation• Introduction• User Studies• Rate, distortion, complexity optimization• X264 implementation• User Interface• Current and future research12MobileASL Focus Group• 4 Deaf people, mid-20s to mid-40s, • Open ended questions:– Physical Setup• Camera, distance, …– Features• Compatibility, text, …– Privacy Concerns• ASL is a visual language– Scenarios• Lighting, driving, relay services, …313Implications of Focus Group• “I don’t foresee any limitations. I would use the phone anywhere: the grocery store, the bus, the car, a restaurant, … anywhere!”• There is a need within the Deaf Community for mobile ASL conversations• Existing video phone technology (with minor modifications) would be usable14Eyetracking Studies• Participants watched ASL videos while eye movements were tracked• Important regions of the video could be encoded differently* Muir et al. (2005) and Agrafiotis et al. (2003)15Eyetracking Results• 95% of eye movements within 2 degrees visual angle of the signer’s face (demo)• Implications: Face region of video is most visually important– Detailed grammar in face requires foveal vision– Hands and arms can be viewed in peripheral vision* Muir et al. (2005) and Agrafiotis et al. (2003)16Mobile Video Phone Study• 3 Region-of-Interest (ROI) values• 2 Frame rates, frames per second (FPS)• 3 different Bit rates– 15 kbps, 20 kbps, 25 kbps• 18 participants (7 women)– 10 Deaf, 5 hearing, 3 CODA*– All fluent in ASL* CODA = (Hearing) Child of a Deaf Adult17Example of ROIVaried quality in fixed-sized region around the face• (demo)2x quality in face4x quality in face18Examples of FPS• Varied frame rate: 10 fps and 15 fps• For a given bit rate:Fewer frames = more bits per frame• (demo)419Questionnaire20User Preferences Results1234515 kbps 20 kbps 25 kbps 10 fps 15 fps 0 roi 6 roi 12 roiType of EncodingAverage Participant ResponseBit Rate Frame Rate Region of Interest21Implications of results• A mid-range ROI was preferred– Optimal tradeoff between clarity in face and distortion in rest of “sign-box”• Lower frame rate preferred– Optimal tradeoff between clarity of frames and number of frames per second• Results independent of bit rate22Outline• Motivation• Introduction• User studies• Rate, distortion, complexity optimization• X264 implementation• User Interface• Current and future research23Rate, distortion and complexity optimizationH.264 encoderH.264 H.264 encoderencoderInputparametersRaw videoCompressed video• Objective: Achieve best possible quality for least encoding time at a given bitrate24Parameter SettingsH.264 encoderH.264 encoderH.264 encoderDistortionEncoding timeinput parameters # of options# of reference frames 16motion estimation 7partition size 10quantization method 3Total = 16x7x10x3 = 3360 tests/video clipInputparametersRaw video525Time – Complexity Tradeoff30 kbps10 ASL videos26GBFOS Approach• Choose input parameter that minimizes the slope on the convex hull and repeat.• Parameter settings are not independent.• Basic – Compute slopes once.• Iterative – Recompute slopes after each parameter is chosen.Chou, Lookabaugh, Gray, 198927PSNR vs. Average Encoding Time28Outline• Motivation• Introduction• User studies• Rate, distortion, complexity optimization• X264 implementation• User Interface• Current and future research29Encoding/Decoding on the Cell Phone• Implemented a command-line version of x264 on a cell phone using Windows Mobile Edition 5.0.• Required significant modifications to the Linux based x264 codec.30Encoding performance for high/medium/low quality settings with and without code optimization0123456789101112131415161718192021222324252627high settingsQVGAmed settingsQVGAlow settingsQVGAlowest settingsQVGAhigh settingsQCIFmed settingsQCIFlow settingsQCIFlowes t settingsQCIFframes/second (average)Unoptimized Wireless MMX optimized320 x 240176 x 144631Examples of Low Frame Rates• Demo32Outline• Motivation• Introduction• User studies• Rate, distortion, complexity optimization• X264 implementation• User Interface• Current and future research33User Interface Design: Goals• Usable, intuitive, easy to learn• Inspired by Deaf users• Utilize existing knowledge (VP, Webcam, Sorenson …)• Design stages:– Story boards– Paper prototype


View Full Document

UW CSEP 590 - Lecture Notes

Documents in this Course
Sequitur

Sequitur

56 pages

Sequitur

Sequitur

56 pages

Protocols

Protocols

106 pages

Spyware

Spyware

31 pages

Sequitur

Sequitur

10 pages

Load more
Download Lecture Notes
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture Notes and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture Notes 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?