Unformatted text preview:

Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26Slide 27Slide 28Karrie Karahalios, Eric Gilbert6 April 2007some slides courtesy of Brian Bailey and John Hartcs414empirical user studiesConduct user study to gain more precise measure of the usability of an interface or systemComplements low-fidelity techniquesRequires a larger investment than low-fi prototypingProvide positive experience for users!MessagesIn Context of Task-Centered UI DesignMeasure performance, error rate, learnability and retention, satisfaction, tolerable network delay…adapt to your particular interface and contextCompare results to usability goalsIdentify usability issues and resolve themEmpirical User StudiesDevelop materialsPrepare for the studyConduct the studyAnalyze results and iterateLearn from the experienceOverview of Doing Empirical User StudiesIdentify usability goalsDevelop experimental tasks and designRecruit usersInstrument software/hardwarePrepare for the StudyIdentify questions you want answeredquestions should be specific and measurableExamples:can a user perform each task in < 30s?after only five minutes of instruction, can a user perform each task with < 2 errors?are users rating the interface at least a ‘3’ for overall satisfaction on a 5-point scale?Identify Usability GoalsStructure of experimentwhat will users do, in what order, where, etc.Between groups (randomly assigned to treatment groups)Control groupExperimental groupWithin groupsEach user performs under all conditionsOrder randomizedCheaper because it uses fewer participantsDevelop Experimental DesignWhat gets changed and what is its effect?Independent variablesthe variables you manipulatee.g. # of menu items, lighting conditions, mouse vs. keysDependent variablesmeasured parte.g. speed of menu choice, reaction time to stimuliVariable type mattersdiscretecontinuousExperimental VariablesTypically want about 8 – 12 usersdepends on desired confidence in the results12 is the magic number for the ANOVA test (more later)This could be the most challenging aspect of the studyexpect about a 0.1% to 10% response ratemay need IRB approval, especially if you want to publishGive users a compelling reason to participateRecruit UsersDemographic DiversityIt is important to target your user population.example: if you are developing for Firefox, make sure that you use people already familiar with Firefox.Beyond that, it is also important to gain a diversity of different types of users:agesexeducationoccupation...can tell you important things about your system, and help you generalizeLog performance and errors (if possible)Determined media capture needsensure that you have access to equipmentmanage physical layout of the testing spaceAnything else that you need?Instrument Software/HardwareGive user an overview of the studyIntroduce your system, allow for practice Have users work through the tasksCollect experimental measures (e.g., performance and error data)Fill out questionnaire, if anyDebrief the userEntire session should last less than 60 minutesConduct the StudyPurpose of the study, but not necessarily details of what you are testingWhat they will be doing (the tasks)They are not being tested, the interface/system isThey can quit at anytime and will not affect relationship with you, the university, the company, etc.About the equipment in the roomWhether their face and/or actions will be recordedHow to think aloud (if you are collecting verbal data)If you will or will not be available to answer questionsTheir data will be viewed only in aggregate formHow long the session will takeTell the User At Least:Offer breaks at boundary pointsOffer to send results in aggregate form or allows users to see improved interfaceDevelop understandable instructionsDo not “defend” your interfaceDo not make subjective comments about users, ease or difficulty of tasks, etc.Make Users Feel ComfortableAnalyze data using statistical methods (ANOVAs and Chi-Squared tests common) take a stats course, e.g., Stat 320, for more detaildid you meet the goals? How from the goals are you?Analyze Results and Iteratet-tests and ANOVAst-tests compare two random samples and determine if the samples are statistically significantly differente.g., are dynamic menus better than static menus?ANOVAs (analysis of variance) compare n random samples and determine if the samples are statistically significantly differente.g., which is best: dynamic, static or radial menus?Both assume the samples come from normal distributions and both produce p-values..Bell curvey = exp(-x2)Occurs from sum of independent eventse.g. sum of dice rollsTotal time = t-find + t-home + t-clickTotal # of errorsNormal DistributionsNormal Distributions1σ 2πp-valuesprobability valueThe probability that the difference you observe in an experiment is due to random chanceAn expression of the confidence of your resultTypically, a difference is called statistically significant whenp < 0.05.Partial eta-squaredSome ANOVAs produce partial eta-squared values in addition to p-values.They are becoming widespread in HCI literature.You may see them soon in a usability report.Partial eta-squared values offer a practical measure of significance.Measure performance (time, error rate)Measure user satisfactionGive realistic experience of the interfacerealistic system responsemove among tasks seamlesslydesigners not in control, the user isFocus will be on the detailsmost big issues should already be resolvedAdvantages of Empirical User StudiesUsers typically must come to the labmakes it more difficult to recruit themusers may have anxietyLarge setup effort involvedsoftware instrumentation, hardware setup, questionnaire design, IRB approval, etc.Prototype may crashDisadvantages of Empirical User StudiesAn Example of How This Gets Used in Practice“The Impact of Delayed Visual Feedback on Collaborative Performance” by Darren Gergle, presented at CHI 06.What is the relationship between delayed visual feedback and collaboration? How much network delay can be tolerated?e.g, architectural planning, telesurgery and remote repairThe Collaborative Puzzle TaskThe experimental task was for a helper to guide a worker through a visual puzzle over a network connectionIndependent VariablesOnly one: visual delay in the helper’s view windowDelay sampled from this distribution [60 -


View Full Document

U of I CS 414 - Quantitative

Documents in this Course
Lecture 1

Lecture 1

32 pages

LECTURE

LECTURE

30 pages

Load more
Download Quantitative
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Quantitative and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Quantitative 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?