Unformatted text preview:

Evaluation-Quantitative 1BenBederson/SaulGreenbergQuantitative EvaluationWhat is experimental design?What is an experimental hypothesis?How do I plan an experiment?Why are statistics used?What are the important statistical methods?BenBederson/SaulGreenbergQuestion: Which size grid is better?Evan GolubEvaluation-Quantitative 2BenBederson/SaulGreenbergQuestion: Which menu placement system is better?Top of WindowTop of ScreenEvan GolubBenBederson/SaulGreenbergQuantitative methods1. User performance data collection• data is collected on system use- frequency of request for on-line assistancewhat did people ask for help with?- frequency of use of different parts of the systemwhy are parts of system unused?- number of errors and where they occurredwhy does an error occur repeatedly?- time it takes to complete some operationwhat tasks take longer than expected?• collects heaps of data in the hope that something interesting shows up• often difficult to sift through data unless specific aspects are targeted- as in list aboveEvaluation-Quantitative 3BenBederson/SaulGreenbergQuantitative methods ...2. Controlled experimentsThe traditional scientific method• reductionist- clear convincing result on specific issues•InHCI:- insights into cognitive process, human performance limitations, ...- allows comparison of systems, fine-tuning of details ...Strives for• lucid and testable hypothesis• quantitative measurement• measure of confidence in results obtained (statistics)• repeatability of experiment• control of variables and conditions• removal of experimenter biasBenBederson/SaulGreenbergThe experimental methoda) Begin with a lucid, testable hypothesis•Example1:“ there is no difference in the number of cavities in children and teenagersusingcrestandno-teethtoothpaste”Evaluation-Quantitative 4BenBederson/SaulGreenbergThe experimental methoda) Begin with a lucid, testable hypothesis•Example2:“ there is no difference in user performance (time, error rate, and subjectivesatisfaction) when selecting a single item from a pop-up or a pull downmenu, regardless of the subject’s previous expertise in using a mouse orusing the different menu types”File Edit View InsertNewOpenCloseSaveFileEditViewInsertNewOpenCloseSaveBenBederson/SaulGreenbergThe experimental method...b) Explicitly state the independent variables that are to be alteredindependent variable- the things you manipulate independent of how a subject behaves- determines a modification to the conditions the subjects undergo- may arise from subjects being classified into different groupsin toothpaste experiment- toothpaste type: uses Crest or No-teeth toothpaste-age: <=11yearsor >11yearsin menu experiment- menu type: pop-up or pull-down- menu length: 3, 6, 9, 12, 15- subject type (expert or novice)Evaluation-Quantitative 5BenBederson/SaulGreenbergThe experimental method...c) Carefully choose the dependent variables that will be measuredDependent variables• variables dependent on the subject’s behaviour / reaction to the independentvariablein toothpaste experiment• number of cavities• frequency of brushingin menu experiment• time to select an item• selection errors made• Subjective satisfaction as reported in a questionnaireBenBederson/SaulGreenbergThe experimental method...d) Judiciously select and assign subjects to groupsWays of controlling subject variability• recognize classes and make them an independent variable• minimize unaccounted anomalies in subject group- superstars versus poor performers• use reasonable number of subjects and random assignmentNoviceExpertEvaluation-Quantitative 6BenBederson/SaulGreenbergThe experimental method...e) Control for biasing factors• unbiased instructions + experimental protocols- prepare ahead of time• double-blind experiments, ...Now you get to do thepop-up menus. I thinkyou will really like them...I designed them myself!BenBederson/SaulGreenbergThe experimental method...f) Apply statistical methods to data analysis• Confidence limits: the confidence that your conclusion is correct- “The hypothesis that mouse experience makes no difference isrejected at the .05 level”- “Expert mouse users can use pull-down menus 15% faster than novicemouse users, and that result is statistically significant”- means:a 95% chance that your statement is correcta 5% chance you are wrongg) Interpret your results• what you believe the results mean, and their implicationsEvaluation-Quantitative 7BenBederson/SaulGreenbergStatistical AnalysisCalculations that tell us• mathematical attributes about our data sets- mean, amount of variance, ...• how data sets relate to each other- whether we are “sampling” from the same or different distributions• the probability that our claims are correct- “statistical significance”BenBederson/SaulGreenbergStatistical significance vs Practical significancewhen n is large, even a trivial difference may be large enough to producea statistically significant result•egmenuchoice:mean selection time of menu a is 3 seconds;menu b is 3.05 secondsStatistical significance does not imply that the difference is important!• a matter of interpretationEvaluation-Quantitative 8BenBederson/SaulGreenbergExample: Differences between meansGiven: two data sets measuring a condition• eg height difference of males and femalestime to select an item from different menu styles ...Question:• is the difference between the means of the data statistically significant?Null hypothesis:• there is no difference between the two means• statistical analysis can only reject the hypothesis at a certain level ofconfidenceBenBederson/SaulGreenbergExample:Is there a significant differencebetween the means?Condition one: 3, 4, 4, 4, 5, 5, 5, 6Condition two: 4, 4, 5, 5, 6, 6, 7, 70123Condition 1Condition 10123Condition 2Condition 234 5 67mean = 4.5mean = 5.534 5 67Evaluation-Quantitative 9BenBederson/SaulGreenbergThe problem with visual inspection of dataThere is almost always variation in the collected dataDifferences between data sets may be due to:• normal variation- eg two sets of ten tosses with different but fair dicedifferences between data and means are accountable by expected variation• real differences between data- eg two sets of ten tosses for with loaded dice and fair dicedifferences between data and means are not accountable by expected variationBenBederson/SaulGreenbergChoice of significance levels and two types of errorsType


View Full Document
Download Quantitative evaluation
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Quantitative evaluation and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Quantitative evaluation 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?