This preview shows page 1-2-16-17-18-34-35 out of 35 pages.

Save
View full document
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
View full document
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience
Premium Document
Do you want full access? Go Premium and unlock all 35 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

http://www.cs.umd.edu/class/spring2002/cmsc828g/project.htmweightweightweightweight+ =http://trochim.human.cornell.edu/kb/measlevl.htmHere, numerical values just "name" the attribute uniquely. No ordering impliedI.e. jersey numbers in basketball; a player with number 30 is not moreof anything than a player with number 15; certainly not twice whatever number 15 is.ordinal measurement - attributes can be rank-ordered. Distances between attributes do not have any meaning. i.e., on a survey you might code Educational Attainment as 0=less than H.S.; 1=some H.S.; 2=H.S. degree; 3=some college; 4=college degree; 5=post college. In this measure, higher numbers mean more education. But is distance from 0 to 1 same as 3 to 4? No. The interval between values is not interpretable in an ordinal measure.interval measurement - distance between attributes does have meaning. i.e., when we measure temperature (in Fahrenheit), the distance from 30-40 is same as distance from 70-80. The interval between values is interpretable. average makes sense, however ratios don't - 80 degrees is not twice as hot as 40 degreesratio measurement - an absolute zero that is meaningful. This means that you can construct a meaningful fraction (or ratio) with a ratio variable. Weight is a ratio variable. In applied social research most "count" variables are ratio, for example, the number of clients in past six months. Why? Because you can have zero clients and because it is meaningful to say that "...we had twice as many clients in the past six months as we did in the previous six months."Hierarchy of Measurementsconsider new order preserving mapping: pain 1-10 pain 1-20; 1→1, 2→2, 3→3, 4→4, 5→5, 6→12≥≤))i(x,),i(x),i(x()i(xp21=21p1k2kkE))j(x)i(x()j,i(d −==( )21n1i2kkkx)i(xn1ˆ −=σ===n1ikk)i(xn1x21p1k2kkkWE))j(x)i(x(w)j,i(d −==height(i)height(j)diameter(i)diameter(j)height2(i)height100(i)…height2(j)height100(j)…=−−=n1i)y)i(y)(x)i(x(n1)Y,X(Cov21n1i22n1i)y)i(y()x)i(x()y)i(y)(x)i(x()Y,X( −−−−===ρbusiness acreagenitrous oxidepercentage of large residential lots+1 0 -1data on characteristicsof Boston surburbsYXρ(X,Y) = ?linear covariance, correlationAre X and Y dependent?( ) ( )()211TMH)j(x)i(x)j(x)i(x)j,i(d −Σ−=−1. It automatically accounts for the scaling of the coordinate axes2. It corrects for correlation between the different features Price:1. The covariance matrices can be hard to determine accurately2. The memory and time requirements grow quadratically rather than linearly with the number of features.λ∞λλ1p1kkk))j(x)i(x()j,i(d −===−=p1kkk)j(x)i(x)j,i(d)j(x)i(xmax)j,i(dkkk−=000110110011nnnnnn++++01101111nnnn++p1p)p(itlog−=flattenProblems:– introduces statistical skew– loses relational structure• incapable of detecting link-based patterns– must fix attributes in advance• Principles of Data Mining, Hand, Mannila, Smyth. MIT Press, 2001.• Trochim, William M. The Research Methods Knowledge Base, 2nd Edition. (version current as of 2001). • Pattern Recognition for HCI. Richard Duda,


View Full Document

UMD CMSC 828G - Lecture 2

Documents in this Course
Load more
Download Lecture 2
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Lecture 2 and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Lecture 2 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?