# MIT HST 950J - Homework 3 (3 pages)

Previewing page 1 of 3 page document
View Full Document

## Homework 3

Previewing page 1 of actual document.

View Full Document
View Full Document

## Homework 3

115 views

Problems/Exams

Pages:
3
School:
Massachusetts Institute of Technology
Course:
Hst 950j - Biomedical Computing
##### Biomedical Computing Documents
• 4 pages

• 17 pages

• 36 pages

• 3 pages

• 38 pages

• 34 pages

• 35 pages

• 50 pages

• 40 pages

• 15 pages

• 26 pages

• 36 pages

• 58 pages

• 47 pages

• 12 pages

• 41 pages

• 39 pages

Unformatted text preview:

6 872 HST950 Problem Set 3 Due LEC 15 Homework 3 1 Identification Problem In an earlier lecture we had outlined a theory of record linkage the full paper is linked from our class schedule page that tells us in principle how to do probabilistic matching of various features of two objects in order to decide whether they are likely to be the same object Briefly the theory is as follows I have interspersed questions for you to answer with the description 1 Given two purported objects e g patients o1 and o2 it is either the case that o1 o2 or that they are distinct individuals For example our records contain a patient file for Raul A Jones of 123 Main Street Boston MA 02131 a new patient arrives claiming to be Raul Jones of 123 Main Street Boston MA 02113 2 Among all the observations we might make of o1 and o2 we select a certain set of features fi o that we agree will be of interest For example we might choose last name first and middle names street address city and ZIP code 3 For each pair of features fi o1 fi o2 we can compare the probability that one would observe fi o1 fi o2 in either of the two cases of step 1 For example assuming that half the hospital s patient population have home addresses in Boston then P fcity Raul fcity Jones same is B y contrast if these two records belong to the same person then we would just expect that the probability that that person lives in Boston is Thus the likelihood ratio 1 p Boston Boston same 2 2 Further if 1 of people in the city live on Main p Boston Boston same 1 4 p Main Main same 0 01 St then 100 We may get an additional likelihood p Main Main same 0 012 ratio of 1000 say for the address 123 and another factor of say 75 for both states being MA These are both estimates and answer the question what fraction of all addresses is 123 or what fraction of individuals like in MA If our initial database contains records on 1M individuals then we might argue that the a priori odds are p same essentially 1 1M 10 6 If we assume

View Full Document

Unlocking...