View Full Document

Collective Entity Resolution in Relational Data



View the full content.
View Full Document
View Full Document

3 views

Unformatted text preview:

Collective Entity Resolution in Relational Data INDRAJIT BHATTACHARYA and LISE GETOOR University of Maryland College Park Many databases contain uncertain and imprecise references to real world entities The absence of identifiers for the underlying entities often results in a database which contains multiple references to the same entity This can lead not only to data redundancy but also inaccuracies in query processing and knowledge extraction These problems can be alleviated through the use of entity resolution Entity resolution involves discovering the underlying entities and mapping each database reference to these entities Traditionally entities are resolved using pair wise similarity over the attributes of references However there is often additional relational information in the data Specifically references to different entities may co occur In these cases collective entity resolution in which entities for co occurring references are determined jointly rather than independently can improve entity resolution accuracy We propose a novel relational clustering algorithm that uses both attribute and relational information for determining the underlying domain entities and we give an efficient implementation We investigate the impact that different relational similarity measures have on entity resolution quality We evaluate our collective entity resolution algorithm on multiple real world databases We show that it improves entity resolution performance over both attribute based baselines and over algorithms that consider relational information but do not resolve entities collectively In addition we perform detailed experiments on synthetically generated data to identify data characteristics that favor collective relational resolution over purely attribute based algorithms Categories and Subject Descriptors H 3 3 Information Storage and Retrieval Information Search and Retrieval Clustering H 2 8 Database Management Database Applications Data mining General Terms



Access the best Study Guides, Lecture Notes and Practice Exams

Loading Unlocking...
Login

Join to view Collective Entity Resolution in Relational Data and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Collective Entity Resolution in Relational Data and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?