1CS490D:Introduction to Data MiningProf. Chris CliftonApril 12, 2004Multi-Relational Data MiningWhat is MRDM?• Problem: Data in multiple tables– Want rules/patterns/etc. across tables• Solution: Represent as single table– Join the data– Construct a single view– Use standard data mining techniques• Example: “Customer” and “Married-to”– Easy single-table representation• Bad Example: Ancestor of2Basis of Solutions:Inductive Logic Programming• ILP Rule:– customer(CID,Name,Age,yes) Age > 30 ∧ purchase(CID,PID,D,Value,PM) ∧PM = credit card ∧ Value > 100• Learning methods:– Database represented as clauses (rules)– Unification: Given rule (function/clause), discover values for which it holdsExample• How do we learn the “daughter” relationship?– Is this classification? Association?• Covering Algorithm: “guess” at rule explaining only positive examples– Remove positive examples explained by rule– Iterate3How to make a good “guess”• Clause subsumption: Generalize– More general clause (daughter(mary,Y) subsumes daughter(mary,ann)• Start with general hypotheses and move to more specificIssues• Search space – efficiency• Noisy data– positive examples labeled as negative– Missing data (e.g., a daughter with no parents in the database)• What else might we want to learn?4WARMR: Multi-relational association rulesMulti-Relational Decision
View Full Document