Unformatted text preview:

PrivacyWhat is PrivacySome Privacy concernsData Mining as a Threat to PrivacySome Privacy Problems and Potential SolutionsPrivacy Constraint ProcessingArchitecture for Privacy Constraint ProcessingSemantic Model for Privacy ControlPrivacy Preserving Data MiningCryptographic Approaches for Privacy Preserving Data MiningCryptographic Approaches for Privacy Preserving Data MiningPerturbation Based Approaches for Privacy Preserving Data MiningSlide 13Perturbation Based Approaches for Privacy Preserving Data MiningSlide 15CPT: Confidentiality, Privacy and TrustPlatform for Privacy Preferences (P3P): What is it?Platform for Privacy Preferences (P3P): OrganizationsPlatform for Privacy Preferences (P3P): SpecificationsP3P and Legal IssuesPrivacy for Assured Information SharingPrivacy Preserving SurveillanceDirections: Foundations of Privacy Preserving Data MiningDirections: Testbed Development and Application ScenariosKey PointsApplication Specific Privacy?Data Mining and Privacy: Friends or Foes?PrivacyProf. Bhavani ThuraisinghamThe University of Texas at DallasJuly 2011What is PrivacyMedical Community-Privacy is about a patient determining what patient/medical information the doctor should be released about him/herFinancial community-A bank customer determine what financial information the bank should release about him/herGovernment community-FBI would collect information about US citizens. However FBI determines what information about a US citizen it can release to say the CIASome Privacy concernsMedical and Healthcare-Employers, marketers, or others knowing of private medical concernsSecurity-Allowing access to individual’s travel and spending data-Allowing access to web surfing behaviorMarketing, Sales, and Finance-Allowing access to individual’s purchasesData Mining as a Threat to PrivacyData mining gives us “facts” that are not obvious to human analysts of the dataCan general trends across individuals be determined without revealing information about individuals?Possible threats:-Combine collections of data and infer information that is private Disease information from prescription dataMilitary Action from Pizza delivery to pentagonNeed to protect the associations and correlations between the data that are sensitive or privateSome Privacy Problems and Potential SolutionsProblem: Privacy violations that result due to data mining-Potential solution: Privacy-preserving data miningProblem: Privacy violations that result due to the Inference problem-Inference is the process of deducing sensitive information from the legitimate responses received to user queries-Potential solution: Privacy Constraint ProcessingProblem: Privacy violations due to un-encrypted data-Potential solution: Encryption at different levelsProblem: Privacy violation due to poor system design-Potential solution: Develop methodology for designing privacy-enhanced systemsPrivacy Constraint ProcessingPrivacy constraints processing-Based on prior research in security constraint processing -Simple Constraint: an attribute of a document is private-Content-based constraint: If document contains information about X, then it is private-Association-based Constraint: Two or more documents taken together is private; individually each document is public-Release constraint: After X is released Y becomes privateAugment a database system with a privacy controller for constraint processingArchitecture for Privacy Constraint ProcessingUser Interface ManagerConstraintManagerPrivacy ConstraintsQuery Processor:Constraints during query and release operationsUpdate Processor:Constraints during update operationDatabase Design ToolConstraints during database design operationDatabaseDBMSSemantic Model for Privacy ControlPatient JohnCancerInfluenzaHas diseaseTravels frequentlyEnglandaddressJohn’s addressDark lines/boxes containprivate informationPrivacy Preserving Data MiningPrevent useful results from mining -Introduce “cover stories” to give “false” results -Only make a sample of data available so that an adversary is unable to come up with useful rules and predictive functionsRandomization-Introduce random values into the data and/or results-Challenge is to introduce random values without significantly affecting the data mining results-Give range of values for results instead of exact valuesSecure Multi-party Computation-Each party knows its own inputs; encryption techniques used to compute final resultsCryptographic Approaches for Privacy Preserving Data Mining Secure Multi-part Computation (SMC) for PPDM-Mainly used for distributed data mining.-Provably secure under some assumptions.-Learned models are accurate-Efficient/specific cryptographic solutions for many distributed data mining problems are developed.-Mainly semi-honest assumption (i.e. parties follow the protocols)-Malicious model is also explored recently. (e.g. Kantarcioglu and Kardes paper in this workshop)-Many SMC based PPDM algorithms share common sub-protocols (e.g. dot product, summation, etc. )Cryptographic Approaches for Privacy Preserving Data MiningDrawbacks:-Still not efficient enough for very large datasets. (e.g. petabyte sized datasets ??)-Semi-honest model may not be realistic -Malicious model is even slowerPossible new directions -New models that can trade-off better between efficiency and security-Game theoretic / incentive issues in PPDM-Combining anonymization and cryptographic techniques for PPDMPerturbation Based Approaches for Privacy Preserving Data MiningGoal: Distort data while still preserve some properties for data mining propose. −Additive Based −Multiplicative Based−Condensation based −Decomposition −Data SwappingPerturbation Based Approaches for Privacy Preserving Data MiningGoal: Achieve a high data mining accuracy with maximum privacy protection.Perturbation Based Approaches for Privacy Preserving Data Mining Privacy is a personal choice, so should enable individual adaptable (Liu, Kantarcioglu and Thuraisingham ICDM’06)Perturbation Based Approaches for Privacy Preserving Data MiningThe trend is to make PPDM approaches fit in the realityWe investigated perturbation based approaches with real-world data setsWe give a applicability study to the current approaches-Liu, Kantarcioglu and Thuraisingham, DKE 07We found out, -The reconstruction the original distribution may not work well with real-world data set-Distribution is a


View Full Document

UTD CS 6V81 - Privacy

Documents in this Course
Botnets

Botnets

33 pages

Privacy

Privacy

27 pages

Load more
Download Privacy
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Privacy and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Privacy 2 2 and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?