Digital ForensicsOutlineEmail ForensicsEmail InvestigationsClient/Server RolesEmail Crimes and ViolationsEmail ServersEmail Forensics ToolsWorm Detection: IntroductionEmail Worm Detection using Data MiningAssumptionsFeature setsData Mining ApproachData setOur Implementation and AnalysisMobile Device/System ForensicsMobile Device Forensics OverviewMobile PhonesAcquisition proceduresMobile Forensics ToolsPapers to discuss: November 10, 2008Papers to discuss November 17, 2008Digital ForensicsDr. Bhavani ThuraisinghamThe University of Texas at DallasApplication ForensicsNovember 5, 2008OutlineEmail ForensicsUTD work on Email worm detection - revisitedMobile System ForensicsNote: Other Application/systems related forensics-Database forensics, Network forensics (already discussed)Papers to discuss November 10, 2008 and November 17, 2008Reference: Chapters 12 and 13 of text bookOptional paper to read:-http://www.mindswap.org/papers/Trust.pdfEmail ForensicsEmail InvestigationsClient/Server rolesEmail crimes and violationsEmail serversEmail forensics toolsEmail InvestigationsTypes of email investigations-Emails have worms and viruses – suspicious emails-Checking emails in a crime – homicideTypes of suspicious emails-Phishing emails i- they are in HTML format and redirect to suspicious web sites-Nigerian scam-Spoofing emailsClient/Server RolesClient-Server architectureEmail servers runs the email server programs – example Microsoft Exchange ServerEmail runs the client program – example OutlookIdentitication/authntictaion is used for client to access the serverIntranet/Internet email servers-Intranet – local environment-Internet – public: example: yahoo, hotmail etc.Email Crimes and ViolationsGoal is to determine who is behind the crime such as who sent the emailSteps to email forensics-Examine email message-Copy email message – also forward email -View and examine email header: tools available for outlook and other email clients-Examine additional files such as address books-Trace the message using various Internet tools-Examine network logs (netflow analysis)Note: UTD Netflow tools SCRUB are in SourceForgeEmail ServersNeed to work with the network administrator on how to retrieve messages from the serverUnderstand how the server records and handles the messagesHow are the email logs created and storedHow are deleted email messages handled by the server? Are copies of the messages still kept?Chapter 12 discussed email servers by UNIX, Microsoft, NovellEmail Forensics ToolsSeveral tools for Outlook Express, Eudora Exchange, Lotus notesTools for log analysis, recovering deleted emails,Examples:-AccessData FTK-FINALeMAIL-EDBXtract-MailRecoveryWorm Detection: IntroductionWhat are worms?-Self-replicating program; Exploits software vulnerability on a victim; Remotely infects other victimsEvil worms-Severe effect; Code Red epidemic cost $2.6 BillionGoals of worm detection-Real-time detectionIssues-Substantial Volume of Identical Traffic, Random ProbingMethods for worm detection-Count number of sources/destinations; Count number of failed connection attemptsWorm Types-Email worms, Instant Messaging worms, Internet worms, IRC worms, File-sharing Networks wormsAutomatic signature generation possible -EarlyBird System (S. Singh -UCSD); Autograph (H. Ah-Kim - CMU)Email Worm Detection using Data MiningTraining dataFeature extractionClean or Infected ?Outgoing EmailsClassifierMachine LearningTest dataThe ModelTask: given some training instances of both “normal” and “viral” emails, induce a hypothesis to detect “viral” emails.We used:Naïve BayesSVMAssumptionsFeatures are based on outgoing emails.Different users have different “normal” behaviour.Analysis should be per-user basis.Two groups of features -Per email (#of attachments, HTML in body, text/binary attachments)-Per window (mean words in body, variable words in subject)Total of 24 features identifiedGoal: Identify “normal” and “viral” emails based on these featuresFeature sets-Per email featuresBinary valued FeaturesPresence of HTML; script tags/attributes; embedded images; hyperlinks; Presence of binary, text attachments; MIME types of file attachmentsContinuous-valued FeaturesNumber of attachments; Number of words/characters in the subject and body-Per window featuresNumber of emails sent; Number of unique email recipients; Number of unique sender addresses; Average number of words/characters per subject, body; average word length:; Variance in number of words/characters per subject, body; Variance in word lengthRatio of emails with attachmentsData Mining ApproachClassifierSVM Naïve Bayesinfected?Clean?CleanClean/ InfectedClean/ InfectedTest instanceTest instanceData setCollected from UC Berkeley.-Contains instances for both normal and viral emails.Six worm types: -bagle.f, bubbleboy, mydoom.m, -mydoom.u, netsky.d, sobig.fOriginally Six sets of data:-training instances: normal (400) + five worms (5x200) -testing instances: normal (1200) + the sixth worm (200)Problem: Not balanced, no cross validation reportedSolution: re-arrange the data and apply cross-validationOur Implementation and AnalysisImplementation-Naïve Bayes: Assume “Normal” distribution of numeric and real data; smoothing applied-SVM: with the parameter settings: one-class SVM with the radial basis function using “gamma” = 0.015 and “nu” = 0.1.Analysis-NB alone performs better than other techniques-SVM alone also performs better if parameters are set correctly-mydoom.m and VBS.Bubbleboy data set are not sufficient (very low detection accuracy in all classifiers)-The feature-based approach seems to be useful only when we haveidentified the relevant featuresgathered enough training dataImplement classifiers with best parameter settingsMobile Device/System ForensicsMobile device forensics overviewAcquisition proceduresSummaryMobile Device Forensics OverviewWhat is stored in cell phones-Incoming/outgoing/missed calls-Text messages-Short messages-Instant messaging logs-Web pages-Pictures-Calendars-Address books-Music files-Voice recordsMobile PhonesMultiple generations-Analog, Digital personal communications, Third generations (increased bandwidth and other features)Digital networks-CDMA, GSM, TDMA, - - -Proprietary OSsSIM Cards (Subscriber Identity
View Full Document