DOC PREVIEW
FSU COP 5611 - Automated Worm Fingerprinting

This preview shows page 1-2-3-4-5-6-41-42-43-44-45-46-83-84-85-86-87-88 out of 88 pages.

Save
View full document
Premium Document
Do you want full access? Go Premium and unlock all 88 pages.
Access to all documents
Download any document
Ad free experience

Unformatted text preview:

Automated Worm Fingerprinting Sumeet Singh Cristian Estan George Varghese and Stefan Savage Introduction Problem how to react quickly to worms CodeRed 2001 Infected 360 000 hosts within 11 hours Sapphire Slammer 376 bytes 2002 Infected 75 000 hosts within 10 minutes Existing Approaches Detection Ad hoc intrusion detection Characterization Manual signature extraction Isolates and decompiles a new worm Look for and test unique signatures Can take hours or days Existing Approaches Containment Updates to anti virus and network filtering products Earlybird Automatically detect and contain new worms Two observations Some portion of the content in existing worms is invariant Rare to see the same string recurring from many sources to many destinations Earlybird Automatically extract the signature of all known worms Also Blaster MyDoom and Kibuv B hours or days before any public signatures were distributed Few false positives Background and Related Work Almost all IPs were scanned by Slammer 10 minutes Limited only by bandwidth constraints The SQL Slammer Worm 30 Minutes After Release Infections doubled every 8 5 seconds Spread 100X faster than Code Red At peak scanned 55 million hosts per second Network Effects Of The SQL Slammer Worm At the height of infections Several ISPs noted significant bandwidth consumption at peering points Average packet loss approached 20 South Korea lost almost all Internet service for period of time Financial ATMs were affected Some airline ticketing systems overwhelmed Signature Based Methods Pretty effective if signatures can be generated quickly For CodeRed 60 minutes For Slammer 1 5 minutes Worm Detection Three classes of methods Scan detection Honeypots Behavioral techniques Scan Detection Look for unusual frequency and distribution of address scanning Limitations Not suited to worms that spread in a nonrandom fashion i e emails IM P2P apps Based on a target list Spread topologically Scan Detection More limitations Detects infected sites Does not produce a signature Honeypots Monitored idle hosts with untreated vulnerabilities Used to isolate worms Limitations Manual extraction of signatures Depend on quick infections Behavioral Detection Looks for unusual system call patterns Sending a packet from the same buffer containing a received packet Can detect slow moving worms Limitations Needs application specific knowledge Cannot infer a large scale outbreak Characterization Process of analyzing and identifying a new worm Current approaches Use a priori vulnerability signatures Automated signature extraction Vulnerability Signatures Example Slammer Worm UDP traffic on port 1434 that is longer than 100 bytes buffer overflow Can be deployed before the outbreak Can only be applied to well known vulnerabilities Some Automated Signature Extraction Techniques Allows viruses to infect decoy programs Extracts the modified regions of the decoy Uses heuristics to identify invariant code strings across infected instances Some Automated Signature Extraction Techniques Limitation Assumes the presence of a virus in a controlled environment Some Automated Signature Extraction Techniques Honeycomb Autograph Find longest common subsequences among sets of strings found in messages Uses network level data to infer worm signatures Limitations Scale and full distributed deployments Containment Mechanism used to deter the spread of an active worm Host quarantine Via IP ACLs on routers or firewalls String matching Connection throttling On all outgoing connections Host Quarantine Preventing an infected host from talking to other hosts Via IP ACLs on routers or firewalls Defining Worm Behavior Content invariance Content prevalence Portions of a worm are invariant e g the decryption routine Appears frequently on the network Address dispersion Distribution of destination addresses more uniform to spread fast Finding Worm Signatures Traffic pattern is sufficient for detecting worms Relatively straightforward Extract all possible substrings Raise an alarm when FrequencyCounter substring threshold1 SourceCounter substring threshold2 DestCounter substring threshold3 Practical Content Sifting Characteristics Small processing requirements Small memory requirements Allows arbitrary deployment strategies Estimating Content Prevalence Finding the packet payloads that appear at least x times among the N packets sent During a given interval Estimating Content Prevalence Table payload 1 GB table filled in 10 seconds Table hash payload 1 GB table filled in 4 minutes Tracking millions of ants to track a few elephants Collisions false positives Multistage Filters stream memory Array of counters Hash Pink Singh et al 2002 Multistage Filters packet memory Array of counters Hash Green Multistage Filters packet memory Array of counters Hash Green Multistage Filters packet memory Multistage Filters Collisions are OK packet memory Multistage Filters Reached threshold packet memory packet1 1 Insert Multistage Filters packet memory packet1 1 Multistage Filters packet memory packet1 1 packet2 1 Multistage Filters Stage 1 packet memory packet1 1 No false negatives guaranteed detection Stage 2 Conservative Updates Gray all prior packets Conservative Updates Redundant Redundant Conservative Updates Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length A horse is a horse of course of course F1 c1p4 c2p3 c3p2 c4p1 c5 mod M Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length F2 c2p4 c3p3 c4p2 c5p1 c6 mod M A horse is a horse of course of course F1 c1p4 c2p3 c3p2 c4p1 c5 mod M Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length F2 c2p4 c3p3 c4p2 c5p1 c6 mod M c1p5 c2p4 c3p3 c4p2 c5p1 c6 c1p5 mod M pF1 c6 c1p5 mod M A horse is a horse of course of course F1 c1p4 c2p3 c3p2 c4p1 c5 mod M Detecting Common Strings Cannot afford to detect all substrings Maybe can afford to detect all strings with a small fixed length Still too expensive Estimating Address Dispersion Not sufficient to count the number of source and destination pairs e g send a mail to a mailing list Two sources mail server and the sender Many destinations Need to count the unique source and destination


View Full Document

FSU COP 5611 - Automated Worm Fingerprinting

Documents in this Course
Load more
Download Automated Worm Fingerprinting
Our administrator received your request to download this document. We will send you the file to your email shortly.
Loading Unlocking...
Login

Join to view Automated Worm Fingerprinting and access 3M+ class-specific study document.

or
We will never post anything without your permission.
Don't have an account?
Sign Up

Join to view Automated Worm Fingerprinting and access 3M+ class-specific study document.

or

By creating an account you agree to our Privacy Policy and Terms Of Use

Already a member?