1 15-744: Computer Networking L-23 Worms 2 Overview •Worm propagation • Worm signatures Threat Model Traditional •High-value targets • Insider threats Worms & Botnets •Automated attack of millions of targets • Value in aggregate, not individual systems • Threats: Software vulnerabilities; naïve users ... and it's profitable •Botnets used for •Spam (and more spam)? • Credit card theft • DDoS extortion • Flourishing Exchange market • Spam proxying: 3-10 cents/host/week • 25k botnets: $40k - $130k/year • Also for stolen account compromised machines, credit cards, identities, etc. (be worried)? 42 Why is this problem hard? •Monoculture: little “genetic diversity” in hosts •Instantaneous transmission: Almost entire network within 500ms • Slow immune response: human scales (10x-1Mx slower!)? • Poor hygiene: Out of date / misconfigured systems; naïve users • Intelligent designer ... of pathogens • Near-Anonymitity 5 Code Red I v1 •July 12th, 2001 •Exploited a known vulnerability in Microsoft’s Internet Information Server (IIS) • Buffer overflow in a rarely used URL decoding routine – published June 18th • 1st – 19th of each month: attempts to spread • Random scanning of IP address space • 99 propagation threads, 100th defaced pages on server • Static random number generator seed • Every worm copy scans the same set of addresses Linear growth Code Red I v1 •20th – 28th of each month: attacks •DDOS attack against 198.137.240.91 (www.whitehouse.gov) • Memory resident – rebooting the system removes the worm • However, could quickly be reinfected Code Red I v2 •July 19th, 2001 •Largely same codebase – same author? • Ends website defacements • Fixes random number generator seeding bug • Scanned address space grew exponentially • 359,000 hosts infected in 14 hours • Compromised almost all vulnerable IIS servers on internet3 Analysis of Code Red I v2 •Random Constant Spread model •Constants • N = total number of vulnerable machines • K = initial compromise rate, per hour • T = Time at which incident happens • Variables • a = proportion of vulnerable machines compromised • t = time in hours Analysis of Code Red I v2 N = total number of vulnerable machines K = initial compromise rate, per hour T = Time at which incident happens Variables a = proportion of vulnerable machines compromised t = time in hours “Logistic equation” Rate of growth of epidemic in finite systems when all entities have an equal likelihood of infecting any other entity Code Red I v2 – Plot • K = 1.8 • T = 11.9 Hourly probe rate data for inbound port 80 at the Chemical Abstracts Service during the initial outbreak of Code Red I on July 19th, 2001. Improvements: Localized scanning •Observation: Density of vulnerable hosts in IP address space is not uniform •Idea: Bias scanning towards local network • Used in CodeRed II • P=0.50: Choose address from local class-A network (/8) • P=0.38: Choose address from local class-B network (/16) • P=0.12: Choose random address • Allows worm to spread more quickly4 Code Red II (August 2001) •Began : August 4th, 2001 •Exploit : Microsoft IIS webservers (buffer overflow) • Named “Code Red II” because : • It contained a comment stating so. However the codebase was new. • Infected IIS on windows 2000 successfully but caused system crash on windows NT. • Installed a root backdoor on the infected machine. Improvements: Multi-vector •Idea: Use multiple propagation methods simultaneously • Example: Nimda • IIS vulnerability • Bulk e-mails • Open network shares • Defaced web pages • Code Red II backdoor Onset of Nimda Time (PDT) 18 September, 2001 HTTP connections/second seen at LBNL (only confirmed Nimda attacks) 1/2 hour Better Worms: Hit-list Scanning •Worm takes a long time to “get off the ground” •Worm author collects a list of, say, 10,00 vulnerable machines • Worm initially attempts to infect these hosts How to build Hit-List •Stealthy randomized scan over number of months •Distributed scanning via botnet • DNS searches – e.g. assemble domain list, search for IP address of mail server in MX records • Web crawling spider similar to search engines • Public surveys – e.g. Netcraft • Listening for announcements – e.g. vulnerable IIS servers during Code Red I5 Better Worms: Permutation scanning • Problem: Many addresses are scanned multiple times • Idea: Generate random permutation of all IP addresses, scan in order • Hit-list hosts start at their own position in the permutation • When an infected host is found, restart at a random point • Can be combined with divide-and-conquer approach H0 H4 H1 H3 H2 H1 (Restart) Warhol Worm •Simulation shows that employing the two previous techniques, can attack 300,000 hosts in less than 15 minutes • Conventional = 10 scans/sec • Fast Scanning = 100 scans/sec • Warhol = 100 scans/sec, • Permutation scanning and 10,000 entry hit list Flash worms •A flash worm would start with a hit list that contains most/all vulnerable hosts •Realistic scenario: • Complete scan takes 2h with an OC-12 • Internet warfare? • Problem: Size of the hit list • 9 million hosts 36 MB • Compression works: 7.5MB • Can be sent over a 256kbps DSL link in 3 seconds • Extremely fast: • Full infection in tens of seconds! Surreptitious worms •Idea: Hide worms in inconspicuous traffic to avoid detection • Leverage P2P systems? • High node degree • Lots of traffic to hide in • Proprietary protocols • Homogeneous software • Immense size (30,000,000 Kazaa downloads!)6 Example Outbreak: SQL Slammer (2003) •Single, small UDP packet exploit (376 b) •First ~1min: classic random scanning • Doubles # of infected hosts every ~8.5sec • (In comparison: Code Red doubled in 40min) • After 1min, starts to saturate access b/w • Interferes with itself, so it slows down • By this point, was sending 20M pps • Peak of 55 million IP scans/sec @ 3min • 90% of Internet
View Full Document