UW-Madison ECE 539 - Detecting Spam emails using neural networks - D2962757

Home> Schools> University of Wisconsin, Madison> Electrical and Computer Engr (ECE) > ECE 539> Detecting Spam emails using neural networks

DOC PREVIEW

UW-Madison ECE 539 - Detecting Spam emails using neural networks

School name University of Wisconsin, Madison

Course Ece 539- Introduction to Artificial Neural Network and Fuzzy Systems

Pages 16

This preview shows page 1-2-3-4-5 out of 16 pages.

Save

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

View full document

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Premium Document

Do you want full access? Go Premium and unlock all 16 pages.

Access to all documents

Download any document

Ad free experience

Subscribe for instant access Get instant access

Unformatted text preview:

Spam ? Not Any More ! Detecting Spam emails using neural networks ECE / CS / ME 539 Project Submitted by Sivanadyan, Thiagarajan Last Name First Name2TABLE OF CONTENTS 1. INTRODUCTION ..................................................................................2 1.1 Importance of the topic ................................................................................................... 3 1.2 Current Technology ......................................................................................................... 3 1.3 Project statement and outline ........................................................................................... 4 2. PREPROCESSING THE DATA ..............................................................5 2.1 Analysis of the original data.............................................................................................. 5 2.2 Difference in mean procedure.......................................................................................... 6 2.3 Reduction of the data set.................................................................................................. 7 2.4 Preprocessing & splitting the data .................................................................................... 7 3. MULTI-LAYER PERCEPTRON IMPLEMENTATION ..........................8 3.1 What is it? ........................................................................................................................ 8 3.2 MLP Simulation using the complete data set .................................................................... 9 3.3 MLP Simulation using the reduced data set (Inputs - 21)................................................ 10 3.4 MLP Simulation using the reduced data set (Inputs - 9) ................................................. 11 3.5 MLP Simulation using cross-validation........................................................................... 12 4. DISCUSSION AND INFERENCES ..................................................... 13 4.1 Discussion of the results ................................................................................................ 13 4.2 Inferences ...................................................................................................................... 13 5. CONCLUSION .................................................................................... 14 6. REFERENCES AND ACKNOWLEDGEMENTS................................. 15 7. MATLAB FILES USED....................................................................... 1631. INTRODUCTION 1.1 Importance of the topic 'Spam' is the word for unsolicited or unwanted emails including advertisements for products/web sites, make money fast schemes, chain letters, pornography... In short, it is several varieties of email and newsgroup abuse, loosely categorized by using the Internet to deliver inappropriate messages to unwilling recipients. The amount of it delivered today is appalling. An average user on the internet gets about 10-50 spam emails a day and about 13 billion pieces of unsolicited commercial e-mail are sent each day, which represents about half of all e-mail sent[1]. The harm in spam, excluding the time spent in getting rid of them, is the fact that it uses up resources such as disk space and bandwidth, extremely precious quantities in the modern context. Also, a lot of disreputable and illegal companies and individuals utilize the opportunity to perpetrate various scams, illegal products and other inappropriate materials. Estimates of lost productivity are as high as $10 billion year. Spam does not cost the sender anything - most of the expenses are paid for by the carriers or the recipient rather than by the sender. Due to these reasons and for the outright infringement on personal space, some method of preventing the normal delivery of spam is desired. The seriousness of the problem of spam can be inferred from the fact that, ‘The U.S. House of Representatives has approved an amended version of a bill that will allow penalties of up to $6 million and five years in jail for sending some e-mail spam as part of the CAN-SPAM Act’ [1]. However, the goals of a spam blocking mechanism vary depending on the user. In most cases, it is important to avoid false hits (where non-spam gets classified as spam), at the risk of allowing some spam through. In few cases, such as net Moms (software which prevent inappropriate messages reaching younger viewers), some degree of laxity is allowed, especially since the parent can view it later and sort it out. Also, the nature of spam emails received differs among users, and spam content can also vary with time. Therefore high adaptability is one of the prime concerns of any anti-spam software. 1.2 Current Technology Software to combat spam e-mail, until recently, was based on simple keyword filters; if a given term was present in a message's headers and/or body, the mail was labeled as spam. This rapidly became unscalable as each potential spam term had to be manually added to the system, resulting in little time saved for users. There are several spam filters available today. Most of them fall into one of the following categories: [2] User defined filters: Available on most email servers. With these filters you can forward email to different mailboxes depending on headers or contents. Header filters: These are more sophisticated. They look at the email headers to see if they are forged. Language filters: They simply filter out any email that is not in your native tongue. It only filters out foreign language spam, which is not a major problem today, unless the foreign language under question is English.4Permission filters: They block all email that does not come from an authorized source. Typically the first time you send an email to a person using a permission filter you will receive an auto-response inviting you to visit a web page and enter some information. Content filters: They scan the text of an email and either neural networks or fuzzy logic to give a weighted opinion as to whether the email is Spam. They can be highly effective, but can also occasionally filter out newsletters and other bulk email that may appear to be Spam. 1.3 Project statement and outline This project attempts to apply a neural network to the problem of spam recognition. It is based on content based filtering and is a method that is getting popular these days. It is can

View Full Document

UW-Madison ECE 539 - Detecting Spam emails using neural networks

Sign up for free to view:

This document and 3 million+ documents and flashcards
High quality study guides, lecture notes, practice exams
Course Packets handpicked by editors offering a comprehensive review of your courses
Better Grades Guaranteed


School:
Email:
New Password:
Confirm Password:

This preview shows page 1-2-3-4-5 out of 16 pages.

UW-Madison ECE 539 - Detecting Spam emails using neural networks

Sign up for free to view:

Please select your school