IntrusionDetectionSystem-master

所属分类:其他
开发工具:Java
文件大小:251KB
下载次数:1
上传日期:2018-05-31 07:35:21
上 传 者aminov
说明:  intrusion detection de system en java aminov algerieno

文件列表:
CandidateClassifier.java (2751, 2014-12-24)
DecisionTree (0, 2014-12-24)
FeatureSubsetGA.java (7041, 2014-12-24)
FinalReport (0, 2014-12-24)
FinalReport\FinalReport.pdf (256068, 2014-12-24)
Individual.java (390, 2014-12-24)
PostProcessing (0, 2014-12-24)
PostProcessing\gene_freq1.cpp (1084, 2014-12-24)
PostProcessing\gene_freq2.cpp (1084, 2014-12-24)
PostProcessing\generate1.cpp (1126, 2014-12-24)
PostProcessing\generate2.cpp (1126, 2014-12-24)
TestKDD.java (2901, 2014-12-24)
experiments (0, 2014-12-24)
experiments\TestKDD.java (1814, 2014-12-24)
proposal.tex (4796, 2014-12-24)
proposalref.bib (516, 2014-12-24)

IntrusionDetectionSystem ======================== Evolutionary Computation class project for Michigan State University Purpose
The purpose of this project is - given a data set, use a Genetic Algorithm to identify a good subset of features which can be used by an Ensemble classifier to classify network traffic as good or bad. Data Set
The data set I plan to use for this project comes from the 1999 KDD intrusion detection contest[1]. The dataset consists of 9 weeks of raw TCP dump data of raw data for a local area network simulating a U.S. Air Force LAN. The raw training data consists of compressed binary TCP dump data from seven weeks of network traffic, processed into about five million records. A connection is a sequence of TCP packets starting and ending at some well defined times between which data flows to and from a source IP address to a destination IP address under some well defined protocol. Each connection is labeled as either normal, or as a particular type of attack. The attacks themselves fall into 4 main categories: DOS(denial of service), R2L(unauthorized remote access), U2R(unauthorized local superuser access), probing. The data set also consists of test data which is derived from a different probability distribution than the training data including certain attack types not included in the training data. 24 training attack types are given while 14 attack types are present only in the test data. Ensemble classifier
Ensemble Classifiers do not learn a single classifier but learn a set of classifiers. They combine the predictions of multiple classifiers. This helps in reducing the dependence on the peculiarities of a single training set and reduces bias introduced by a single classifier. Different types of Ensemble methods are commonly used which involve manipulating the data distribution, manipulating the input features(something which the GA will be responsible for in this project), manipulating the class labels or introducing randomness into the learning algorithm. Bagging and boosting are commonly used to modify the data distribution. The base classifiers have to satisfy the criteria that the classification errors made by the classifiers have to be as uncorrelated as possible. Evolutionary Algorithm
The purpose of the genetic algorithm will be to select a subset of the features to be used by the Ensemble classifier, train and test the Ensemble classifier and calculate its fitness. The search component would be a GA and the evaluation component will be a Ensemble classifier. The initial population would be randomly generated, each individual would have a subset of the 41 features present in the training data set. Each individual would be evaluated using a Ensemble classifier. Once the top individuals from a generation have been found, crossover from the parents would create the offspring and some mutation would be performed on the child to maintain some diversity in the population. The parameters of - initial population size, method of crossover, mutation, selection criteria - among other parameters are something I am hoping to also be able to investigate through the duration of this project. Library to be used
The Weka library[2] is a collection of implementation for various data mining algorithms including classification algorithms. I plan to use this library to implement the Ensemble classifier. References
  1. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
  2. http://www.cs.waikato.ac.nz/ml/weka/

近期下载者

相关文件


收藏者