svdd_tool
所属分类:matlab编程
开发工具:matlab
文件大小:1028KB
下载次数:87
上传日期:2009-05-27 10:28:54
上 传 者:
mycool007
说明: 支持向量域是近几年采用的一种较新的分类器,是在其支持向量机的基础上发展而来。
(the tool box of SVDD)
文件列表:
svdd_tool\@dataset\abs.m (22, 2005-03-04)
svdd_tool\@dataset\abs.p (534, 2005-03-04)
svdd_tool\@dataset\and.m (22, 2005-03-04)
svdd_tool\@dataset\and.p (1458, 2005-03-04)
svdd_tool\@dataset\classsizes.m (347, 2005-03-04)
svdd_tool\@dataset\classsizes.p (1205, 2005-03-04)
svdd_tool\@dataset\conj.m (23, 2005-03-04)
svdd_tool\@dataset\conj.p (491, 2005-03-04)
svdd_tool\@dataset\Contents.m (11782, 2005-03-04)
svdd_tool\@dataset\Contents.p (126, 2005-03-04)
svdd_tool\@dataset\corrcoef.m (310, 2005-03-04)
svdd_tool\@dataset\corrcoef.p (1843, 2005-03-04)
svdd_tool\@dataset\ctranspose.m (99, 2005-03-04)
svdd_tool\@dataset\ctranspose.p (449, 2005-03-04)
svdd_tool\@dataset\cumsum.m (25, 2005-03-04)
svdd_tool\@dataset\cumsum.p (717, 2005-03-04)
svdd_tool\@dataset\dataset.m (4569, 2005-03-04)
svdd_tool\@dataset\dataset.p (15241, 2005-03-04)
svdd_tool\@dataset\det.m (78, 2005-03-04)
svdd_tool\@dataset\det.p (438, 2005-03-04)
svdd_tool\@dataset\disp.m (72, 2005-03-04)
svdd_tool\@dataset\disp.p (547, 2005-03-04)
svdd_tool\@dataset\display.m (37, 2005-03-04)
svdd_tool\@dataset\display.p (3414, 2005-03-04)
svdd_tool\@dataset\double.m (133, 2005-03-04)
svdd_tool\@dataset\double.p (409, 2005-03-04)
svdd_tool\@dataset\eig.m (42, 2005-03-04)
svdd_tool\@dataset\eig.p (2442, 2005-03-04)
svdd_tool\@dataset\end.m (22, 2005-03-04)
svdd_tool\@dataset\end.p (1130, 2005-03-04)
svdd_tool\@dataset\eq.m (21, 2005-03-04)
svdd_tool\@dataset\eq.p (1469, 2005-03-04)
svdd_tool\@dataset\exp.m (22, 2005-03-04)
svdd_tool\@dataset\exp.p (486, 2005-03-04)
svdd_tool\@dataset\find.m (39, 2005-03-04)
svdd_tool\@dataset\find.p (923, 2005-03-04)
svdd_tool\@dataset\findident.m (387, 2005-03-04)
svdd_tool\@dataset\findident.p (2288, 2005-03-04)
svdd_tool\@dataset\findlabels.m (201, 2005-03-04)
svdd_tool\@dataset\findlabels.p (1733, 2005-03-04)
... ...
Data Description Matlab toolbox. (version 1.5.7)
This toolbox is an add-on to the PRTools toolbox. The toolbox contains
algorithms to train, investigate, visualize and evaluate one-class
classifiers (or data descriptions, novelty descriptors, outlier
detectors). Some experience with the PRTools toolbox is recommended.
This toolbox is developed as a research tool so no guarantees can be
given.
- Requirements:
In order to make this toolbox work, you need:
0. A computer and some enthusiasm
1. Matlab with the - optimization toolbox (for svdd and lpdd)
- statistics toolbox (for randsph)
and - neural network toolbox (for autoenc_dd)
2. PRTools 4.0.0 or higher
3. This toolbox.
- Installation:
The installation of the toolbox is almost trivial. Unzip the file, store
the contents in a directory (name it for instance DD_TOOLS) and add this
directory to your matlab path.
- Information and example code:
For the most basic information, type help DD_TOOLS (use the directory name
where the toolbox is stored). Some simple (and some not so simple!)
one-class examples are given in dd_ex1 till dd_ex9. For more background
information, please have a look at the pdf file included in the
directory. Some examples of the operation of the procedures in the
toolbox are given on the web-pages:
http://www-ict.ewi.tudelft.nl/~davidt/dd_tools.html
* Notes on version 1.5.7
- Finally added the removal of an object from the incremental SVDD. Also
a separate function for storing the structure W to a mapping is
introduced, and an example file (dd_ex10.m)
- Added nndist_range to easily compute the average nearest neighbor
distance in a dataset
- Extended myproxm to include the computation of the kernel with a
subsampled version of the training data (randomly selected prototypes)
- Tried to improve the help a bit (never ending story)
* Notes on version 1.5.6
- Added the svddpath.m and svddpath_opt.m that optimizes the SVDD by
moving over the whole regularization path C (or lambda). It is not
suitable for data with outliers...
- Change incsvdd such that it also outputs the distance to the sphere
center
- Introduce variable C for different objects to weigh objects in the
SVDD
- Greatly improved the speed of the dd_error by avoiding the call to
renumlab.
* Notes on version 1.5.5
- Added the extended version of the lpball_dd
- Added the AUC linear optimizer auclpm.m.
- Added the rankboostm.m
* Notes on version 1.5.4
- Due to changes in prtools, I had to change make_outliers.m
- Added the possibility to incsvdd to add other Matlab files that define
kernel functions. Furthermore, again some strange starting conditions
have been taken care of.
- Fixed bugs in rob_gauss_dd
* Notes on version 1.5.3
- Added the equal error rate
- Updated version of mst_dd by Piotr.
* Notes on version 1.5.2
- Added the stump_dd, that thresholds just the first feature
- Small bugfixes
* Notes on version 1.5.1
- Tiny bug in oc_set for the case of selecting several classes
* Notes on version 1.5.0
- Changed the way in which the warnings are made: use the Matlab warning
identifiers
- Added the cost curve, a derivative curve from the ROC curve. The curve
is created using dd_costc, and can be plotted using plotcostc.m.
- The functionality of plotroc and plotcosts is extended.
- Take care for zero-variance features in ball_dd.
- Remove the C-variable from ball_dd, because it was (almost) pointless
- Added the multic, that combines one-class classifiers into one
multi-class classifier
- Removed a sneaky bug in oc_set; oc_set(x,'outlier') works now
correctly when one-class dataset x does not contain outliers
- Made the simpleroc.m a bit more beautiful and consistent.
* Notes on version 1.4.1
- Removed a *stupid* typo in the definition of an empty mapping
* Notes on version 1.4.0
- Changed the storage of the classifiers that are using normal
distributions. Instead of the covariance matrix, the inverse of the
covariance matrix is stored, making the repeated inverse computation
in the application of the mapping unnecessary
- Added the Minimum spanning tree data description by Piotr Juszczak and
a plotting function to show the tree on screen.
- Made an improved EM procedure for training the mixture of Gaussians.
It is not only possible to train with example outliers, it is now also
possible to extend the number of clusters for a trained mixture. Here
also the inverse covariance matrix is stored. It is also possible to
change the threshold for a trained mapping.
- In the function dd_roc the special case that two (or more) objects are
on exactly the same place, is covered
- New plot (without a good name, so I called it askerplot for historical
reasons)
- Fixed the wrong output threshold vector that was provided by
simpleroc.m
- Fixed a bug in dd_setfn.m, the threshold should be fitted on the target
data, not on all the data
- Small bug fixes and minor features added (for instance, output the
mean and radius from the function gendatout.m, or return the optimal
parameter in consistent_occ)
* Notes on version 1.3.0
- Introduced optim_auc to automatically update hyperparameters by
optimizing AUC
- Added the Naive Bayes data description
- Added the Minimax probability machine data description.
- Added a L_p Ball data description
- Changed the simpleroc, such that objects with identical output values
do not cause the ROC to change when the dataset is randomized in
ordering. It still results in an suspicious ROC curve, but I added a
warning
- Made the change_R more robust against overflow (thanks to Mauro Del
Rio)
- removed a bug in som_dd, concerning the input arguments that were not
set correctly
- removed a bug from mykmeans, where the average over one object should
be avoided, grrrr
- removed a bug from consistent_occ, where the loop over k was not
stopped on time
- minor tweakings concerning speed, pointless (and not so pointless)
warnings, help and tiny bugs
* Notes on version 1.2.0
- Removed a stupid typo in ksvdd
- Made the function setthres.m actually useful (after quite some
requests), replaced by dd_setfn.m
- made an extra check in isocc
- Made a significant change and improvement in the plotroc function. It
is now possible to dynamically change the operating point of a
one-class classifier.
For this, also an example script is supplied.
- Removed the 'oldplot' option in plotroc, due to very unclear spagetti
code and potential disastrous bugs
- Improved the dd_crossval to take class priors into account.
- Added an ex_dd9 to show dd_crossval
- Added a variant of the ROC plot, the askerplot.
* Notes on version 1.1.2
- Optimized the simpleroc.m significantly.
- Added the normalization of the classifier output to something like
probabilities
- Removed some very very small bug from incsvdd.
- Make preparations for OCC normalisation by introducing featdom ranges.
- Some bugfixes
- Improved the output of the outmog_dd: use Bayes rule to find the
posteriors.
- Made the find_target work also on just labels.
- Fixes a very rare situation in oc_set when just outlier data is
available
* Notes on version 1.1.1
- Changed dd_crossval to take the class information into account.
- Added the feature in incsvdd that the kernel-type and kernel-parameter
can be exchanged (to make incsvdd compatible wich consistent_occ).
- Removed some superfluous blanks in the display of oc_set.
- Tiny improvement in the incsvdd
* Notes on version 1.1.0
- Change in the revision numbering. I'm going more in the Linux kernel
numbering, but we will see if I can stay consistent;-)
- Added the gendatouts.m
- Added the gendatoutg.m
- Split nndd.m into dnndd.m and nndd.m
- Split kcenter_dd.m into dkcenter_dd.m and kcenter_dd.m, and simplified
the individual classifiers
- Added the dknndd.m (now knndd.m is actually *not* a wrapper for
dknndd.m)
- Changed the gauss_dd such it stores the inverted cov. matrix instead
of the original cov. matrix. Saves double work.
- scale_range.m now is back to a linear distribution over the distances
instead of logarithmic. It looks better that way.
- Changed oc_set to be able to handle several classes that become target
class.
- Added the outmog_dd.m, which makes it possible to train a Mixture of
Gaussians using outlier objects during training.
* Notes on version 1.12:
- Changed the implementation of the dd_auc such that the AUC over a
restricted domain is more interpretable when you consider a
'standard'/'traditional' ROC curve.
- Added the incremental SVDD for very large datasets, for user defined
kernels or for the case that not good QP optimizer is present. I
consider this still a bit experimental, but it actually works pretty
well!
- Added the Mixture of Gaussians which can also model outlier data
(using both an almost uniform outlier class combined with normal
Gaussian clusters).
- Make dlpdd work on non-square distance matrices (by Ela Pekalska).
- Removed a bug in the dd_roc_old (thanks to Piotr Juczscak).
- Removed a bug in dlpdd (thanks to Elzbieta Pekalska).
* Notes on version 1.11:
- Changed some implementation of newsvdd such that it uses the
standard optimizer.
- Plotsom is now standard in Prtools, so removed from the Contents.m
- Included an index in the manual.
* Notes on version 1.10:
- Significantly rewrote and rearranged oc_set.m and target_class.m
- Changed dd_error.m and dd_roc.m to mimic testc.m
Also included the computation of the precision and recall.
- Completely rewrote the ROC computation. Large amounts of complexity
are just removed (and thus also some features, I'm sorry).
- Support the selection of hyperparameters using the consistency
criterion.
- Added the robustified Gaussian (rob_gauss_dd) and the minimum
covariance determinant Gaussian (mcd_gauss_dd).
- Removed a bad, bad, bad bug from gausspdf.m.
- Made all the Gaussian methods use mahaldist.m for their evaluation.
- Completely rewrote the SVDD. The confusing parameters fracrej and
fracerr are removed, and all the quadratic optimizers (libsvm, qld,
quadprog) are integrated.
- Added the SVDD using general kernel definitions: ksvdd.m (although
it has a very annoying feature that you have to supply the values
for K(z,z) during the evaluation of object z, when you just supply
the kernel matrix: have a look at the help)
- Rewrote the LPDD in terms of DLPDD, MYPROXM and DISSIM.
- Implemented the SOM now nicely and removed the most obvious bugs.
- Added the dd_crossval.
- Added the dd_f1, for the computation of the f1-score.
* Notes on version 1.01:
- Changed the order of the mtimes: so w*a is replaced by a*w
- Removed a bug in the creation of a one-class dataset from a
more-than-two class dataset in oc_set.m
* Notes on version 1.00:
- There is a *significant* change from updating from prtools3 to
prtools4 (prtools3.2.2 or higher). The definitions of the objects
'dataset' and 'mapping' have been upgraded. This requires the rewriting
of almost all code! It can therefore happen that new results are not
identical to results obtained by previous versions of the tools (but
they should not be very large).
- dd_error is totally rewritten
- names of is_ocset and is_occ are renamed to isocset and isocc to be
more consistent with the rest of matlab and prtools
- som_dd is added.
* Notes on version 0.99:
- introduced dissim.m
* Notes on version 0.95:
- added a bit of help to each of the m-files.
- programmed my own very basic kmeans clustering, because I needed it
also for other things. Therefore added mykmeans.m
- added plotroc.m to plot the classical ROC curve
- made an extra check in dd_roc to see where the outputs of the target
class is stored (for my OCC's it is always in the first column, but
for general PRTools classifiers this does not have to be the case).
Now dd_roc should work for all prtools classifiers (trained on data
with 'target' and 'outlier' labels of course).
- dd_fp.m added: compute the error on the outliers (fraction false
positive) of a trained classifier for a given error on the target
class (fraction false negative).
- made my own version of proxm.m (myproxm.m) which uses the lpdistm.m.
It is used in kwhiten.m.
- removed some horrible bug in lpdd! (one bloody minus sig...)
- another horrible bug from kwhiten, in the case a fixed
dimensionality was requested... Furthermore, in case of a fraction
of retained variance was requested, the threshold is now set such
that *at least* this fraction is retained (could be higher also).
- corrected the nu parameter in svdd and newsvdd in cases when example
outliers are used in training. Note that it cannot be done
completely correctly in newsvdd, because there just one single nu
parameter is allowed for all data.
- included in plotg.m the possibility to just plot the decision
boundary.
* Notes on version 0.9:
! in the early versions of the svdd, the support vectors were
classified as outliers. Now they are forced to be target objects.
This will therefore change the classification results!
- added gendatout: generation of spherically distributed outlier objects
- changed the place in which distm(a) was computed in the original
version of svdd. In previous versions, it was done over and over
again in f_svs, but now it is moved to the main svdd.m
- removed a bug in range_svdd, where the sqrt of the D has to be taken
for the range of sigma.
- fixed a bug in dd_roc. Now it is possible to supply 1D datasets for
computing the roc curve.
- fixed an error in the help of dd_auc
- added the function relabel
- replaced all explicit references of the function name by 'mfilename'
in all one-class classifiers
- added the random_dd, which randomly assigns labels
- added lpdd.m, the linear programming data description. It works on
distances, and therefore I also had to add:
ddistm.m and lpdistm.m
- added kwhiten.m, normalization to unit variance in the kernel space.
For that also center.m was needed.
近期下载者:
相关文件:
收藏者: