gcforest

所属分类:人工智能/神经网络/深度学习
开发工具:Python
文件大小:100KB
下载次数:74
上传日期:2017-06-01 18:23:31
上 传 者沙霍特
说明:  周志华教授深度森林算法代码,用于分类精度接近深度学习算法
(Professor zhihua s deep forest algorithm code is used to classify precision approach to deep learning algorithm)

文件列表:
datasets (0, 2017-05-22)
datasets\gtzan (0, 2017-05-22)
datasets\gtzan\get_data.sh (1033, 2017-05-26)
datasets\gtzan\splits (0, 2017-05-22)
datasets\gtzan\splits\blues.train (1680, 2017-05-22)
datasets\gtzan\splits\blues.trainval (2400, 2017-05-22)
datasets\gtzan\splits\blues.val (720, 2017-05-22)
datasets\gtzan\splits\classical.train (2240, 2017-05-22)
datasets\gtzan\splits\classical.trainval (3200, 2017-05-22)
datasets\gtzan\splits\classical.val (960, 2017-05-22)
datasets\gtzan\splits\country.train (1960, 2017-05-22)
datasets\gtzan\splits\country.trainval (2800, 2017-05-22)
datasets\gtzan\splits\country.val (840, 2017-05-22)
datasets\gtzan\splits\disco.train (1680, 2017-05-22)
datasets\gtzan\splits\disco.trainval (2400, 2017-05-22)
datasets\gtzan\splits\disco.val (720, 2017-05-22)
datasets\gtzan\splits\genre.train (17360, 2017-05-22)
datasets\gtzan\splits\genre.trainval (24800, 2017-05-22)
datasets\gtzan\splits\genre.val (7440, 2017-05-22)
datasets\gtzan\splits\genres.trainval (22800, 2017-05-22)
datasets\gtzan\splits\hiphop.train (1820, 2017-05-22)
datasets\gtzan\splits\hiphop.trainval (2600, 2017-05-22)
datasets\gtzan\splits\hiphop.val (780, 2017-05-22)
datasets\gtzan\splits\jazz.train (1540, 2017-05-22)
datasets\gtzan\splits\jazz.trainval (2200, 2017-05-22)
datasets\gtzan\splits\jazz.val (660, 2017-05-22)
datasets\gtzan\splits\metal.train (1680, 2017-05-22)
datasets\gtzan\splits\metal.trainval (2400, 2017-05-22)
datasets\gtzan\splits\metal.val (720, 2017-05-22)
datasets\gtzan\splits\pop.train (1400, 2017-05-22)
datasets\gtzan\splits\pop.trainval (2000, 2017-05-22)
datasets\gtzan\splits\pop.val (600, 2017-05-22)
datasets\gtzan\splits\reggae.train (1820, 2017-05-22)
datasets\gtzan\splits\reggae.trainval (2600, 2017-05-22)
datasets\gtzan\splits\reggae.val (780, 2017-05-22)
datasets\gtzan\splits\rock.train (1540, 2017-05-22)
datasets\gtzan\splits\rock.trainval (2200, 2017-05-26)
datasets\gtzan\splits\rock.val (660, 2017-05-22)
datasets\uci_adult (0, 2017-05-22)
datasets\uci_adult\features (1156, 2017-05-22)
... ...

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %Description: A python 2.7 implementation of gcForest proposed in [1]. % %A demo implementation of gcForest library as well as some demo client scripts to demostrate how to use the code. % %The implementation is flexible enough for modifying the model or fit your own datasets. % % % %Reference: [1] Z.-H. Zhou and J. Feng. Deep Forest: Towards an Alternative to Deep Neural Networks. % % In IJCAI-2017. (https://arxiv.org/abs/1702.08835v2 ) % % % %Requirements: This package is developed with Python 2.7, please make sure all the dependencies are installed, % %which is specified in requirements.txt % % % %ATTN: This package is free for academic usage. % % You can run it at your own risk. % % For other purposes, please contact Prof. Zhi-Hua Zhou(zhouzh@lamda.nju.edu.cn) % % % %ATTN2: This package was developed by Mr.Ji Feng(fengj@lamda.nju.edu.cn). % % The readme file and demo roughly explains how to use the codes. % % For any problem concerning the codes, please feel free to contact Mr.Feng. % %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Package Official Website: http://lamda.nju.edu.cn/code_gcForest.ashx This package is provided "AS IS" and free for academic usage. You can run it at your own risk. For other purposes, please contact Prof. Zhi-Hua Zhou (zhouzh@lamda.nju.edu.cn). Before running the demo, make sure all the dependencies are installed, for instance, please run the following command to install dependencies before running the code: ```pip install -r requirements.txt``` =================================== Outline for README ==================================== * Package Overview * Notes on Demo Scripts * Notes on Model Specification Files * Example and Demos * Using Own Dataset ================================== Package Overview ================================== * lib/gcforest - code for the implementations for gcforest * tools/train_fg.py - the demo script used for training Fine grained Layers * tools/train_cascade.py - the demo script used for training Cascade Layers * models/ - folder to save models which can be used in tools/train_fg.py and tools/train_cascade.py - the gcForest structure is saved in json format * logs - folder logs/gcforest is used to save the logfiles produced by demo scripts ============================ Notes on Demo Scripts ============================ Below is a brief description on the args needed for demo scripts %%%%%%%%%%%%%%%%%%%% tools/train_fg.py %%%%%%%%%%%%%%%%%%%% * --model: str - The config filepath for Fine grained models (in json format) * --save_outputs: bool - if True. The output predictions produced by Fine Grained Model will be saved in model_cache_dir which is specified in Model Config. This output will be used when Training Cascade Layer. - the default value is false %%%%%%%%%%%%%%%%%%%%%% tools/train_cascade.py %%%%%%%%%%%%%%%%%%%%%% * --model: str - The model config filepath for cascade training (in json format) %%%%%%%%%%%%%%%%%%%%%% Notes on Config Files %%%%%%%%%%%%%%%%%%%%%% Below is a brief introduction on how to use model specification files, namely * model specification for fine grained scanning structure. * model specification for cascade forests. All the model specifications (in json files) are saved in models/ For instance, all the model specification files needed for MNIST is stored in models/mnist/gcforest * ca is short for cascade structure specifications * fg is short for fine-grained structure specifications You can define your own structure by writing similar json files. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% FineGrained model's config (dataset) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * dataset.train, dataset.test: [dict] - coresponds to the particular datasets defined in lib/datasets - type [str]: see lib/datasets/__init__.py for a reference - You can use your own dataset by writing similar wrappers. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% FineGrained model's config (train) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * train.keep_model_in_mem: [bool] default=0 - if 0, the forest will be freed in RAM * train.data_cache : [dict] - coresponds to the DataCache in lib/dataset/data_cache.py * train.data_cache.cache_dir (str) - make sure to change "/mnt/raid/fengji/gcforest/cifar10/fg-tree500-depth100-3folds/datas" to your own path %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% FineGrained model's config (net) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * net.outputs: [list] - List of the data names output by this model * net.layers: [List of Layers] - Layer's Config, see lib/gcforest/layers for a reference %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Cascade model's config (dataset) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Similar as FineGrained's model config (dataset) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% Cascade model's config (cascade) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% see lib/gcforest/cascade/cascade_classifier.py __init__ for a reference ============================= Examples and Demos ============================= Before running the scripts, make sure to change * train.data_cache.cache_dir in the Finegrained Model Config (eg: model/xxx/fg-xxxx.json) * train.cascade.dataset.{train,test}.data_path in the Finegrained-Cascade Model Config (eg: model/xxx/fg-xxxx-ca.json) * train.cascade.cascade.data_save_dir in the Finegrained Model Config (eg: model/xxx/ca-xxxx.json and model/xxx/fg-xxxx-ca.json) To Train a gcForest(with fine grained scanning), you need to run two scripts. * Fine Grained Scanning: 'tools/train_fg.py' * Cascade Training: 'tools/train_cascade.py' %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% [UCI Letter](http://archive.ics.uci.edu/ml/datasets/Letter+Recognition) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * Get Data: you need to download the data by yourself by running the following command: ```Shell cd dataset/uci_letter sh get_data.sh ``` * Since we do not need to fine-grained scaning, we only train a Cascade Forest as follows: - `python tools/train_cascade.py --model models/uci_letter/gcforest/ca-tree500-n4x2-3folds.json --log_dir logs/gcforest/uci_letter/ca` * Adult, YEAST can be trained with similar procedure. %%%%%%%%%%%%%%%%%%%%% MNIST %%%%%%%%%%%%%%%%%%%%% * Get the data: The data will be automatically downloaded via 'lib/datasets/mnist.py', you do not need to do it yourself * First Train the Fine Grained Forest: - Run `python tools/train_fg.py --model models/mnist/gcforest/fg-tree500-depth100-3folds.json --log_dir logs/gcforest/mnist/fg --save_outputs` - This means: 1. Train a fine grained model for MNIST dataset, 2. Using the structure defined in models/mnist/gcforest/fg-tree500-depth100-3folds.json 3. save the log files in logs/gcforest/mnist/fg 4. The output for the fine grained scanning predictions is saved in train.data_cache.cache_dir * Then, train the cascade forest (Note: make sure you run the train_fg.py first) - run `python tools/train_cascade.py --model models/mnist/gcforest/fg-tree500-depth100-3folds-ca.json` - This means: 1. Train the fine grained scaning results with cascade structure. 2. The cascade model specification is defined in 'models/mnist/gcforest/fg-tree500-depth100-3folds-ca.json' * You could also train a Cascade Forest without fine-grained scanning (but the accuracy will be much lower): - Run `python tools/train_cascade.py --model models/mnist/gcforest/ca-tree500-n4x2-3folds.json --log_dir logs/gcforest/mnist/ca` %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% [UCI sEMG](http://archive.ics.uci.edu/ml/datasets/sEMG+for+Basic+Hand+movements) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * Get Data ```Shell cd dataset/uci_semg sh get_data.sh ``` * First Train the Fine Grained Forest: - `python tools/train_fg.py --model models/uci_semg/gcforest/fg-tree500-depth100-3folds.json --save_outputs --log_dir logs/gcforest/uci_semg/fg` * Then, train the cascade forest (Note: make sure you run the train_fg.py first) - `python tools/train_cascade.py --model models/uci_semg/gcforest/fg-tree500-depth100-3folds-ca.json --log_dir logs/gcforest/uci_semg/gc` * You could also training a Cascade Forest without fine-grained scanning(but the accuracy will be much lower): - `python tools/train_cascade.py --model models/uci_semg/gcforest/ca-tree500-n4x2-3folds.json --log_dir logs/gcforest/uci_semg/ca` %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% [GTZAN](http://marsyasweb.appspot.com/download/data_sets/) %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * Requirements(you need to install the following package) librosa * Get Data by yourself by running the following command ```Shell cd dataset/gtzan sh get_data.sh cd ../.. python tools/audio/cache_feature.py --dataset gtzan --feature mfcc --split genre.trainval ``` * First Train the Fine Grained Forest: - `python tools/train_fg.py --model models/gtzan/gcforest/fg-tree500-depth100-3folds.json --save_outputs --log_dir logs/gcforest/gtzan/fg` * Then, train the cascade forest (Note: make sure you run the train_fg.py first) - `python tools/train_cascade.py --model models/gtzan/gcforest/fg-tree500-depth100-3folds-ca.json --log_dir logs/gcforest/gtzan/gc` * You could also training a Cascade Forest without fine-grained scanning(but the accuracy will be much lower): - `python tools/train_cascade.py --model models/gtzan/gcforest/ca-tree500-n4x2-3folds.json --log_dir logs/gcforest/gtzan/ca --save_outputs` %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% IMDB %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * Cascade Forest: - `python tools/train_cascade.py --model models/imdb/gcforest/ca-tree500-n4x2-3folds.json --log_dir logs/gcforest/imdb/ca` %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% CIFAR10 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * First Train the Fine Grained Forest: - `python tools/train_fg.py --model models/cifar10/gcforest/fg-tree500-depth100-3folds.json --save_outputs` * Then, train the cascade forest (Note: make sure you run the train_fg.py first) - `python tools/train_cascade.py --model models/cifar10/gcforest/fg-tree500-depth100-3folds-ca.json` %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% For You Own Datasets %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% * Data Format: 0. Please refer lib/datasets/mnist.py as an example 1. the dataset should has attribute X,y to represent the data and label 2. y should be 1-d array 3. For fine-grained scanning, X should be 4-d array (N x channel x H x W). (e.g. cifar10 shoud be Nx3x32x32, mnist should be Nx1x28x28, uci_semg should be Nx1x3000x1) * Model Specifications: 1. Save the json file in models/$dataset_name (recommended) 2. for a detailed description, see section 'Config Files' * If you only need to train a cascade forest, run tools/train_cascade.py. Happy Hacking. Reference: [1] Z.-H. Zhou and J. Feng. Deep Forest: Towards an Alternative to Deep Neural Networks. In IJCAI-2017. (https://arxiv.org/abs/1702.08835v2 )

近期下载者

相关文件


收藏者