2011CCA—Coding

所属分类:数值算法/人工智能
开发工具:matlab
文件大小:10851KB
下载次数:1
上传日期:2021-01-06 16:09:52
上 传 者V-velaciela
说明:  Multi-Label Output Codes using Canonical Correlation Analysis——CCA算法
(Multi-Label Output Codes using Canonical Correlation Analysis-CCA)

文件列表:
AISTAT2011_Code (0, 2012-07-23)
AISTAT2011_Code\adaptiveTrainLGR_Liblin.m (5551, 2011-04-26)
AISTAT2011_Code\adaptiveTrainRLS_Regress_CG.m (2779, 2011-05-09)
AISTAT2011_Code\CCA.m (4752, 2009-02-22)
AISTAT2011_Code\computeF1.m (845, 2010-05-21)
AISTAT2011_Code\computeF1_2.m (778, 2010-09-29)
AISTAT2011_Code\computeF1_3.m (260, 2010-10-20)
AISTAT2011_Code\emotions_matlab.mat (704584, 2011-04-27)
AISTAT2011_Code\IOCoding_MultiLabel_TrainTest_OC_MFA.m (6985, 2012-07-23)
AISTAT2011_Code\IOCoding_MultiLabel_TrainTest_OC_MFA_EnsembleTrees.m (7639, 2012-07-23)
AISTAT2011_Code\IOC_MFADecoding.m (2952, 2010-09-16)
AISTAT2011_Code\regress_ensemble_tree_CV.m (1317, 2010-09-17)
AISTAT2011_Code\regress_tree_CV.m (981, 2010-09-16)
AISTAT2011_Code\scene-matlab.mat (10363191, 2011-04-27)
AISTAT2011_Code\solve_eigen.m (1769, 2009-02-16)
AISTAT2011_Code\test_Emotion_MultiLabel.m (246, 2012-07-23)
AISTAT2011_Code\test_Emotion_MultiLabel_EnsembleTrees.m (287, 2012-07-23)
AISTAT2011_Code\test_Scene_MultiLabel.m (240, 2012-07-23)
AISTAT2011_Code\test_Scene_MultiLabel_EnsembleTrees.m (280, 2012-07-23)
AISTAT2011_Code\train.mexa64 (58232, 2010-04-07)
AISTAT2011_Code\TrainRLS_Regress_CG.m (3411, 2010-10-06)

Thanks for using this code. This code contains experiments to test CCA-based output coding for multi-label classification. 0. Reference Yi Zhang and Jeff Schneider. Multi-label Output Codes using Canonical Correlation Analysis, AISTATS 2011 1. External software Note that you need to install Liblinear package (http://www.csie.ntu.edu.tw/~cjlin/liblinear/) before you can run this code. I include a linux compiled file for liblinear, but you probably need to compile your own version. 2. Launch the experiments Use test_Scene_MultiLabel.m, test_Emotion_MultiLabel.m, test_Scene_MultiLabel_EnsembleTrees.m, and test_Emotion_MultiLabel_EnsembleTrees.m to launch experiments, e.g, test_Scene_MultiLabel(300,30) for 30 random runs and 300 training samplea in each run. test_Scene_MultiLabel.m and test_Emotion_MultiLabel.m launch CCA coding with linear regression and logistic regression as the base learners. test_Scene_MultiLabel_EnsembleTrees.m and test_Emotion_MultiLabel_EnsembleTrees.m launch CCA coding with tree ensemble as the base learners. 3. Test your own data If you have your own data, you need to prepare your data file and write a wrapper similarly as test_Scene_MultiLabel.m. The data file needs to at least contain: trX (nTrain * nFeature), trY (nTrain * nLabel), tsX (nTest * nFeature), tsY (nTest * nLabel), indicies (nRun * 1 cell array, where each cell contains an 1 * nTrain permutation vector). As test_Scene_MultiLabel.m, your wrapper will call a function IOCoding_MultiLabel_TrainTest_OC_MFA() or IOCoding_MultiLabel_TrainTest_OC_MFA_EnsembleTrees(). See comments in the corresponding file for the parameter settings. 4. Read the results The code will write all results into a file, e.g., file_Emotion_nSample300_CCACoding_MFADecoding.mat. In the file, you will see several result arrays for subset accuracy (aka exact matching rate), micro F1 and macro F1 scores. Take subset accuracy for example, you will see three arrays: SubAccuracy, SubAccuracy_2, SubAccruacy_3. They corresponds to three different values for \lambda in the decoding equation (12) and Algorithm 2 in the AISTAT 2011 paper: 1/4, 1, 4. If you don't want to tune this decoding parameter, then you can just look at SubAccuracy_2 (or microF1_2, macroF1_2), which corresponds to \lambda = 1 in decoding eq.(15). Take the array SubAccuracy for example, each row is for a random run, and each column corresponds to a specific number of label projections used by the coding. The number of projections is stored in the variable nComps (i.e., nComps has the same number of columns as each result array, and each element of nComps tells the number of label projections used in each column of the result array). If you don't want to tune this parameter (as in our paper), you can just look at the last column of SubAccuracy (or microF1, macroF1), which corresponds to using the max number of label projections. Now, take a column of a result array, you have the result for all random runs, each row is for one random run. 5. Alternative method We have also proposed a maximum margin output coding (Yi Zhang and Jeff Schneider. Maximum Margin Output Coding, ICML 2012), which combines the benefits of CCA-based coding and other codings. The paper and code can be found at http://www.cs.cmu.edu/~yizhang1 6. Contact Yi Zhang, yizhang1@cs.cmu.edu

近期下载者

相关文件


收藏者