FDSN Regression Paper - Data
Supplementary data (source code and results spreadsheets) of the paper "Fast Deep Stacked Networks Based on Extreme Learning Machine Applied to Regression Problems", currently submitted to Neural Networks.
Steps to reproduce
To reproduce our experiments you must have a computer with Linux and MATLAB installed. We used Ubuntu 18.04 and MATLAB 9.4. 1) Download the source code zip and extract anywhere (e.g. to a "Source" folder) 2) Open a terminal and navigate to Source folder (e.g. "cd /home/user/Source") 3) Download datasets using the following commands: cd Datasets/MultiTargetRegression/ bash multiTargetRegression.sh After some time the datasets will be downloaded and stored in their respective folders. 4) Return to "Source" folder (e.g. "cd /home/user/Source") 5) Open MATLAB and run "testMTR.m" script After some very long time, the results will be stored in the file "results.mat". Warning: Our tests included three large-scale datasets, which require a large amount of time and memory to be processed by the methods (due to the large number of samples). Our computer took around three weeks to run all combinations of methods and datasets. (The .mat file obtained in our experiments is saved in "results" folder) We used the script "statTest.m", which applies Friedman and Nemenyi statistical tests in our results. It prints an table of the ranks and plots the critical distance graph. It has one argument that should be "aRRMSE" (default value), "trTime" and "bytes". We can also generate boxplots from our results, using the script "genBoxPlots.m". To generate aRRMSE boxplots considering all datasets, we can call genBoxPlot('aRRMSE','all') in MATLAB. We perform a boxplot analysis, categorizing the datasets in small, medium and large ones. To generate the boxplots considering the number of bytes and small datasets, we call genBoxPlots('bytes','small'). The first argument should be "aRRMSE", "trTime" or "bytes". The second argument can be "all", "small", "medium" or "large". Since we encountered some large outliers in our results, we had to ask MATLAB to hide the outliers to improve the user visualization. The boxplots used in the paper are also included in the "results" folder, but can be generated using the function and the given "results.mat" file. A summary of the results are included in the "Spreadsheet" folder, showing the obtained aRRMSE metric, training time and number of bytes used to store each model.