Datasets Comparison
Version 7
HEVC-SVS: Low-level HEVC features and CNN features for TVSum, SumMe, OVP and VSUMM datasets
Description
Steps to reproduce
The features extracted are the following:
Feature ID Feature variable
Averaged per frame
1 Number of CU parts
2 MVD bits per CU
3 CU bits excluding MVD bits
4 Percentage of intra CU parts
5 Percentage of skipped CU parts
6 Number of CUs with depth 0 (i.e 64×64)
7 Number of parts with depth 1 (i.e 32×32)
8 Number of CUs with depth 2 (i.e 16×16)
9 Number of parts with depth 3 (i.e 8x8)
Not Averaged per frame
10-18 Standard deviation of feature IDs 1-9 per frame
19 Max CU depth per frame
20 For CUs with depth > 0, log2 (sum of MVD)
21 For CUs with depth = 0, log2 (sum of MVD)
Averaged per frame
22 Row-wise SAD of the CU prediction error
23 Column-wise SAD of the CU prediction error
24 Ratio of gradients (i.e feature 22 divided by feature 23) per CU
25 Total distortion per CU as computed by the HEVC encoder
Not Averaged per frame
26-29 Standard deviation of feature IDs 22-25 per frame
30 Per frame: Summation of variance of the x and y components of all MVs
31-47 Histogram of x-component of all MVs per frame (using 16 pins) 48-64 Histogram of y-component of all MVs per frame (using 16 pins)
Institutions
American University of Sharjah
Categories
Video Processing, Feature Extraction, Convolutional Neural Network, Video Summarization
Related Links
Licence
Creative Commons Attribution 4.0 International
Version 8
HEVC-SVS: Low-level HEVC features and CNN features for TVSum, SumMe, OVP and VSUMM datasets
Description
Proposed HEVC feature sets along with CNN features from GoogleNet, AlexNet, Inception-ResNet-V2, and VGG16 for TVSum, SumMe, OVP and VSUMM datasets. The new modified datasets names are "HEVC-SVS-TVSum", "HEVC-SVS-SumMe", "HEVC-SVS-OVP" and "HEVC-SVS-VSUMM", respectively.
The datasets contain the original ground truth data they came with, and these stayed unmodified.
Upon using any of these datasets, please do cite our publications where we proposed the HEVC feature set for the first time:
If you are using (HEVC-SVS-OVP) and/or (HEVC-SVS-VSUMM) datasets: https://ieeexplore.ieee.org/document/9815254/
@article{issa_cnn_2022,
title = {{CNN} and {HEVC} {Video} {Coding} {Features} for {Static} {Video} {Summarization}},
volume = {10},
issn = {2169-3536},
url = {https://ieeexplore.ieee.org/document/9815254/},
doi = {10.1109/ACCESS.2022.3188638},
urldate = {2022-09-29},
journal = {IEEE Access},
author = {Issa, Obada and Shanableh, Tamer},
year = {2022},
pages = {72080--72091},
}
If you are using (HEVC-SVS-TVSum) and/or (HEVC-SVS-SumMe) datasets: https://www.mdpi.com/2076-3417/13/10/6065
@article{issa_static_2023,
title = {Static {Video} {Summarization} {Using} {Video} {Coding} {Features} with {Frame}-{Level} {Temporal} {Subsampling} and {Deep} {Learning}},
volume = {13},
issn = {2076-3417},
url = {https://www.mdpi.com/2076-3417/13/10/6065},
doi = {10.3390/app13106065},
number = {10},
journal = {Applied Sciences},
author = {Issa, Obada and Shanableh, Tamer},
month = may,
year = {2023},
pages = {6065},
}
Make sure to also cite the original authors for each of the datasets:
TVSum (https://people.csail.mit.edu/yalesong/tvsum/)
SumMe (https://gyglim.github.io/me/vsum/index.html)
OVP and VSUMM (https://www.sites.google.com/site/vsummsite/download)
Acknowledgement:
The work in this research project is supported by the American University of Sharjah under research grant number FRG22-E-E44. This research work represents the opinions of the author(s) and does not mean to represent the position or opinions of the American University of Sharjah.
Steps to reproduce
The features extracted are the following:
Feature ID Feature variable
Averaged per frame
1 Number of CU parts
2 MVD bits per CU
3 CU bits excluding MVD bits
4 Percentage of intra CU parts
5 Percentage of skipped CU parts
6 Number of CUs with depth 0 (i.e 64×64)
7 Number of parts with depth 1 (i.e 32×32)
8 Number of CUs with depth 2 (i.e 16×16)
9 Number of parts with depth 3 (i.e 8x8)
Not Averaged per frame
10-18 Standard deviation of feature IDs 1-9 per frame
19 Max CU depth per frame
20 For CUs with depth > 0, log2 (sum of MVD)
21 For CUs with depth = 0, log2 (sum of MVD)
Averaged per frame
22 Row-wise SAD of the CU prediction error
23 Column-wise SAD of the CU prediction error
24 Ratio of gradients (i.e feature 22 divided by feature 23) per CU
25 Total distortion per CU as computed by the HEVC encoder
Not Averaged per frame
26-29 Standard deviation of feature IDs 22-25 per frame
30 Per frame: Summation of variance of the x and y components of all MVs
31-47 Histogram of x-component of all MVs per frame (using 16 pins) 48-64 Histogram of y-component of all MVs per frame (using 16 pins)
Institutions
American University of Sharjah
Categories
Video Processing, Feature Extraction, Convolutional Neural Network, Video Summarization
Related Links
Licence
Creative Commons Attribution 4.0 International