A Comparative Study Of LRC, NNLS, Knn, And Sparse Representation Based Classification Methods Based On Audio Features
Volume 2 - Issue 9, September 2018 Edition
[Download Full Paper]
Author(s)
Nu War
Keywords
MFCC, DWT, SRC, NNLS, LRC, MSRC, and kNN.
Abstract
Video comprises of two signal modes the acoustic and visual. The visual information is potentially more difficult to extract. In this work, only acoustic signal mode are considered and assessed for the task of genre classification. In the video genre classification field, there are many genres in the real world such as music, sport, information, education, news, and so on. In this paper, the three genres are only considered to classify such as cartoon, sport and music. The experiments with them have also provided a reference of the performance of such systems when dealing with the own video data set. Furthermore, it has been experimented with five classification methods (SRC, NNLS, LRC, MSRC, and kNN) in order to improve accuracy and to see relevant aspects and processes of them. It consists of discrete wavelet subband features, then computes the mean and variances for all. Furthermore, MFCC features are also implemented in feature extraction. Finally, a proper evaluation of the solution has been done. The overall accuracy and classification are also shown in the experimental results.
References
[1] M. Roach, L.-Q. Xu, and J. Mason, “Classification of non-edited broadcast video using holistic low-level features,†(IWDC‘2002), 2002.
[2] R. Jasinschi and J. Louie, “Automatic tv program genre classification based on audio patterns,†in Euromicro Conference, 2001, 2001.
[3] L.-Q. Xu and Y. Li, “Video classification using spatial-temporal features and pca,†in Multimedia and Expo, (ICME ‘03), 2003.
[4] M. Roach and J. Mason, “Classification of video genre using audio,†in European Conference on Speech Communication and Technology, 2001.
[5] S. G. Mallat and W.L. Hwang, “Singularity Detection and Processing with wavelets,†IEEE Transactions on Information Theory, 38 (2): 617-643, 1992.
[6] C-P.Wei, Y-W.Chao, Y-R.Yeh,and Y-C. F.Wang, “Locality-sensitive dictionary learning for sparse representation based classificationâ€, Pattern Recognition 46 (2013) 1277–1287.
http://dx.doi.org/10.1016/j.patcog.2012.11.014
[7] L. Zhanga, M.Yanga, and X.Fengb, “Sparse Representation or Collaborative Representation: Which Helps Face Recognition? â€
[8] Boutsidis, Christos; Drineas, Petros (2009). "Random projections for the nonnegative least-squares problem". Linear Algebra and its Applications. 431 (5–7): 760–771. doi:10.1016/j.laa.2009.03.026.
[9] C-G.Li, J.Guo and H-G.Zhang, “Local Sparse Representation based Classificationâ€, 2010 International Conference on Pattern Recognition.
[10] C-H. Zheng , L.Zhang, T-Y.Ng, and C.K.Shiu, “Metasample Based Sparse Representation for Tumor Classificationâ€.
[11] N. N. Htwe, and N. War, “ Human Identification Using Biometric Gait Features,†International Conference on Advances in Engineering and Technology (ICAET'2014) March 29-30, 2014.
[12] K. Z. Thwe, N. War, “Environmental Sound Classification using Time-frequency Representationâ€, IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD 2017) ; Date- June 26-28, 2017, Kanazawa, Japan. ISBN: ‎978-1-5090-5504-3
[13] K. Z. Thwe, N. War, “Local Tetra Pattern Texture Features for Environmental Sound Event Classificationâ€, The 18th International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT'17), 2017, Taipei Taiwan.
[14] T. T. Yu, N. War, “Condensed Object Representation with Corner HOG Features for object Classification in Outdoor Scenesâ€, IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing SNPD 2017, Kanazawa, Japan.
[15] T. Guha and R. K. Ward, Fellow, “Learning Sparse Representations for Human Action Recognitionâ€, IEEE Transactions on Pattaern analysis and Machine Intelligence, VOL. , NO. , July 2011.