Introduction to Audio Analysis serves as a standalone introduction to audio analysis, providing theoretical background to many state-of-the-art techniques. It covers the essential theory necessary to develop audio engineering applications, but also uses programming techniques, notably MATLAB, to take a more applied approach to the topic. Basic theory and reproducible experiments are combined to demonstrate theoretical concepts from a practical point of view and provide a solid foundation in the field of audio analysis. Audio feature extraction, audio classification, audio segmentation, and music information retrieval are all addressed in detail, along with material on basic audio processing and frequency domain representations and filtering. Throughout the text, reproducible MATLAB examples are accompanied by theoretical descriptions, illustrating how concepts and equations can be applied to the development of audio analysis systems and components. A blend of reproducible MATLAB code and essential theory provides enable the reader to delve into the world of audio signals and develop real-world audio applications in various domains. Practical approach to signal processing: The first book to focus on audio analysis from a signal processing perspective, demonstrating practical implementation alongside theoretical concepts Bridge the gap between theory and practice: The authors demonstrate how to apply equations to real-life code examples and resources, giving you the technical skills to develop real-world applications Library of MATLAB code: The book is accompanied by a well-documented library of MATLAB functions and reproducible experiments
Cited By
- Nicolini M and Ntalampiras S Gender-Aware Speech Emotion Recognition in Multiple Languages Pattern Recognition Applications and Methods, (111-123)
- Cao Y, Min X, Sun W and Zhai G (2023). Subjective and Objective Audio-Visual Quality Assessment for User Generated Content, IEEE Transactions on Image Processing, 32, (3847-3861), Online publication date: 1-Jan-2023.
- Angelopoulos K, Georgoulaki K and Glentis G Evaluating the impact of spectral estimators on frequency domain feature classification applications for pipe leakage detection 2022 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), (1-6)
- Franzoni V, Baia A, Biondi G and Milani A Producing Artificial Male Voices with Maternal Features for Relaxation IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, (273-277)
- Rajakumar M, Ramya J and Maheswari B (2021). Health monitoring and fault prediction using a lightweight deep convolutional neural network optimized by Levy flight optimization algorithm, Neural Computing and Applications, 33:19, (12513-12534), Online publication date: 1-Oct-2021.
- Cunningham S, Ridley H, Weinel J and Picking R (2020). Supervised machine learning for audio emotion recognition, Personal and Ubiquitous Computing, 25:4, (637-650), Online publication date: 1-Aug-2021.
- Glentis G, Georgoulaki K and Angelopoulos K Efficient selection of time domain features for leakage detection in pipes carrying liquid commodities 2021 IEEE International Instrumentation and Measurement Technology Conference (I2MTC), (1-6)
- Gupta N, Khosravy M, Patel N, Dey N, Gupta S, Darbari H and Crespo R (2020). Economic data analytic AI technique on IoT edge devices for health monitoring of agriculture machines, Applied Intelligence, 50:11, (3990-4016), Online publication date: 1-Nov-2020.
- Li B, Han B, Wang Z, Jiang J and Long G Confusable Learning for Large-Class Few-Shot Classification Machine Learning and Knowledge Discovery in Databases, (707-723)
- Chittaragi N and Koolagudi S (2019). Automatic dialect identification system for Kannada language using single and ensemble SVM algorithms, Language Resources and Evaluation, 54:2, (553-585), Online publication date: 1-Jun-2020.
- Chittaragi N, Hegde P, Mothukuri S and Koolagudi S Spectral Feature Based Kannada Dialect Classification from Stop Consonants Pattern Recognition and Machine Intelligence, (82-90)
- Bhattacharya I, Foley M, Ku C, Zhang N, Zhang T, Mine C, Li M, Ji H, Riedl C, Welles B and Radke R The unobtrusive group interaction (UGI) corpus Proceedings of the 10th ACM Multimedia Systems Conference, (249-254)
- Cunningham S, Weinel J and Picking R High-Level Analysis of Audio Features for Identifying Emotional Valence in Human Singing Proceedings of the Audio Mostly 2018 on Sound in Immersion and Emotion, (1-4)
- Zhang C, Xue Q, Waghmare A, Meng R, Jain S, Han Y, Li X, Cunefare K, Ploetz T, Starner T, Inan O and Abowd G FingerPing Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems, (1-10)
- Yang X, He L, Qu D and Zhang W (2018). Semi-supervised minimum redundancy maximum relevance feature selection for audio classification, Multimedia Tools and Applications, 77:1, (713-739), Online publication date: 1-Jan-2018.
- Rao S. B P, Rasipuram S, Das R and Jayagopi D Automatic assessment of communication skill in non-conventional interview settings: a comparative study Proceedings of the 19th ACM International Conference on Multimodal Interaction, (221-229)
- Strese M, Schuwerk C, Iepure A and Steinbach E (2017). Multimodal Feature-Based Surface Material Classification, IEEE Transactions on Haptics, 10:2, (226-239), Online publication date: 1-Apr-2017.
- Yang C, Cheung G, Stankovic V, Chan K and Ono N (2017). Sleep Apnea Detection via Depth Video and Audio Feature Learning, IEEE Transactions on Multimedia, 19:4, (822-835), Online publication date: 1-Apr-2017.
- Albornoz E and Milone D (2017). Emotion Recognition in Never-Seen Languages Using a Novel Ensemble Method with Emotion Profiles, IEEE Transactions on Affective Computing, 8:1, (43-53), Online publication date: 1-Jan-2017.
- Rasipuram S Prediction/Assessment of communication skill using multimodal cues in social interactions Proceedings of the 18th ACM International Conference on Multimodal Interaction, (546-549)
- Rasipuram S, B. P and Jayagopi D Asynchronous video interviews vs. face-to-face interviews for communication skill measurement: a systematic study Proceedings of the 18th ACM International Conference on Multimodal Interaction, (370-377)
- Cobb J A novel audio based approach to game control to encourage musical instrument practice Proceedings of the 30th International BCS Human Computer Interaction Conference: Fusion!, (1-3)
- Giannakopoulos T and Siantikos G A ROS framework for audio-based activity recognition Proceedings of the 9th ACM International Conference on PErvasive Technologies Related to Assistive Environments, (1-4)
- Prieto L, Sharma K, Dillenbourg P and Jesús M Teaching analytics Proceedings of the Sixth International Conference on Learning Analytics & Knowledge, (148-157)
- Mahesha P and Vinod D Automatic Segmentation and Classification of Dysfluencies in Stuttering Speech Proceedings of the Second International Conference on Information and Communication Technology for Competitive Strategies, (1-6)
- Nigam A and Riek L Social context perception for mobile robots 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (3621-3627)
- Giannakopoulos T, Siantikos G, Perantonis S, Votsi N and Pantis J Automatic soundscape quality estimation using audio analysis Proceedings of the 8th ACM International Conference on PErvasive Technologies Related to Assistive Environments, (1-9)
- Giannakopoulos T, Smailis C, Perantonis S and Spyropoulos C Realtime depression estimation using mid-term audio features Proceedings of the 3rd International Conference on Artificial Intelligence and Assistive Medicine - Volume 1213, (41-45)
Index Terms
- Introduction to Audio Analysis: A MATLAB Approach