Introduction to Audio Analysis: A MATLAB Approach | Guide books

Introduction to Audio Analysis: A MATLAB ApproachApril 2014

April 2014

Publisher:

Academic Press, Inc.
6277 Sea Harbor Drive Orlando, FL
United States

ISBN:978-0-08-099388-1

Published:21 April 2014

Pages:

288

Available at Amazon

Bibliometrics

Abstract

Introduction to Audio Analysis serves as a standalone introduction to audio analysis, providing theoretical background to many state-of-the-art techniques. It covers the essential theory necessary to develop audio engineering applications, but also uses programming techniques, notably MATLAB, to take a more applied approach to the topic. Basic theory and reproducible experiments are combined to demonstrate theoretical concepts from a practical point of view and provide a solid foundation in the field of audio analysis. Audio feature extraction, audio classification, audio segmentation, and music information retrieval are all addressed in detail, along with material on basic audio processing and frequency domain representations and filtering. Throughout the text, reproducible MATLAB examples are accompanied by theoretical descriptions, illustrating how concepts and equations can be applied to the development of audio analysis systems and components. A blend of reproducible MATLAB code and essential theory provides enable the reader to delve into the world of audio signals and develop real-world audio applications in various domains. Practical approach to signal processing: The first book to focus on audio analysis from a signal processing perspective, demonstrating practical implementation alongside theoretical concepts Bridge the gap between theory and practice: The authors demonstrate how to apply equations to real-life code examples and resources, giving you the technical skills to develop real-world applications Library of MATLAB code: The book is accompanied by a well-documented library of MATLAB functions and reproducible experiments

Cited By

Contributors

Τheodoros Giannakopoulos
National Centre for Scientific Research "DEMOKRITOS"
- Publication Years2006 - 2022
- Publication counts36
- Citation count158
- Available for Download18
- Downloads (cumulative)2,463
- Downloads (12 months)178
- Downloads (6 weeks)17
- Average Downloads per Article137
- Average Citation per Article4
View Full Profile
Aggelos Pikrakis
University of Piraeus
- Publication Years1998 - 2023
- Publication counts16
- Citation count117
- Available for Download2
- Downloads (cumulative)315
- Downloads (12 months)48
- Downloads (6 weeks)1
- Average Downloads per Article158
- Average Citation per Article7
View Full Profile

Index Terms

Introduction to Audio Analysis: A MATLAB Approach

Recommendations

Reviewer: Ghita Kouadri

Audio analysis is the science of dealing with the extraction of information from audio signals for the sake of analysis, classification, and synthesis. The applications of audio analysis range from surveillance and forensics to audio emotion detection. There are several audio analysis software packages available on the market, some of which are even free. They are quite useful for occasional users. However, MATLAB remains the de facto tool as it contains built-in functions to manipulate signals in general, including audio data. This book synthesizes the main techniques for audio capture, visualization, reading, analysis, and storage using MATLAB. It is divided into eight chapters. The introduction presents the MATLAB audio library and guidelines on how to use this book in the best way possible. The rest of the chapters in Part 1 (chapters 2 through 4) provide a tutorial on audio signals, transforms, and filtering essentials. Part 2 (chapters 5 through 7) delves into more advanced topics such as audio classification and segmentation. Part 3 (chapter 8) focuses on the important topic of music information retrieval, given the relevance of its applications. These applications include automatic music transcription, track separation, and instrument recognition. The book was written with the aim of providing a self-contained book. Every chapter contains a set of exercises to help the reader test the concepts learned. The provided MATLAB audio library constitutes the core of the book. Indeed, it simplifies many tasks when analyzing audio data and therefore can be used in further projects. When reading the theoretical part, I must admit that I had to make an effort to get some concepts. I believe that the authors aim at providing a relatively compact book with the necessary information on audio analysis. Therefore, some information has been condensed to fit the format. However, combined with conventional face-to-face lectures, the current book represents the best choice as a textbook. I also must admit that the book's typography and paper quality add value to its content. It is clear that a great deal of effort has been put forward to produce a high-quality textbook to be used on a daily basis by students and professionals. Online Computing Reviews Service

Reviewer: George Michael White

Giannakopoulos and Pikrakis discuss the scope of this book at its beginning: Before we proceed, it is important to note that, although in this book the term 'audio' does not exclude the speech signal, we are not focusing on traditional speech-related problems that have been studied by the research community for decades, e.g. speech recognition and coding. It is our intention to provide analysis methods that can be used to study various audio modalities and their relationship in mixed audio streams. ... In other words, we are not interested in providing solutions that are well tailored to specific audio types (e.g. the speech signal) but are not applicable to other modalities. The book is divided into three parts. The first part is devoted to a selection of mathematical tools that are used to extract various features of audio streams. Chapter 2 introduces some elementary techniques and properties that will prove helpful in what follows: sampling, playback mono, stereo, block reading and writing, and short-term processing. Chapter 3 brings in the heavy guns, the discrete Fourier transform (using the complex exponential formulation), the discrete cosine transform, the discrete-time wavelet transform, and digital filtering. Included are several MATLAB programs that implement these things. The following chapter explains how some of the elementary properties of audio files are extracted. Such a file may consist of a single stationary waveform. In real life, however, an audio file probably consists of one or more stationary or nonstationary waveforms mixed with “noise.” Various techniques can eliminate or reduce this noise. Time-domain and frequency-domain audio features centered around the distribution spectrum are defined here and more MATLAB programs are presented. When these tools and techniques are mastered, we can start using them to extract useful features from the audio streams, things like audio classification, segmentation, alignment, and temporal modeling. The second part of the book contains a chapter for teaching these topics. Chapter 5 begins the study of classification techniques. The features that are extracted from the files form a pyramid. The lower layers of this pyramid use short-term techniques that generate feature vectors that are passed up to higher layers that compute various statistics that, in turn, are passed up to form feature vectors. The end goal is to estimate a class label that is represented by the computed feature vector. Thus, a class label of a certain audio stream might indicate that it is part of a speech made by a certain individual, or perhaps the chirp of a black-capped chickadee or a segment of electronic music. There are approaches that can use the a priori probabilities to estimate the exact class the sound belongs to. In other cases, nothing at all is known about the sound's origins. How, then, should such a sound be classified__?__ This is explored in Part 2. The Bayesian classifier, k -nearest-neighbor classifier, and others are introduced at this time, along with the problems of training, testing, and evaluation of the results. Chapter 5 concludes with several case studies. Chapter 6 tackles the necessity of segmentation. Usually, real-life audio streams consist of sequences of different audio types, things like speech followed by music followed by more speech and so on. The goal here is to split the audio signal into homogeneous segments that can be analyzed separately. Various types of windowing may be used and classification may or may not be desirable. In chapter 7, “Audio Alignment and Temporal Modeling,” the reader will discover dynamic time warping, hidden Markov modeling, the Viterbi algorithm, the Baum-Welch algorithm, and various training methods. The chapters are each terminated by a set of exercises. Some of them will require a mathematical analysis. Others will be answered by a MATLAB program. This illustrates the strengths and weaknesses of the book. MATLAB is a very powerful programming system that is well suited for solving problems arising in this field. However, it is not as universally available as other systems such as Microsoft Visual Studio. If MATLAB is available to the reader, then go to it. MATLAB provides a suite of primitives that are eminently suitable for use in programs to solve problems in audio analysis. The MATLAB system is well worth the price for someone with a strong interest in the field. The reader should also note that a certain level of applied mathematics is required to do any serious work here. Thus, a working knowledge of complex variables and probability theory is required to really grasp the underlying concepts. At less than 300 pages, the volume is relatively slender and is written in a sparse but graceful style, skillfully edited, and well bound. It is mostly suitable for the reader seriously interested in audio analysis who likes a mathematical programming approach to the subject. Online Computing Reviews Service

Reviewer: Vladimir Botchev

This is a very well-written and well-presented book. It differs from some other books on signal processing, which use MATLAB as the main vehicle for conveying practical solutions, as it doesn't clutter all its pages with MATLAB code listings. MATLAB code, though an essential part of the book, and available for download, is only briefly explained, as in every good manual. The book deals less with signal processing techniques, which are covered only in the first couple of chapters after the introduction, than with pattern recognition and machine learning. Indeed, the emphasis is placed on classification and recognition algorithms. In far fewer pages than in other works dedicated to these topics, the authors clearly present the practical details of major classification and pattern search algorithms, such as k -means, dynamic programming, and hidden Markov. The only unfortunate omission among these select algorithms is an introduction to the most popular type of neural network (NN), backpropagation NN. Hopefully, they will include it in a future edition, since by application base, it is almost as widespread as the selected ones in the book. The book also presents some issues in music information retrieval, which while interesting are of lesser value. The MATLAB toolbox for that purpose, music information retrieval (MIR), has been available for many years now, with very extensive documentation. This new book on audio content analysis and the associated toolbox is highly recommended to audio signal processing practitioners. It can even serve as a first introduction to the more general area of pattern classification. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.

Comments

Browse Books

Sections

Cited By

Index Terms

An Introduction to Nonlinear Analysis: Theory

Introduction to Microprocessors

Audio Coding: Theory and Applications

Access critical reviews of Computing literature here

Save to Binder

Sections

Cited By

Save to Binder

Index Terms

Recommendations

An Introduction to Nonlinear Analysis: Theory

Introduction to Microprocessors

Audio Coding: Theory and Applications

Access critical reviews of Computing literature here