Abstract
This work is focused on the evaluation of different methods to estimate the amount of jitter present in speech signals. The jitter value is a measure of the irregularity of a quasiperiodic signal and is a good indicator of the presence of pathologies in the larynx such as vocal fold nodules or a vocal fold polyp. Given the irregular nature of the speech signal, each jitter estimation algorithm relies on its own model making a direct comparison of the results very difficult. For this reason, the evaluation of the different jitter estimation methods was target on their ability to detect pathological voices. Two databases were used for this evaluation: a subset of the MEEI database and a smaller database acquired in the scope of this work. The results showed that there were significant differences in the performance of the algorithms being evaluated. Surprisingly, in the largest database the best results were not achieved with the commonly used relative jitter, measured as a percentage of the glottal cycle, but with absolute jitter values measured in microseconds. Also, the new proposed measure for jitter, LocJitt, performed in general is equal to or better than the commonly used tools of MDVP and Praat.
- J. P. Dworkin and R. J. Meleca, Vocal Pathologies: Diagnosis, Treatment & Case Studies, Singular, San Diego, Calif, USA, 1996.Google Scholar
- J. Kreiman, B. R. Gerratt, G. B. Kempster, A. Erman, and G. S. Berke, "Perceptual evaluation of voice quality: review, tutorial, and a framework for future research," Journal of Speech and Hearing Research, vol. 36, no. 1, pp. 21-40, 1993.Google ScholarCross Ref
- "Multi-Dimensional Voice Program, Model 5105".Google Scholar
- P. Boersma and D. Weenink, "Praat, a system for doing phonetics by computer," Glot International, vol. 5, pp. 341-345, 2001.Google Scholar
- J. Schoentgen, "Stochastic models of jitter," Journal of the Acoustical Society of America, vol. 109, no. 4, pp. 1631-1650, 2001.Google ScholarCross Ref
- O. Amir, M. Wolf, and N. Amir, "A clinical comparison between two acoustic analysis softwares: MDVP and Praat," Biomedical Signal Processing and Control, vol. 4, no. 3, pp. 202-205, 2009.Google ScholarCross Ref
- J. I. Godino-Llorente and P. Gomez-Vilda, "Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors," IEEE Transactions on Biomedical Engineering, vol. 51, no. 2, pp. 380-384, 2004.Google ScholarCross Ref
- R. J. Moran, R. B. Reilly, P. de Chazal, and P. D. Lacy, "Telephony-based voice pathology assessment using automated speech analysis," IEEE Transactions on Biomedical Engineering, vol. 53, no. 3, pp. 468-477, 2006.Google ScholarCross Ref
- P. Gómez-Vilda, R. Fernández-Baillo, V. Rodellar-Biarge, et al., "Glottal source biometrical signature for voice pathology detection," Speech Communication, vol. 50, no. 9, pp. 759-781, 2009. Google ScholarDigital Library
- D. Wong, M. R. Ito, N. B. Cox, and I. R. Titze, "Observation of perturbations in a lumped-element model of the vocal folds with application to some pathological cases," The Journal of the Acoustical Society of America, vol. 89, no. 1, pp. 383-394, 1991.Google ScholarCross Ref
- L. Lehto, M. Airas, E. Björkner, J. Sundberg, and P. Alku, "Comparison of two inverse filtering methods in parameterization of the glottal closing phase characteristics in different phonation types," The Journal of Voice, vol. 21, no. 2, pp. 138-150, 2007.Google ScholarCross Ref
- B. S. Atal and S. L. Hanauer, "Speech analysis and synthesis by linear prediction of the speech wave," The Journal of the Acoustical Society of America, vol. 50, no. 2B, pp. 637-655, 1971.Google ScholarCross Ref
- "Disordered Voice Database and Program, Model 4337," 1994.Google Scholar
- A. Kounoudes, P. Naylor, and M. Brookes, "The DYPSA algorithm for estimation of glottal closure instants in voiced speech," in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP'02), vol. 1, pp. 349-352, Orlando, Fla, USA, May 2002.Google Scholar
- M. Brookes, "VOICEBOX: Speech Processing Toolbox for MATLAB," 2003.Google Scholar
- M. Vasilakis and Y. Stylianou, "A mathematical model for accurate measurement of jitter," in Proceedings of the 5th International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, Firenze University Press, Firenze, Italy, December 2007.Google Scholar
Index Terms
- Jitter estimation algorithms for detection of pathological voices
Recommendations
Pitch detection in pathological voices driven by three tailored classical pitch detection algorithms
GAVTASC'11: Proceedings of the 11th WSEAS international conference on Signal processing, computational geometry and artificial vision, and Proceedings of the 11th WSEAS international conference on Systems theory and scientific computationPitch detection is one of the most difficult problems encountered when analyzing speech signals. This paper focuses on detecting the pitch in pathological voices, what is of key importance, for voice pathology diagnosis. In particular, we put special ...
Ageing Voices: The Effect of Changes in Voice Parameters on ASR Performance
With ageing, human voices undergo several changes which are typically characterized by increased hoarseness and changes in articulation patterns. In this study, we have examined the effect on Automatic Speech Recognition (ASR) and found that the Word ...
Comments