skip to main content
article
Free Access

Study of harmonics-to-noise ratio and critical-band energy spectrum of speech as acoustic indicators of laryngeal and voice pathology

Authors Info & Claims
Published:01 January 2007Publication History
Skip Abstract Section

Abstract

Acoustic analysis of speech signals is a noninvasive technique that has been proved to be an effective tool for the objective support of vocal and voice disease screening. In the present study acoustic analysis of sustained vowels is considered. A simple k-means nearest neighbor classifier is designed to test the efficacy of a harmonics-to-noise ratio (HNR) measure and the critical-band energy spectrum of the voiced speech signal as tools for the detection of laryngeal pathologies. It groups the given voice signal sample into pathologic and normal. The voiced speech signal is decomposed into harmonic and noise components using an iterative signal extrapolation algorithm. The HNRs at four different frequency bands are estimated and used as features. Voiced speech is also filtered with 21 critical-bandpass filters that mimic the human auditory neurons. Normalized energies of these filter outputs are used as another set of features. The results obtained have shown that the HNR and the critical-band energy spectrum can be used to correlate laryngeal pathology and voice alteration, using previously classified voice samples. This method could be an additional acoustic indicator that supplements the clinical diagnostic features for voice evaluation.

References

  1. {1} I. R. Titze, Principles of Voice Production, Prentice-Hall, Englewood Cliffs, NJ, USA, 1994.Google ScholarGoogle Scholar
  2. {2} M. Hirano, S. Hibi, R. Terasawa, and M. Fujiu, "Relationship between aerodynamic, vibratory, acoustic and psychoacoustic correlates in dysphonia," Journal of Phonetics, vol. 14, pp. 445-456, 1986.Google ScholarGoogle Scholar
  3. {3} S. B. Davis, "Acoustic characteristics of laryngeal pathology," in Speech Evaluation in Medicine, J. Darby, Ed., pp. 77-104, Grune and Stratton, New York, NY, USA, 1981.Google ScholarGoogle Scholar
  4. {4} J. H. L. Hansen, L. Gavidia-Ceballos, and J. F. Kaiser, "A non-linear operator-based speech feature analysis method with application to vocal fold pathology assessment," IEEE Transactions on Biomedical Engineering, vol. 45, no. 3, pp. 300-313, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  5. {5} O. Fujimura and M. Hirano, Vocal Fold Physiology-Voice Quality Control, Singular, San Diego, Calif, USA, 1995.Google ScholarGoogle Scholar
  6. {6} R. J. Baken and R. F. Orlikoff, Clinical Measurements of Speech and Voice, Singular Thomson Learning, San Diego, Calif, USA, 2000.Google ScholarGoogle Scholar
  7. {7} R. D. Kent and C. Read, The Acoustic Analysis of Speech, AITBS, New Delhi, India, 1995.Google ScholarGoogle Scholar
  8. {8} L. Gavidia-Ceballos and J. H. L. Hansen, "Direct speech feature estimation using an iterative EM algorithm for vocal fold pathology detection," IEEE Transactions on Biomedical Engineering , vol. 43, no. 4, pp. 373-383, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  9. {9} D. G. Childers, "Signal processing methods for the assessment of vocal disorders," The Journal of Biomedical Engineering Society of India, vol. 13, pp. 117-130, 1994.Google ScholarGoogle Scholar
  10. {10} N. B. Pinto and I. R. Titze, "Unification of perturbation measures in speech signals," The Journal of the Acoustical Society of America, vol. 87, no. 3, pp. 1278-1289, 1990.Google ScholarGoogle ScholarCross RefCross Ref
  11. {11} E. Yumoto, W. J. Gould, and T. Baer, "Harmonics to noise ratio as an index of the degree of hoarseness," The Journal of the Acoustical Society of America, vol. 71, no. 6, pp. 1544-1550, 1982.Google ScholarGoogle ScholarCross RefCross Ref
  12. {12} H. Kasuya, S. Ogawa, K. Mashima, and S. Ebihara, "Normalized noise energy as an acoustic measure to evaluate pathologic voice," The Journal of the Acoustical Society of America, vol. 80, no. 5, pp. 1329-1334, 1986.Google ScholarGoogle ScholarCross RefCross Ref
  13. {13} C. Manfredi, "Adaptive noise energy estimation in pathological speech signals," IEEE Transactions on Biomedical Engineering , vol. 47, no. 11, pp. 1538-1543, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  14. {14} M. de Oliveira Rosa, J. C. Pereira, and M. Grellet, "Adaptive estimation of residue signal for voice pathology diagnosis," IEEE Transactions on Biomedical Engineering, vol. 47, no. 1, pp. 96-104, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  15. {15} F. Plant, H. Kessler, B. Cheetham, and J. Earis, "Speech monitoring of infective laryngitis," in Proceedings of the 4th International Conference on Spoken Language Processing (ICSLP '96), vol. 2, pp. 749-752, Philadelphia, Pa, USA, October 1996.Google ScholarGoogle Scholar
  16. {16} D. Michaelis, T. Gramss, and H. W. Strube, "Glottal to noise excitation ratio-a new measure for describing pathological voices," Acustica - Acta Acustica, vol. 83, no. 4, pp. 700-706, 1997.Google ScholarGoogle Scholar
  17. {17} D. Michaelis, M. Fröhlich, and H. W. Strube, "Selection and combination of acoustic features for the description of pathologic voices," The Journal of the Acoustical Society of America, vol. 103, no. 3, pp. 1628-1639, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  18. {18} Anantha krishna, K. Shama, and U. C. Niranjan, "k-Means nearest neighbor classifier for voice pathology," in Proceedings of IEEE India Annual Conference (INDICON '04), pp. 232-234, IIT-Kharagpur, India, December 2004.Google ScholarGoogle Scholar
  19. {19} E. Zwicker and H. Fastl, Psycho-Acoustics: Facts and Models, Springer, Berlin, Germany, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. {20} Kay Elemetrics Corp, Disordered Voice Database Model 4337, Version 1.03, Massachusetts Eye and Ear Infirmary Voice and Speech Lab, 2002.Google ScholarGoogle Scholar
  21. {21} B. Yegnanarayana, C. d'Alessandro, and V. Darsinos, "An iterative algorithm for decomposition of speech signals into periodic and aperiodic components," IEEE Transactions on Speech and Audio Processing, vol. 6, no. 1, pp. 1-11, 1998.Google ScholarGoogle ScholarCross RefCross Ref
  22. {22} C. Wendt and A. Petropulu, "Pitch determination and speech segmentation using the discrete wavelet transform," in Proceedings of IEEE International Symposium on Circuits and Systems (ISCAS '96), vol. 2, pp. 45-48, Atlanta, Ga, USA, May 1996.Google ScholarGoogle Scholar
  23. {23} S. Mallat and S. Zhong, "Characterization of signals from multiscale edges," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 14, no. 7, pp. 710-732, 1992. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. {24} T. F. Quatieri, Discrete-Time Speech Signal Processing, Prentice Hall PTR, Upper Saddle River, NJ, USA, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. {25} S. H. Chen and J. F. Wang, "Noise-robust pitch detection method using wavelet transform with aliasing compensation," IEE Proceedings, vol. 149, no. 6, pp. 327-334, 2002.Google ScholarGoogle Scholar
  26. {26} A. Papoulis, Signal Analysis, McGraw-Hill, New York, NY, USA, Int. edition, 1984.Google ScholarGoogle Scholar
  27. {27} G. K. Parikh and P. C. Loizou, "The effects of noise on the spectrum of speech," a M.S. thesis presented to the faculty of Telecommunication Engineering, University of Texas at Dallas, August 2002.Google ScholarGoogle Scholar
  28. {28} W. A. Yost, Fundamentals of Hearing, Academic Press, New York, NY, USA, 3rd edition, 1994.Google ScholarGoogle Scholar
  29. {29} R. O. Duda, P. E. Hart, and D. G. Stork, Pattern Analysis, John Wiley & Sons, New York, NY, USA, 2002.Google ScholarGoogle Scholar
  30. {30} B. Boyanov and S. Hadjitodorov, "Acoustic analysis of pathological voices. A voice analysis system for the screening of laryngeal diseases," IEEE Engineering in Medicine and Biology Magazine, vol. 16, no. 4, pp. 74-82, 1997.Google ScholarGoogle ScholarCross RefCross Ref
  31. {31} J. B. Alonso, J. de Leon, I. Alonso, and M. A. Ferrer, "Automatic detection of pathologies in the voice by HOS based parameters," EURASIP Journal on Applied Signal Processing, vol. 2001, no. 4, pp. 275-284, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. {32} J. I. Godino-Llorente and P. Gomez-Vilda, "Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors," IEEE Transactions on Biomedical Engineering, vol. 51, no. 2, pp. 380-384, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  33. {33} K. Umapathi, S. Krishnan, V. Parsa, and D. G. Jamieson, "Discrimination of pathological voices using a time-frequency approach," IEEE Transactions on Biomedical Engineering, vol. 52, no. 3, pp. 421-430, 2005.Google ScholarGoogle ScholarCross RefCross Ref
  34. {34} D. R. Boone, The Voice and Voice Therapy, Prentice-Hall, Englewood Cliffs, NJ, USA, 1988.Google ScholarGoogle Scholar
  35. {35} J. A. Koufman and P. D. Blalock, "Functional voice disorders," in Oto Laryngological Clinics of North America. Voice Disorders, vol. 24, no. 5, pp. 1059-1073, Philadelphia, Pa, USA, October 1991.Google ScholarGoogle Scholar

Index Terms

  1. Study of harmonics-to-noise ratio and critical-band energy spectrum of speech as acoustic indicators of laryngeal and voice pathology

          Recommendations

          Comments

          Login options

          Check if you have access through your login credentials or your institution to get full access on this article.

          Sign in

          Full Access

          • Published in

            cover image EURASIP Journal on Advances in Signal Processing
            EURASIP Journal on Advances in Signal Processing  Volume 2007, Issue 1
            1 January 2007
            2434 pages

            Publisher

            Hindawi Limited

            London, United Kingdom

            Publication History

            • Published: 1 January 2007

            Qualifiers

            • article

          PDF Format

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader