ABSTRACT
With the proliferation of online multimedia content and the popularity of multimedia streaming systems, it is increasingly useful to be able to skim and browse multimedia quickly. A key technique that enables quick browsing of multimedia is time-compression. Prior research has described how speech can be time-compressed (shortened in duration) while preserving the pitch of the audio. However, client-server systems providing this functionality have not been available.
In this paper, we first describe the key tradeoffs faced by designers of streaming multimedia systems deploying time-compression. The implementation tradeoffs primarily impact the granularity of time-compression supported (discrete vs. continuous) and the latency (wait-time) experienced by users after adjusting degree of time-compression. We report results of user studies showing impact of these factors on the average- compression-rate achieved. We also present data on the usage patterns and benefits of time compression. Overall, we show significant time-savings for users and that considerable flexibility is available to the designers of client-server streaming systems with time compression.
- 1.Arons, B. "Techniques, Perception, and Applications of Time- Compressed Speech." In Proceedings of 1992 Conference, American Voice I/O Society, Sep. 1992, pp. 169-177.Google Scholar
- 2.Arons, B. "SpeechSkimmer: A System for Interactively Skimming Recorded Speech." A CM Transactions on Computer Human Interaction, 4, 1, 1997, 3-38. Google ScholarDigital Library
- 3.Beasley, D.S. & Maki, J.E. "Time- and Frequency-Altered Speech." In N.J. Lass (Ed.), Contemporary Issues in Experimental Phonetics, 419-458. NY: Academic Press, 1976.Google Scholar
- 4.Degen, L., Mander, R., & Salomon, G. "Working with Audio: Integrating Personal Tape recorders and Desktop Computers." Proc. CHI '92, ACM, Apr. 1992, pp. 413-418. Google ScholarDigital Library
- 5.Fairbanks, G., Everitt, W.L., & Jaeger, R.P. "Method for Time or Frequency Compression-Expansion of Speech." Transactions of the Institute of Radio Engineers, Professional Group on Audio A U-2 (1954): 7-12. Reprinted in G. Fairbanks, Experimental Phonetics: Selected Articles, University of Illinois Press, 1966.Google ScholarCross Ref
- 6.Foulke, W. & Sticht, T.G. "Review of research on the intelligibility and comprehension of accelerated speech." Psychological Bulletin, 72: 50-62, 1969.Google ScholarCross Ref
- 7.Garvey, W.D. "The intelligibility of abbreviated speech patterns." Quarterly Journal of Speech, 39: 296-306, 1953. Reprinted in J. S. Lim (Ed.) Speech Enhancement, Prentice-Hall, Inc., 1983.Google ScholarCross Ref
- 8.Garvey, W.D. "The intelligibility of speeded speech." Journal of Experimental Psychology, 45:102-108, 1953.Google ScholarCross Ref
- 9.Gerber, S.E. "Limits of speech time compression." In S. Duker (Ed.), Time-Compressed Speech, 456-465. Scarecrow, 1974.Google Scholar
- 10.Griffin, D.W. & Lim, J.S. "Signal estimation from modified short-time fourier transform." IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-32 (2): 236-243, 1984.Google ScholarCross Ref
- 11.Harrigan, K. "The SPECIAL System: Self-Paced Education with Compressed Interactive Audio Learning," Journal of Research on Computing in Education, 27, 3, Spring 1995.Google ScholarCross Ref
- 12.Harrigan, K.A. "Just Noticeable Difference and Effects of Searching of User-Controlled Time-Compressed Digital-Video. Ph.D. Thesis, University of Toronto, 1996.Google Scholar
- 13.Heiman, G.W., Leo, R.J., Leighbody, G., & Bowler, K. "Word Intelligibility Decrements and the Comprehension of Time- Compressed Speech." Perception and Psychophysics 40, 6 (1986): 407-411.Google ScholarCross Ref
- 14.Hejna Jr, D.J. "Real-Time Time-Scale Modification of Speech via the Synchronized Overlap-Add Algorithm." MS thesis, MIT, 1990. Electrical Engineering and Computer Science.Google Scholar
- 15.Maxemchuk, N. "An Experimental Speech Storage and Editing Facility." Bell System Technical Journal 59, 8 (1980): 1383- 1395.Google ScholarCross Ref
- 16.Miller, G.A. & Licklider, J.C.R. '`The intelligibility of interrupted speech." Journal of the Acoustic Society of America, 22(2): 167-173, 1950.Google Scholar
- 17.Neuburg, E.P. "Simple Pitch-Dependent Algorithm for High Quality Speech Rate Changing." Journal of the Acoustic Society of America 63, 2 (1978): 624-625.Google ScholarCross Ref
- 18.Orr, D.B. "A perspective on the perception of time compressed speech." In P. M. Kjeldergaard, D. L. Horton, & J. J. Jenkins, (Eds.) Perception of Language, 108-119. Merrill, 1971.Google Scholar
- 19.Orr, D. B. Friedman, H.L., & Williams, J.C. "Trainability of listening comprehension of speeded discourse." Journal of Educational Psychology, 56: 148-156, 1965.Google ScholarCross Ref
- 20.Portnoff, M.R. ''Time-scale modification of speech based on short-time fourier analysis." IEEE Transactions on Acoustics, Speech, and Signal Processing, ASSP-29 (3): 374-390, 1981.Google ScholarCross Ref
- 21.Quereshi, S.U.H. "Speech compression by computer." In S. Duker (Ed.), Time-Compressed Speech, 618-623. Scarecrow, 1974.Google Scholar
- 22.Resnick, P. & Virzi, R.A. "Skip and Scan: Cleaning Up Telephone Interfaces." Proc. CHI'92 (May 1992), ACM. Google ScholarDigital Library
- 23.Schmandt, C. & Arons, B. "A Conversational Telephone Messaging System." IEEE Transactions on Consumer Electronics CE-30, 3 (1984): xxi-xxiv.Google Scholar
- 24.Scott, R.J. "Time Adjustment in Speech Synthesis." Journal of the Acoustic Society of America 41, 1 (1967): 60-65.Google ScholarCross Ref
- 25.Stanford Online: Masters in Electrical Engineering, 1998. http://scpd.stanford.edu/cee/telecom/onlinedegree.htmlGoogle Scholar
- 26.Stifelman, L. "The Audio Notebook: Paper and Pen Interaction with Structured Speech" Ph.D. dissertation, MIT Media Laboratory, 1997. Google ScholarDigital Library
- 27.Stifelman, L.j., Arons, B., Schmandt, C. & Hulteen, E.A. "VoiceNotes: A Speech Interface for a Hand-Held Voice Notetaker." Proc. INTERCHI'93 (Amsterdam, 1993), ACM. Google ScholarDigital Library
- 28.Tarquin, A., Craver, L., & Schroder, D. "Time-Compression Effects of Video-tapes on Students," Journal of Professional Issues in Engineering, Vol. 110, No. 1, January 1984.Google ScholarCross Ref
- 29.Voor, J.B. & Miller, J.M. "The effect of practice upon the comprehension of time-compressed speech." Speech Monographs, 32: 452-455, 1965.Google ScholarCross Ref
Index Terms
- Time-compression: systems concerns, usage, and benefits
Recommendations
Intelligibility of time-compressed synthetic speech
Analysis of listeners' intelligibility of natural and synthetic time-compressed speech.Different compression methods are applied to normal and fast speech.We evaluated a linear method and two non linear methods that act on the duration model.The linear ...
Comparing presentation summaries: slides vs. reading vs. listening
CHI '00: Proceedings of the SIGCHI conference on Human Factors in Computing SystemsAs more audio and video technical presentations go online, it becomes imperative to give users effective summarization and skimming tools so that they can find the presentation they want and browse through it quickly. In a previous study, we reported ...
Speech Compression by Polynomial Approximation
Methods for speech compression aim at reducing the transmission bit rate while preserving the quality and intelligibility of speech. These objectives are antipodal in nature since higher compression presupposes preserving less information about the ...
Comments