article

On the human ability to discriminate audio ambiances from similar locations of an urban environment

Authors:
Dani Korpi

Department of Signal Processing, Tampere University of Technology, Tampere, Finland 33101

Department of Signal Processing, Tampere University of Technology, Tampere, Finland 33101
View Profile

,
Toni Heittola

Department of Signal Processing, Tampere University of Technology, Tampere, Finland 33101

Department of Signal Processing, Tampere University of Technology, Tampere, Finland 33101
View Profile

,
Timo Partala

Human-Centered Technology, Tampere University of Technology, Tampere, Finland 33101

Human-Centered Technology, Tampere University of Technology, Tampere, Finland 33101
View Profile

,
Antti Eronen

Nokia Research Center, Tampere, Finland 33720

Nokia Research Center, Tampere, Finland 33720
View Profile

,
Annamaria Mesaros

Department of Signal Processing, Tampere University of Technology, Tampere, Finland 33101

Department of Signal Processing, Tampere University of Technology, Tampere, Finland 33101
View Profile

,
Tuomas Virtanen

Department of Signal Processing, Tampere University of Technology, Tampere, Finland 33101

Department of Signal Processing, Tampere University of Technology, Tampere, Finland 33101
View Profile

Personal and Ubiquitous Computing Volume 17 Issue 4pp 761–769https://doi.org/10.1007/s00779-012-0625-z

Published:01 April 2013Publication History

Personal and Ubiquitous Computing

Abstract

When developing advanced location-based systems augmented with audio ambiances, it would be cost-effective to use a few representative samples from typical environments for describing a larger number of similar locations. The aim of this experiment was to study the human ability to discriminate audio ambiances recorded in similar locations of the same urban environment. A listening experiment consisting of material from three different environments and nine different locations was carried out with nineteen subjects to study the credibility of audio representations for certain environments which would diminish the need for collecting huge audio databases. The first goal was to study to what degree humans are able to recognize whether the recording has been made in an indicated location or in another similar location, when presented with the name of the place, location on a map, and the associated audio ambiance. The second goal was to study whether the ability to discriminate audio ambiances from different locations is affected by a visual cue, by presenting additional information in form of a photograph of the suggested location. The results indicate that audio ambiances from similar urban areas of the same city differ enough so that it is not acceptable to use a single recording as ambience to represent different yet similar locations. Including an image was found to increase the perceived credibility of all the audio samples in representing a certain location. The results suggest that developers of audio-augmented location-based systems should aim at using audio samples recorded on-site for each location in order to achieve a credible impression.

References

Bonebright TL (1998) Perceptual structure of everyday sounds: a multidimensional scaling approach. In: Proceedings of the 7th international conference on auditory display. ICAD, Laboratory of Acoustics and Audio Signal Processing and the Telecommunications Software and Multimedia Laboratory, Helsinki University, pp 73-78.Google Scholar
Bonebright TL, Miner NE, Goldsmith TE, Caudell TP (1998) Data collection and analysis techniques for evaluating the perceptual qualities of auditory stimuli. In: Proceedings of the ICAD '98. ICAD, British Computer Society. Google ScholarDigital Library
Burr D, Alais D (2006) Combining visual and auditory information. Prog Brain Res 155:243-258.Google ScholarCross Ref
Chrisler J, McCreary D (2010) Handbook of gender research in psychology, vol 1. Gender research in general and experimental psychology. Springer, New York.Google Scholar
Frassinetti F, Bolognini N, Ladavas E (2002) Enhancement of visual perception by crossmodal visuo-auditory interaction. Exp Brain Res 147(3):332-343.Google ScholarCross Ref
Google: Street view. URL http://maps.google.com/ (online). Accessed 10 Oct 2011.Google Scholar
Green PD, Cooke MP, Crawford MD (1995) Auditory scene analysis and HMM recognition of speech in noise. In: Proceedings of the ICASSP '95, pp 401-404.Google Scholar
Hong JY, Lee PJ, Jeon JY (2010) Evaluation of urban soundscape using soundwalking. In: Proceedings of 20th international congress on acoustics. Sydney.Google Scholar
Klabbers E, Veldhuis R (2001) Reducing audible spectral discontinuities. IEEE Trans Speech Audio Process 9(1):39-51.Google ScholarCross Ref
Lakatos S, McAdams S, Causs R (1997) The representation of auditory source characteristics: simple geometric form. Percept Psychophys 59(8):1180-1190.Google ScholarCross Ref
MacEachren AM, Taylor DRF (1994) Visualization in modern cartography. Pergamon, New York.Google Scholar
MacVeigh R, Jacobson RD (2007) Increasing the dimensionality of a geographic information system (GIS) using auditory display. In: Scavone GP (ed) Proceedings of the 13th international conference on auditory display (ICAD 2007). Schulich School of Music, McGill University, Montreal, Canada, pp 530-535.Google Scholar
Miner NE (1998) Creating wavelet-based models for real-time synthesis of perceptually convincing environmental sounds. PhD thesis, University of New Mexico. Google ScholarDigital Library
Nokia: Maps 3d. URL http://maps.nokia.com/3D (online). Accessed 10 Oct 2011.Google Scholar
Peltonen VTK, Eronen AJ, Parviainen MP, Klapuri AP (2001) Recognition of everyday auditory scenes: potentials, latencies and cues. In: Proceedings of the 110th audio engineering society convention. Hall, Amsterdam.Google Scholar
Philips S, Pitton J, Atlas L (2006) Perceptual feature identification for active sonar echoes. In: Proceedings of the IEEE OCEANS conference, pp 1-6.Google ScholarCross Ref
Schiewe J, Kornfeld AL (2009) Framework and potential implementations of urban sound cartography. In: Proceedings of the 12th AGILE international conference on geographic information science. Hannover.Google Scholar
Storms RL (1998) Auditory-visual cross-modal perception phenomena. PhD thesis, Naval Postgraduate School, Monterey, CA, USA.Google Scholar
Urban remix project. URL http://urbanremix.gatech.edu. Online accessed 10 Oct 2011.Google Scholar
Viollon S, Lavandier C, Drake C (2002) Influence of visual setting on sound ratings in an urban environment. Appl Acoust 63(5):493-511.Google ScholarCross Ref
Vroomen J, Gelder BD (2000) Sound enhances visual perception: cross-modal effects of auditory organization on vision. J Exp Psychol Hum Percept Perform 26(5):1583-1590.Google ScholarCross Ref

Index Terms

On the human ability to discriminate audio ambiances from similar locations of an urban environment
1. Applied computing
  1. Arts and humanities
    1. Sound and music computing
  2. Computers in other domains
    1. Personal computers and PC applications
2. Information systems
  1. Information retrieval
    1. Specialized information retrieval
      1. Multimedia and multimodal retrieval
        Music retrieval

Recommendations

Ambiguity in Automatic Chord Transcription: Recognizing Major and Minor Chords
Adaptive Multimedia Retrieval: Semantics, Context, and Adaptation
Abstract
Automatic chord transcription is the process of transforming the harmonic content of a music signal into chord symbols. We use difficult chord transcription cases in the Beatles material to compare human performance to computer performance. ...
Read More
Joint Recognition and Linking of Fine-Grained Locations from Tweets
WWW '16: Proceedings of the 25th International Conference on World Wide Web

Many users casually reveal their locations such as restaurants, landmarks, and shops in their tweets. Recognizing such fine-grained locations from tweets and then linking the location mentions to well-defined location profiles (e.g., with formal name, ...
Read More
Audio–visual language instruction understanding for robotic sorting
Abstract
For robot in human environment, it has always been expected that the robot can execute specified tasks following language instructions. Most current methods only rely on visual perception to understand the language instruction, while ...
Highlights
- A novel task of audio–visual language instruction understanding for robotic sorting is proposed, in which both the visual and audio information is leveraged ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in

Personal and Ubiquitous Computing Volume 17, Issue 4
April 2013
191 pages
ISSN:1617-4909
Issue’s Table of Contents

Copyright © Copyright © 2013 Springer-Verlag London
Sponsors
In-Cooperation
Publisher
Springer-Verlag
Berlin, Heidelberg
Publication History
- Published: 1 April 2013
Published in puc Volume 17, Issue 4
Author Tags
Audio ambiance
Audio-visual perception
Listening experiment
Location recognition
Qualifiers
- article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 63
  Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

On the human ability to discriminate audio ambiances from similar locations of an urban environment

Personal and Ubiquitous Computing

Abstract

References

Cited By

Index Terms

Recommendations

Ambiguity in Automatic Chord Transcription: Recognizing Major and Minor Chords

Joint Recognition and Linking of Fine-Grained Locations from Tweets

Audio–visual language instruction understanding for robotic sorting

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

Digital Edition

Caption

On the human ability to discriminate audio ambiances from similar locations of an urban environment

Personal and Ubiquitous Computing

Abstract

References

Cited By

Index Terms

Recommendations

Ambiguity in Automatic Chord Transcription: Recognizing Major and Minor Chords

Joint Recognition and Linking of Fine-Grained Locations from Tweets

Audio–visual language instruction understanding for robotic sorting

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

Digital Edition

Share this Publication link

Share on Social Media