abstract

Selection of an Object Requested by Speech Based on Generic Object Recognition

Authors:
Hitoshi Nishimura

Kobe University, Kobe, Japan

Kobe University, Kobe, Japan
View Profile

,
Yuko Ozasa

Kobe University, Kobe, Japan

Kobe University, Kobe, Japan
View Profile

,
Yasuo Ariki

Kobe University, Kobe, Japan

Kobe University, Kobe, Japan
View Profile

,
Mikio Nakano

Honda Research Institute Japan Co., Ltd., Saitama, Japan

Honda Research Institute Japan Co., Ltd., Saitama, Japan
View Profile

MMRWHRI '14: Proceedings of the 2014 Workshop on Multimodal, Multi-Party, Real-World Human-Robot InteractionNovember 2014Pages 23–24https://doi.org/10.1145/2666499.2666505

Published:16 November 2014Publication History

MMRWHRI '14: Proceedings of the 2014 Workshop on Multimodal, Multi-Party, Real-World Human-Robot Interaction

Pages 23–24

ABSTRACT

In this paper, we propose a method that a robot can select an object specified by human speech among several objects based on generic object recognition. Although object selection methods have been proposed based on specific object recognition, generic object recognition is more useful for the selection in a real environment. In the proposed method, an object is selected by integrating speech recognition results and generic object recognition results. We investigated the relation between the method of narrowing down candidates based on speech and image recognition results and the object selection accuracy.

References

Nishimura et al.. Selection of unknown objects specified by speech using models constructed from web images. In Proc ICPR, pages 477--482, 2014.Google ScholarDigital Library
Ozasa et al.. Disambiguation in unknown object detection by integrating image and speech recognition confidences. In Proc ACCV, pages 85--96. 2013. Google ScholarDigital Library
Sermanet et al.. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229, 2013.Google Scholar
Julius. Open source large vocabulary csr engine Julius. http://julius.sourceforge.jp/.Google Scholar

Index Terms

Selection of an Object Requested by Speech Based on Generic Object Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition

Recommendations

Generic Object Recognition: Building and Matching Coarse Descriptions from Line Drawings

Primal access recognition of visual objects (PARVO), a computer vision system that addresses the problem of fast and generic recognition of unexpected 3D objects from single 2D views, is considered. Recently, recognition by components (RBC), which is a ...
Read More
Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and ...
Read More
Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

We study the low-variance and robust features for speech recognition system on the AURORA-4 corpus.We propose to compute cepstral features from a regularized MVDR (RMVDR) spectral estimates, denoted as RMVDR-based Cepstral Coefficient (RMCC) features.A ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
MMRWHRI '14: Proceedings of the 2014 Workshop on Multimodal, Multi-Party, Real-World Human-Robot Interaction
November 2014
40 pages
ISBN:9781450305518
DOI:10.1145/2666499
General Chairs:
Mary Ellen Foster
Heriot-Watt University, Edinburgh, Scotland
,
Manuel Giuliani
University of Salzburg, Austria
,
Ronald Petrick
University of Edinburgh, Scotland
Copyright © 2014 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 16 November 2014
Check for updates
Author Tags
generic object recognition
multimodality
speech recognition
Qualifiers
- abstract
Conference

Acceptance Rates
MMRWHRI '14 Paper Acceptance Rate3of5submissions,60%Overall Acceptance Rate3of5submissions,60%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 1
  Total Citations
  View Citations
- 80
  Total Downloads
- Downloads (Last 12 months)2
- Downloads (Last 6 weeks)0
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Selection of an Object Requested by Speech Based on Generic Object Recognition

MMRWHRI '14: Proceedings of the 2014 Workshop on Multimodal, Multi-Party, Real-World Human-Robot Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Generic Object Recognition: Building and Matching Coarse Descriptions from Line Drawings

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Selection of an Object Requested by Speech Based on Generic Object Recognition

MMRWHRI '14: Proceedings of the 2014 Workshop on Multimodal, Multi-Party, Real-World Human-Robot Interaction

ABSTRACT

References

Cited By

Index Terms

Recommendations

Generic Object Recognition: Building and Matching Coarse Descriptions from Line Drawings

Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System

Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media