ABSTRACT
In this paper, we propose a method that a robot can select an object specified by human speech among several objects based on generic object recognition. Although object selection methods have been proposed based on specific object recognition, generic object recognition is more useful for the selection in a real environment. In the proposed method, an object is selected by integrating speech recognition results and generic object recognition results. We investigated the relation between the method of narrowing down candidates based on speech and image recognition results and the object selection accuracy.
- Nishimura et al.. Selection of unknown objects specified by speech using models constructed from web images. In Proc ICPR, pages 477--482, 2014.Google ScholarDigital Library
- Ozasa et al.. Disambiguation in unknown object detection by integrating image and speech recognition confidences. In Proc ACCV, pages 85--96. 2013. Google ScholarDigital Library
- Sermanet et al.. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229, 2013.Google Scholar
- Julius. Open source large vocabulary csr engine Julius. http://julius.sourceforge.jp/.Google Scholar
Index Terms
- Selection of an Object Requested by Speech Based on Generic Object Recognition
Recommendations
Generic Object Recognition: Building and Matching Coarse Descriptions from Line Drawings
Primal access recognition of visual objects (PARVO), a computer vision system that addresses the problem of fast and generic recognition of unexpected 3D objects from single 2D views, is considered. Recently, recognition by components (RBC), which is a ...
Speech-Input Speech-Output Communication for Dysarthric Speakers Using HMM-Based Speech Recognition and Adaptive Synthesis System
Dysarthria is a motor speech disorder that causes inability to control and coordinate one or more articulators. This makes it difficult for a dysarthric speaker to utter certain speech sound units, thereby producing poorly articulated, slurred, and ...
Regularized minimum variance distortionless response-based cepstral features for robust continuous speech recognition
We study the low-variance and robust features for speech recognition system on the AURORA-4 corpus.We propose to compute cepstral features from a regularized MVDR (RMVDR) spectral estimates, denoted as RMVDR-based Cepstral Coefficient (RMCC) features.A ...
Comments