ABSTRACT
We consider the task of driving a remote control car at high speeds through unstructured outdoor environments. We present an approach in which supervised learning is first used to estimate depths from single monocular images. The learning algorithm can be trained either on real camera images labeled with ground-truth distances to the closest obstacles, or on a training set consisting of synthetic graphics images. The resulting algorithm is able to learn monocular vision cues that accurately estimate the relative depths of obstacles in a scene. Reinforcement learning/policy search is then applied within a simulator that renders synthetic scenes. This learns a control policy that selects a steering direction as a function of the vision system's output. We present results evaluating the predictive ability of the algorithm both on held out test data, and in actual autonomous driving experiments.
- Barron, J., Fleet, D., & Beauchemin, S. (1994). Performance of optical flow techniques. Int'l Journal of Computer Vision, 12, 43--77. Google ScholarDigital Library
- Blthoff, I., Blthoff, H., & Sinha, P. (1998). Top-down influences on stereoscopic depth-perception. Nature Neuroscience, 1, 254--257.Google ScholarCross Ref
- Criminisi, A., Reid, I., & Zisserman, A. (2000). Single view metrology. Int'l Journal of Computer Vision, 40, 123--148. Google ScholarDigital Library
- Davies, E. (1997). Machine vision: Theory, algorithms, practicalities 2nd ed. Academic Press. Google ScholarDigital Library
- Gini, G., & Marchi, A. (2002). Indoor robot navigation with single camera vision. Proc. Pattern Recognition in Information Systems, PRIS, Spain.Google Scholar
- Honig, J., Heit, B., & Bremont, J. (1996). Visual depth perception based on optical blur. Proc. of Int'l Conf. on Image Processing (pp. 721--724).Google ScholarCross Ref
- Huber, P. (1981). Robust statistics. New York: Wiley.Google Scholar
- Jahne, B., & Geissler, P. (1994). Depth from focus with one image. Proc. IEEE Conf. on Computer Vision and Pattern Recognition CVPR (pp. 713--717).Google ScholarCross Ref
- Kardas, E. (2005). Monocular cues in depth perception. {Online}: http://peace.saumag.edu/faculty/Kardas/Courses/GP Weiten/C4SandP/MonoCues.html.Google Scholar
- Kearns, M., & Singh, S. (1999). Finite-sample rates of convergence for q-learning and indirect methods. NIPS 11 (pp. 996--1002). The MIT Press. Google ScholarDigital Library
- Kim, M., & Uther, W. (2003). Automatic gait optimisation for quadruped robots. Proc. Australasian Conf. on Robotics and Automation (pp. 1--9).Google Scholar
- Klarquist, W., Geisler, W., & Bovik, A. (1995). Maximum-likelihood depth-from-defocus for active vision. Proc. Int'l Conf. on Intelligent Robots and Systems (pp. 374--379).Google ScholarCross Ref
- Kohl, N., & Stone, P. (2004). Policy gradient reinforcement learning for fast quadrupedal locomotion. Proc. IEEE Int'l Conf. Robotics and Automation.Google ScholarCross Ref
- Kudo, H., Saito, M., Yamamura, T., & Ohnishi, N. (1999). Measurement of the ability in monocular depth perception during gazing at near visual target-effect of the ocular parallax cue. Proc. IEEE Int'l Conf. Systems, Man & Cybernetics (pp. 34--37).Google ScholarCross Ref
- LeCun, Y. (2003). Presentation at Navigation, Locomotion and Articulation workshop. Washington DC.Google Scholar
- Loomis, J. M. (2001). Looking down is looking up. Nature News and Views, 414, 155--156.Google ScholarCross Ref
- Nagai, T., Naruse, T., Ikehara, M., & Kurematsu, A. (2002). Hmm-based surface reconstruction from single images. Proc. IEEE Int'l Conf. on Image Processing (pp. II--561 -- 11--564).Google ScholarCross Ref
- Ng, A. Y., & Jordan, M. (2000). Pegasus: A policy search method for large mdps and pomdps. Proc. 16th Conf. UAI. Google ScholarDigital Library
- Pomerleau, D. (1989). An autonomous land vehicle in a neural network. NIPS 1. Morgan Kaufmann.Google Scholar
- Scharstein, D., & Szeliski. R. (2002). A taxonomy and evaluation of dense two-frame stereo correspondence algorithms. Int'l Journal of Computer Vision, 47, 7--42. Google ScholarDigital Library
- Shao, M., Simchony, T., & Chellappa, R. (1988). New algorithms from reconstruction of a 3-d depth map from one or more images. Proc. IEEE Conf. on Computer Vision and Pattern Recognition CVPR (pp. 530--535).Google ScholarCross Ref
- Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning. MIT Press. Google ScholarDigital Library
- Wu, B., Ooi, T. L., & He, Z. J. (2004). Perceiving distance accurately by a directional process of integrating ground information. Letters to Nature, 428, 73 77.Google Scholar
- High speed obstacle avoidance using monocular vision and reinforcement learning
Recommendations
Collision Avoidance Using Deep Learning-Based Monocular Vision
AbstractAutonomous driving technologies, including monocular vision-based approaches, are in the forefront of industrial and research communities, since they are expected to have a significant impact on economy and society. However, they have limitations ...
High Precision Calibration Algorithm for Binocular Stereo Vision Camera using Deep Reinforcement Learning
Camera calibration is the most important aspect of computer vision research. To address the issue of insufficient precision, therefore, a high precision calibration algorithm for binocular stereo vision camera using deep reinforcement learning is ...
3-D Position Sensing Using a Passive Monocular Vision System
Passive monocular 3-D position sensing is made possible by a new calibration scheme that relates depth to focus blur through a composite lens and aperture model. The calibration technique enables the recovery of absolute 3-D position coordinates from ...
Comments