skip to main content
article
Free Access

Stability of controllers for Gaussian process dynamics

Authors Info & Claims
Published:01 January 2017Publication History
Skip Abstract Section

Abstract

Learning control has become an appealing alternative to the derivation of control laws based on classic control theory. However, a major shortcoming of learning control is the lack of performance guarantees which prevents its application in many real-world scenarios. As a step towards widespread deployment of learning control, we provide stability analysis tools for controllers acting on dynamics represented by Gaussian processes (GPs). We consider differentiable Markovian control policies and system dynamics given as (i) the mean of a GP, and (ii) the full GP distribution. For both cases, we analyze finite and infinite time horizons. Furthermore, we study the effect of disturbances on the stability results. Empirical evaluations on simulated benchmark problems support our theoretical results.

References

  1. G. Adomian. Stochastic systems. Mathematics in Science and Engineering. Elsevier Science, 1983.Google ScholarGoogle Scholar
  2. A. A. Ahmadi and P. A. Parrilo. Converse results on existence of sum of squares lyapunov functions. In 2011 50th IEEE Conference on Decision and Control and European Control Conference, pages 6516-6521, Dec 2011.Google ScholarGoogle ScholarCross RefCross Ref
  3. A. A. Ahmadi, A. Majumdar, and R. Tedrake. Complexity of ten decision problems in continuous time dynamical systems. In 2013 American Control Conference, pages 6376-6381, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  4. T. Beckers and S. Hirche. Stability of gaussian process state space models. In Proceedings of the European Control Conference (ECC), 2016.Google ScholarGoogle ScholarCross RefCross Ref
  5. F. Blanchini. Set invariance in control. Automatica, 35(11):1747 - 1767, 1999. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. J. Burkardt. Stroud - numerical integration in m dimensions. https://people.sc.fsu.edu/~jburkardt/m_src/stroud/stroud.html, 2014.Google ScholarGoogle Scholar
  7. G. Chesi. Estimating the domain of attraction for uncertain polynomial systems. Automatica, 40(11):1981-1986, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. P.J. Davis, P. Rabinowitz, and W. Rheinbolt. Methods of Numerical Integration. Computer Science and Applied Mathematics. Elsevier Science, 2014.Google ScholarGoogle Scholar
  9. M.P. Deisenroth. Efficient Reinforcement Learning Using Gaussian Processes. Karlsruhe series on intelligent sensor actuator systems. KIT Scientific Publ., 2010.Google ScholarGoogle Scholar
  10. M.P. Deisenroth, D. Fox, and C.E. Rasmussen. Gaussian processes for data-efficient learning in robotics and control. IEEE Trans. Pattern Anal. Mach. Intell., 37(2):408-423, 2015.Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. K. Doya. Reinforcement learning in continuous time and space. Neural Computation, 12: 219-245, 2000. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Y. Engel, P. Szabo, and D. Volkinshtein. Learning to control an octopus arm with gaussian process temporal difference methods. In Y. Weiss, B. Schölkopf, and J.C. Platt, editors, Advances in Neural Information Processing Systems 18, pages 347-354. MIT Press, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. G.A. Evans. The estimation of errors in numerical quadrature. International Journal of Mathematical Education in Science and Technology, 25(5):727-744, 1994.Google ScholarGoogle ScholarCross RefCross Ref
  14. G.H. Golub and C.F. Van Loan. Matrix Computations. Johns Hopkins Studies in the Mathematical Sciences. Johns Hopkins University Press, 2013.Google ScholarGoogle Scholar
  15. N.R. Hansen. Geometric ergodicity of discrete-time approximations to multivariate diffusions. Bernoulli, 9(4):725-743, 08 2003.Google ScholarGoogle ScholarCross RefCross Ref
  16. F. Heiss and V. Winschel. Likelihood approximation by numerical integration on sparse grids. Journal of Econometrics, 144(1):62 - 80, 2008.Google ScholarGoogle ScholarCross RefCross Ref
  17. A. Hurwitz. Ueber die Bedingungen, unter welchen eine Gleichung nur Wurzeln mit negativen reellen Theilen besitzt. Mathematische Annalen, 46(2):273-284, 1895.Google ScholarGoogle ScholarCross RefCross Ref
  18. H.K. Khalil. Nonlinear control. Prentice Hall, 2014.Google ScholarGoogle Scholar
  19. R. Khasminskii and G.N. Milstein. Stochastic Stability of Differential Equations. Stochastic Modelling and Applied Probability. Springer Berlin Heidelberg, 2011.Google ScholarGoogle Scholar
  20. H.J. Kim and A.Y. Ng. Stable adaptive control with online learning. In L.K. Saul, Y. Weiss, and L. Bottou, editors, Advances in Neural Information Processing Systems 17, pages 977-984. MIT Press, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. E.D. Klenske, M.N. Zeilinger, B. Schölkopf, and P. Hennig. Nonparametric dynamics estimation for time periodic systems. In Communication, Control, and Computing (Allerton), 2013 51st Annual Allerton Conference on, pages 486-493. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  22. J. Ko and D. Fox. GP-BayesFilters: Bayesian filtering using gaussian process prediction and observation models. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2008., 2008.Google ScholarGoogle ScholarCross RefCross Ref
  23. J. Kocijan, R. Murray-Smith, C.E. Rasmussen, and A. Girard. Gaussian process model based predictive control. In American Control Conference, 2004. Proceedings of the 2004, volume 3, pages 2214-2219. IEEE, 2004.Google ScholarGoogle ScholarCross RefCross Ref
  24. H.J. Kushner. Finite time stochastic stability and the analysis of tracking systems. Automatic Control, IEEE Transactions on, 11(2):219-227, 1966.Google ScholarGoogle Scholar
  25. H.J. Kushner. Stochastic Stability and Control. Mathematics in science and engineering. Academic Press, 1967.Google ScholarGoogle Scholar
  26. A.M. Lyapunov. General Problem of the Stability Of Motion. Doctoral dissertation, Univesity of Kharkov, 1892. Englisch Translation by A.T. Fuller, Taylor & Francis, London 1992.Google ScholarGoogle Scholar
  27. J.M. Maciejowski and X. Yang. Fault tolerant control using gaussian processes and model predictive control. In Control and Fault-Tolerant Systems (SysTol), 2013 Conference on, pages 1-12. IEEE, 2013.Google ScholarGoogle ScholarCross RefCross Ref
  28. A. Majumdar, A. A. Ahmadi, and R. Tedrake. Control and verification of high-dimensional systems with dsos and sdsos programming. In 53rd IEEE Conference on Decision and Control, pages 394-401, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  29. M. Masjed-Jamei. New error bounds for gauss-legendre quadrature rules. Filomat, 28(6): 1281-1293, 2014.Google ScholarGoogle ScholarCross RefCross Ref
  30. S. Meyn and R.L. Tweedie. Markov Chains and Stochastic Stability. Cambridge University Press, New York, NY, USA, 2nd edition, 2009. Google ScholarGoogle ScholarDigital LibraryDigital Library
  31. Charles A. Micchelli, Yuesheng Xu, and Haizhang Zhang. Universal kernels. Journal of Machine Learning Research, 7:2651-2667, December 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. J. Moore and R. Tedrake. Adaptive control design for underactuated systems using sums-of-squares optimization. In 2014 American Control Conference, pages 721-728, June 2014.Google ScholarGoogle ScholarCross RefCross Ref
  33. J. Nakanishi, J.A. Farrell, and S. Schaal. A locally weighted learning composite adaptive controller with structure adaptation. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2002., pages 882-889 vol.1, 2002.Google ScholarGoogle ScholarCross RefCross Ref
  34. K.S. Narendra and A.M. Annaswamy. Stable Adaptive Systems. Dover Books on Electrical Engineering. Dover Publications, 2012.Google ScholarGoogle Scholar
  35. D. Nguyen-Tuong and J. Peters. Model learning in robotics: a survey. Cognitive Processing, (4), 2011.Google ScholarGoogle Scholar
  36. E. Novak and K. Ritter. High dimensional integration of smooth functions over cubes. Numerische Mathematik, 75(1):79-97, 1996.Google ScholarGoogle ScholarCross RefCross Ref
  37. Y. Pan and E. Theodorou. Probabilistic differential dynamic programming. In Z. Ghahramani, M. Welling, C. Cortes, N.D. Lawrence, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 1907-1915. Curran Associates, Inc., 2014. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. A. Papachristodoulou and S. Prajna. Analysis of Non-polynomial Systems Using the Sum of Squares Decomposition, pages 23-43. Springer Berlin Heidelberg, Berlin, Heidelberg, 2005.Google ScholarGoogle Scholar
  39. P.A. Parrilo. Structured semidefinite programs and semialgebraic geometry methods in robustness and optimization. PhD thesis, 2000.Google ScholarGoogle Scholar
  40. T.J. Perkins and A.G. Barto. Lyapunov design for safe reinforcement learning. Journal of Machine Learning Research, 3:803-832, March 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. J. Quiñonero-Candela, A. Girard, J. Larsen, and C.E. Rasmussen. Propagation of uncertainty in bayesian kernel models - application to multiple-step ahead forecasting. In International Conference on Acoustics, Speech and Signal Processing, pages 701-704, vol. 2, 2003.Google ScholarGoogle Scholar
  42. C.E. Rasmussen and C.K.I. Williams. Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning). The MIT Press, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  43. E.J. Routh. A Treatise on the Stability of a Given State of Motion: Particularly Steady Motion. Macmillan and Company, 1877.Google ScholarGoogle Scholar
  44. E.K. Ryu and S.P. Boyd. Extensions of gauss quadrature via linear programming. Foundations of Computational Mathematics, 15(4):953-971, 2015. Google ScholarGoogle ScholarDigital LibraryDigital Library
  45. S. Skogestad and I. Postlethwaite. Multivariable Feedback Control: Analysis and Design. John Wiley & Sons, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  46. B.S. Skrainka and K.L. Judd. High performance quadrature rules: How numerical integration affects a popular model of product differentiation. Available at SSRN 1870703, 2011.Google ScholarGoogle Scholar
  47. J. Steinhardt and R. Tedrake. Finite-time regional verification of stochastic nonlinear systems. In H.F. Durrant-Whyte, N. Roy, and P. Abbeel, editors, Robotics: Science and Systems VII, pages 321-328. MIT Press, 2012.Google ScholarGoogle Scholar
  48. A.H. Stroud. Approximate calculation of multiple integrals. Prentice-Hall series in automatic computation. Prentice-Hall, 1971.Google ScholarGoogle Scholar
  49. E. Süli and D.F. Mayers. An Introduction to Numerical Analysis. Cambridge University Press, 2003.Google ScholarGoogle ScholarCross RefCross Ref
  50. R.S. Sutton and A.G. Barto. Reinforcement Learning: An Introduction. MIT Press, 1998. Google ScholarGoogle ScholarDigital LibraryDigital Library
  51. G. Tao. Adaptive Control Design and Analysis (Adaptive and Learning Systems for Signal Processing, Communications and Control Series). John Wiley & Sons, Inc., New York, NY, USA, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  52. U. Topcu, A. Packard, P. Seiler, and G. Balas. Help on sos [ask the experts]. IEEE Control Systems, 30(4):18-23, 2010a.Google ScholarGoogle ScholarCross RefCross Ref
  53. U. Topcu, A. K. Packard, P. Seiler, and G. J. Balas. Robust region-of-attraction estimation. IEEE Transactions on Automatic Control, 55(1):137-142, 2010b.Google ScholarGoogle ScholarCross RefCross Ref
  54. J. Vinogradska, B. Bischoff, D. Nguyen-Tuong, A. Romer, H. Schmidt, and J. Peters. Stability of controllers for gaussian process forward models. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, pages 545-554, 2016. Google ScholarGoogle ScholarDigital LibraryDigital Library
  55. S. Wasowicz. On error bounds for gauss-legendre and lobatto quadrature rules. Journal of Inequalities in Pure & Applied Mathematics, 7(3):Paper No. 84, 7 p., 2006.Google ScholarGoogle Scholar
  56. H. Xiao and Z. Gimbutas. A numerical algorithm for the construction of efficient quadrature rules in two and higher dimensions. Computers & Mathematics with Applications, 59(2): 663 - 676, 2010. Google ScholarGoogle ScholarDigital LibraryDigital Library
  57. K. Zhou and J.C. Doyle. Essentials of Robust Control. Prentice Hall Modular Series for Eng. Prentice Hall, 1998.Google ScholarGoogle Scholar

Recommendations

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Sign in

Full Access

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader