skip to main content
Skip header Section
Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)September 2007
Publisher:
  • Wiley-Interscience
  • 605 Third Avenue New York, NY
  • United States
ISBN:978-0-470-17155-4
Published:01 September 2007
Skip Bibliometrics Section
Bibliometrics
Abstract

No abstract available.

Cited By

  1. Du L, Li Q and Yu P (2024). A Sequential Model for High-Volume Recruitment Under Random Yields, Operations Research, 72:1, (60-90), Online publication date: 1-Jan-2024.
  2. Cheng L, Luo J, Fan W, Zhang Y and Li Y A Deep Q-Network Based on Radial Basis Functions for Multi-Echelon Inventory Management Proceedings of the Winter Simulation Conference, (1581-1592)
  3. Qasem O, Gao W and Vamvoudakis K (2023). Adaptive optimal control of continuous-time nonlinear affine systems via hybrid iteration, Automatica (Journal of IFAC), 157:C, Online publication date: 1-Nov-2023.
  4. Wu W, Eamen L, Dandy G, Razavi S, Kuczera G and Maier H (2023). Beyond engineering, Environmental Modelling & Software, 167:C, Online publication date: 1-Sep-2023.
  5. ACM
    Frohner N, Raidl G and Chicano F Multi-Objective Policy Evolution for a Same-Day Delivery Problem with Soft Deadlines Proceedings of the Companion Conference on Genetic and Evolutionary Computation, (1941-1949)
  6. Liu M, Cai Q, Li D, Meng W and Fu M (2023). Output feedback Q-learning for discrete-time finite-horizon zero-sum games with application to the H ∞ control, Neurocomputing, 529:C, (48-55), Online publication date: 7-Apr-2023.
  7. Zhang X, Varakantham P and Jiang H Future aware pricing and matching for sustainable on-demand ride pooling Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, (14628-14636)
  8. Shin D, Vaccari S and Zeevi A (2023). Dynamic Pricing with Online Reviews, Management Science, 69:2, (824-845), Online publication date: 1-Feb-2023.
  9. Kiumarsi B and Başar T Secure Linear Quadratic Regulator Using Sparse Model-Free Reinforcement Learning 2019 IEEE 58th Conference on Decision and Control (CDC), (3641-3647)
  10. Brunetti M, Campuzano G and Mes M Simulation of the Internal Electric Fleet Dispatching Problem at a Seaport Proceedings of the Winter Simulation Conference, (2675-2686)
  11. Wang W, Xie X and Feng C (2022). Model-free finite-horizon optimal tracking control of discrete-time linear systems, Applied Mathematics and Computation, 433:C, Online publication date: 15-Nov-2022.
  12. ACM
    Talbi E (2021). Machine Learning into Metaheuristics, ACM Computing Surveys, 54:6, (1-32), Online publication date: 31-Jul-2022.
  13. ACM
    Qiu H, Huang P, Asavisanu N, Liu X, Psounis K and Govindan R AutoCast Proceedings of the 20th Annual International Conference on Mobile Systems, Applications and Services, (128-141)
  14. Hao J and Varakantham P Hierarchical Value Decomposition for Effective On-demand Ride-Pooling Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (580-587)
  15. Ciocan D and Mišić V (2022). Interpretable Optimal Stopping, Management Science, 68:3, (1616-1638), Online publication date: 1-Mar-2022.
  16. Martinelli A, Gargiani M and Lygeros J (2021). Data-driven optimal control with a relaxed linear program, Automatica (Journal of IFAC), 136:C, Online publication date: 1-Feb-2022.
  17. Wang R, Parunandi K, Sharma A, Goyal R and Chakravorty S On the Search for Feedback in Reinforcement Learning 2021 60th IEEE Conference on Decision and Control (CDC), (1560-1567)
  18. Sihite E, Dangol P and Ramezani A Optimization-free Ground Contact Force Constraint Satisfaction in Quadrupedal Locomotion 2021 60th IEEE Conference on Decision and Control (CDC), (713-719)
  19. Wang J, Wilson E and Velasquez A Consensus-Based Value Iteration for Multiagent Cooperative Control 2021 60th IEEE Conference on Decision and Control (CDC), (6659-6664)
  20. Mustafa A, Mazouchi M, Nageshrao S and Modares H (2021). Assured learning‐enabled autonomy, International Journal of Adaptive Control and Signal Processing, 35:12, (2348-2371), Online publication date: 3-Dec-2021.
  21. Qin Z, Zhu H and Ye J Reinforcement Learning for Ridesharing: A Survey 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), (2447-2454)
  22. Long X, He Z, Wang Z and Na J (2021). Online Optimal Control of Robotic Systems with Single Critic NN-Based Reinforcement Learning, Complexity, 2021, Online publication date: 1-Jan-2021.
  23. Zhang G, Li H and Peng Y Sequential sampling for a ranking and selection problem with exponential sampling distributions Proceedings of the Winter Simulation Conference, (2984-2995)
  24. Yang X, Hu J, Hu J and Peng Y Asynchronous value iteration for markov decision processes with continuous state spaces Proceedings of the Winter Simulation Conference, (2856-2866)
  25. Mu S, Zhong Z and Zhao D Online Policy Learning for Opportunistic Mobile Computation Offloading GLOBECOM 2020 - 2020 IEEE Global Communications Conference, (1-6)
  26. Li J, Zhou Y, Chen H and Shi Y Age of Aggregated Information: Timely Status Update with Over-the-Air Computation GLOBECOM 2020 - 2020 IEEE Global Communications Conference, (1-6)
  27. Shifrin M, Menasché D, Cohen A, Goeckel D and Gurewitz O (2020). Optimal PHY Configuration in Wireless Networks, IEEE/ACM Transactions on Networking, 28:6, (2601-2614), Online publication date: 1-Dec-2020.
  28. Liu Y, Chong E, Pezeshki A and Zhang Z (2020). Submodular optimization problems and greedy strategies: A survey, Discrete Event Dynamic Systems, 30:3, (381-412), Online publication date: 1-Sep-2020.
  29. ACM
    Yang Z, Chandramouli B, Wang C, Gehrke J, Li Y, Minhas U, Larson P, Kossmann D and Acharya R Qd-tree: Learning Data Layouts for Big Data Analytics Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, (193-208)
  30. Philpott A, Wahid F and Bonnans J (2019). MIDAS: A mixed integer dynamic approximation scheme, Mathematical Programming: Series A and B, 181:1, (19-50), Online publication date: 1-May-2020.
  31. Liang M, Wang D and Liu D (2020). Improved value iteration for neural-network-based stochastic optimal control design, Neural Networks, 124:C, (280-295), Online publication date: 1-Apr-2020.
  32. Bakir I, Boland N, Dandurand B and Erera A (2019). Sampling Scenario Set Partition Dual Bounds for Multistage Stochastic Programs, INFORMS Journal on Computing, 32:1, (145-163), Online publication date: 1-Jan-2020.
  33. Shah A, Ganesan R, Jajodia S, Samarati P and Cam H (2019). Adaptive Alert Management for Balancing Optimal Performance among Distributed CSOCs using Reinforcement Learning, IEEE Transactions on Parallel and Distributed Systems, 31:1, (16-33), Online publication date: 1-Jan-2020.
  34. Wu J and Frazier P Practical two-step look-ahead Bayesian optimization Proceedings of the 33rd International Conference on Neural Information Processing Systems, (9813-9823)
  35. Rizvi S and Lin Z (2019). Experience replay–based output feedback Q‐learning scheme for optimal output tracking control of discrete‐time linear systems, International Journal of Adaptive Control and Signal Processing, 33:12, (1825-1842), Online publication date: 2-Dec-2019.
  36. Ning J and Sobel M (2019). Easy Affine Markov Decision Processes, Operations Research, 67:6, (1719-1737), Online publication date: 1-Nov-2019.
  37. Shi C, Wei Y and Zhong Y (2019). Process Flexibility for Multiperiod Production Systems, Operations Research, 67:5, (1300-1320), Online publication date: 1-Sep-2019.
  38. Hagebring F and Lennartson B (2019). Time-optimal control of large-scale systems of systems using compositional optimization, Discrete Event Dynamic Systems, 29:3, (411-443), Online publication date: 1-Sep-2019.
  39. Raza S and Lin M Constructive Policy: Reinforcement Learning Approach for Connected Multi-Agent Systems 2019 IEEE 15th International Conference on Automation Science and Engineering (CASE), (257-262)
  40. ACM
    Qin Z, Tang J and Ye J Deep Reinforcement Learning with Applications in Transportation Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (3201-3202)
  41. ACM
    Bedewy A, Sun Y, Kompella S and Shroff N Age-optimal Sampling and Transmission Scheduling in Multi-Source Systems Proceedings of the Twentieth ACM International Symposium on Mobile Ad Hoc Networking and Computing, (121-130)
  42. Boyalı A, Hashimoto N, John V and Acarman T Multi-Agent Reinforcement Learning for Autonomous On Demand Vehicles 2019 IEEE Intelligent Vehicles Symposium (IV), (1461-1468)
  43. Liu Z and Wu H (2019). New insight into the simultaneous policy update algorithms related to H ∞ state feedback control, Information Sciences: an International Journal, 484:C, (84-94), Online publication date: 1-May-2019.
  44. Chen X, Wang W, Cao W and Wu M (2022). Gaussian-kernel-based adaptive critic design using two-phase value iteration, Information Sciences: an International Journal, 482:C, (139-155), Online publication date: 1-May-2019.
  45. Liang Q and Modiano E Optimal Network Control in Partially-Controllable Networks IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, (397-405)
  46. ACM
    Roy A, Borkar V, Karandikar A and Chaporkar P A Structure-aware Online Learning Algorithm for Markov Decision Processes Proceedings of the 12th EAI International Conference on Performance Evaluation Methodologies and Tools, (71-78)
  47. Sauré D and Vielma J (2019). Ellipsoidal Methods for Adaptive Choice-Based Conjoint Analysis, Operations Research, 67:2, (315-338), Online publication date: 1-Mar-2019.
  48. Mazouchi M, Naghibi‐Sistani M, Hosseini Sani S, Tatari F and Modares H (2018). Observer‐based adaptive optimal output containment control problem of linear heterogeneous Multiagent systems with relative output measurements, International Journal of Adaptive Control and Signal Processing, 33:2, (262-284), Online publication date: 3-Feb-2019.
  49. Moghadam R and Lewis F (2017). Output‐feedback H∞ quadratic tracking control of linear systems using reinforcement learning, International Journal of Adaptive Control and Signal Processing, 33:2, (300-314), Online publication date: 3-Feb-2019.
  50. Yin B, Dridi M and Moudni A (2019). Recursive least-squares temporal difference learning for adaptive traffic signal control at intersection, Neural Computing and Applications, 31:2, (1013-1028), Online publication date: 1-Feb-2019.
  51. ACM
    John I and Bhatnagar S Efficient Budget Allocation and Task Assignment in Crowdsourcing Proceedings of the ACM India Joint International Conference on Data Science and Management of Data, (318-321)
  52. Barde S, Yacout S and Shin H (2019). Optimal preventive maintenance policy based on reinforcement learning of a fleet of military trucks, Journal of Intelligent Manufacturing, 30:1, (147-161), Online publication date: 1-Jan-2019.
  53. Punčochář I and Straka O Multiple-Model Active Fault Diagnosis with Deferred Decisions 2018 IEEE Conference on Decision and Control (CDC), (6340-6345)
  54. Mukherjee S, Bai H and Chakrabortty A On Model-Free Reinforcement Learning of Reduced-Order Optimal Control for Singularly Perturbed Systems 2018 IEEE Conference on Decision and Control (CDC), (5288-5293)
  55. Shin K and Lee T Spartan Proceedings of the 2018 Winter Simulation Conference, (4182-4183)
  56. Toscano-Palmerin S and Frazier P Effort allocation and statistical inference for 1-dimensional multistart stochastic gradient descent Proceedings of the 2018 Winter Simulation Conference, (1850-1861)
  57. Liu T, Xun J, Yin J and Xiao X Optimal Train Control by Approximate Dynamic Programming: Comparison of Three Value Function Approximation Methods* 2018 21st International Conference on Intelligent Transportation Systems (ITSC), (2741-2746)
  58. ACM
    Shah A, Ganesan R, Jajodia S and Cam H (2018). Dynamic Optimization of the Level of Operational Effectiveness of a CSOC Under Adverse Conditions, ACM Transactions on Intelligent Systems and Technology, 9:5, (1-20), Online publication date: 30-Sep-2018.
  59. ACM
    Davoudi H, An A, Zihayat M and Edall G Adaptive Paywall Mechanism for Digital News Media Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (205-214)
  60. Bertsekas D (2018). Proximal algorithms and temporal difference methods for solving fixed point problems, Computational Optimization and Applications, 70:3, (709-736), Online publication date: 1-Jul-2018.
  61. ACM
    Péron M, Bartlett P, Becker K, Helmstedt K and Chadès I Two Approximate Dynamic Programming Algorithms for Managing Complete SIS Networks Proceedings of the 1st ACM SIGCAS Conference on Computing and Sustainable Societies, (1-10)
  62. Sisikoglu Sir E, Pariazar M and Sir M (2018). Capacitated inspection scheduling of multi-unit systems, Computers and Industrial Engineering, 120:C, (471-479), Online publication date: 1-Jun-2018.
  63. Tan Y, Kunapareddy A and Kobilarov M Gaussian Process Adaptive Sampling Using the Cross-Entropy Method for Environmental Sensing and Monitoring 2018 IEEE International Conference on Robotics and Automation (ICRA), (6220-6227)
  64. Benosman M (2018). Model‐based vs data‐driven adaptive control, International Journal of Adaptive Control and Signal Processing, 32:5, (753-776), Online publication date: 9-May-2018.
  65. Fryer R and Harms P (2018). Two-Armed Restless Bandits with Imperfect Information, Mathematics of Operations Research, 43:2, (399-427), Online publication date: 1-May-2018.
  66. HoseinyFarahabady M, Bastani S, Taheri J, Zomaya A, Tari Z and Khan S Toward designing a dynamic CPU cap manager for timely dataflow platforms Proceedings of the High Performance Computing Symposium, (1-11)
  67. Hooker J and Hoeve W (2018). Constraint programming and operations research, Constraints, 23:2, (172-195), Online publication date: 1-Apr-2018.
  68. Dalal G, Szorenyi B, Thoppe G and Mannor S Finite sample analyses for TD(0) with function approximation Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence, (6144-6160)
  69. ACM
    Tsai C, Hu Y, Lin W and Wang M Early versus Late Dimensionality Reduction of Bag-of-Words Feature Representation for Image Classification Proceedings of the 4th International Conference on Bioinformatics Research and Applications, (42-45)
  70. Roy A, Xu H and Pokutta S Reinforcement learning under model mismatch Proceedings of the 31st International Conference on Neural Information Processing Systems, (3046-3055)
  71. Lam R and Willcox K Lookahead Bayesian optimization with inequality constraints Proceedings of the 31st International Conference on Neural Information Processing Systems, (1888-1898)
  72. Berkenkamp F, Turchetta M, Schoellig A and Krause A Safe model-based reinforcement learning with stability guarantees Proceedings of the 31st International Conference on Neural Information Processing Systems, (908-919)
  73. Hassler M (2017). Heuristic decision rules for short-term trading of renewable energy with co-located energy storage, Computers and Operations Research, 83:C, (199-213), Online publication date: 1-Jul-2017.
  74. Qian X and Ukkusuri S (2017). Time-of-Day Pricing in Taxi Markets, IEEE Transactions on Intelligent Transportation Systems, 18:6, (1610-1622), Online publication date: 1-Jun-2017.
  75. Huynh T, Theoleyre F and Hwang W (2017). On the interest of opportunistic anycast scheduling for wireless low power lossy networks, Computer Communications, 104:C, (55-66), Online publication date: 15-May-2017.
  76. Jafari N, Nuse B, Moore C, Dilkina B and Hepinstall-Cymerman J (2017). Achieving full connectivity of sites in the multiperiod reserve network design problem, Computers and Operations Research, 81:C, (119-127), Online publication date: 1-May-2017.
  77. Wang S, Urgaonkar R, He T, Chan K, Zafer M and Leung K (2017). Dynamic Service Placement for Mobile Micro-Clouds with Predicted Future Costs, IEEE Transactions on Parallel and Distributed Systems, 28:4, (1002-1016), Online publication date: 1-Apr-2017.
  78. Cosgun Ö, Kula U and Kahraman C (2017). Markdown optimization for an apparel retailer under cross-price and initial inventory effects, Knowledge-Based Systems, 120:C, (186-197), Online publication date: 15-Mar-2017.
  79. Yao J, Yin B, Tan X and Jiang X (2017). A POMDP framework for forwarding mechanism in named data networking, Computer Networks: The International Journal of Computer and Telecommunications Networking, 112:C, (167-175), Online publication date: 15-Jan-2017.
  80. Van Hoof H, Neumann G and Peters J (2017). Non-parametric policy search with limited information loss, The Journal of Machine Learning Research, 18:1, (2472-2517), Online publication date: 1-Jan-2017.
  81. Koch S (2017). Least squares approximate policy iteration for learning bid prices in choice-based revenue management, Computers and Operations Research, 77:C, (240-253), Online publication date: 1-Jan-2017.
  82. Bayliss C, Bennell J, Currie C, Martinez-Sykora A and So M A simheuristic approach to the vehicle ferry revenue management problem Proceedings of the 2016 Winter Simulation Conference, (2335-2346)
  83. Lam R, Willcox K and Wolpert D Bayesian optimization with a finite budget Proceedings of the 30th International Conference on Neural Information Processing Systems, (883-891)
  84. Bertsimas D and Mišić V (2016). Decomposable Markov Decision Processes, Operations Research, 64:6, (1537-1555), Online publication date: 1-Dec-2016.
  85. Sen S and Liu Y (2016). Mitigating Uncertainty via Compromise Decisions in Two-Stage Stochastic Linear Programming, Operations Research, 64:6, (1422-1437), Online publication date: 1-Dec-2016.
  86. ACM
    Petrou C and Paraskevas M Signal Processing Techniques Restructure The Big Data Era Proceedings of the 20th Pan-Hellenic Conference on Informatics, (1-6)
  87. Giuliani M, Li Y, Cominola A, Denaro S, Mason E and Castelletti A (2016). A Matlab toolbox for designing Multi-Objective Optimal Operations of water reservoir systems, Environmental Modelling & Software, 85:C, (293-298), Online publication date: 1-Nov-2016.
  88. ACM
    Ganesan R, Jajodia S, Shah A and Cam H (2016). Dynamic Scheduling of Cybersecurity Analysts for Minimizing Risk Using Reinforcement Learning, ACM Transactions on Intelligent Systems and Technology, 8:1, (1-21), Online publication date: 3-Oct-2016.
  89. Huang M, Lin W, Chen C, Ke S, Tsai C and Eberle W (2016). Data preprocessing issues for incomplete medical datasets, Expert Systems: The Journal of Knowledge Engineering, 33:5, (432-438), Online publication date: 1-Oct-2016.
  90. Haykin S, Setoodeh P, Feng S and Findlay D (2016). Cognitive Dynamic System as the Brain of Complex Networks, IEEE Journal on Selected Areas in Communications, 34:10, (2791-2800), Online publication date: 1-Oct-2016.
  91. ul Hassan U and Curry E (2016). Efficient task assignment for spatial crowdsourcing, Expert Systems with Applications: An International Journal, 58:C, (36-56), Online publication date: 1-Oct-2016.
  92. Gao W, Jiang Y, Jiang Z and Chai T (2016). Output-feedback adaptive optimal control of interconnected systems based on robust adaptive dynamic programming, Automatica (Journal of IFAC), 72:C, (37-45), Online publication date: 1-Oct-2016.
  93. GarcíaźCarrillo L, Vamvoudakis K and Hespanha J (2016). Approximate optimal adaptive control for weakly coupled nonlinear systems, International Journal of Adaptive Control and Signal Processing, 30:8-10, (1494-1522), Online publication date: 1-Aug-2016.
  94. Qin C, Zhang H, Wang Y and Luo Y (2016). Neural network-based online H∞ control for discrete-time affine nonlinear system using adaptive dynamic programming, Neurocomputing, 198:C, (91-99), Online publication date: 19-Jul-2016.
  95. Yang X, Liu D, Wei Q and Wang D (2016). Guaranteed cost neural tracking control for a class of uncertain nonlinear systems using adaptive dynamic programming, Neurocomputing, 198:C, (80-90), Online publication date: 19-Jul-2016.
  96. Peng M, Sun Y, Li X, Mao Z and Wang C (2016). Recent Advances in Cloud Radio Access Networks: System Architectures, Key Techniques, and Open Issues, IEEE Communications Surveys & Tutorials, 18:3, (2282-2308), Online publication date: 1-Jul-2016.
  97. Fox R, Pakman A and Tishby N Taming the noise in reinforcement learning via soft updates Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence, (202-211)
  98. Memarzadeh M and Pozzi M (2016). Integrated Inspection Scheduling and Maintenance Planning for Infrastructure Systems, Computer-Aided Civil and Infrastructure Engineering, 31:6, (403-415), Online publication date: 1-Jun-2016.
  99. Basu D, Lin Q, Chen W, Vo H, Yuan Z, Senellart P and Bressan S Regularized Cost-Model Oblivious Database Tuning with Reinforcement Learning Transactions on Large-Scale Data- and Knowledge-Centered Systems XXVIII - Volume 9940, (96-132)
  100. Cheng K, Zhang K, Fei S and Wei H (2016). Potential-Based Least-Squares Policy Iteration for a Parameterized Feedback Control System, Journal of Optimization Theory and Applications, 169:2, (692-704), Online publication date: 1-May-2016.
  101. Dolinskaya I, Epelman M, Şişikoğlu Sir E and Smith R (2016). Parameter-Free Sampled Fictitious Play for Solving Deterministic Dynamic Programming Problems, Journal of Optimization Theory and Applications, 169:2, (631-655), Online publication date: 1-May-2016.
  102. Jiang J, Sekar V, Milner H, Shepherd D, Stoica I and Zhang H CFA Proceedings of the 13th Usenix Conference on Networked Systems Design and Implementation, (137-150)
  103. Wei Q, Liu D and Xu Y (2016). Neuro-optimal tracking control for a class of discrete-time nonlinear systems via generalized value iteration adaptive dynamic programming approach, Soft Computing - A Fusion of Foundations, Methodologies and Applications, 20:2, (697-706), Online publication date: 1-Feb-2016.
  104. Chang H (2016). Sleeping experts and bandits approach to constrained Markov decision processes, Automatica (Journal of IFAC), 63:C, (182-186), Online publication date: 1-Jan-2016.
  105. Nagpal A and Gaur D A New Proposed Feature Subset Selection Algorithm Based on Maximization of Gain Ratio Proceedings of the 4th International Conference on Big Data Analytics - Volume 9498, (181-197)
  106. Toosi A and Buyya R A fuzzy logic-based controller for cost and energy efficient load balancing in geo-distributed data centers Proceedings of the 8th International Conference on Utility and Cloud Computing, (186-194)
  107. Jiang H and Shanbhag U Data-driven schemes for resolving misspecified MDPs Proceedings of the 2015 Winter Simulation Conference, (3801-3812)
  108. Chen Z, Lin W, Ke S and Tsai C (2015). Evolutionary feature and instance selection for traffic sign recognition, Computers in Industry, 74:C, (201-211), Online publication date: 1-Dec-2015.
  109. Luo B, Wu H, Huang T and Liu D (2015). Reinforcement learning solution for HJB equation arising in constrained optimal control problem, Neural Networks, 71:C, (150-158), Online publication date: 1-Nov-2015.
  110. Veatch M (2015). Approximate linear programming for networks, Computers and Operations Research, 63:C, (32-45), Online publication date: 1-Nov-2015.
  111. ACM
    Rabbi M, Aung M, Zhang M and Choudhury T MyBehavior Proceedings of the 2015 ACM International Joint Conference on Pervasive and Ubiquitous Computing, (707-718)
  112. Alagoz O, Ayvaci M and Linderoth J (2015). Optimally solving Markov decision processes with total expected discounted reward function, Computers and Industrial Engineering, 87:C, (311-316), Online publication date: 1-Sep-2015.
  113. Basu D, Lin Q, Chen W, Vo H, Yuan Z, Senellart P and Bressan S Cost-Model Oblivious Database Tuning with Reinforcement Learning Proceedings, Part I, of the 26th International Conference on Database and Expert Systems Applications - Volume 9261, (253-268)
  114. Fernandez-Bes J, Cid-Sueiro J and Marques A (2015). An MDP Model for Censoring in Harvesting Sensors: Optimal and Approximated Solutions, IEEE Journal on Selected Areas in Communications, 33:8, (1717-1729), Online publication date: 1-Aug-2015.
  115. Fernández C, Manyà F, Mateu C and Sole-Mauri F (2015). Approximate dynamic programming for automated vacuum waste collection systems, Environmental Modelling & Software, 67:C, (128-137), Online publication date: 1-May-2015.
  116. (2015). Approximating convex functions via non-convex oracles under the relative noise model, Discrete Optimization, 16:C, (1-16), Online publication date: 1-May-2015.
  117. Chen X, Lin Q and Zhou D (2015). Statistical decision making for optimal budget allocation in crowd labeling, The Journal of Machine Learning Research, 16:1, (1-46), Online publication date: 1-Jan-2015.
  118. Huynh T, Pham N, Lee S and Hwang W (2015). Dynamic Control Policy for Delay Guarantees in Multi-hop Wireless Networks, Wireless Personal Communications: An International Journal, 80:2, (647-670), Online publication date: 1-Jan-2015.
  119. Liu D, Yan P and Wei Q (2014). Data-based analysis of discrete-time linear systems in noisy environment, Information Sciences: an International Journal, 288:C, (314-329), Online publication date: 20-Dec-2014.
  120. Hu W, Frazier P and Xie J Parallel bayesian policies for finite-horizon multiple comparisons with a known standard Proceedings of the 2014 Winter Simulation Conference, (3904-3915)
  121. Coelho L, Cordeau J and Laporte G (2014). Heuristics for dynamic and stochastic inventory-routing, Computers and Operations Research, 52:PA, (55-67), Online publication date: 1-Dec-2014.
  122. Abdulwahab U and Wahab M (2014). Approximate dynamic programming modeling for a typical blood platelet bank, Computers and Industrial Engineering, 78:C, (259-270), Online publication date: 1-Dec-2014.
  123. Bian T, Jiang Y and Jiang Z (2014). Adaptive dynamic programming and optimal control of nonlinear nonaffine systems, Automatica (Journal of IFAC), 50:10, (2624-2632), Online publication date: 1-Oct-2014.
  124. Govindarajan N, de Visser C and Krishnakumar K (2014). A sparse collocation method for solving time-dependent HJB equations using multivariate B-splines, Automatica (Journal of IFAC), 50:9, (2234-2244), Online publication date: 1-Sep-2014.
  125. Adelman D and Barz C (2014). A Unifying Approximate Dynamic Programming Model for the Economic Lot Scheduling Problem, Mathematics of Operations Research, 39:2, (374-402), Online publication date: 1-May-2014.
  126. Jung T, Wehenkel L, Ernst D and Maes F (2014). Optimized look-ahead tree policies, International Journal of Adaptive Control and Signal Processing, 28:3-5, (255-289), Online publication date: 1-Mar-2014.
  127. Boyd S, Mueller M, O'Donoghue B and Wang Y (2014). Performance Bounds and Suboptimal Policies for Multi–Period Investment, Foundations and Trends in Optimization, 1:1, (1-72), Online publication date: 1-Jan-2014.
  128. Ahner D and Parson C Weapon tradeoff analysis using dynamic programming for a dynamic weapon target assignment problem within a simulation Proceedings of the 2013 Winter Simulation Conference: Simulation: Making Decisions in a Complex World, (2831-2841)
  129. Xie J and Frazier P Upper bounds on the Bayes-optimal procedure for ranking & selection with independent normal priors Proceedings of the 2013 Winter Simulation Conference: Simulation: Making Decisions in a Complex World, (877-887)
  130. Barahona F, Ettl M, Petrik M and Rimshnick P Agile logistics simulation and optimization for managing disaster responses Proceedings of the 2013 Winter Simulation Conference: Simulation: Making Decisions in a Complex World, (3340-3351)
  131. Coşgun Ö, Kula U and Kahraman C (2013). Analysis of cross-price effects on markdown policies by using function approximation techniques, Knowledge-Based Systems, 53, (173-184), Online publication date: 1-Nov-2013.
  132. Escudero L, Monge J, Morales D and Wang J (2013). Expected Future Value Decomposition Based Bid Price Generation for Large-Scale Network Revenue Management, Transportation Science, 47:2, (181-197), Online publication date: 1-May-2013.
  133. Fang J, Zhao L, Fransoo J and Van Woensel T (2013). Sourcing strategies in supply risk management, Computers and Operations Research, 40:5, (1371-1382), Online publication date: 1-May-2013.
  134. Adelman D and Mersereau A (2013). Dynamic Capacity Allocation to Customers Who Remember Past Service, Management Science, 59:3, (592-612), Online publication date: 1-Mar-2013.
  135. Wu H and Luo B (2013). Simultaneous policy update algorithms for learning the solution of linear continuous-time H∞ state feedback control, Information Sciences: an International Journal, 222, (472-485), Online publication date: 1-Feb-2013.
  136. Gaggero M, Gnecco G and Sanguineti M (2013). Dynamic Programming and Value-Function Approximation in Sequential Decision Problems, Journal of Optimization Theory and Applications, 156:2, (380-416), Online publication date: 1-Feb-2013.
  137. Goodson J, Ohlmann J and Thomas B (2013). Rollout Policies for Dynamic Solutions to the Multivehicle Routing Problem with Stochastic Demand and Duration Limits, Operations Research, 61:1, (138-154), Online publication date: 1-Jan-2013.
  138. Zhang D and Lu Z (2013). Assessing the Value of Dynamic Pricing in Network Revenue Management, INFORMS Journal on Computing, 25:1, (102-115), Online publication date: 1-Jan-2013.
  139. Chen X, Fernandez E and Kelton W Optimization model selection for simulation-based approximate dynamic programming approaches in semiconductor manufacturing operations Proceedings of the Winter Simulation Conference, (1-12)
  140. Haijema R, van Dijk D, Hendrix E and van der Wal J Simulation to discover structure in optimal dynamic control policies Proceedings of the Winter Simulation Conference, (1-12)
  141. Bigus J, Chen-Ritzo C, Hermiz K, Tesauro G and Sorrentino R Applying a framework for healthcare incentives simulation Proceedings of the Winter Simulation Conference, (1-12)
  142. Frazier P, Jedynak B and Chen L Sequential screening Proceedings of the Winter Simulation Conference, (1-12)
  143. Frazier P Optimization via simulation with Bayesian statistics and dynamic programming Proceedings of the Winter Simulation Conference, (1-16)
  144. Gouberman A and Siegle M Markov Reward Models and Markov Decision Processes in Discrete and Continuous Time Advanced Lectures of the International Autumn School on Stochastic Model Checking. Rigorous Dependability Analysis Using Model Checking Techniques for Stochastic Systems - Volume 8453, (156-241)
  145. Powell W, George A, Simão H, Scott W, Lamont A and Stewart J (2012). SMART, INFORMS Journal on Computing, 24:4, (665-682), Online publication date: 1-Oct-2012.
  146. Cooper W and Rangarajan B (2012). Performance Guarantees for Empirical Markov Decision Processes with Applications to Multiperiod Inventory Models, Operations Research, 60:5, (1267-1281), Online publication date: 1-Sep-2012.
  147. Petrik M and Subramanian D An approximate solution method for large risk-averse Markov decision processes Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence, (805-814)
  148. McMillan C, Grechanik M and Poshyvanyk D Detecting similar software applications Proceedings of the 34th International Conference on Software Engineering, (364-374)
  149. Desai V, Farias V and Moallemi C (2012). Approximate Dynamic Programming via a Smoothed Linear Program, Operations Research, 60:3, (655-674), Online publication date: 1-May-2012.
  150. ACM
    Ilyas M and Radha H (2012). A dynamic programming approach to maximizing a statistical measure of the lifetime of sensor networks, ACM Transactions on Sensor Networks, 8:2, (1-21), Online publication date: 1-Mar-2012.
  151. Bertsekas D and Yu H (2012). Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming, Mathematics of Operations Research, 37:1, (66-94), Online publication date: 1-Feb-2012.
  152. Ryzhov I, Powell W and Frazier P (2012). The Knowledge Gradient Algorithm for a General Class of Online Learning Problems, Operations Research, 60:1, (180-195), Online publication date: 1-Jan-2012.
  153. Sen S and Zhou Z Optimization simulation Proceedings of the Winter Simulation Conference, (4103-4114)
  154. Sisikoglu E, Epelman M and Smith R A sampled fictitious play based learning algorithm for infinite horizon Markov decision processes Proceedings of the Winter Simulation Conference, (4091-4102)
  155. ACM
    Powell W, Bouzaiene-Ayari B, Berger J, Boukhtouta A and George A (2011). The Effect of Robust Decisions on the Cost of Uncertainty in Military Airlift Operations, ACM Transactions on Modeling and Computer Simulation, 22:1, (1-19), Online publication date: 1-Dec-2011.
  156. Epelman M, Ghate A and Smith R (2011). Sampled fictitious play for approximate dynamic programming, Computers and Operations Research, 38:12, (1705-1718), Online publication date: 1-Dec-2011.
  157. ACM
    Cai C, Wang Y and Geers G Quantifying the exact impact of state estimation error on traffic signal control Proceedings of the 4th ACM SIGSPATIAL International Workshop on Computational Transportation Science, (39-44)
  158. Elkan C Reinforcement learning with a bilinear q function Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning, (78-88)
  159. Mirhoseini A and Koushanfar F Learning to manage combined energy supply systems Proceedings of the 17th IEEE/ACM international symposium on Low-power electronics and design, (229-234)
  160. Osogami T Iterated risk measures for risk-sensitive Markov decision processes with discounted cost Proceedings of the Twenty-Seventh Conference on Uncertainty in Artificial Intelligence, (573-580)
  161. ACM
    Engel Y and Etzion O Towards proactive event-driven computing Proceedings of the 5th ACM international conference on Distributed event-based system, (125-136)
  162. Levina T, Levin Y, McGill J and Nediak M (2011). Network Cargo Capacity Management, Operations Research, 59:4, (1008-1023), Online publication date: 1-Jul-2011.
  163. Negoescu D, Frazier P and Powell W (2011). The Knowledge-Gradient Algorithm for Sequencing Experiments in Drug Discovery, INFORMS Journal on Computing, 23:3, (346-363), Online publication date: 1-Jul-2011.
  164. Lai G, Wang M, Kekre S, Scheller-Wolf A and Secomandi N (2011). Valuation of Storage at a Liquefied Natural Gas Terminal, Operations Research, 59:3, (602-616), Online publication date: 1-May-2011.
  165. Zhang D (2011). An Improved Dynamic Programming Decomposition Approach for Network Revenue Management, Manufacturing & Service Operations Management, 13:1, (35-52), Online publication date: 1-Jan-2011.
  166. Hannah L, Powell W and Blei D Nonparametric density estimation for stochastic optimization with an observable state variable Proceedings of the 23rd International Conference on Neural Information Processing Systems - Volume 1, (820-828)
  167. Maxwell M, Henderson S and Topaloglu H Identifying effective policies in approximate dynamic programming Proceedings of the Winter Simulation Conference, (1079-1087)
  168. Huang H and Lau V (2010). Delay-optimal user scheduling and inter-cell interference management in cellular network via distributive stochastic learning, IEEE Transactions on Wireless Communications, 9:12, (3790-3797), Online publication date: 1-Dec-2010.
  169. ACM
    Cai C, Wang Y and Geers G Adaptive traffic signal control using vehicle-to-infrastructure communication Proceedings of the Third International Workshop on Computational Transportation Science, (43-47)
  170. Rimmel A, Teytaud F and Teytaud O Biasing Monte-Carlo simulations through RAVE values Proceedings of the 7th international conference on Computers and games, (59-68)
  171. ACM
    Kunnumkal S and Topaloglu H (2010). A stochastic approximation method with max-norm projections and its applications to the Q-learning algorithm, ACM Transactions on Modeling and Computer Simulation, 20:3, (1-26), Online publication date: 1-Sep-2010.
  172. Volos H and Buehrer R (2010). Cognitive engine design for link adaptation, IEEE Transactions on Wireless Communications, 9:9, (2902-2913), Online publication date: 1-Sep-2010.
  173. Gnecco G and Sanguineti M (2010). Suboptimal Solutions to Dynamic Optimization Problems via Approximations of the Policy Functions, Journal of Optimization Theory and Applications, 146:3, (764-794), Online publication date: 1-Sep-2010.
  174. Kim S and Giannakis G (2010). Sequential and cooperative sensing for multi-channel cognitive radios, IEEE Transactions on Signal Processing, 58:8, (4239-4253), Online publication date: 1-Aug-2010.
  175. ACM
    Adhikari P and Hollmén J Patterns from multiresolution 0-1 data Proceedings of the ACM SIGKDD Workshop on Useful Patterns, (8-16)
  176. Branavan S, Zettlemoyer L and Barzilay R Reading between the lines Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, (1268-1277)
  177. Hendzel Z and Szuster M Discrete model-based adaptive critic designs in wheeled mobile robot control Proceedings of the 10th international conference on Artifical intelligence and soft computing: Part II, (264-271)
  178. Gierlak P, Szuster M and Żylski W Discrete dual-heuristic programming in 3DOF manipulator control Proceedings of the 10th international conference on Artifical intelligence and soft computing: Part II, (256-263)
  179. Lai G, Margot F and Secomandi N (2010). An Approximate Dynamic Programming Approach to Benchmark Practice-Based Heuristics for Natural Gas Storage Valuation, Operations Research, 58:3, (564-582), Online publication date: 1-May-2010.
  180. Nascimento J and Powell W (2010). Dynamic Programming Models and Algorithms for the Mutual Fund Cash Balance Problem, Management Science, 56:5, (801-815), Online publication date: 1-May-2010.
  181. ACM
    Ramponi F, Chatterjee D, Summers S and Lygeros J On the connections between PCTL and dynamic programming Proceedings of the 13th ACM international conference on Hybrid systems: computation and control, (253-262)
  182. Akuiyibo E and Boyd S (2010). Adaptive modulation with smoothed flow utility, EURASIP Journal on Wireless Communications and Networking, 2010, (1-9), Online publication date: 1-Apr-2010.
  183. Kim B, Park J, Park S and Kang S (2010). Impedance learning for robotic contact tasks using natural actor-critic algorithm, IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics, 40:2, (433-443), Online publication date: 1-Apr-2010.
  184. Kobayashi K A linear approximation of the value function of an approximate dynamic programming approach for the ship scheduling problem Proceedings of the 4th international conference on Learning and intelligent optimization, (184-187)
  185. Powell W (2010). Rejoinder---The Languages of Stochastic Optimization, INFORMS Journal on Computing, 22:1, (23-25), Online publication date: 1-Jan-2010.
  186. Powell W (2010). Feature Article---Merging AI and OR to Solve High-Dimensional Stochastic Optimization Problems Using Approximate Dynamic Programming, INFORMS Journal on Computing, 22:1, (2-17), Online publication date: 1-Jan-2010.
  187. Maxwell M, Henderson S and Topaloglu H Ambulance redeployment Winter Simulation Conference, (1850-1860)
  188. Ramírez-Hernández J and Fernandez E A simulation-based approximate dynamic programming approach for the control of the Intel Mini-Fab benchmark model Winter Simulation Conference, (1634-1645)
  189. Ryzhov I and Powell W A Monte Carlo knowledge gradient method for learning abatement potential of emissions reduction technologies Winter Simulation Conference, (1492-1502)
  190. Volos H and Buehrer R On balancing exploration vs. exploitation in a cognitive engine for multi-antenna systems Proceedings of the 28th IEEE conference on Global telecommunications, (4238-4243)
  191. Hover F Path planning for data assimilation in mobile environmental monitoring systems Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems, (213-218)
  192. Zhang D and Adelman D (2009). An Approximate Dynamic Programming Approach to Network Revenue Management with Customer Choice, Transportation Science, 43:3, (381-394), Online publication date: 1-Aug-2009.
  193. ACM
    Petrik M and Zilberstein S Constraint relaxation in approximate linear programs Proceedings of the 26th Annual International Conference on Machine Learning, (809-816)
  194. ACM
    Wu T, Powell W and Whisman A (2009). The optimizing-simulator, ACM Transactions on Modeling and Computer Simulation, 19:3, (1-31), Online publication date: 1-Jun-2009.
  195. Simão H, Day J, George A, Gifford T, Nienow J and Powell W (2009). An Approximate Dynamic Programming Algorithm for Large-Scale Fleet Management, Transportation Science, 43:2, (178-197), Online publication date: 1-May-2009.
  196. Mahadevan S (2009). Learning Representation and Control in Markov Decision Processes, Foundations and Trends® in Machine Learning, 1:4, (403-565), Online publication date: 1-Apr-2009.
  197. Hvattum L, Løkketangen A and Laporte G (2009). Scenario Tree-Based Heuristics for Stochastic Inventory-Routing Problems, INFORMS Journal on Computing, 21:2, (268-285), Online publication date: 1-Apr-2009.
  198. Borkar V Opportunistic Transmission over Randomly Varying Channels Network Control and Optimization, (62-69)
  199. Nascimento J and Powell W (2009). An Optimal Approximate Dynamic Programming Algorithm for the Lagged Asset Acquisition Problem, Mathematics of Operations Research, 34:1, (210-237), Online publication date: 1-Feb-2009.
  200. Kang Q, Wang L and Wu Q (2009). Swarm-based approximate dynamic optimization process for discrete particle swarm optimization system, International Journal of Bio-Inspired Computation, 1:1/2, (61-70), Online publication date: 1-Jan-2009.
  201. Secomandi N and Margot F (2009). Reoptimization Approaches for the Vehicle-Routing Problem with Stochastic Demands, Operations Research, 57:1, (214-230), Online publication date: 1-Jan-2009.
  202. Powell W Approximate dynamic programming Proceedings of the 40th Conference on Winter Simulation, (205-214)
  203. Fu M, Chen C and Shi L Some topics for simulation optimization Proceedings of the 40th Conference on Winter Simulation, (27-38)
  204. Petrik M and Zilberstein S Learning heuristic functions through approximate linear programming Proceedings of the Eighteenth International Conference on International Conference on Automated Planning and Scheduling, (248-255)
  205. Powell W The optimizing-simulator Proceedings of the 39th conference on Winter simulation: 40 years! The best is yet to come, (43-53)
  206. Shin J and Lee J Procurement scheduling under supply and demand uncertainty: Case study for comparing classical, reactive, and proactive scheduling 2015 15th International Conference on Control, Automation and Systems (ICCAS), (636-641)
  207. Abd-Elmagid M, Ferdowsi A, Dhillon H and Saad W Deep Reinforcement Learning for Minimizing Age-of-Information in UAV-Assisted Networks 2019 IEEE Global Communications Conference (GLOBECOM), (1-6)
  208. Barde S, Shin H and Yacout S Opportunistic preventive maintenance strategy of a multi-component system with hierarchical structure by simulation and evaluation 2016 IEEE 21st International Conference on Emerging Technologies and Factory Automation (ETFA), (1-8)
  209. Pavez E, Michelusi N, Anis A, Mitra U and Ortega A Markov chain sparsification with independent sets for approximate value iteration 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), (1399-1405)
  210. Saldi N, Yüksel S and Linder T Finite-state approximations to constrained Markov decision processes with Borel spaces 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton), (567-572)
Contributors
  • Princeton University

Recommendations

Reviews

Renato Leone

Finally, a book devoted to dynamic programming and written using the language of operations research (OR)! This beautiful book fills a gap in the libraries of OR specialists and practitioners. The book is subdivided into three parts. Part 1 is an introduction to dynamic programming using flat-state representation, and contains four chapters. The meaning of the subtitle is explained in the first chapter by means of a simple example. Of extreme interest is chapter 2, where the characteristics of dynamic programming are introduced through various examples: discrete and continuous budgeting problems, a stochastic shortest-path problem, a gambling problem, an asset pricing and asset acquisition problem, and a dynamic assignment problem. Chapter 3 is devoted to Markov decision theory. Lastly, approximate dynamic programming is discussed in chapter 4. Chapters 5 through 9 make up Part 2, which focuses on approximate dynamic programming. Approximate dynamic programming offers an important set of strategies and methods for solving problems that are difficult due to size, the lack of a formal model of the information process, or in view of the fact that the transition function is unknown. Chapter 5 is devoted to time and resource modeling techniques. The next two chapters are devoted to stochastic approximation methods and approximating value functions, respectively. Chapter 8 is the bulk of the book: starting from the introduction presented in chapter 4, various algorithms are described in detail, with an emphasis on the finite-horizon case. Finally, Part 3, chapters 10 to 13, is devoted to special topics and applications. In particular, the last chapter describes a number of issues arising in real-world applications of approximate dynamic programming: convergence issues, modeling (online versus offline), and debugging. As a last point, it is worth mentioning that additional material is available online (additional exercises and solutions to selected problems); at the moment, only some datasets are downloadable. The intended audience of the book is OR practitioners and undergraduate or masters students. The more advanced material, suitable for graduate students, is clearly marked. The sections titled "Why Does It Work__?__" are remarkable?these are where the proofs of the various results are collected. In my opinion, the audience for this book is much larger than the author had in mind. In fact, this book should be on the shelf of every OR researcher and practitioner as a standard reference book. My overall impression of this book is extremely positive. The author succeeds in focusing on the core of each aspect of dynamic programming with a simple and clear exposition of the material, emphasizing computation and applications, while maintaining and elevating the standard of the theory. Online Computing Reviews Service

Access critical reviews of Computing literature here

Become a reviewer for Computing Reviews.