skip to main content
Skip header Section
Introduction to Reinforcement LearningMarch 1998
Publisher:
  • MIT Press
  • 55 Hayward St.
  • Cambridge
  • MA
  • United States
ISBN:978-0-262-19398-6
Published:01 March 1998
Pages:
342
Skip Bibliometrics Section
Bibliometrics
Skip Abstract Section
Abstract

From the Publisher:

In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability.

Cited By

  1. Luo B Adaptive Decision-Making in Non-Stationary Markov Decision Processes Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, (2755-2757)
  2. Nafi N, Ali R, Hsu W, Duong K and Vick M Policy Optimization using Horizon Regularized Advantage to Improve Generalization in Reinforcement Learning Proceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems, (1427-1435)
  3. Xu Y, Wang L, Xu H, Liu J, Wang Z and Huang L (2024). Enhancing Federated Learning With Server-Side Unlabeled Data by Adaptive Client and Data Selection, IEEE Transactions on Mobile Computing, 23:4, (2813-2831), Online publication date: 1-Apr-2024.
  4. Ouyang W, Wang Y, Weng P and Han S (2024). Generalization in Deep RL for TSP Problems via Equivariance and Local Search, SN Computer Science, 5:4, Online publication date: 29-Mar-2024.
  5. Li D, Zhu F, Wu J, Wong Y and Chen T (2024). Managing mixed traffic at signalized intersections, Expert Systems with Applications: An International Journal, 238:PC, Online publication date: 15-Mar-2024.
  6. Wang Q and Huang Z (2024). Load Frequency Control Strategy for Islanded Microgrid Based on SCQ(λ) Algorithm, International Journal of Gaming and Computer-Mediated Simulations, 16:1, (1-16), Online publication date: 7-Mar-2024.
  7. Ji J, Cai L, Zhu K and Niyato D (2024). Decoupled Association With Rate Splitting Multiple Access in UAV-Assisted Cellular Networks Using Multi-Agent Deep Reinforcement Learning, IEEE Transactions on Mobile Computing, 23:3, (2186-2201), Online publication date: 1-Mar-2024.
  8. Tripathi S, Puligheddu C, Pramanik S, Garcia-Saavedra A and Chiasserini C (2024). Fair and Scalable Orchestration of Network and Compute Resources for Virtual Edge Services, IEEE Transactions on Mobile Computing, 23:3, (2202-2218), Online publication date: 1-Mar-2024.
  9. de Lellis Rossi L, Rohmer E, Dornhofer Paro Costa P, Colombini E, da Silva Simões A and Gudwin R (2024). A Procedural Constructive Learning Mechanism with Deep Reinforcement Learning for Cognitive Agents, Journal of Intelligent and Robotic Systems, 110:1, Online publication date: 1-Mar-2024.
  10. ACM
    Weng Z, Wu Z, Li H, Chen J and Jiang Y (2023). HCMS: Hierarchical and Conditional Modality Selection for Efficient Video Recognition, ACM Transactions on Multimedia Computing, Communications, and Applications, 20:2, (1-18), Online publication date: 29-Feb-2024.
  11. ACM
    Tappler M, Pferscher A, Aichernig B and Könighofer B Learning and Repair of Deep Reinforcement Learning Policies from Fuzz-Testing Data Proceedings of the 46th IEEE/ACM International Conference on Software Engineering, (1-13)
  12. Sheng H, Zhou W, Zheng J, Zhao Y and Ma W (2024). Transfer Reinforcement Learning for Dynamic Spectrum Environment, IEEE Transactions on Wireless Communications, 23:2, (1447-1458), Online publication date: 1-Feb-2024.
  13. Ji B, Zhang M, Huang J, Wang Y, Xing L, Li T, Han C and Mumtaz S (2024). Research on Offloading Strategy of Twin UAVs Edge Computing Tasks for Emergency Communication, IEEE Transactions on Network and Service Management, 21:1, (684-696), Online publication date: 1-Feb-2024.
  14. Hammar K and Stadler R (2024). Learning Near-Optimal Intrusion Responses Against Dynamic Attackers, IEEE Transactions on Network and Service Management, 21:1, (1158-1177), Online publication date: 1-Feb-2024.
  15. Zhang X, Zuo J, Huang Z, Zhou Z, Chen X and Joe-Wong C (2024). Learning With Side Information: Elastic Multi-Resource Control for the Open RAN, IEEE Journal on Selected Areas in Communications, 42:2, (295-309), Online publication date: 1-Feb-2024.
  16. Oh M, Das A, Hosseinalipour S, Kim T, Love D and Brinton C (2024). A Decentralized Pilot Assignment Algorithm for Scalable O-RAN Cell-Free Massive MIMO, IEEE Journal on Selected Areas in Communications, 42:2, (373-388), Online publication date: 1-Feb-2024.
  17. Alilou M, Babazadeh Sangar A, Majidzadeh K and Masdari M (2024). QFS-RPL: mobility and energy aware multi path routing protocol for the internet of mobile things data transfer infrastructures, Telecommunications Systems, 85:2, (289-312), Online publication date: 1-Feb-2024.
  18. Hung Y (2024). A review of Monte Carlo and quasi‐Monte Carlo sampling techniques, WIREs Computational Statistics, 16:1, Online publication date: 21-Jan-2024.
  19. Xiong L, Chen Y, Peng Y and Ghadi Y (2024). Improving Robot-Assisted Virtual Teaching Using Transformers, GANs, and Computer Vision, Journal of Organizational and End User Computing, 36:1, (1-32), Online publication date: 17-Jan-2024.
  20. Xie C, El‐Hajjar M and Ng S (2023). Machine learning assisted adaptive LDPC coded system design and analysis, IET Communications, 18:1, (1-10), Online publication date: 16-Jan-2024.
  21. Ma Z, Li Z and Tu Y (2023). A Sample-Aware Database Tuning System With Deep Reinforcement Learning, Journal of Database Management, 35:1, (1-25), Online publication date: 7-Jan-2024.
  22. Zhang T and Jia Y (2024). Input‐constrained optimal output synchronization of heterogeneous multiagent systems via observer‐based model‐free reinforcement learning, Asian Journal of Control, 26:1, (98-113), Online publication date: 7-Jan-2024.
  23. Lin W, Song Y, Ruan B, Shuai H, Shen C, Wang L and Li Y (2024). Temporal Difference-Aware Graph Convolutional Reinforcement Learning for Multi-Intersection Traffic Signal Control, IEEE Transactions on Intelligent Transportation Systems, 25:1, (327-337), Online publication date: 1-Jan-2024.
  24. Stöckermann P, Immordino A, Altenmüller T, Seidel G, Gebser M, Tassel P, Chan C and Zhang F Dispatching in Real Frontend Fabs with Industrial Grade Discrete-Event Simulations by Deep Reinforcement Learning with Evolution Strategies Proceedings of the Winter Simulation Conference, (3047-3058)
  25. Möbius M, Kallfass D, Flock M, Doll T and Kunde D Incorporation of Military Doctrines and Objectives into an AI Agent via Natural Language and Reward in Reinforcement Learning Proceedings of the Winter Simulation Conference, (2357-2378)
  26. Zhang T, Kabak K, Heavey C and Rose O A Reinforcement Learning Approach for Improved Photolithography Schedules Proceedings of the Winter Simulation Conference, (2136-2147)
  27. Kannan K, Pamuru V and Rosokha Y (2023). Analyzing Frictions in Generalized Second-Price Auction Markets, Information Systems Research, 34:4, (1437-1454), Online publication date: 1-Dec-2023.
  28. Suzuki A, Kobayashi M and Oki E (2023). Multi-Agent Deep Reinforcement Learning for Cooperative Computing Offloading and Route Optimization in Multi Cloud-Edge Networks, IEEE Transactions on Network and Service Management, 20:4, (4416-4434), Online publication date: 1-Dec-2023.
  29. Morcego B, Yin W, Boersma S, van Henten E, Puig V and Sun C (2024). Reinforcement Learning versus Model Predictive Control on greenhouse climate control, Computers and Electronics in Agriculture, 215:C, Online publication date: 1-Dec-2023.
  30. Könighofer B, Rudolf J, Palmisano A, Tappler M and Bloem R (2023). Online shielding for reinforcement learning, Innovations in Systems and Software Engineering, 19:4, (379-394), Online publication date: 1-Dec-2023.
  31. Nilsen M, Nygaard T and Ellefsen K (2023). Reward tampering and evolutionary computation: a study of concrete AI-safety problems using evolutionary algorithms, Genetic Programming and Evolvable Machines, 24:2, Online publication date: 1-Dec-2023.
  32. Raveendran M, Srikanth K, Ungureanu T and Zheng G (2023). How Do Performance Goals Influence Exploration-Exploitation Choices?, Organization Science, 34:6, (2464-2486), Online publication date: 1-Nov-2023.
  33. Zhang L, Peng J, Zheng J and Xiao M (2023). Intelligent Cloud-Edge Collaborations Assisted Energy-Efficient Power Control in Heterogeneous Networks, IEEE Transactions on Wireless Communications, 22:11, (7743-7755), Online publication date: 1-Nov-2023.
  34. Yavas M, Kumbasar T and Ure N (2023). A Real-World Reinforcement Learning Framework for Safe and Human-Like Tactical Decision-Making, IEEE Transactions on Intelligent Transportation Systems, 24:11, (11773-11784), Online publication date: 1-Nov-2023.
  35. Valadares J, Villela S, Bernardino H, Gonçalves G and Vieira A (2023). Mapping user behaviors to identify professional accounts in Ethereum using semi-supervised learning, Expert Systems with Applications: An International Journal, 229:PB, Online publication date: 1-Nov-2023.
  36. Naderi M, Chakareski J and Ghanbari M (2023). Hierarchical Q-learning-enabled neutrosophic AHP scheme in candidate relay set size adaption in vehicular networks, Computer Networks: The International Journal of Computer and Telecommunications Networking, 235:C, Online publication date: 1-Nov-2023.
  37. Zukerman I, Partovi A and Hohwy J (2023). Influence of Device Performance and Agent Advice on User Trust and Behaviour in a Care-taking Scenario, User Modeling and User-Adapted Interaction, 33:5, (1015-1063), Online publication date: 1-Nov-2023.
  38. Li X, Wang P, Jin X, Jiang Q, Zhou W and Yao S (2023). Reinforcement learning architecture for cyber–physical–social AI: state-of-the-art and perspectives, Artificial Intelligence Review, 56:11, (12655-12688), Online publication date: 1-Nov-2023.
  39. Turgut O, Turgut M and Kırtepe E (2023). Q-learning-based metaheuristic algorithm for thermoeconomic optimization of a shell-and-tube evaporator working with refrigerant mixtures, Soft Computing - A Fusion of Foundations, Methodologies and Applications, 27:21, (16201-16241), Online publication date: 1-Nov-2023.
  40. ACM
    Bailly G, Khamassi M and Girard B (2023). Computational Model of the Transition from Novice to Expert Interaction Techniques, ACM Transactions on Computer-Human Interaction, 30:5, (1-33), Online publication date: 31-Oct-2023.
  41. ACM
    Yang S, Wang Z, Wu Z, Li M, Zhang Z, Huang Q, Hao L, Xu S, Wu X, Yang C and Dai Z UnifiedGesture: A Unified Gesture Synthesis Model for Multiple Skeletons Proceedings of the 31st ACM International Conference on Multimedia, (1033-1044)
  42. Savinov V and Yakovlev K DHC-R: Evaluating “Distributed Heuristic Communication” and Improving Robustness for Learnable Decentralized PO-MAPF Interactive Collaborative Robotics, (151-163)
  43. Ji J, Zhu K and Cai L (2023). Trajectory and Communication Design for Cache- Enabled UAVs in Cellular Networks: A Deep Reinforcement Learning Approach, IEEE Transactions on Mobile Computing, 22:10, (6190-6204), Online publication date: 1-Oct-2023.
  44. Vicente Ó, Fernández F and García J (2023). Automated market maker inventory management with deep reinforcement learning, Applied Intelligence, 53:19, (22249-22266), Online publication date: 1-Oct-2023.
  45. Clempner J (2023). A Bayesian reinforcement learning approach in markov games for computing near-optimal policies, Annals of Mathematics and Artificial Intelligence, 91:5, (675-690), Online publication date: 1-Oct-2023.
  46. Prestwich S Solving Mixed Influence Diagrams by Reinforcement Learning Machine Learning, Optimization, and Data Science, (255-269)
  47. ACM
    Spieck J, Sixdenier P, Esper K, Wildermann S and Teich J Hybrid Genetic Reinforcement Learning for Generating Run-Time Requirement Enforcers Proceedings of the 21st ACM-IEEE International Conference on Formal Methods and Models for System Design, (23-35)
  48. Sabbioni L, Corda F and Restelli M Stepsize Learning for Policy Gradient Methods in Contextual Markov Decision Processes Machine Learning and Knowledge Discovery in Databases: Research Track, (506-523)
  49. De Biasio A, Montagna A, Aiolli F and Navarin N (2023). A systematic review of value-aware recommender systems, Expert Systems with Applications: An International Journal, 226:C, Online publication date: 15-Sep-2023.
  50. ACM
    Khalid A, Mushtaq Z, Arif S, Zeb K, Khan M and Bakshi S (2023). Control Schemes for Quadrotor UAV: Taxonomy and Survey, ACM Computing Surveys, 0:0
  51. Alós-Ferrer C and Garagnani M (2023). Part-Time Bayesians, Management Science, 69:9, (5523-5542), Online publication date: 1-Sep-2023.
  52. Asheralieva A and Niyato D (2023). Secure and Efficient Coded Multi-Access Edge Computing With Generalized Graph Neural Networks, IEEE Transactions on Mobile Computing, 22:9, (5504-5524), Online publication date: 1-Sep-2023.
  53. Rui L, Yan Z, Tan Z, Gao Z, Yang Y, Chen X and Liu H (2023). An Intersection-Based QoS Routing for Vehicular Ad Hoc Networks With Reinforcement Learning, IEEE Transactions on Intelligent Transportation Systems, 24:9, (9068-9083), Online publication date: 1-Sep-2023.
  54. Du X, Chen H, Yang B, Long C and Zhao S (2023). HRL4EC, Information Sciences: an International Journal, 640:C, Online publication date: 1-Sep-2023.
  55. Huang F, Deng X, He Y and Jiang W (2023). A novel policy based on action confidence limit to improve exploration efficiency in reinforcement learning, Information Sciences: an International Journal, 640:C, Online publication date: 1-Sep-2023.
  56. Bai Y, Lv Y and Zhang J (2023). Smart mobile robot fleet management based on hierarchical multi-agent deep Q network towards intelligent manufacturing, Engineering Applications of Artificial Intelligence, 124:C, Online publication date: 1-Sep-2023.
  57. Lee D, Lee D and Kim K (2023). Self-growth learning-based machine scheduler to minimize setup time and tardiness in OLED display semiconductor manufacturing, Applied Soft Computing, 145:C, Online publication date: 1-Sep-2023.
  58. Kundaliya A, Kumar S and Lobiyal D (2023). Throughput and Lifetime Enhancement of WSNs Using Transmission Power Control and Q-learning, Wireless Personal Communications: An International Journal, 132:2, (799-821), Online publication date: 1-Sep-2023.
  59. Gonzalez-Soto M, Feliciano-Avelino I, Sucar L and Escalante H (2023). Learning a causal structure: a Bayesian random graph approach, Neural Computing and Applications, 35:25, (18147-18159), Online publication date: 1-Sep-2023.
  60. ACM
    Gunarathna U, Xie H, Tanin E, Karunasekera S and Borovica-Gajic R (2023). Real-time Road Network Optimization with Coordinated Reinforcement Learning, ACM Transactions on Intelligent Systems and Technology, 14:4, (1-30), Online publication date: 31-Aug-2023.
  61. Fassih M, Capelle-Laizé A, Carré P and Boisbunon P Reinforcement Learning for Truck Eco-Driving: A Serious Game as Driving Assistance System Advanced Concepts for Intelligent Vision Systems, (299-310)
  62. Mahmud S, Saisubramanian S and Zilberstein S Explanation-guided reward alignment Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence, (473-482)
  63. Sonuç E and Özcan E (2023). An adaptive parallel evolutionary algorithm for solving the uncapacitated facility location problem, Expert Systems with Applications: An International Journal, 224:C, Online publication date: 15-Aug-2023.
  64. Mussi M, Lombarda D, Metelli A, Trovó F and Restelli M (2023). ARLO, Expert Systems with Applications: An International Journal, 224:C, Online publication date: 15-Aug-2023.
  65. Li Y, Ren J, Zhang T, Fang Y and Chen F (2023). MER, Knowledge-Based Systems, 273:C, Online publication date: 3-Aug-2023.
  66. Jin C, Yang Z, Wang Z and Jordan M (2023). Provably Efficient Reinforcement Learning with Linear Function Approximation, Mathematics of Operations Research, 48:3, (1496-1521), Online publication date: 1-Aug-2023.
  67. Duan X, Li H, Wang P, Wang T, Liu B and Zhang B (2023). Bandit Interpretability of Deep Models via Confidence Selection, Neurocomputing, 544:C, Online publication date: 1-Aug-2023.
  68. Penney D, Li B, Sydir J, Chen L, Tai C, Lee S, Walsh E and Long T (2023). PROMPT, Future Generation Computer Systems, 145:C, (164-175), Online publication date: 1-Aug-2023.
  69. Vásconez J, Barona López L, Valdivieso Caraguay Á and Benalcázar M (2023). A comparison of EMG-based hand gesture recognition systems based on supervised and reinforcement learning, Engineering Applications of Artificial Intelligence, 123:PB, Online publication date: 1-Aug-2023.
  70. An J, Wu S, Gui X, He X and Zhang X (2023). A blockchain-based framework for data quality in edge-computing-enabled crowdsensing, Frontiers of Computer Science: Selected Publications from Chinese Universities, 17:4, Online publication date: 1-Aug-2023.
  71. Shimokawa D, Yoshida N, Koyama S and Kurihara S (2023). Automatic parameter learning method for agent activation spreading network by evolutionary computation, Artificial Life and Robotics, 28:3, (571-582), Online publication date: 1-Aug-2023.
  72. Karine K, Klasnja P, Murphy S and Marlin B Assessing the impact of context inference error and partial observability on RL methods for just-in-time adaptive interventions Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, (1047-1057)
  73. Zenati H, Diemert E, Martin M, Mairal J and Gaillard P Sequential counterfactual risk minimization Proceedings of the 40th International Conference on Machine Learning, (40681-40706)
  74. Yamagata T, Khalil A and Santos-Rodríguez R Q-learning decision transformer Proceedings of the 40th International Conference on Machine Learning, (38989-39007)
  75. Wu P, Majumdar A, Stone K, Lin Y, Mordatch I, Abbeel P and Rajeswaran A Masked trajectory models for prediction, representation, and control Proceedings of the 40th International Conference on Machine Learning, (37607-37623)
  76. Schwarzer M, Obando-Ceron J, Courville A, Bellemare M, Agarwal R and Castro P Bigger, better, faster Proceedings of the 40th International Conference on Machine Learning, (30365-30380)
  77. O'Donoghue B Efficient exploration via epistemic-risk-seeking policy optimization Proceedings of the 40th International Conference on Machine Learning, (26382-26402)
  78. Le Lan C, Tu S, Rowland M, Harutyunyan A, Agarwal R, Bellemare M and Dabney W Bootstrapped representations in reinforcement learning Proceedings of the 40th International Conference on Machine Learning, (18686-18713)
  79. Laroche R and Des Combes R On the occupancy measure of non-Markovian policies in continuous MDPs Proceedings of the 40th International Conference on Machine Learning, (18548-18562)
  80. ACM
    Pigozzi F, Camerota Verdù F and Medvet E How the Morphology Encoding Influences the Learning Ability in Body-Brain Co-Optimization Proceedings of the Genetic and Evolutionary Computation Conference, (1045-1054)
  81. ACM
    Lim B, Flageat M and Cully A Understanding the Synergies between Quality-Diversity and Deep Reinforcement Learning Proceedings of the Genetic and Evolutionary Computation Conference, (1212-1220)
  82. Fayyazi M, Abdoos M, Phan D, Golafrouz M, Jalili M, Jazar R, Langari R and Khayyam H (2023). Real-time self-adaptive Q-learning controller for energy management of conventional autonomous vehicles, Expert Systems with Applications: An International Journal, 222:C, Online publication date: 15-Jul-2023.
  83. ACM
    Silva D, Carvalho S and Felipe Da Silva N On identifying early blockable taxpayers on goods and services trading operations Proceedings of the 24th Annual International Conference on Digital Government Research, (405-413)
  84. Tassel P, Gebser M and Schekotihin K An end-to-end reinforcement learning approach for job-shop scheduling problems based on constraint programming Proceedings of the Thirty-Third International Conference on Automated Planning and Scheduling, (614-622)
  85. Reagans R, Volvovsky H and Burt R (2023). Shared language in the team network-performance association: Reconciling conflicting views of the network centralization effect on team performance, Collective Intelligence, 2:3, Online publication date: 1-Jul-2023.
  86. Li X, Chen J, Ling X and Wu T (2023). Deep Reinforcement Learning-Based Anti-Jamming Algorithm Using Dual Action Network, IEEE Transactions on Wireless Communications, 22:7, (4625-4637), Online publication date: 1-Jul-2023.
  87. Liu T, Ni S, Li X, Zhu Y, Kong L and Yang Y (2023). Deep Reinforcement Learning Based Approach for Online Service Placement and Computation Resource Allocation in Edge Computing, IEEE Transactions on Mobile Computing, 22:7, (3870-3881), Online publication date: 1-Jul-2023.
  88. Nikpour B and Armanfard N (2023). Spatio-temporal hard attention learning for skeleton-based activity recognition, Pattern Recognition, 139:C, Online publication date: 1-Jul-2023.
  89. Zhou P, Xu Z, Zhu X, Zhao J, Song C and Shao Z (2023). Safe reinforcement learning method integrating process knowledge for real-time scheduling of gas supply network, Information Sciences: an International Journal, 633:C, (280-304), Online publication date: 1-Jul-2023.
  90. Terrén-Serrano G and Martínez-Ramón M (2023). Deep learning for intra-hour solar forecasting with fusion of features extracted from infrared sky images, Information Fusion, 95:C, (42-61), Online publication date: 1-Jul-2023.
  91. Hore S, Shah A and Bastian N (2023). Deep VULMAN, Expert Systems with Applications: An International Journal, 221:C, Online publication date: 1-Jul-2023.
  92. Birabwa D, Ramotsoela D and Ventura N (2023). Multi-agent deep reinforcement learning for user association and resource allocation in integrated terrestrial and non-terrestrial networks, Computer Networks: The International Journal of Computer and Telecommunications Networking, 231:C, Online publication date: 1-Jul-2023.
  93. Shu Z, Feng H, Taleb T and Zhang Z (2023). A novel combinatorial multi-armed bandit game to identify online the changing top-K flows in software-defined networks, Computer Networks: The International Journal of Computer and Telecommunications Networking, 230:C, Online publication date: 1-Jul-2023.
  94. Chen C, Lewis F, Xie K, Lyu Y and Xie S (2023). Distributed output data-driven optimal robust synchronization of heterogeneous multi-agent systems, Automatica (Journal of IFAC), 153:C, Online publication date: 1-Jul-2023.
  95. Priya B and Malhotra J (2023). Intelligent Multi-connectivity Based Energy-Efficient Framework for Smart City, Journal of Network and Systems Management, 31:3, Online publication date: 1-Jul-2023.
  96. ACM
    Sharma H, Suetterlein J, Lakshmiranganatha S, Flynn T, Vrabie D, Sweeney C and Ramakrishniah V EXARL-PARS: Parallel Augmented Random Search Using Reinforcement Learning at Scale for Applications in Power Systems Companion Proceedings of the 14th ACM International Conference on Future Energy Systems, (1-1)
  97. ACM
    Liu H, Balaji B, Gupta R and Hong D Rule-based Policy Regularization for Reinforcement Learning-based Building Control Proceedings of the 14th ACM International Conference on Future Energy Systems, (242-265)
  98. Zhang Q, Cho J, Moore T, Kim D, Lim H and Nelson F EVADE: Efficient Moving Target Defense for Autonomous Network Topology Shuffling Using Deep Reinforcement Learning Applied Cryptography and Network Security, (555-582)
  99. ACM
    Kakarla N and Mahendran V How to Learn on the Fly? On Improving the Uplink Throughput Performance of UAV-Assisted Sensor Networks Proceedings of the Ninth Workshop on Micro Aerial Vehicle Networks, Systems, and Applications, (27-32)
  100. ACM
    Kraus M, Wagner N, Riekenbrauck R and Minker W Improving Proactive Dialog Agents Using Socially-Aware Reinforcement Learning Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization, (146-155)
  101. ACM
    Prihar E, Sales A and Heffernan N A Bandit You Can Trust Proceedings of the 31st ACM Conference on User Modeling, Adaptation and Personalization, (106-115)
  102. Alyazidi N, Hassanine A and Mahmoud M (2023). An Online Adaptive Policy Iteration-Based Reinforcement Learning for a Class of a Nonlinear 3D Overhead Crane, Applied Mathematics and Computation, 447:C, Online publication date: 15-Jun-2023.
  103. Cesar Bonini R and Correa Martins-Jr D Gene Networks Inference by Reinforcement Learning Advances in Bioinformatics and Computational Biology, (136-147)
  104. ACM
    Chan A, Salganik R, Markelius A, Pang C, Rajkumar N, Krasheninnikov D, Langosco L, He Z, Duan Y, Carroll M, Lin M, Mayhew A, Collins K, Molamohammadi M, Burden J, Zhao W, Rismani S, Voudouris K, Bhatt U, Weller A, Krueger D and Maharaj T Harms from Increasingly Agentic Algorithmic Systems Proceedings of the 2023 ACM Conference on Fairness, Accountability, and Transparency, (651-666)
  105. Sacco A, Flocco M, Esposito F and Marchetto G (2023). Partially Oblivious Congestion Control for the Internet via Reinforcement Learning, IEEE Transactions on Network and Service Management, 20:2, (1644-1659), Online publication date: 1-Jun-2023.
  106. Lou J, Tang Z and Jia W (2023). Energy-Efficient Joint Task Assignment and Migration in Data Centers: A Deep Reinforcement Learning Approach, IEEE Transactions on Network and Service Management, 20:2, (961-973), Online publication date: 1-Jun-2023.
  107. Kim J, Lee J, Kim T and Pack S (2023). Deep Q-Network-Based Cloud-Native Network Function Placement in Edge Cloud-Enabled Non-Public Networks, IEEE Transactions on Network and Service Management, 20:2, (1804-1816), Online publication date: 1-Jun-2023.
  108. Augello A, Gaglio S, Infantino I, Maniscalco U, Pilato G and Vella F (2023). Roboception and adaptation in a cognitive robot, Robotics and Autonomous Systems, 164:C, Online publication date: 1-Jun-2023.
  109. Lan Y, Ren J, Tang T, Xu X, Shi Y and Tang Z (2023). Efficient reinforcement learning with least-squares soft Bellman residual for robotic grasping, Robotics and Autonomous Systems, 164:C, Online publication date: 1-Jun-2023.
  110. Elguea-Aguinaco Í, Serrano-Muñoz A, Chrysostomou D, Inziarte-Hidalgo I, Bøgh S and Arana-Arexolaleiba N (2023). A review on reinforcement learning for contact-rich robotic manipulation tasks, Robotics and Computer-Integrated Manufacturing, 81:C, Online publication date: 1-Jun-2023.
  111. Bhola S, Pawar S, Balaprakash P and Maulik R (2023). Multi-fidelity reinforcement learning framework for shape optimization, Journal of Computational Physics, 482:C, Online publication date: 1-Jun-2023.
  112. Zhang H, Sun J, Xu Z and Shi J (2023). Learning unified mutation operator for differential evolution by natural evolution strategies, Information Sciences: an International Journal, 632:C, (594-616), Online publication date: 1-Jun-2023.
  113. Yang J, Tan K, Feng L and Li Z (2023). A model-based deep reinforcement learning approach to the nonblocking coordination of modular supervisors of discrete event systems, Information Sciences: an International Journal, 630:C, (305-321), Online publication date: 1-Jun-2023.
  114. Li Y, Sun H, Fang W, Ma Q, Han S, Wang-Sattler R, Du W and Yu Q (2023). SURE, Information Sciences: an International Journal, 629:C, (299-312), Online publication date: 1-Jun-2023.
  115. Yang S, Yang B, Zeng Z and Kang Z (2023). Causal inference multi-agent reinforcement learning for traffic signal control, Information Fusion, 94:C, (243-256), Online publication date: 1-Jun-2023.
  116. Jafari M, Shoeibi A, Khodatars M, Ghassemi N, Moridian P, Alizadehsani R, Khosravi A, Ling S, Delfan N, Zhang Y, Wang S, Gorriz J, Alinejad-Rokny H and Acharya U (2023). Automated diagnosis of cardiovascular diseases from cardiac magnetic resonance imaging using deep learning models, Computers in Biology and Medicine, 160:C, Online publication date: 1-Jun-2023.
  117. Wu C, Pan W, Staa R, Liu J, Sun G and Wu L (2023). Deep reinforcement learning control approach to mitigating actuator attacks, Automatica (Journal of IFAC), 152:C, Online publication date: 1-Jun-2023.
  118. Gu S, Grudzien Kuba J, Chen Y, Du Y, Yang L, Knoll A and Yang Y (2023). Safe multi-agent reinforcement learning for multi-robot control, Artificial Intelligence, 319:C, Online publication date: 1-Jun-2023.
  119. Zhu C, Cai Y, Hu S, Leung H and Chiu D (2022). Learning by reusing previous advice: a memory-based teacher–student framework, Autonomous Agents and Multi-Agent Systems, 37:1, Online publication date: 1-Jun-2023.
  120. Cairo M, Eldaphonse B, Mousavi P, Sahir S, Jubair S, Taylor M, Doerksen G, Kummer N, Maretzki J, Mohhar G, Murphy S, Gunther J, Petrich L and Syed T Multi-Robot Warehouse Optimization: Leveraging Machine Learning for Improved Performance Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (3047-3049)
  121. Sunehag P, Vezhnevets A, Duéñez-Guzmán E, Mordatch I and Leibo J Diversity Through Exclusion (DTE): Niche Identification for Reinforcement Learning through Value-Decomposition Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (2827-2829)
  122. Källström J and Heintz F Model-Based Actor-Critic for Multi-Objective Reinforcement Learning with Dynamic Utility Functions Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (2818-2820)
  123. Fan F, Ma Y, Dai Z, Tan C and Low B FedHQL: Federated Heterogeneous Q-Learning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (2810-2812)
  124. Hua Y, Gao S, Li W, Jin B, Wang X and Zha H Learning Optimal "Pigovian Tax" in Sequential Social Dilemmas Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (2784-2786)
  125. Dohmen T and Trivedi A Reinforcement Learning with Depreciating Assets Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (2628-2630)
  126. Wu J, Yang T, Hao X, Hao J, Zheng Y, Wang W and Taylor M PORTAL: Automatic Curricula Generation for Multiagent Reinforcement Learning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (2460-2462)
  127. Poletti S, Testolin A and Tschiatschek S Learning Constraints From Human Stop-Feedback in Reinforcement Learning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (2328-2330)
  128. Pendurkar S, Chow C, Jie L and Sharon G Bilevel Entropy based Mechanism Design for Balancing Meta in Video Games Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (2134-2142)
  129. Hayes C, Rădulescu R, Bargiacchi E, Kallstrom J, Macfarlane M, Reymond M, Verstraeten T, Zintgraf L, Dazeley R, Heintz F, Howley E, Irissappane A, Mannion P, Nowé A, Ramos G, Restelli M, Vamplew P and Roijers D A Brief Guide to Multi-Objective Reinforcement Learning and Planning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1988-1990)
  130. Zeng H, Wu Q, Han K, He J and Hu H A Deep Reinforcement Learning Approach for Online Parcel Assignment Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1961-1968)
  131. Yu Y, Yin Q, Zhang J and Huang K Prioritized Tasks Mining for Multi-Task Cooperative Multi-Agent Reinforcement Learning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1615-1623)
  132. Oh J, Kim J, Jeong M and Yun S Toward Risk-based Optimistic Exploration for Cooperative Multi-Agent Reinforcement Learning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1597-1605)
  133. Shi L, Zhang Z, Wang S, Zhou B, Wu M, Yang C and Li S Efficient Interactive Recommendation via Huffman Tree-based Policy Learning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1495-1503)
  134. Chakraborty D, Busatto-Gaston D, Raskin J and Pérez G Formally-Sharp DAgger for MCTS: Lower-Latency Monte Carlo Tree Search using Data Aggregation with Formal Methods Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1354-1362)
  135. Mittal D, Aravindan S and Lee W ExPoSe: Combining State-Based Exploration with Gradient-Based Online Search Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1345-1353)
  136. Chandak S, Bistritz I and Bambos N Equilibrium Bandits: Learning Optimal Equilibria of Unknown Dynamics Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1336-1344)
  137. Sun H and Wu F Less Is More: Refining Datasets for Offline Reinforcement Learning with Reward Machines Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1239-1247)
  138. Cai Y, Zhang C, Zhao H, Zhao L and Bian J Curriculum Offline Reinforcement Learning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1221-1229)
  139. Ganapathi Subramanian S, Taylor M, Larson K and Crowley M Learning from Multiple Independent Advisors in Multi-agent Reinforcement Learning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (1144-1153)
  140. Trudeau A and Bowling M Targeted Search Control in AlphaZero for Effective Policy Improvement Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (842-850)
  141. Daoudi P, Robu B, Prieur C, Dos Santos L and Barlier M Enhancing Reinforcement Learning Agents with Local Guides Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (829-838)
  142. Zhang S, Cao J, Yuan L, Yu Y and Zhan D Self-Motivated Multi-Agent Exploration Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (476-484)
  143. Seo S, Han B and Unhelkar V Automated Task-Time Interventions to Improve Teamwork using Imitation Learning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (335-344)
  144. Zhang Y and Yu C EXPODE: EXploiting POlicy Discrepancy for Efficient Exploration in Multi-agent Reinforcement Learning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (58-66)
  145. Ivanov D, Zisman I and Chernyshev K Mediated Multi-Agent Reinforcement Learning Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (49-57)
  146. Yang J, Mittal K, Dzanic T, Petrides S, Keith B, Petersen B, Faissol D and Anderson R Multi-Agent Reinforcement Learning for Adaptive Mesh Refinement Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, (14-22)
  147. ACM
    Zhang Y, Zheng H and Zhai X Deep Reinforcement Learning Based UAV Mission Planning with Charging Module Proceedings of the 2023 4th International Conference on Computing, Networks and Internet of Things, (658-662)
  148. Xianjia W, Linzhao X, Yang Z and Liu Y (2023). Rationality-bounded adaptive learning in multi-agent dynamic games, Knowledge-Based Systems, 268:C, Online publication date: 23-May-2023.
  149. Jang J and Seong N (2023). Deep reinforcement learning for stock portfolio optimization by connecting with modern portfolio theory, Expert Systems with Applications: An International Journal, 218:C, Online publication date: 15-May-2023.
  150. Bouktif S, Cheniki A, Ouni A and El-Sayed H (2023). Deep reinforcement learning for traffic signal control with consistent state and reward design approach, Knowledge-Based Systems, 267:C, Online publication date: 12-May-2023.
  151. Tasfi N, Santana E, Liboni L and Capretz M (2023). Dynamic Successor Features for transfer learning and guided exploration, Knowledge-Based Systems, 267:C, Online publication date: 12-May-2023.
  152. ACM
    Fahmida S, Modekurthy‬ V, Rahman M and Saifullah A Handling Coexistence of LoRa with Other Networks through Embedded Reinforcement Learning Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation, (410-423)
  153. ACM
    Taherisadr M, Stavroulakis S and Elmalaki S adaPARL: Adaptive Privacy-Aware Reinforcement Learning for Sequential Decision Making Human-in-the-Loop Systems Proceedings of the 8th ACM/IEEE Conference on Internet of Things Design and Implementation, (262-274)
  154. ACM
    Robinette P, Hamilton N and Johnson T DEMO: Self-Preserving Genetic Algorithms vs. Safe Reinforcement Learning in Discrete Action Spaces Proceedings of the ACM/IEEE 14th International Conference on Cyber-Physical Systems (with CPS-IoT Week 2023), (278-279)
  155. ACM
    Robinette P, Hamilton N and Johnson T Self-Preserving Genetic Algorithms for Safe Learning in Discrete Action Spaces Proceedings of the ACM/IEEE 14th International Conference on Cyber-Physical Systems (with CPS-IoT Week 2023), (110-119)
  156. ACM
    Wang Y, Zhan S, Wang Z, Huang C, Wang Z, Yang Z and Zhu Q Joint Differentiable Optimization and Verification for Certified Reinforcement Learning Proceedings of the ACM/IEEE 14th International Conference on Cyber-Physical Systems (with CPS-IoT Week 2023), (132-141)
  157. Li J, Wang Z, Cong G, Long C, Kiah H and Cui B (2023). Towards Designing and Learning Piecewise Space-Filling Curves, Proceedings of the VLDB Endowment, 16:9, (2158-2171), Online publication date: 1-May-2023.
  158. Lefebvre T and Crevecoeur G (2023). A posteriori control densities, Pattern Recognition Letters, 169:C, (87-94), Online publication date: 1-May-2023.
  159. Li W, Özcan E, Drake J and Maashi M (2023). A generality analysis of multiobjective hyper-heuristics, Information Sciences: an International Journal, 627:C, (34-51), Online publication date: 1-May-2023.
  160. Zhang Y, Zhao M, Li T, Wang Y and Liang T (2023). Achieving optimal rewards in cryptocurrency stubborn mining with state transition analysis, Information Sciences: an International Journal, 625:C, (299-313), Online publication date: 1-May-2023.
  161. Ling X, Wu L, Zhang J, Qu Z, Deng W, Chen X, Qian Y, Wu C, Ji S, Luo T, Wu J and Wu Y (2023). Adversarial attacks against Windows PE malware detection, Computers and Security, 128:C, Online publication date: 1-May-2023.
  162. Xing Y, Shu H and Kang F (2023). PeerRemove, Computers and Security, 128:C, Online publication date: 1-May-2023.
  163. Choi J, Seo S, Choi S, Piao S, Park C, Ryu S, Kim B and Park S (2023). ReBADD-SE, Computers in Biology and Medicine, 157:C, Online publication date: 1-May-2023.
  164. Sarah A, Nencioni G and Khan M (2023). Resource Allocation in Multi-access Edge Computing for 5G-and-beyond networks, Computer Networks: The International Journal of Computer and Telecommunications Networking, 227:C, Online publication date: 1-May-2023.
  165. Stanković M, Beko M and Stanković S (2023). Distributed consensus-based multi-agent temporal-difference learning, Automatica (Journal of IFAC), 151:C, Online publication date: 1-May-2023.
  166. Malikopoulos A (2023). Separation of learning and control for cyber–physical systems, Automatica (Journal of IFAC), 151:C, Online publication date: 1-May-2023.
  167. Ueda M (2023). Memory-two strategies forming symmetric mutual reinforcement learning equilibrium in repeated prisoners’ dilemma game, Applied Mathematics and Computation, 444:C, Online publication date: 1-May-2023.
  168. Lin M, Zhao B and Liu D (2023). Policy gradient adaptive dynamic programming for nonlinear discrete-time zero-sum games with unknown dynamics, Soft Computing - A Fusion of Foundations, Methodologies and Applications, 27:9, (5781-5795), Online publication date: 1-May-2023.
  169. ACM
    Schulam P and Muslea I Improving the Exploration/Exploitation Trade-Off in Web Content Discovery Companion Proceedings of the ACM Web Conference 2023, (1183-1189)
  170. ACM
    Chen J, Nie J, Xu M, Lyu L, Xiong Z, Kang J, Tong Y and Jiang W Multiple-Agent Deep Reinforcement Learning for Avatar Migration in Vehicular Metaverses Companion Proceedings of the ACM Web Conference 2023, (1258-1265)
  171. ACM
    Luo J, Hazra K, Huo W, Li R and Mahabal A Personalized style recommendation via reinforcement learning Companion Proceedings of the ACM Web Conference 2023, (290-293)
  172. ACM
    Wang X, Ma G, Eden A, Li C, Trott A, Zheng S and Parkes D Platform Behavior under Market Shocks: A Simulation Framework and Reinforcement-Learning Based Study Proceedings of the ACM Web Conference 2023, (3592-3602)
  173. ACM
    Liu Z, Tian J, Cai Q, Zhao X, Gao J, Liu S, Chen D, He T, Zheng D, Jiang P and Gai K Multi-Task Recommendations with Reinforcement Learning Proceedings of the ACM Web Conference 2023, (1273-1282)
  174. ACM
    Cai Q, Xue Z, Zhang C, Xue W, Liu S, Zhan R, Wang X, Zuo T, Xie W, Zheng D, Jiang P and Gai K Two-Stage Constrained Actor-Critic for Short Video Recommendation Proceedings of the ACM Web Conference 2023, (865-875)
  175. ACM
    Liu S, Cai Q, Sun B, Wang Y, Jiang J, Zheng D, Jiang P, Gai K, Zhao X and Zhang Y Exploration and Regularization of the Latent Action Space in Recommendation Proceedings of the ACM Web Conference 2023, (833-844)
  176. Jiang W, Ren Y and Wang Y (2023). Improving anti-jamming decision-making strategies for cognitive radar via multi-agent deep reinforcement learning, Digital Signal Processing, 135:C, Online publication date: 30-Apr-2023.
  177. ACM
    Gupta T and Gori J Modeling reciprocal adaptation in HCI: a Multi-Agent Reinforcement Learning Approach Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, (1-6)
  178. ACM
    Shakerimov A, Sarmonov S, Amirova A, Oralbayeva N, Zhanatkyzy A, Telisheva Z, Aimysheva A and Sandygulova A QWriter: Technology-Enhanced Alphabet Acquisition based on Reinforcement Learning Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, (1-7)
  179. ACM
    Lloyd-Roberts B, James P, Edwards M, Robinson S and Werner T Improving Railway Safety: Human-in-the-loop Invariant Finding Extended Abstracts of the 2023 CHI Conference on Human Factors in Computing Systems, (1-8)
  180. ACM
    Mutalova R, Sarrazin-Gendron R, Cai E, Richard G, Ghasemloo Gheidari P, Caisse S, Knight R, Blanchette M, Szantner A and Waldispühl J Playing the System: Can Puzzle Players Teach us How to Solve Hard Problems? Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, (1-15)
  181. ACM
    Milani S, Juliani A, Momennejad I, Georgescu R, Rzepecki J, Shaw A, Costello G, Fang F, Devlin S and Hofmann K Navigates Like Me: Understanding How People Evaluate Human-Like AI in Video Games Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, (1-18)
  182. Malekzadeh P, Hou M and Plataniotis K (2023). Uncertainty-aware transfer across tasks using hybrid model-based successor feature reinforcement learning☆, Neurocomputing, 530:C, (165-187), Online publication date: 14-Apr-2023.
  183. ACM
    Chetitah M, Müller J, Deserno L, Waltmann M and von Mammen S Gamification Framework for Reinforcement Learning-based Neuropsychology Experiments Proceedings of the 18th International Conference on the Foundations of Digital Games, (1-4)
  184. Liu M, Cai Q, Li D, Meng W and Fu M (2023). Output feedback Q-learning for discrete-time finite-horizon zero-sum games with application to the H ∞ control, Neurocomputing, 529:C, (48-55), Online publication date: 7-Apr-2023.
  185. Bao Y, Peng Y and Wu C (2023). Deep Learning-Based Job Placement in Distributed Machine Learning Clusters With Heterogeneous Workloads, IEEE/ACM Transactions on Networking, 31:2, (634-647), Online publication date: 1-Apr-2023.
  186. Zhao Y, Liu C, Zhu K, Zhang S and Wu J (2023). GSMAC: GAN-Based Signal map Construction With Active Crowdsourcing, IEEE Transactions on Mobile Computing, 22:4, (2190-2204), Online publication date: 1-Apr-2023.
  187. Chen H, Zhu C, Tang R, Zhang W, He X and Yu Y (2023). Large-Scale Interactive Recommendation With Tree-Structured Reinforcement Learning, IEEE Transactions on Knowledge and Data Engineering, 35:4, (4018-4032), Online publication date: 1-Apr-2023.
  188. Li C, Zheng P, Yin Y, Pang Y and Huo S (2023). An AR-assisted Deep Reinforcement Learning-based approach towards mutual-cognitive safe human-robot interaction, Robotics and Computer-Integrated Manufacturing, 80:C, Online publication date: 1-Apr-2023.
  189. Ying C, Qiaoben Y, Zhou X, Su H, Ding W and Ai J (2023). Consistent attack, Pattern Recognition Letters, 168:C, (57-63), Online publication date: 1-Apr-2023.
  190. Yang L, Tao J, Liu Y, Xu Y and Su C (2023). Energy scheduling for DoS attack over multi-hop networks, Neural Networks, 161:C, (735-745), Online publication date: 1-Apr-2023.
  191. Yang H, Zhao M, Yuan L, Yu Y, Li Z and Gu M (2023). Memory-efficient Transformer-based network model for Traveling Salesman Problem, Neural Networks, 161:C, (589-597), Online publication date: 1-Apr-2023.
  192. Ming F, Gao F, Liu K and Zhao C (2023). Cooperative modular reinforcement learning for large discrete action space problem, Neural Networks, 161:C, (281-296), Online publication date: 1-Apr-2023.
  193. Alabi A, Vanderelst D and Minai A (2023). Rapid learning of spatial representations for goal-directed navigation based on a novel model of hippocampal place fields, Neural Networks, 161:C, (116-128), Online publication date: 1-Apr-2023.
  194. Kwak D, Choi S and Chang W (2023). Self-attention based deep direct recurrent reinforcement learning with hybrid loss for trading signal generation, Information Sciences: an International Journal, 623:C, (592-606), Online publication date: 1-Apr-2023.
  195. Jiang T, Liu Y, Wu X, Xu M and Cui X (2023). Application of deep reinforcement learning in attacking and protecting structural features-based malicious PDF detector, Future Generation Computer Systems, 141:C, (325-338), Online publication date: 1-Apr-2023.
  196. Seong M, Jo O and Shin K (2023). Multi-UAV trajectory optimizer, Engineering Applications of Artificial Intelligence, 120:C, Online publication date: 1-Apr-2023.
  197. Norouzi A, Heidarifar H, Borhan H, Shahbakhti M and Koch C (2023). Integrating Machine Learning and Model Predictive Control for automotive applications, Engineering Applications of Artificial Intelligence, 120:C, Online publication date: 1-Apr-2023.
  198. Wang J, Wang L and Xiu X (2023). A cooperative memetic algorithm for energy-aware distributed welding shop scheduling problem, Engineering Applications of Artificial Intelligence, 120:C, Online publication date: 1-Apr-2023.
  199. Li X, Li Y and Wu X (2023). Empirical Gittins index strategies with ε-explorations for multi-armed bandit problems, Computational Statistics & Data Analysis, 180:C, Online publication date: 1-Apr-2023.
  200. Tokuda K, Sato T and Oki E (2023). Network slice reconfiguration with deep reinforcement learning under variable number of service function chains, Computer Networks: The International Journal of Computer and Telecommunications Networking, 224:C, Online publication date: 1-Apr-2023.
  201. Aghasi A, Jamshidi K, Bohlooli A and Javadi B (2023). A decentralized adaptation of model-free Q-learning for thermal-aware energy-efficient virtual machine placement in cloud data centers, Computer Networks: The International Journal of Computer and Telecommunications Networking, 224:C, Online publication date: 1-Apr-2023.
  202. Bakhshi B, Mangues-Bafalluy J and Baranda J (2023). Multi-provider NFV network service delegation via average reward reinforcement learning, Computer Networks: The International Journal of Computer and Telecommunications Networking, 224:C, Online publication date: 1-Apr-2023.
  203. Raju M and Mothku S (2023). Delay and energy aware task scheduling mechanism for fog-enabled IoT applications, Computer Networks: The International Journal of Computer and Telecommunications Networking, 224:C, Online publication date: 1-Apr-2023.
  204. Quadri C, Ceselli A and Rossi G (2023). Multi-user edge service orchestration based on Deep Reinforcement Learning, Computer Communications, 203:C, (30-47), Online publication date: 1-Apr-2023.
  205. Moyalan J, Choi H, Chen Y and Vaidya U (2023). Data-driven optimal control via linear transfer operators, Automatica (Journal of IFAC), 150:C, Online publication date: 1-Apr-2023.
  206. Sun Z and Jia G (2023). Reinforcement learning for exploratory linear-quadratic two-person zero-sum stochastic differential games, Applied Mathematics and Computation, 442:C, Online publication date: 1-Apr-2023.
  207. Zang W, Luan X and Song D (2023). Cycles optimization to underwater glider standoff tracking profiles with the objective of conserving energy, Advanced Engineering Informatics, 56:C, Online publication date: 1-Apr-2023.
  208. ACM
    Hu Y, Deng X, Zhu C, Chen X and Chi L (2022). Resource Allocation for Heterogeneous Computing Tasks in Wirelessly Powered MEC-enabled IIOT Systems, ACM Transactions on Management Information Systems, 14:1, (1-17), Online publication date: 31-Mar-2023.
  209. ACM
    Weidele D, Afzal S, Valente A, Makuch C, Cornec O, Vu L, Subramanian D, Geyer W, Nair R, Vejsbjerg I, Marinescu R, Palmes P, Daly E, Franke L and Haehn D AutoDOViz: Human-Centered Automation for Decision Optimization Proceedings of the 28th International Conference on Intelligent User Interfaces, (664-680)
  210. ACM
    Meggetto F, Revie C, Levine J and Moshfeghi Y Why People Skip Music? On Predicting Music Skips using Deep Reinforcement Learning Proceedings of the 2023 Conference on Human Information Interaction and Retrieval, (95-106)
  211. Zhao B, Dong H, Wang Y and Pan T (2023). PPO-TA, Knowledge-Based Systems, 264:C, Online publication date: 15-Mar-2023.
  212. Gholami H and Sun H (2023). Toward automated algorithm configuration for distributed hybrid flow shop scheduling with multiprocessor tasks, Knowledge-Based Systems, 264:C, Online publication date: 15-Mar-2023.
  213. Taghavi M, Bentahar J, Otrok H and Bakhtiyari K (2023). A reinforcement learning model for the reliability of blockchain oracles, Expert Systems with Applications: An International Journal, 214:C, Online publication date: 15-Mar-2023.
  214. Zhang Q, Chng C, Chen K, Lee P and Chui C (2023). DRL-S, Expert Systems with Applications: An International Journal, 214:C, Online publication date: 15-Mar-2023.
  215. Dutta H, Bhuyan A and Biswas S (2023). Reinforcement learning based flow and energy management in resource-constrained wireless networks, Computer Communications, 202:C, (73-86), Online publication date: 15-Mar-2023.
  216. Jamil B, Ijaz H, Shojafar M and Munir K (2023). IRATS, Ad Hoc Networks, 141:C, Online publication date: 15-Mar-2023.
  217. Sadiki A, Bentahar J, Dssouli R, En-Nouaary A and Otrok H (2023). Deep reinforcement learning for the computation offloading in MIMO-based Edge Computing, Ad Hoc Networks, 141:C, Online publication date: 15-Mar-2023.
  218. ACM
    Sarmonov S, Shakerimov A, Aimysheva A, Amirova A, Oralbayeva N, Zhanatkyzy A, Telisheva Z and Sandygulova A Robot-Assisted First Language Learning in a New Latin Alphabet Companion of the 2023 ACM/IEEE International Conference on Human-Robot Interaction, (677-681)
  219. ACM
    Sanneman L and Shah J Transparent Value Alignment Companion of the 2023 ACM/IEEE International Conference on Human-Robot Interaction, (557-560)
  220. ACM
    Tian R, Tomizuka M, Dragan A and Bajcsy A Towards Modeling and Influencing the Dynamics of Human Learning Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction, (350-358)
  221. Han C, Peng Z, Liu Y, Tang J, Yu Y and Zhou Z (2023). Overfitting-avoiding goal-guided exploration for hard-exploration multi-goal reinforcement learning, Neurocomputing, 525:C, (76-87), Online publication date: 7-Mar-2023.
  222. Jansen N Intelligent and Dependable Decision-Making Under Uncertainty Formal Methods, (26-36)
  223. Baek J and Kaddoum G (2023). FLoadNet: Load Balancing in Fog Networks With Cooperative Multiagent Using Actor–Critic Method, IEEE Transactions on Network and Service Management, 20:1, (400-414), Online publication date: 1-Mar-2023.
  224. Tian D, Fang H, Yang Q, Yu H, Liang W and Wu Y (2023). Reinforcement learning under temporal logic constraints as a sequence modeling problem, Robotics and Autonomous Systems, 161:C, Online publication date: 1-Mar-2023.
  225. Kopa M and Šmíd M (2023). Contractivity of Bellman operator in risk averse dynamic programming with infinite horizon, Operations Research Letters, 51:2, (133-136), Online publication date: 1-Mar-2023.
  226. Lee Y, Kim J, Kwak M, Park Y and Kim S (2023). STACoRe, Neural Networks, 160:C, (1-11), Online publication date: 1-Mar-2023.
  227. Bi X, Nie H, Zhang G, Hu L, Ma Y, Zhao X, Yuan Y and Wang G (2023). Boosting question answering over knowledge graph with reward integration and policy evaluation under weak supervision, Information Processing and Management: an International Journal, 60:2, Online publication date: 1-Mar-2023.
  228. Bany Salameh H, Alhafnawi M, Masadeh A and Jararweh Y (2023). Federated reinforcement learning approach for detecting uncertain deceptive target using autonomous dual UAV system, Information Processing and Management: an International Journal, 60:2, Online publication date: 1-Mar-2023.
  229. Vargas-Pérez V, Mesejo P, Chica M and Cordón O (2023). Deep reinforcement learning in agent-based simulations for optimal media planning, Information Fusion, 91:C, (644-664), Online publication date: 1-Mar-2023.
  230. Zhao S, Wu Y, Tan S, Wu J, Cui Z and Wang Y (2023). QQLMPA, Expert Systems with Applications: An International Journal, 213:PC, Online publication date: 1-Mar-2023.
  231. Tufenkci S, Baykant Alagoz B, Kavuran G, Yeroglu C, Herencsar N and Mahata S (2023). A theoretical demonstration for reinforcement learning of PI control dynamics for optimal speed control of DC motors by using Twin Delay Deep Deterministic Policy Gradient Algorithm, Expert Systems with Applications: An International Journal, 213:PC, Online publication date: 1-Mar-2023.
  232. Liu H, Cai K, Li P, Qian C, Zhao P and Wu X (2023). REDRL, Expert Systems with Applications: An International Journal, 213:PA, Online publication date: 1-Mar-2023.
  233. Kalatzantonakis P, Sifaleras A and Samaras N (2023). A reinforcement learning-Variable neighborhood search method for the capacitated Vehicle Routing Problem, Expert Systems with Applications: An International Journal, 213:PA, Online publication date: 1-Mar-2023.
  234. Muy S and Lee J (2023). Spectrum efficiency maximization for multi-hop D2D communication underlaying cellular networks, Expert Systems with Applications: An International Journal, 213:PA, Online publication date: 1-Mar-2023.
  235. Cai W, Kordabad A and Gros S (2023). Energy management in residential microgrid using model predictive control-based reinforcement learning and Shapley value, Engineering Applications of Artificial Intelligence, 119:C, Online publication date: 1-Mar-2023.
  236. Chen X and Qian Q (2023). subGE, Engineering Applications of Artificial Intelligence, 119:C, Online publication date: 1-Mar-2023.
  237. Wang Z, Cai B, Li J, Yang D, Zhao Y and Xie H (2023). Solving non-permutation flow-shop scheduling problem via a novel deep reinforcement learning approach, Computers and Operations Research, 151:C, Online publication date: 1-Mar-2023.
  238. Deng X, Li J, Ma Y, Guan P and Ding H (2023). Allocation of edge computing tasks for UAV-aided target tracking, Computer Communications, 201:C, (123-130), Online publication date: 1-Mar-2023.
  239. Bonetti M, Bisi L and Restelli M (2023). Risk-averse optimization of reward-based coherent risk measures, Artificial Intelligence, 316:C, Online publication date: 1-Mar-2023.
  240. Basich C, Svegliato J, Wray K, Witwicki S, Biswas J and Zilberstein S (2023). Competence-aware systems, Artificial Intelligence, 316:C, Online publication date: 1-Mar-2023.
  241. Knox W, Allievi A, Banzhaf H, Schmitt F and Stone P (2023). Reward (Mis)design for autonomous driving, Artificial Intelligence, 316:C, Online publication date: 1-Mar-2023.
  242. Priya B and Malhotra J (2022). iMnet: Intelligent RAT Selection Framework for 5G Enabled IoMT Network, Wireless Personal Communications: An International Journal, 129:2, (911-932), Online publication date: 1-Mar-2023.
  243. Eva B, Ried K, Müller T and Briegel H (2023). How a Minimal Learning Agent can Infer the Existence of Unobserved Variables in a Complex Environment, Minds and Machines, 33:1, (185-219), Online publication date: 1-Mar-2023.
  244. Kayhan B and Yildiz G (2023). Reinforcement learning applications to machine scheduling problems: a comprehensive literature review, Journal of Intelligent Manufacturing, 34:3, (905-929), Online publication date: 1-Mar-2023.
  245. Velasquez A, Alkhouri I, Subramani K, Wojciechowski P and Atia G (2023). Optimal Deterministic Controller Synthesis from Steady-State Distributions, Journal of Automated Reasoning, 67:1, Online publication date: 1-Mar-2023.
  246. Jutury D, Kumar N, Sachan A, Daultani Y and Dhakad N (2023). Adaptive neuro-fuzzy enabled multi-mode traffic light control system for urban transport network, Applied Intelligence, 53:6, (7132-7153), Online publication date: 1-Mar-2023.
  247. ACM
    Zhang Y, Qu G, Xu P, Lin Y, Chen Z and Wierman A (2023). Global Convergence of Localized Policy Iteration in Networked Multi-Agent Reinforcement Learning, Proceedings of the ACM on Measurement and Analysis of Computing Systems, 7:1, (1-51), Online publication date: 27-Feb-2023.
  248. ACM
    Cai T, Jiang J, Zhang W, Zhou S, Song X, Yu L, Gu L, Zeng X, Gu J and Zhang G Marketing Budget Allocation with Offline Constrained Deep Reinforcement Learning Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, (186-194)
  249. ACM
    Zhao C, Ze Y, Dong J, Wang B and Li S Differentially Private Temporal Difference Learning with Stochastic Nonconvex-Strongly-Concave Optimization Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, (985-993)
  250. ACM
    Zhao C, Deng C, Liu Z, Zhang J, Wu Y, Wang Y and Yi X Interpretable Reinforcement Learning of Behavior Trees Proceedings of the 2023 15th International Conference on Machine Learning and Computing, (492-499)
  251. Costa A and Ralha C (2023). AC2CD, Knowledge-Based Systems, 261:C, Online publication date: 15-Feb-2023.
  252. Schmitt S, Shawe-Taylor J and van Hasselt H Exploration via epistemic value estimation Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, (9742-9751)
  253. Zhong F, Bi X, Zhang Y, Zhang W and Wang Y RSPT Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, (3705-3714)
  254. Brafman R, Tolpin D and Wertheim O Probabilistic programs as an action description language Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, (15351-15358)
  255. Carr S, Jansen N, Junges S and Topcu U Safe reinforcement learning via shielding under partial observability Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, (14748-14756)
  256. Wang R, Wang S and Zhang W (2022). Joint power and hopping rate adaption against follower jammer based on deep reinforcement learning, Transactions on Emerging Telecommunications Technologies, 34:2, Online publication date: 5-Feb-2023.
  257. Ravipudi J and Brandt-Pearce M (2023). Impairment- and fragmentation-aware dynamic routing, modulation and spectrum allocation in C+L band elastic optical networks using Q-learning, Optical Switching and Networking, 47:C, Online publication date: 1-Feb-2023.
  258. Zhao T, Wang Y, Sun W, Chen Y, Niu G and Sugiyama M (2023). Representation learning for continuous action spaces is beneficial for efficient policy learning, Neural Networks, 159:C, (137-152), Online publication date: 1-Feb-2023.
  259. Guo L and Zhao H (2023). Online adaptive optimal control algorithm based on synchronous integral reinforcement learning with explorations, Neurocomputing, 520:C, (250-261), Online publication date: 1-Feb-2023.
  260. Fu J, Huang C, Li Y, Mei J, Xu M and Zhang L (2023). Quantitative controller synthesis for consumption Markov decision processes, Information Processing Letters, 180:C, Online publication date: 1-Feb-2023.
  261. Mohammadi H and Nazerfard E (2023). Video violence recognition and localization using a semi-supervised hard attention model, Expert Systems with Applications: An International Journal, 212:C, Online publication date: 1-Feb-2023.
  262. Jooken J, Leyman P, Wauters T and De Causmaecker P (2023). Exploring search space trees using an adapted version of Monte Carlo tree search for combinatorial optimization problems, Computers and Operations Research, 150:C, Online publication date: 1-Feb-2023.
  263. Peng S and Feng Q (2023). Data-driven optimal control of wind turbines using reinforcement learning with function approximation, Computers and Industrial Engineering, 176:C, Online publication date: 1-Feb-2023.
  264. He F, Chen C and Huang S (2023). A multi-agent virtual market model for generalization in reinforcement learning based trading strategies, Applied Soft Computing, 134:C, Online publication date: 1-Feb-2023.
  265. Zhao L, Chang T, Zhang J, Zhang L, Chu K, Guo L and Kong D (2023). A policy optimization algorithm based on sample adaptive reuse and dual-clipping for robotic action control, Applied Soft Computing, 134:C, Online publication date: 1-Feb-2023.
  266. ACM
    Zheng Z, Wang C, Xu T, Shen D, Qin P, Zhao X, Huai B, Wu X and Chen E (2022). Interaction-aware Drug Package Recommendation via Policy Gradient, ACM Transactions on Information Systems, 41:1, (1-32), Online publication date: 31-Jan-2023.
  267. Chen Z, Feng X, Liu S and Yan X (2023). Optimal distributions of rewards for a two-armed slot machine, Neurocomputing, 518:C, (401-407), Online publication date: 21-Jan-2023.
  268. Liu S, Liu L and Yu Z (2023). Safe reinforcement learning for affine nonlinear systems with state constraints and input saturation using control barrier functions, Neurocomputing, 518:C, (562-576), Online publication date: 21-Jan-2023.
  269. ACM
    Yeh Y, Chen S, Chen H, Tu D, Fang G, Kuo Y and Chen P DPRoute Proceedings of the 28th Asia and South Pacific Design Automation Conference, (277-282)
  270. Liu S, Liu L and Yu Z (2023). Safe reinforcement learning for discrete-time fully cooperative games with partial state and control constraints using control barrier functions, Neurocomputing, 517:C, (118-132), Online publication date: 14-Jan-2023.
  271. Asgharnia A, Schwartz H and Atia M (2023). Multi-objective fuzzy Q-learning to solve continuous state-action problems, Neurocomputing, 516:C, (115-132), Online publication date: 7-Jan-2023.
  272. Li P, Zou W, Guo J and Xiang Z (2023). Optimal consensus of a class of discrete-time linear multi-agent systems via value iteration with guaranteed admissibility, Neurocomputing, 516:C, (1-10), Online publication date: 7-Jan-2023.
  273. Shi C, Xiong W, Shen C and Yang J (2023). Reward Teaching for Federated Multiarmed Bandits, IEEE Transactions on Signal Processing, 71, (4407-4422), Online publication date: 1-Jan-2023.
  274. Li W, Huang J, Lyu W, Guo B, Jiang W and Wang J (2023). RAV: Learning-Based Adaptive Streaming to Coordinate the Audio and Video Bitrate Selections, IEEE Transactions on Multimedia, 25, (5662-5675), Online publication date: 1-Jan-2023.
  275. Milarokostas C, Tsolkas D, Passas N and Merakos L (2023). A Comprehensive Study on LPWANs With a Focus on the Potential of LoRa/LoRaWAN Systems, IEEE Communications Surveys & Tutorials, 25:1, (825-867), Online publication date: 1-Jan-2023.
  276. Cao B, Wang Z, Zhang L, Feng D, Peng M, Zhang L and Han Z (2023). Blockchain Systems, Technologies, and Applications: A Methodology Perspective, IEEE Communications Surveys & Tutorials, 25:1, (353-385), Online publication date: 1-Jan-2023.
  277. Wu C, Kim I and Ma Z (2023). Deep Reinforcement Learning Based Traffic Signal Control, Procedia Computer Science, 220:C, (275-282), Online publication date: 1-Jan-2023.
  278. Chiurco A, Elbasheer M, Longo F, Nicoletti L and Solina V (2023). Data Modeling and ML Practice for Enabling Intelligent Digital Twins in Adaptive Production Planning and Control, Procedia Computer Science, 217:C, (1908-1917), Online publication date: 1-Jan-2023.
  279. Pan J, Huang J, Cheng G and Zeng Y (2023). Reinforcement learning for automatic quadrilateral mesh generation, Neural Networks, 157:C, (288-304), Online publication date: 1-Jan-2023.
  280. Silvetti M, Lasaponara S, Daddaoua N, Horan M and Gottlieb J (2023). A Reinforcement Meta-Learning framework of executive function and information demand, Neural Networks, 157:C, (103-113), Online publication date: 1-Jan-2023.
  281. Anzabi Zadeh S, Street W and Thomas B (2023). Optimizing warfarin dosing using deep reinforcement learning, Journal of Biomedical Informatics, 137:C, Online publication date: 1-Jan-2023.
  282. Xie S, Zhang Z, Yu H and Luo X (2023). Recurrent prediction model for partially observable MDPs, Information Sciences: an International Journal, 620:C, (125-141), Online publication date: 1-Jan-2023.
  283. Yerudkar A, Chatzaroulas E, Del Vecchio C and Moschoyiannis S (2023). Sampled-data Control of Probabilistic Boolean Control Networks, Information Sciences: an International Journal, 619:C, (374-389), Online publication date: 1-Jan-2023.
  284. Huang F, Xu J, Wu D, Cui Y, Yan Z, Xing W and Zhang X (2023). A general motion controller based on deep reinforcement learning for an autonomous underwater vehicle with unknown disturbances, Engineering Applications of Artificial Intelligence, 117:PA, Online publication date: 1-Jan-2023.
  285. Robles-Enciso A and Skarmeta A (2023). A multi-layer guided reinforcement learning-based tasks offloading in edge computing, Computer Networks: The International Journal of Computer and Telecommunications Networking, 220:C, Online publication date: 1-Jan-2023.
  286. Kovařík V, Seitz D, Lisý V, Rudolf J, Sun S and Ha K (2023). Value functions for depth-limited solving in zero-sum imperfect-information games, Artificial Intelligence, 314:C, Online publication date: 1-Jan-2023.
  287. Wang Z, Zhang Q, Tang L, Shi T and Xuan J (2023). Transfer reinforcement learning method with multi-label learning for compound fault recognition, Advanced Engineering Informatics, 55:C, Online publication date: 1-Jan-2023.
  288. Beyret B, Shafti A and Faisal A Dot-to-Dot: Explainable Hierarchical Reinforcement Learning for Robotic Manipulation 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (5014-5019)
  289. Khan A, Zhang C, Li S, Wu J, Schlotfeldt B, Tang S, Ribeiro A, Bastani O and Kumar V Learning Safe Unlabeled Multi-Robot Planning with Motion Constraints 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (7558-7565)
  290. Graves D, Rezaee K and Scheideman S Perception as prediction using general value functions in autonomous driving applications 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (1202-1209)
  291. Amiri S, Bajracharya S, Goktolgal C, Thomason J and Zhang S Augmenting Knowledge through Statistical, Goal-oriented Human-Robot Dialog 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (744-750)
  292. Wu B, Akinola I and Allen P Pixel-Attentive Policy Gradient for Multi-Fingered Grasping in Cluttered Scenes 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (1789-1796)
  293. Zapf M, Kawanabe M and Saiki L Pedestrian Density Prediction for Efficient Mobile Robot Exploration 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (4615-4622)
  294. Cui Y, Osaki S and Matsubara T Reinforcement Learning Boat Autopilot: A Sample-efficient and Model Predictive Control based Approach 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2868-2875)
  295. Gschwindt M, Camci E, Bonatti R, Wang W, Kayacan E and Scherer S Can a Robot Become a Movie Director? Learning Artistic Principles for Aerial Cinematography 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (1107-1114)
  296. Li L and Fu J Topological Approximate Dynamic Programming under Temporal Logic Constraints 2019 IEEE 58th Conference on Decision and Control (CDC), (5330-5337)
  297. Yaghmaie F and Gustafsson F Using Reinforcement Learning for Model-free Linear Quadratic Control with Process and Measurement Noises 2019 IEEE 58th Conference on Decision and Control (CDC), (6510-6517)
  298. Murray R and Palladino M Modelling uncertainty in reinforcement learning 2019 IEEE 58th Conference on Decision and Control (CDC), (2436-2441)
  299. Yaghmaie F and Gunnarsson S A New Result on Robust Adaptive Dynamic Programming for Uncertain Partially Linear Systems 2019 IEEE 58th Conference on Decision and Control (CDC), (7480-7485)
  300. Greene M, Deptula P, Nivison S and Dixon W Reinforcement Learning with Sparse Bellman Error Extrapolation for Infinite-Horizon Approximate Optimal Regulation 2019 IEEE 58th Conference on Decision and Control (CDC), (1959-1964)
  301. Wu G, Li Y and Luo J Transforming Policy via Reward Advancement 2019 IEEE 58th Conference on Decision and Control (CDC), (4609-4614)
  302. Hasanbeig M, Kantaros Y, Abate A, Kroening D, Pappas G and Lee I Reinforcement Learning for Temporal Logic Control Synthesis with Probabilistic Satisfaction Guarantees 2019 IEEE 58th Conference on Decision and Control (CDC), (5338-5343)
  303. Mohammadi M, Motamedi S and Sharifian S (2023). Drops on surface optimization (DSO), Computer Networks: The International Journal of Computer and Telecommunications Networking, 219:C, Online publication date: 24-Dec-2022.
  304. Sun W, Hao J, Li W and Wu Q (2022). An adaptive memetic algorithm for the bidirectional loop layout problem, Knowledge-Based Systems, 258:C, Online publication date: 22-Dec-2022.
  305. Guarino A, Malandrino D, Marzullo F, Torre A and Zaccagnino R (2022). Adaptive talent journey, Expert Systems with Applications: An International Journal, 209:C, Online publication date: 15-Dec-2022.
  306. Yu T and Chang Q (2022). User-guided motion planning with reinforcement learning for human-robot collaboration in smart manufacturing, Expert Systems with Applications: An International Journal, 209:C, Online publication date: 15-Dec-2022.
  307. Venugopalan M and Gupta D (2022). A reinforced active learning approach for optimal sampling in aspect term extraction for sentiment analysis, Expert Systems with Applications: An International Journal, 209:C, Online publication date: 15-Dec-2022.
  308. Bartmeyer P, Oliveira L, Leão A and Toledo F (2022). An expert system to react to defective areas in nesting problems, Expert Systems with Applications: An International Journal, 209:C, Online publication date: 15-Dec-2022.
  309. ACM
    De Los Santos B and Intal G Proposed Conceptual Design of Artificial Intelligence Integration into a Pressure Oxidation Facility Proceedings of the 2022 6th International Conference on Software and e-Business, (74-82)
  310. ACM
    Yu M, Zheng Z and Zeng D Continuous Self-adaptive Calibration by Reinforcement Learning Proceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence, (189-194)
  311. ACM
    Landen M, Chung K, Ike M, Mackay S, Watson J and Lee W DRAGON: Deep Reinforcement Learning for Autonomous Grid Operation and Attack Detection Proceedings of the 38th Annual Computer Security Applications Conference, (13-27)
  312. ACM
    Gong C, Yang Z, Bai Y, Shi J, Sinha A, Xu B, Lo D, Hou X and Fan G Curiosity-Driven and Victim-Aware Adversarial Policies Proceedings of the 38th Annual Computer Security Applications Conference, (186-200)
  313. Zheng X, Yu C and Zhang M (2022). Lifelong reinforcement learning with temporal logic formulas and reward machines, Knowledge-Based Systems, 257:C, Online publication date: 5-Dec-2022.
  314. ACM
    Tian Y, Xu J, Li Y, Luo J, Sueda S, Li H, Willis K and Matusik W (2022). Assemble Them All, ACM Transactions on Graphics, 41:6, (1-11), Online publication date: 1-Dec-2022.
  315. Gu R, Jensen P, Seceleanu C, Enoiu E and Lundqvist K (2022). Correctness-guaranteed strategy synthesis and compression for multi-agent autonomous systems, Science of Computer Programming, 224:C, Online publication date: 1-Dec-2022.
  316. Abbas K, Hong J, Tu N, Yoo J and Hong J (2023). Autonomous DRL-based energy efficient VM consolidation for cloud data centers, Physical Communication, 55:C, Online publication date: 1-Dec-2022.
  317. Zhang F, Yang Q and An D (2022). A leader-following paradigm based deep reinforcement learning method for multi-agent cooperation games, Neural Networks, 156:C, (1-12), Online publication date: 1-Dec-2022.
  318. Sellami B, Hakiri A and Ben Yahia S (2022). Deep Reinforcement Learning for energy-aware task offloading in join SDN-Blockchain 5G massive IoT edge network, Future Generation Computer Systems, 137:C, (363-379), Online publication date: 1-Dec-2022.
  319. Lee Y, Masood A, Noh W and Cho S (2022). DQN based user association control in hierarchical mobile edge computing systems for mobile IoT services, Future Generation Computer Systems, 137:C, (53-69), Online publication date: 1-Dec-2022.
  320. Wang H, Feng J, Li K and Chen L (2022). Deep understanding of big geospatial data for self-driving, Future Generation Computer Systems, 137:C, (146-163), Online publication date: 1-Dec-2022.
  321. Jayanetti A, Halgamuge S and Buyya R (2022). Deep reinforcement learning for energy and time optimized scheduling of precedence-constrained tasks in edge–cloud computing environments, Future Generation Computer Systems, 137:C, (14-30), Online publication date: 1-Dec-2022.
  322. Ouyang L, Zhang W and Wang F (2022). Intelligent contracts, Computers and Electrical Engineering, 104:PB, Online publication date: 1-Dec-2022.
  323. Almasan P, Suárez-Varela J, Rusek K, Barlet-Ros P and Cabellos-Aparicio A (2023). Deep reinforcement learning meets graph neural networks, Computer Communications, 196:C, (184-194), Online publication date: 1-Dec-2022.
  324. Mahmoud S, Billing E, Svensson H and Thill S (2022). Where to from here? On the future development of autonomous vehicles from a cognitive systems perspective, Cognitive Systems Research, 76:C, (63-77), Online publication date: 1-Dec-2022.
  325. Tao L, Dong Y, Chen W, Yang Y, Su L, Guo Q and Wang G (2022). A differential evolution with reinforcement learning for multi-objective assembly line feeding problem, Computers and Industrial Engineering, 174:C, Online publication date: 1-Dec-2022.
  326. Chen Z, Zhang S, Doan T, Clarke J and Maguluri S (2022). Finite-sample analysis of nonlinear stochastic approximation with applications in reinforcement learning, Automatica (Journal of IFAC), 146:C, Online publication date: 1-Dec-2022.
  327. Chen C, Xie L, Xie K, Lewis F and Xie S (2022). Adaptive optimal output tracking of continuous-time systems via output-feedback-based reinforcement learning, Automatica (Journal of IFAC), 146:C, Online publication date: 1-Dec-2022.
  328. Xie K, Yu X and Lan W (2022). Optimal output regulation for unknown continuous-time linear systems by internal model and adaptive dynamic programming, Automatica (Journal of IFAC), 146:C, Online publication date: 1-Dec-2022.
  329. Chevtchenko S, Barbosa E, Cavalcanti M, Azevedo G and Ludermir T (2022). Combining PPO and incremental conductance for MPPT under dynamic shading and temperature, Applied Soft Computing, 131:C, Online publication date: 1-Dec-2022.
  330. Zhang W, Lin Y, Liu Y, You H, Wu P, Lin F and Zhou X (2022). Self-Supervised Reinforcement Learning with dual-reward for knowledge-aware recommendation, Applied Soft Computing, 131:C, Online publication date: 1-Dec-2022.
  331. Ahmad S, Beneyto A, Contreras I and Vehi J (2022). Bolus Insulin calculation without meal information. A reinforcement learning approach, Artificial Intelligence in Medicine, 134:C, Online publication date: 1-Dec-2022.
  332. ACM
    Tian H, Liao X, Zeng C, Zhang J and Chen K Spine Proceedings of the 18th International Conference on emerging Networking EXperiments and Technologies, (261-275)
  333. Devagiri J, Paheding S, Niyaz Q, Yang X and Smith S (2022). Augmented Reality and Artificial Intelligence in industry, Expert Systems with Applications: An International Journal, 207:C, Online publication date: 30-Nov-2022.
  334. Sun H, Han L, Yang R, Ma X, Guo J and Zhou B Exploit reward shifting in value-based deep-RL Proceedings of the 36th International Conference on Neural Information Processing Systems, (37719-37734)
  335. Zhong R, Zhang D, Schäfer L, Albrecht S and Hanna J Robust on-policy sampling for data-efficient policy evaluation in reinforcement learning Proceedings of the 36th International Conference on Neural Information Processing Systems, (37376-37388)
  336. Mu J, Zhong V, Raileanu R, Jiang M, Goodman N, Rocktäschel T and Grefenstette E Improving intrinsic exploration with language abstractions Proceedings of the 36th International Conference on Neural Information Processing Systems, (33947-33960)
  337. Jiang M, Dennis M, Parker-Holder J, Lupu A, Küttler H, Grefenstette E, Rocktäschel T and Foerster J Grounding aleatoric uncertainty for unsupervised environment design Proceedings of the 36th International Conference on Neural Information Processing Systems, (32868-32881)
  338. Fu H, Yu S, Littman M and Konidaris G Model-based lifelong reinforcement learning with Bayesian exploration Proceedings of the 36th International Conference on Neural Information Processing Systems, (32369-32382)
  339. Guo Z, Thakoor S, Pîslar M, Pires B, Altché F, Tallec C, Saade A, Calandriello D, Grill J, Tang Y, Valko M, Munos R, Azar M and Piot B BYOL-explore Proceedings of the 36th International Conference on Neural Information Processing Systems, (31855-31870)
  340. Turner A and Tadepalli P Parametrically retargetable decision-makers tend to seek power Proceedings of the 36th International Conference on Neural Information Processing Systems, (31391-31401)
  341. Tang Y, Rowland M, Munos R, Pires B, Dabney W and Bellemare M The nature of temporal difference errors in multi-step distributional reinforcement learning Proceedings of the 36th International Conference on Neural Information Processing Systems, (30265-30276)
  342. Mehta V, Char I, Abbate J, Conlin R, Boyer M, Ermon S, Schneider J and Neiswanger W Exploration via planning for information about the optimal trajectory Proceedings of the 36th International Conference on Neural Information Processing Systems, (28761-28775)
  343. Suau M, He J, Çelikok M, Spaan M and Oliehoek F Distributed influence-augmented local simulators for parallel MARL in large networked systems Proceedings of the 36th International Conference on Neural Information Processing Systems, (28305-28318)
  344. Xu Y, Parker-Holder J, Pacchiano A, Ball P, Rybkin O, Roberts S, Rocktäschel T and Grefenstette E Learning general world models in a handful of reward-free deployments Proceedings of the 36th International Conference on Neural Information Processing Systems, (26820-26838)
  345. Anselmi J, Gaujal B and Rebuffi L Reinforcement learning in a birth and death process Proceedings of the 36th International Conference on Neural Information Processing Systems, (14464-14474)
  346. Tiapkin D, Belomestny D, Calandriello D, Moulines É, Munos R, Naumov A, Rowland M, Valko M and Ménard P Optimistic posterior sampling for reinforcement learning with few samples and tight guarantees Proceedings of the 36th International Conference on Neural Information Processing Systems, (10737-10751)
  347. Yang L, Ji J, Dai J, Zhang L, Zhou B, Li P, Yang Y and Pan G Constrained update projection approach to safe policy optimization Proceedings of the 36th International Conference on Neural Information Processing Systems, (9111-9124)
  348. Kim G, Lee J, Jang Y, Yang H and Kim K LobsDICE Proceedings of the 36th International Conference on Neural Information Processing Systems, (8252-8264)
  349. Giannou A, Lotidis K, Mertikopoulos P and Vlatakis-Gkaragkounis E On the convergence of policy gradient methods to Nash equilibria in general stochastic games Proceedings of the 36th International Conference on Neural Information Processing Systems, (7128-7141)
  350. ACM
    Wang H, Liang D and Xi Y Urban Path Planning Based on Improved Model-based Reinforcement Learning Algorithm Proceedings of the 4th International Conference on Advanced Information Science and System, (1-6)
  351. Zhou X, Zhu F and Zhao P (2022). Within the scope of prediction, Expert Systems with Applications: An International Journal, 206:C, Online publication date: 15-Nov-2022.
  352. Wang W, Xie X and Feng C (2022). Model-free finite-horizon optimal tracking control of discrete-time linear systems, Applied Mathematics and Computation, 433:C, Online publication date: 15-Nov-2022.
  353. ACM
    Kurte K, Amasyali K, Munk J and Zandi H Deep reinforcement learning with online data augmentation to improve sample efficiency for intelligent HVAC control Proceedings of the 9th ACM International Conference on Systems for Energy-Efficient Buildings, Cities, and Transportation, (479-483)
  354. Yang Y, Li F, Zhang X, Liu Z and Chan K (2022). Dynamic power allocation in cellular network based on multi-agent double deep reinforcement learning, Computer Networks: The International Journal of Computer and Telecommunications Networking, 217:C, Online publication date: 9-Nov-2022.
  355. Rago A, Martiradonna S, Piro G, Abrardo A and Boggia G (2022). A tenant-driven slicing enforcement scheme based on Pervasive Intelligence in the Radio Access Network, Computer Networks: The International Journal of Computer and Telecommunications Networking, 217:C, Online publication date: 9-Nov-2022.
  356. ACM
    Zhang Y, Zong R, Shang L, Kou Z, Zeng H and Wang D (2022). CrowdOptim: A Crowd-driven Neural Network Hyperparameter Optimization Approach to AI-based Smart Urban Sensing, Proceedings of the ACM on Human-Computer Interaction, 6:CSCW2, (1-27), Online publication date: 7-Nov-2022.
  357. ACM
    Gohil V, Guo H, Patnaik S and Rajendran J ATTRITION: Attacking Static Hardware Trojan Detection Techniques Using Reinforcement Learning Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, (1275-1289)
  358. ACM
    Pan L, Qian J, Xia W, Mao H, Yao J, Li P and Xiao Z Optimizing communication in deep reinforcement learning with XingTian Proceedings of the 23rd ACM/IFIP International Middleware Conference, (255-268)
  359. Zhang X, Liu Y, Mao H and Yu C (2022). Common belief multi-agent reinforcement learning based on variational recurrent models, Neurocomputing, 513:C, (341-350), Online publication date: 7-Nov-2022.
  360. Xue S, Luo B, Liu D and Gao Y (2022). Neural network-based event-triggered integral reinforcement learning for constrained H ∞ tracking control with experience replay, Neurocomputing, 513:C, (25-35), Online publication date: 7-Nov-2022.
  361. Song Z, Guo H, Jia D, Perc M, Li X and Wang Z (2022). Reinforcement learning facilitates an optimal interaction intensity for cooperation, Neurocomputing, 513:C, (104-113), Online publication date: 7-Nov-2022.
  362. ACM
    Wang W, Mottola L, He Y, Li J, Sun Y, Li S, Jing H and Wang Y MicNest Proceedings of the 20th ACM Conference on Embedded Networked Sensor Systems, (504-517)
  363. ACM
    Gunarathna U, Borovica-Gajic R, Karunasekera S and Tanin E Dynamic graph combinatorial optimization with multi-attention deep reinforcement learning Proceedings of the 30th International Conference on Advances in Geographic Information Systems, (1-12)
  364. Li M, Huang T and Zhu W (2022). Clustering experience replay for the effective exploitation in reinforcement learning, Pattern Recognition, 131:C, Online publication date: 1-Nov-2022.
  365. Chraibi Kaadoud I, Bennetot A, Mawhin B, Charisi V and Díaz-Rodríguez N (2022). Explaining Aha! moments in artificial agents through IKE-XAI, Neural Networks, 155:C, (95-118), Online publication date: 1-Nov-2022.
  366. Kumar A, Das S and Snášel V (2022). Improved spherical search with local distribution induced self-adaptation for hard non-convex optimization with and without constraints, Information Sciences: an International Journal, 615:C, (604-637), Online publication date: 1-Nov-2022.
  367. Kordabad A and Gros S (2023). Q-learning of the storage function in Economic Nonlinear Model Predictive Control, Engineering Applications of Artificial Intelligence, 116:C, Online publication date: 1-Nov-2022.
  368. Liu B, Xie Y, Feng L and Fu P (2023). Correcting biased value estimation in mixing value-based multi-agent reinforcement learning by multiple choice learning, Engineering Applications of Artificial Intelligence, 116:C, Online publication date: 1-Nov-2022.
  369. Malhotra A and Jindal R (2022). Deep learning techniques for suicide and depression detection from online social media, Applied Soft Computing, 130:C, Online publication date: 1-Nov-2022.
  370. García J, Visús Á and Fernández F (2022). A taxonomy for similarity metrics between Markov decision processes, Machine Language, 111:11, (4217-4247), Online publication date: 1-Nov-2022.
  371. ACM
    Tang Z and Yang G (2022). A Re-classification of Information Seeking Tasks and Their Computational Solutions, ACM Transactions on Information Systems, 40:4, (1-32), Online publication date: 31-Oct-2022.
  372. Elkael M, Ait Aba M, Araldo A, Castel-Taleb H and Jouaber B (2022). Monkey Business, Computer Networks: The International Journal of Computer and Telecommunications Networking, 216:C, Online publication date: 24-Oct-2022.
  373. ACM
    Saha S, Das M and Bandyopadhyay S A Model-Centric Explainer for Graph Neural Network based Node Classification Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (4434-4438)
  374. ACM
    Zha D, Lai K, Tan Q, Ding S, Zou N and Hu X Towards Automated Imbalanced Learning with Deep Hierarchical Reinforcement Learning Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (2476-2485)
  375. ACM
    Glake D, Panse F, Lenfers U, Clemen T and Ritter N Spatio-temporal Trajectory Learning using Simulation Systems Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (592-602)
  376. ACM
    Xia Y, Liu S, Chen X, Xu Z, Zheng K and Su H RISE: A Velocity Control Framework with Minimal Impacts based on Reinforcement Learning Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (2210-2219)
  377. ACM
    Chen Z, Silvestri F, Wang J, Zhu H, Ahn H and Tolomei G ReLAX: Reinforcement Learning Agent Explainer for Arbitrary Predictive Models Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (252-261)
  378. ACM
    Rakaraddi A, Siew Kei L, Pratama M and de Carvalho M Reinforced Continual Learning for Graphs Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (1666-1674)
  379. ACM
    Duan Z, Chen C, Cheng D, Liang Y and Qian W Optimal Action Space Search Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (406-415)
  380. ACM
    Xu T, Hua W, Qu J, Li Z, Xu J, Liu A and Zhao L Evidence-aware Document-level Relation Extraction Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (2311-2320)
  381. ACM
    Omidvar-Tehrani B, Personnaz A and Amer-Yahia S Guided Text-based Item Exploration Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (3410-3420)
  382. ACM
    Tao W, Fu Z, Li L, Chen Z, Wen H, Liu Y, Shen Q and Chen P A Dual Channel Intent Evolution Network for Predicting Period-Aware Travel Intentions at Fliggy Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (3524-3533)
  383. ACM
    Chen D, Yan Q, Chen C, Zheng Z, Liu Y, Ma Z, Yu C, Xu J and Zheng B Hierarchically Constrained Adaptive Ad Exposure in Feeds Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (3003-3012)
  384. ACM
    Yuan Y, Muralidharan A, Nandy P, Cheng M and Prabhakar P Offline Reinforcement Learning for Mobile Notifications Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (3614-3623)
  385. ACM
    Yun W, Mohaisen D, Jung S, Kim J and Kim J Hierarchical Reinforcement Learning using Gaussian Random Trajectory Generation in Autonomous Furniture Assembly Proceedings of the 31st ACM International Conference on Information & Knowledge Management, (3624-3633)
  386. ACM
    Ferdous R, Kifetew F, Prandi D and Susi A Towards Agent-Based Testing of 3D Games using Reinforcement Learning Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, (1-8)
  387. ACM
    Wang J and Wang C Learning to Synthesize Relational Invariants Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering, (1-12)
  388. Li M, Hu Z, Huang H, Lu Z and Wen X A Hierarchical Spatio-Temporal Cooperative Reinforcement Learning Approach for Traffic Signal Control 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), (3411-3416)
  389. Mallpress D (2022). Towards a functional classification of behaviour, Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems, 30:5, (417-450), Online publication date: 1-Oct-2022.
  390. Akrour R, Tateo D and Peters J (2022). Continuous Action Reinforcement Learning From a Mixture of Interpretable Experts, IEEE Transactions on Pattern Analysis and Machine Intelligence, 44:10_Part_2, (6795-6806), Online publication date: 1-Oct-2022.
  391. Xu T, Li Z and Yu Y (2022). Error Bounds of Imitating Policies and Environments for Reinforcement Learning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 44:10_Part_2, (6968-6980), Online publication date: 1-Oct-2022.
  392. Yao J, Dou Z, Nie J and Wen J (2022). Looking Back on the Past: Active Learning With Historical Evaluation Results, IEEE Transactions on Knowledge and Data Engineering, 34:10, (4921-4932), Online publication date: 1-Oct-2022.
  393. Huang T, Cai Z, Li R, Wang S and Zhu W (2022). Consolidation of structure of high noise data by a new noise index and reinforcement learning, Information Sciences: an International Journal, 614:C, (206-222), Online publication date: 1-Oct-2022.
  394. Cui Y, Hu W and Rahmani A (2022). A reinforcement learning based artificial bee colony algorithm with application in robot path planning, Expert Systems with Applications: An International Journal, 203:C, Online publication date: 1-Oct-2022.
  395. AlMahamid F and Grolinger K (2022). Autonomous Unmanned Aerial Vehicle navigation using Reinforcement Learning, Engineering Applications of Artificial Intelligence, 115:C, Online publication date: 1-Oct-2022.
  396. Wei Y, Chen Z, Zhao C, Chen X, Yang R, He J, Zhang C and Wu S (2022). Deterministic and probabilistic ship pitch prediction using a multi-predictor integration model based on hybrid data preprocessing, reinforcement learning and improved QRNN, Advanced Engineering Informatics, 54:C, Online publication date: 1-Oct-2022.
  397. Huang B and Jin Y (2022). Reward shaping in multiagent reinforcement learning for self-organizing systems in assembly tasks, Advanced Engineering Informatics, 54:C, Online publication date: 1-Oct-2022.
  398. Manuel Davila Delgado J and Oyedele L (2022). Robotics in construction, Advanced Engineering Informatics, 54:C, Online publication date: 1-Oct-2022.
  399. ACM
    Spychiger F, Tessone C, Zavolokina L and Schwabe G (2022). Incentivizing Data Quality in Blockchain-Based Systems—The Case of the Digital Cardossier, Distributed Ledger Technologies: Research and Practice, 1:1, (1-27), Online publication date: 30-Sep-2022.
  400. ACM
    Shiri A, Kallakuri U, Rashid H, Prakash B, Waytowich N, Oates T and Mohsenin T (2022). E2HRL: An Energy-efficient Hardware Accelerator for Hierarchical Deep Reinforcement Learning, ACM Transactions on Design Automation of Electronic Systems, 27:5, (1-19), Online publication date: 30-Sep-2022.
  401. Li F, Liu Z, Zhang X and Yang Y (2022). Dynamic power allocation in IIoT based on multi-agent deep reinforcement learning, Neurocomputing, 505:C, (10-18), Online publication date: 21-Sep-2022.
  402. Ghazali R, Adabi S, Rezaee A, Down D and Movaghar A (2022). CLQLMRS: improving cache locality in MapReduce job scheduling using Q-learning, Journal of Cloud Computing: Advances, Systems and Applications, 11:1, Online publication date: 19-Sep-2022.
  403. ACM
    Maleas Z, Dais S, Ayfantopoulou G and Salanova Grau J Online policies and multimodal data processing for the same day delivery problem Proceedings of the 12th Hellenic Conference on Artificial Intelligence, (1-3)
  404. Yang H, Zhao J, Lam K, Xiong Z, Wu Q and Xiao L (2022). Distributed Deep Reinforcement Learning-Based Spectrum and Power Allocation for Heterogeneous Networks, IEEE Transactions on Wireless Communications, 21:9, (6935-6948), Online publication date: 1-Sep-2022.
  405. Wei Z, Zhao B and Su J (2022). Event-Driven Computation Offloading in IoT With Edge Computing, IEEE Transactions on Wireless Communications, 21:9, (6847-6860), Online publication date: 1-Sep-2022.
  406. Bai Z, Hao P, ShangGuan W, Cai B and Barth M (2022). Hybrid Reinforcement Learning-Based Eco-Driving Strategy for Connected and Automated Vehicles at Signalized Intersections, IEEE Transactions on Intelligent Transportation Systems, 23:9, (15850-15863), Online publication date: 1-Sep-2022.
  407. Dorabiala O, Kutz J and Aravkin A (2022). Robust trimmed k-means, Pattern Recognition Letters, 161:C, (9-16), Online publication date: 1-Sep-2022.
  408. Yao Z, Yu J, Zhang J and He W (2022). Graph and dynamics interpretation in robotic reinforcement learning task, Information Sciences: an International Journal, 611:C, (317-334), Online publication date: 1-Sep-2022.
  409. Shi H, Li J, Chen S and Hwang K (2022). A behavior fusion method based on inverse reinforcement learning, Information Sciences: an International Journal, 609:C, (429-444), Online publication date: 1-Sep-2022.
  410. Umer M, Junejo K, Jilani M and Mathur A (2022). Machine learning for intrusion detection in industrial control systems, International Journal of Critical Infrastructure Protection, 38:C, Online publication date: 1-Sep-2022.
  411. Zhou Y and Ho H (2022). Online robot guidance and navigation in non-stationary environment with hybrid Hierarchical Reinforcement Learning, Engineering Applications of Artificial Intelligence, 114:C, Online publication date: 1-Sep-2022.
  412. Gautron R, Maillard O, Preux P, Corbeels M and Sabbadin R (2022). Reinforcement learning for crop management support, Computers and Electronics in Agriculture, 200:C, Online publication date: 1-Sep-2022.
  413. Bardou A, Begin T and Busson A (2022). Analysis of a Multi-Armed Bandit solution to improve the spatial reuse of next-generation WLANs, Computer Communications, 193:C, (279-292), Online publication date: 1-Sep-2022.
  414. Hwang H, Lee M and Seok J (2022). Deep reinforcement learning with a critic-value-based branch tree for the inverse design of two-dimensional optical devices, Applied Soft Computing, 127:C, Online publication date: 1-Sep-2022.
  415. Zhang J and Xiao L (2022). Stochastic variance-reduced prox-linear algorithms for nonconvex composite optimization, Mathematical Programming: Series A and B, 195:1-2, (649-691), Online publication date: 1-Sep-2022.
  416. ACM
    Nguyen N, Nguyen P, Nguyen T, Nguyen T, Nguyen D, Nguyen T, Pham H and Truong T FedDRL: Deep Reinforcement Learning-based Adaptive Aggregation for Non-IID Data in Federated Learning Proceedings of the 51st International Conference on Parallel Processing, (1-11)
  417. ACM
    Tahir A, Cui K and Koeppl H Learning Mean-Field Control for Delayed Information Load Balancing in Large Queuing Systems Proceedings of the 51st International Conference on Parallel Processing, (1-11)
  418. ACM
    Lu K, Li G, Wan J, Ma R and Zhao W ADSTS: Automatic Distributed Storage Tuning System Using Deep Reinforcement Learning Proceedings of the 51st International Conference on Parallel Processing, (1-13)
  419. Kirtay M, Oztop E, Kuhlen A, Asada M and Hafner V Trustworthiness assessment in multimodal human-robot interaction based on cognitive load 2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), (469-476)
  420. Pynadath D, Gurney N and Wang N Explainable Reinforcement Learning in Human-Robot Teams: The Impact of Decision-Tree Explanations on Transparency 2022 31st IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), (749-756)
  421. Gallotta R, Arulkumaran K and Soros L Surrogate Infeasible Fitness Acquirement FI-2Pop for Procedural Content Generation 2022 IEEE Conference on Games (CoG), (500-503)
  422. Mayr M, Hvarfner C, Chatzilygeroudis K, Nardi L and Krueger V Learning Skill-based Industrial Robot Tasks with User Priors 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE), (1485-1492)
  423. Perrusquía A and Guo W Performance Objective Extraction of Optimal Controllers: A Hippocampal Learning Approach 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE), (1545-1550)
  424. Wang Y, Jia Y, Tian Y and Xiao J (2022). Deep reinforcement learning with the confusion-matrix-based dynamic reward function for customer credit scoring, Expert Systems with Applications: An International Journal, 200:C, Online publication date: 15-Aug-2022.
  425. ACM
    Li H, Fu X, Wu R, Xu J, Xiao K, Chang X, Wang W, Chen S, Shi L, Xiong T and Qi Y Design Domain Specific Neural Network via Symbolic Testing Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, (3219-3229)
  426. ACM
    Wang Y, Tong Y, Zhou Z, Ren Z, Xu Y, Wu G and Lv W Fed-LTD: Towards Cross-Platform Ride Hailing via Federated Learning to Dispatch Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, (4079-4089)
  427. Wang J, Wei J, Pang J, Zhang F and Li S (2022). Security Enhancement Through Compiler-Assisted Software Diversity With Deep Reinforcement Learning, International Journal of Digital Crime and Forensics, 14:2, (1-18), Online publication date: 11-Aug-2022.
  428. Xiao X, Wang Z, Xu Z, Liu B, Warnell G, Dhamankar G, Nair A and Stone P (2022). APPL, Robotics and Autonomous Systems, 154:C, Online publication date: 1-Aug-2022.
  429. Kim S, Kim I and You D (2022). Multi-condition multi-objective optimization using deep reinforcement learning, Journal of Computational Physics, 462:C, Online publication date: 1-Aug-2022.
  430. Moran M, Cohen T, Ben-Zion Y and Gordon G (2022). Curious instance selection, Information Sciences: an International Journal, 608:C, (794-808), Online publication date: 1-Aug-2022.
  431. Gao W, Deng C, Jiang Y and Jiang Z (2022). Resilient reinforcement learning and robust output regulation under denial-of-service attacks, Automatica (Journal of IFAC), 142:C, Online publication date: 1-Aug-2022.
  432. Lee D, Lee S, Masoud N, Krishnan M and Li V (2022). Digital twin-driven deep reinforcement learning for adaptive task allocation in robotic construction, Advanced Engineering Informatics, 53:C, Online publication date: 1-Aug-2022.
  433. Wang C, Ding Y, Yan N, Ma L, Ma J, Lu C, Yang C, Su Y, Chong J, Jin H and Lin Y (2022). A novel Long-term degradation trends predicting method for Multi-Formulation Li-ion batteries based on deep reinforcement learning, Advanced Engineering Informatics, 53:C, Online publication date: 1-Aug-2022.
  434. Adams S, Cody T and Beling P (2022). A survey of inverse reinforcement learning, Artificial Intelligence Review, 55:6, (4307-4346), Online publication date: 1-Aug-2022.
  435. Nakamura M, Hagiwara S and Matoba R (2022). Simulation for labor market using a multi-agent model toward validation of the Amended Labor Contract Act, Artificial Life and Robotics, 27:3, (472-479), Online publication date: 1-Aug-2022.
  436. ACM
    Tuli S and Casale G Optimizing the Performance of Fog Computing Environments Using AI and Co-Simulation Companion of the 2022 ACM/SPEC International Conference on Performance Engineering, (25-28)
  437. ACM
    Unold O, Kozłowski N and Śmierzchała Ł Preliminary tests of an anticipatory classifier system with experience replay Proceedings of the Genetic and Evolutionary Computation Conference Companion, (2095-2103)
  438. ACM
    Tessari M and Iacca G Reinforcement learning based adaptive metaheuristics Proceedings of the Genetic and Evolutionary Computation Conference Companion, (1854-1861)
  439. ACM
    Aydeniz A, Nickelson A and Tumer K Entropy-based local fitnesses for evolutionary multiagent systems Proceedings of the Genetic and Evolutionary Computation Conference Companion, (212-215)
  440. ACM
    Abramowitz S and Nitschke G Scalable evolutionary hierarchical reinforcement learning Proceedings of the Genetic and Evolutionary Computation Conference Companion, (272-275)
  441. ACM
    Biedenkapp A, Dang N, Krejca M, Hutter F and Doerr C Theory-inspired parameter control benchmarks for dynamic algorithm configuration Proceedings of the Genetic and Evolutionary Computation Conference, (766-775)
  442. ACM
    Allard M, Smith S, Chatzilygeroudis K and Cully A Hierarchical quality-diversity for online damage recovery Proceedings of the Genetic and Evolutionary Computation Conference, (58-67)
  443. ACM
    Wu J, Xie Z, Yu T, Zhao H, Zhang R and Li S Dynamics-Aware Adaptation for Reinforcement Learning Based Cross-Domain Interactive Recommendation Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, (290-300)
  444. ACM
    Huang J, Oosterhuis H, Cetinkaya B, Rood T and de Rijke M State Encoders in Reinforcement Learning for Recommendation Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, (2738-2748)
  445. ACM
    Ruiz-Dolz R, Taverner J, Heras S, Garcia-Fornes A and Botti V A Qualitative Analysis of the Persuasive Properties of Argumentation Schemes Proceedings of the 30th ACM Conference on User Modeling, Adaptation and Personalization, (1-11)
  446. Gilavert P and Freire V Computerized Adaptive Testing: A Unified Approach Under Markov Decision Process Computational Science and Its Applications – ICCSA 2022, (591-602)
  447. Zhu Z, Li X, Wang M and Zhang A (2022). Learning Markov Models Via Low-Rank Optimization, Operations Research, 70:4, (2384-2398), Online publication date: 1-Jul-2022.
  448. Koçak Ö and Puranam P (2022). Separated by a Common Language, Management Science, 68:7, (5287-5310), Online publication date: 1-Jul-2022.
  449. ACM
    Peng X, Guo Y, Halper L, Levine S and Fidler S (2022). ASE, ACM Transactions on Graphics, 41:4, (1-17), Online publication date: 1-Jul-2022.
  450. ACM
    She Q, Hu R, Xu J, Liu M, Xu K and Huang H (2022). Learning high-DOF reaching-and-grasping via dynamic representation of gripper-object interaction, ACM Transactions on Graphics, 41:4, (1-14), Online publication date: 1-Jul-2022.
  451. Xiao J and Lou Y (2022). An Online Reinforcement Learning Approach for User-Optimal Parking Searching Strategy Exploiting Unique Problem Property and Network Topology, IEEE Transactions on Intelligent Transportation Systems, 23:7, (8157-8169), Online publication date: 1-Jul-2022.
  452. Mach P and Becvar Z (2022). Device-to-Device Relaying: Optimization, Performance Perspectives, and Open Challenges Towards 6G Networks, IEEE Communications Surveys & Tutorials, 24:3, (1336-1393), Online publication date: 1-Jul-2022.
  453. Shaw R, Howley E and Barrett E (2022). Applying Reinforcement Learning towards automating energy efficient virtual machine consolidation in cloud data centers, Information Systems, 107:C, Online publication date: 1-Jul-2022.
  454. ACM
    Zhang T, S A, Afshari M, Musilek P, Taylor M and Ardakanian O Diversity for transfer in learning-based control of buildings Proceedings of the Thirteenth ACM International Conference on Future Energy Systems, (556-564)
  455. ACM
    Zhang D, Dai D and Xie B SchedInspector Proceedings of the 31st International Symposium on High-Performance Parallel and Distributed Computing, (97-109)
  456. ACM
    Dietz G, King Chen J, Beason J, Tarrow M, Hilliard A and Shapiro R ARtonomous: Introducing Middle School Students to Reinforcement Learning Through Virtual Robotics Proceedings of the 21st Annual ACM Interaction Design and Children Conference, (430-441)
  457. Massimo D and Ricci F (2022). Building effective recommender systems for tourists, AI Magazine, 43:2, (209-224), Online publication date: 23-Jun-2022.
  458. ACM
    Liu J and Shah C Leveraging user interaction signals and task state information in adaptively optimizing usefulness-oriented search sessions Proceedings of the 22nd ACM/IEEE Joint Conference on Digital Libraries, (1-11)
  459. ACM
    Cai T, Wallace S, Rezvanian T, Dobres J, Kerr B, Berlow S, Huang J, Sawyer B and Bylinskii Z Personalized Font Recommendations: Combining ML and Typographic Guidelines to Optimize Readability Proceedings of the 2022 ACM Designing Interactive Systems Conference, (1-25)
  460. ACM
    Wu W, Wang C, Siddiqui T, Wang J, Narasayya V, Chaudhuri S and Bernstein P Budget-aware Index Tuning with Reinforcement Learning Proceedings of the 2022 International Conference on Management of Data, (1528-1541)
  461. ACM
    Zhang W, Interlandi M, Mineiro P, Qiao S, Ghazanfari N, Lie K, Friedman M, Hosn R, Patel H and Jindal A Deploying a Steered Query Optimizer in Production at Microsoft Proceedings of the 2022 International Conference on Management of Data, (2299-2311)
  462. ACM
    Yang Z, Chiang W, Luan S, Mittal G, Luo M and Stoica I Balsa: Learning a Query Optimizer Without Expert Demonstrations Proceedings of the 2022 International Conference on Management of Data, (931-944)
  463. Gunarathna U, Karunasekera S, Borovica-Gajic R and Tanin E Real-Time Intelligent Autonomous Intersection Management Using Reinforcement Learning 2022 IEEE Intelligent Vehicles Symposium (IV), (135-144)
  464. Tiwari A, Saha T, Saha S, Sengupta S, Maitra A, Ramnani R and Bhattacharyya P (2022). A persona aware persuasive dialogue policy for dynamic and co-operative goal setting, Expert Systems with Applications: An International Journal, 195:C, Online publication date: 1-Jun-2022.
  465. Li M, Qin J, Zheng W, Wang Y and Kang Y (2022). Model-free design of stochastic LQR controller from a primal–dual optimization perspective, Automatica (Journal of IFAC), 140:C, Online publication date: 1-Jun-2022.
  466. Qiao G, Leng S and Zhang Y (2022). Online Learning and Optimization for Computation Offloading in D2D Edge Computing and Networks, Mobile Networks and Applications, 27:3, (1111-1122), Online publication date: 1-Jun-2022.
  467. Hummaida A, Paton N and Sakellariou R (2022). Scalable Virtual Machine Migration using Reinforcement Learning, Journal of Grid Computing, 20:2, Online publication date: 1-Jun-2022.
  468. ACM
    Németh M and Szűcs G Split Feature Space Ensemble Method using Deep Reinforcement Learning for Algorithmic Trading Proceedings of the 2022 8th International Conference on Computer Technology Applications, (188-194)
  469. Schäfer L Task Generalisation in Multi-Agent Reinforcement Learning Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1863-1865)
  470. Ricci A "Go to the Children": Rethinking Intelligent Agent Design and Programming in a Developmental Learning Perspective Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1809-1813)
  471. Zawalski M, Osiński B, Michalewski H and Miłoń P Off-Policy Correction For Multi-Agent Reinforcement Learning Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1774-1776)
  472. Singh I, Singh G and Modi A Pre-trained Language Models as Prior Knowledge for Playing Text-based Games Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1729-1731)
  473. Bagga P, Paoletti N and Stathis K Deep Learnable Strategy Templates for Multi-Issue Bilateral Negotiation Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1533-1535)
  474. Yang J, Wang E, Trivedi R, Zhao T and Zha H Adaptive Incentive Design with Multi-Agent Meta-Gradient Reinforcement Learning Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1436-1445)
  475. Wang W, Wu G, Wu W, Jiang Y and An B Online Collective Multiagent Planning by Offline Policy Reuse with Applications to City-Scale Mobility-on-Demand Systems Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1364-1372)
  476. Vasco M, Yin H, Melo F and Paiva A How to Sense the World: Leveraging Hierarchy in Multimodal Perception for Robust Reinforcement Learning Agents Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1301-1309)
  477. Seraj E, Wang Z, Paleja R, Martin D, Sklar M, Patel A and Gombolay M Learning Efficient Diverse Communication for Cooperative Heterogeneous Teaming Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1173-1182)
  478. Senadeera M, Karimpanal T, Gupta S and Rana S Sympathy-based Reinforcement Learning Agents Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1164-1172)
  479. Schäfer L, Christianos F, Hanna J and Albrecht S Decoupled Reinforcement Learning to Stabilise Intrinsically-Motivated Exploration Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1146-1154)
  480. Qian P and Unhelkar V Evaluating the Role of Interactivity on Improving Transparency in Autonomous Agents Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (1083-1091)
  481. Katt S, Nguyen H, Oliehoek F and Amato C BADDr: Bayes-Adaptive Deep Dropout RL for POMDPs Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (723-731)
  482. Egorov V and Shpilman A Scalable Multi-Agent Model-Based Reinforcement Learning Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (381-390)
  483. Browne A and Forney A Exploiting Causal Structure for Transportability in Online, Multi-Agent Environments Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (199-207)
  484. Agarwal M, Aggarwal V and Lan T Multi-Objective Reinforcement Learning with Non-Linear Scalarization Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, (9-17)
  485. Liu D, Kong H, Luo X, Liu W and Subramaniam R (2022). Bringing AI to edge, Neurocomputing, 485:C, (297-320), Online publication date: 7-May-2022.
  486. Han R, Wen S, Liu C, Yuan Y, Wang G and Chen L EdgeTuner: Fast Scheduling Algorithm Tuning for Dynamic Edge-Cloud Workloads and Resources IEEE INFOCOM 2022 - IEEE Conference on Computer Communications, (880-889)
  487. Piezunka H, Aggarwal V and Posen H (2022). The Aggregation–Learning Trade-off, Organization Science, 33:3, (1094-1115), Online publication date: 1-May-2022.
  488. Zhang T, Zhu K, Wang J and Han Z (2022). Cost-Efficient Beam Management and Resource Allocation in Millimeter Wave Backhaul HetNets With Hybrid Energy Supply, IEEE Transactions on Wireless Communications, 21:5, (3291-3306), Online publication date: 1-May-2022.
  489. Wang G, Hu J, Li Z and Li L (2022). Harmonious Lane Changing via Deep Reinforcement Learning, IEEE Transactions on Intelligent Transportation Systems, 23:5, (4642-4650), Online publication date: 1-May-2022.
  490. Zhang W, Zhang N, Yan J, Li G and Yang X (2022). Auto uning of price prediction models for high-frequency trading via reinforcement learning, Pattern Recognition, 125:C, Online publication date: 1-May-2022.
  491. Muzio A, Maximo M and Yoneyama T (2022). Deep Reinforcement Learning for Humanoid Robot Behaviors, Journal of Intelligent and Robotic Systems, 105:1, Online publication date: 1-May-2022.
  492. Bougie N and Ichise R (2022). Hierarchical learning from human preferences and curiosity, Applied Intelligence, 52:7, (7459-7479), Online publication date: 1-May-2022.
  493. ACM
    Yu H, Wang H, Li J, Yuan X and Park S Accelerating Serverless Computing by Harvesting Idle Resources Proceedings of the ACM Web Conference 2022, (1741-1751)
  494. ACM
    Pires S, Ribeiro A and Sampaio L A meta-policy approach for learning suitable caching replacement policies in information-centric networks Proceedings of the 37th ACM/SIGAPP Symposium on Applied Computing, (1950-1959)
  495. Chen H, Liu Q, Fu K, Huang J, Wang C and Gong J (2022). Accurate policy detection and efficient knowledge reuse against multi-strategic opponents, Knowledge-Based Systems, 242:C, Online publication date: 22-Apr-2022.
  496. Chen L, Jiang S, Liu J, Wang C, Zhang S, Xie C, Liang J, Xiao Y and Song R (2022). Rule mining over knowledge graphs via reinforcement learning, Knowledge-Based Systems, 242:C, Online publication date: 22-Apr-2022.
  497. Tiwari A, Saha S and Bhattacharyya P (2022). A knowledge infused context driven dialogue agent for disease diagnosis using hierarchical reinforcement learning, Knowledge-Based Systems, 242:C, Online publication date: 22-Apr-2022.
  498. Lan Y, Xu X, Fang Q, Zeng Y, Liu X and Zhang X (2022). Transfer reinforcement learning via meta-knowledge extraction using auto-pruned decision trees, Knowledge-Based Systems, 242:C, Online publication date: 22-Apr-2022.
  499. ACM
    Yilmaz E, Ji T, Ayday E and Li P Genomic Data Sharing under Dependent Local Differential Privacy Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy, (77-88)
  500. ACM
    Chen P, Ke B, Lee T, Tsai I, Kung T, Lin L, Liu E, Chang Y, Li Y and Chao M A Reinforcement Learning Agent for Obstacle-Avoiding Rectilinear Steiner Tree Construction Proceedings of the 2022 International Symposium on Physical Design, (107-115)
  501. Zhu X, Zhang R and Zhu W (2022). MDMD options discovery for accelerating exploration in sparse-reward domains, Knowledge-Based Systems, 241:C, Online publication date: 6-Apr-2022.
  502. ACM
    Alabed S and Yoneki E BoGraph Proceedings of the 2nd European Workshop on Machine Learning and Systems, (45-53)
  503. ACM
    Qiu H, Mao W, Patke A, Wang C, Franke H, Kalbarczyk Z, Başar T and Iyer R Reinforcement learning for resource management in multi-tenant serverless platforms Proceedings of the 2nd European Workshop on Machine Learning and Systems, (20-28)
  504. Krishna Moorthy S, Mcmanus M and Guan Z (2021). ESN Reinforcement Learning for Spectrum and Flight Control in THz-Enabled Drone Networks, IEEE/ACM Transactions on Networking, 30:2, (782-795), Online publication date: 1-Apr-2022.
  505. Zhang W, Wang L, Xie L, Feng K and Liu X (2022). TradeBot, Pattern Recognition, 124:C, Online publication date: 1-Apr-2022.
  506. Fernandez-Gauna B, Graña M, Osa-Amilibia J and Larrucea X (2022). Actor-critic continuous state reinforcement learning for wind-turbine control robust optimization, Information Sciences: an International Journal, 591:C, (365-380), Online publication date: 1-Apr-2022.
  507. Yang L, Rao H, Lin M, Xu Y and Shi P (2022). Optimal sensor scheduling for remote state estimation with limited bandwidth, Information Sciences: an International Journal, 588:C, (279-292), Online publication date: 1-Apr-2022.
  508. Zhao Y, Chen B, Wang X, Zhu Z, Wang Y, Cheng G, Wang R, Wang R, He M and Liu Y (2022). A deep reinforcement learning based searching method for source localization, Information Sciences: an International Journal, 588:C, (67-81), Online publication date: 1-Apr-2022.
  509. Chen C, Lewis F and Li B (2022). Homotopic policy iteration-based learning design for unknown linear continuous-time systems, Automatica (Journal of IFAC), 138:C, Online publication date: 1-Apr-2022.
  510. Gaujal B (2022). Learning in queues, Queueing Systems: Theory and Applications, 100:3-4, (521-523), Online publication date: 1-Apr-2022.
  511. Akintunde M, Botoeva E, Kouvaros P and Lomuscio A (2021). Formal verification of neural agents in non-deterministic environments, Autonomous Agents and Multi-Agent Systems, 36:1, Online publication date: 1-Apr-2022.
  512. ACM
    Zini F, Le Piane F and Gaspari M (2022). Adaptive Cognitive Training with Reinforcement Learning, ACM Transactions on Interactive Intelligent Systems, 12:1, (1-29), Online publication date: 31-Mar-2022.
  513. ACM
    Louie R, Engel J and Huang C Expressive Communication: Evaluating Developments in Generative Models and Steering Interfaces for Music Creation 27th International Conference on Intelligent User Interfaces, (405-417)
  514. Hummaida A, Paton N and Sakellariou R Dynamic Threshold Setting for VM Migration Service-Oriented and Cloud Computing, (31-46)
  515. ACM
    Wang H, Tang Z, Zhang C, Zhao J, Cummins C, Leather H and Wang Z Automating reinforcement learning architecture design for code optimization Proceedings of the 31st ACM SIGPLAN International Conference on Compiler Construction, (129-143)
  516. Mallappa U, Pratty S and Brown D RLPlace Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in Europe, (120-123)
  517. Chai C, Liu J, Tang N, Li G and Luo Y (2022). Selective data acquisition in the wild for model charging, Proceedings of the VLDB Endowment, 15:7, (1466-1478), Online publication date: 1-Mar-2022.
  518. Zhang Y, Luo F and Yu Y (2022). Improve generated adversarial imitation learning with reward variance regularization, Machine Language, 111:3, (977-995), Online publication date: 1-Mar-2022.
  519. Almulla H and Gay G (2022). Learning how to search: generating effective test cases through adaptive fitness function selection, Empirical Software Engineering, 27:2, Online publication date: 1-Mar-2022.
  520. Huang J, Tan Q, Li H, Li A and Huang L (2022). Monte carlo tree search for dynamic bike repositioning in bike-sharing systems, Applied Intelligence, 52:4, (4610-4625), Online publication date: 1-Mar-2022.
  521. Zou B, You J, Wang Q, Wen X and Jia L (2022). Survey on Learnable Databases, Big Data Research, 27:C, Online publication date: 28-Feb-2022.
  522. ACM
    Kholkar D, Roychoudhury S, Kulkarni V and Reddy S Learning to Adapt – Software Engineering for Uncertainty Proceedings of the 15th Innovations in Software Engineering Conference, (1-5)
  523. ACM
    Chen Z, Mou S and Maguluri S (2022). Stationary Behavior of Constant Stepsize SGD Type Algorithms, Proceedings of the ACM on Measurement and Analysis of Computing Systems, 6:1, (1-24), Online publication date: 24-Feb-2022.
  524. ACM
    Montazeralghaem A and Allan J Learning Relevant Questions for Conversational Product Search using Deep Reinforcement Learning Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, (746-754)
  525. ACM
    Luo Z and Miao C RLMob Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, (648-656)
  526. ACM
    Kiyohara H, Saito Y, Matsuhiro T, Narita Y, Shimizu N and Yamamoto Y Doubly Robust Off-Policy Evaluation for Ranking Policies under the Cascade Behavior Model Proceedings of the Fifteenth ACM International Conference on Web Search and Data Mining, (487-497)
  527. Baldi S, Zhang Z and Liu D (2022). Eligibility traces and forgetting factor in recursive least‐squares‐based temporal difference, International Journal of Adaptive Control and Signal Processing, 36:2, (334-353), Online publication date: 8-Feb-2022.
  528. Carius J, Ranftl R, Farshidian F and Hutter M (2022). Constrained stochastic optimal control with learned importance sampling, International Journal of Robotics Research, 41:2, (189-209), Online publication date: 1-Feb-2022.
  529. Kim B, Shimanuki L, Kaelbling L and Lozano-Pérez T (2022). Representation, learning, and planning algorithms for geometric task and motion planning, International Journal of Robotics Research, 41:2, (210-231), Online publication date: 1-Feb-2022.
  530. Zha Z, Liu D, Zhang H, Zhang Y and Wu F (2022). Context-Aware Visual Policy Network for Fine-Grained Image Captioning, IEEE Transactions on Pattern Analysis and Machine Intelligence, 44:2, (710-722), Online publication date: 1-Feb-2022.
  531. Noor-A-Rahim M, Liu Z, Lee H, Ali G, Pesch D and Xiao P (2022). A Survey on Resource Allocation in Vehicular Networks, IEEE Transactions on Intelligent Transportation Systems, 23:2, (701-721), Online publication date: 1-Feb-2022.
  532. Zhou C, Huang B and Fränti P (2022). A review of motion planning algorithms for intelligent robots, Journal of Intelligent Manufacturing, 33:2, (387-424), Online publication date: 1-Feb-2022.
  533. Ushida Y, Razan H, Ishizuya S, Sakuma T and Kato S (2022). Using sim-to-real transfer learning to close gaps between simulation and real environments through reinforcement learning, Artificial Life and Robotics, 27:1, (130-136), Online publication date: 1-Feb-2022.
  534. Kanda T, Koda Y, Yamamoto K and Nishio T ACK-Less Rate Adaptation for IEEE 802.11bc Enhanced Broadcast Services Using Sim-to-Real Deep Reinforcement Learning 2022 IEEE 19th Annual Consumer Communications & Networking Conference (CCNC), (139-143)
  535. Omoniwa B, Galkin B and Dusparic I Energy-aware optimization of UAV base stations placement via decentralized multi-agent Q-learning 2022 IEEE 19th Annual Consumer Communications & Networking Conference (CCNC), (216-222)
  536. Zhao H, Hua J, Zhang Z, Zhu J and Ardagna C (2022). Deep Reinforcement Learning-Based Task Offloading for Parked Vehicle Cooperation in Vehicular Edge Computing, Mobile Information Systems, 2022, Online publication date: 1-Jan-2022.
  537. Okada N, Yamagami T, Chauvet N, Ito Y, Hasegawa M, Naruse M and Sayama H (2022). Theory of Acceleration of Decision-Making by Correlated Time Sequences, Complexity, 2022, Online publication date: 1-Jan-2022.
  538. Li S, Guo W and Liu L (2022). Supervised Reinforcement Learning for ULV Path Planning in Complex Warehouse Environment, Wireless Communications & Mobile Computing, 2022, Online publication date: 1-Jan-2022.
  539. Erkan E, Arserim M and Javed A (2022). Mobile Robot Application with Hierarchical Start Position DQN, Computational Intelligence and Neuroscience, 2022, Online publication date: 1-Jan-2022.
  540. Wang X, Chen H and Barcelo-Ordinas J (2022). A Reinforcement Learning-Based Dynamic Clustering Algorithm for Compressive Data Gathering in Wireless Sensor Networks, Mobile Information Systems, 2022, Online publication date: 1-Jan-2022.
  541. Lu M, Huang Z, Li B, Zhao Y, Qin Z and Li D (2022). SIFTER: A Framework for Robust Rumor Detection, IEEE/ACM Transactions on Audio, Speech and Language Processing, 30, (429-442), Online publication date: 1-Jan-2022.
  542. Chen X, Wu C, Chen T, Liu Z, Zhang H, Bennis M, Liu H and Ji Y (2022). Information Freshness-Aware Task Offloading in Air-Ground Integrated Edge Computing Systems, IEEE Journal on Selected Areas in Communications, 40:1, (243-258), Online publication date: 1-Jan-2022.
  543. Mao K, Dong Q, Wang Y and Honga D (2022). An Exploratory Approach to Intelligent Quiz Question Recommendation, Procedia Computer Science, 207:C, (4065-4074), Online publication date: 1-Jan-2022.
  544. Gergely M (2022). Finding Cooperation in the N-Player Iterated Prisoner's Dilemma with Deep Reinforcement Learning Over Dynamic Complex Networks, Procedia Computer Science, 207:C, (465-474), Online publication date: 1-Jan-2022.
  545. Zabin A, González V, Zou Y and Amor R (2023). Applications of machine learning to BIM, Advanced Engineering Informatics, 51:C, Online publication date: 1-Jan-2022.
  546. ACM
    Gyarteng E, Shi R and Long Y Joint Control of Lane Allocation and Traffic Light for Changeable-Lane Intersection Based on Reinforcement Learning Proceedings of the 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence, (1-5)
  547. ACM
    Bertino E and Karim I AI-powered Network Security: Approaches and Research Directions Proceedings of the 8th International Conference on Networking, Systems and Security, (97-105)
  548. ACM
    Zhang C, Dang S, Chen Y and Ling C A Survey of Motion Planning Algorithms Based on Fast Searching Random Tree Proceedings of the 7th International Conference on Communication and Information Processing, (4-8)
  549. Perrusquía A and Yu W Human-Behavior Learning for Infinite-Horizon Optimal Tracking Problems of Robot Manipulators 2021 60th IEEE Conference on Decision and Control (CDC), (57-62)
  550. Gammelli D, Yang K, Harrison J, Rodrigues F, Pereira F and Pavone M Graph Neural Network Reinforcement Learning for Autonomous Mobility-on-Demand Systems 2021 60th IEEE Conference on Decision and Control (CDC), (2996-3003)
  551. Tsukamoto H, Chung S, Slotine J and Fan C A Theoretical Overview of Neural Contraction Metrics for Learning-based Control with Guaranteed Stability 2021 60th IEEE Conference on Decision and Control (CDC), (2949-2954)
  552. Peng Z, Cheng H, Huang R, Hu J, Luo R, Shi K and Ghosh B Adaptive Event-Triggered Motion Tracking Control Strategy for a Lower Limb Rehabilitation Exoskeleton 2021 60th IEEE Conference on Decision and Control (CDC), (1795-1801)
  553. Roy R, Raiman J, Kant N, Elkin I, Kirby R, Siu M, Oberman S, Godil S and Catanzaro B PrefixRL: Optimization of Parallel Prefix Circuits using Deep Reinforcement Learning 2021 58th ACM/IEEE Design Automation Conference (DAC), (853-858)
  554. Myasnikov A, Konoplev A, Suprun A, Anisimov V, Kasatkin V and Los’ V (2021). Constructing the Model of an Information System for the Automatization of Penetration Testing, Automatic Control and Computer Sciences, 55:8, (949-955), Online publication date: 1-Dec-2021.
  555. Liu L and Tan R (2021). Certainty driven consistency loss on multi-teacher networks for semi-supervised learning, Pattern Recognition, 120:C, Online publication date: 1-Dec-2021.
  556. Brammer J, Lutz B and Neumann D (2022). Solving the mixed model sequencing problem with reinforcement learning and metaheuristics, Computers and Industrial Engineering, 162:C, Online publication date: 1-Dec-2021.
  557. Qian H and Yu Y (2021). Derivative-free reinforcement learning: a review, Frontiers of Computer Science: Selected Publications from Chinese Universities, 15:6, Online publication date: 1-Dec-2021.
  558. Habet D and Terrioux C (2021). Conflict history based heuristic for constraint satisfaction problem solving, Journal of Heuristics, 27:6, (951-990), Online publication date: 1-Dec-2021.
  559. De Winter J, EI Makrini I, Van de Perre G, Nowé A, Verstraten T and Vanderborght B (2021). Autonomous assembly planning of demonstrated skills with reinforcement learning in simulation, Autonomous Robots, 45:8, (1097-1110), Online publication date: 1-Dec-2021.
  560. Sartor G, Zollo D, Cialdea Mayer M, Oddi A, Rasconi R and Santucci V Option Discovery for Autonomous Generation of Symbolic Knowledge AIxIA 2021 – Advances in Artificial Intelligence, (153-167)
  561. Jia R, Li Q, Huang W, Zhang J and Li X Consistency Regularization for Ensemble Model Based Reinforcement Learning PRICAI 2021: Trends in Artificial Intelligence, (3-16)
  562. Germann T, Alexander F, Ang J, Bilbrey J, Balewski J, Casey T, Chard R, Choi J, Choudhury S, Debusschere B, DeGennaro A, Dryden N, Ellis J, Foster I, Cardona C, Ghosh S, Harrington P, Huang Y, Jha S, Johnston T, Kagawa A, Kannan R, Kumar N, Liu Z, Maruyama N, Matsuoka S, McCarthy E, Mohd-Yusof J, Nugent P, Oyama Y, Proffen T, Pugmire D, Rajamanickam S, Ramakrishniah V, Schram M, Seal S, Sivaraman G, Sweeney C, Tan L, Thakur R, Van Essen B, Ward L, Welch P, Wolf M, Xantheas S, Yager K, Yoo S and Yoon B (2021). Co-design Center for Exascale Machine Learning Technologies (ExaLearn), International Journal of High Performance Computing Applications, 35:6, (598-616), Online publication date: 1-Nov-2021.
  563. de Rosa G and Papa J (2021). A survey on text generation using generative adversarial networks, Pattern Recognition, 119:C, Online publication date: 1-Nov-2021.
  564. Wu Z, Li H, Zheng Y, Xiong C, Jiang Y and Davis L (2021). A Coarse-to-Fine Framework for Resource Efficient Video Recognition, International Journal of Computer Vision, 129:11, (2965-2977), Online publication date: 1-Nov-2021.
  565. Saha T, Gupta D, Saha S and Bhattacharyya P (2021). A hierarchical approach for efficient multi-intent dialogue policy learning, Multimedia Tools and Applications, 80:28-29, (35025-35050), Online publication date: 1-Nov-2021.
  566. Qu S, Abouheaf M, Gueaieb W and Spinello D A Policy Iteration Approach for Flock Motion Control 2021 IEEE International Symposium on Robotic and Sensors Environments (ROSE), (1-7)
  567. Abouheaf M, Gueaieb W, Spinello D and Al-Sharhan S A Data-Driven Model-Reference Adaptive Control Approach Based on Reinforcement Learning 2021 IEEE International Symposium on Robotic and Sensors Environments (ROSE), (1-7)
  568. ACM
    Tong L, Chen Y, Zhou X and Sun Y QoE-Fairness Tradeoff Scheme for Dynamic Spectrum Allocation Based on Deep Reinforcement Learning Proceedings of the 5th International Conference on Computer Science and Application Engineering, (1-7)
  569. ACM
    Wang X, Wan Z and Zhang Y A DQN-based Internet Financial Fraud Transaction Detection Method Proceedings of the 5th International Conference on Computer Science and Application Engineering, (1-5)
  570. Chu K, Zhu X and Zhu W Accelerating Lifelong Reinforcement Learning via Reshaping Rewards* 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (619-624)
  571. Li X, Wang X and Zha W Online Adaptive Optimal Control of Discrete-time Linear Systems via Synchronous Q-learning 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (2024-2029)
  572. ACM
    Vereschak O, Bailly G and Caramiaux B (2021). How to Evaluate Trust in AI-Assisted Decision Making? A Survey of Empirical Methodologies, Proceedings of the ACM on Human-Computer Interaction, 5:CSCW2, (1-39), Online publication date: 13-Oct-2021.
  573. Fong J, Campolo D, Acar C and Tee K Model-Based Reinforcement Learning with LSTM Networks for Non-Prehensile Manipulation Planning 2021 21st International Conference on Control, Automation and Systems (ICCAS), (1152-1158)
  574. Chetouani M Interactive Robot Learning: An Overview Human-Centered Artificial Intelligence, (140-172)
  575. Sun L, Zong T, Wang S, Liu Y and Wang Y (2021). Towards Optimal Low-Latency Live Video Streaming, IEEE/ACM Transactions on Networking, 29:5, (2327-2338), Online publication date: 1-Oct-2021.
  576. Bedewy A, Sun Y, Singh R and Shroff N (2021). Low-Power Status Updates via Sleep-Wake Scheduling, IEEE/ACM Transactions on Networking, 29:5, (2129-2141), Online publication date: 1-Oct-2021.
  577. Kicki P, Gawron T, Ćwian K, Ozay M and Skrzypczyński P (2021). Learning from experience for rapid generation of local car maneuvers, Engineering Applications of Artificial Intelligence, 105:C, Online publication date: 1-Oct-2021.
  578. Zhang L, Chen Y, Wang W, Han Z, Li S, Pan Z and Pan G (2021). A Monte Carlo Neural Fictitious Self-Play approach to approximate Nash Equilibrium in imperfect-information dynamic games, Frontiers of Computer Science: Selected Publications from Chinese Universities, 15:5, Online publication date: 1-Oct-2021.
  579. Araki B, Vodrahalli K, Leech T, Vasile C, Donahue M and Rus D (2021). Learning and planning with logical automata, Autonomous Robots, 45:7, (1013-1028), Online publication date: 1-Oct-2021.
  580. ACM
    Collins E, Neto A, Vincenzi A and Maldonado J Deep Reinforcement Learning based Android Application GUI Testing Proceedings of the XXXV Brazilian Symposium on Software Engineering, (186-194)
  581. ACM
    Ross M, Broz F and Baillie L Observing and Clustering Coaching Behaviours to Inform the Design of a Personalised Robotic Coach Proceedings of the 23rd International Conference on Mobile Human-Computer Interaction, (1-17)
  582. Tanaka K, Hamaya M, Joshi D, von Drigalski F, Yonetani R, Matsubara T and Ijiri Y Learning Robotic Contact Juggling 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (958-964)
  583. Yan C, Xiang X, Wang C and Lan Z Flocking and Collision Avoidance for a Dynamic Squad of Fixed-Wing UAVs Using Deep Reinforcement Learning 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (4738-4744)
  584. Torabi F, Warnell G and Stone P DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2391-2397)
  585. Naveed K, Qiao Z and Dolan J Trajectory Planning for Autonomous Vehicles Using Hierarchical Reinforcement Learning 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), (601-606)
  586. Kuutti S, Fallah S and Bowden R ARC: Adversarially Robust Control Policies for Autonomous Vehicles 2021 IEEE International Intelligent Transportation Systems Conference (ITSC), (522-529)
  587. ACM
    Jokinen J and Kujala T Modelling Drivers’ Adaptation to Assistance Systems 13th International Conference on Automotive User Interfaces and Interactive Vehicular Applications, (12-19)
  588. ACM
    Peng X, Ma Z, Abbeel P, Levine S and Kanazawa A (2021). AMP, ACM Transactions on Graphics, 40:4, (1-20), Online publication date: 31-Aug-2021.
  589. Zhang Y, Xu Z, Wu J and Guan X An Event-based Optimization Method for Building Evacuation with Queuing Network Model 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE), (1961-1966)
  590. ACM
    Tang X, Zhang F, Qin Z, Wang Y, Shi D, Song B, Tong Y, Zhu H and Ye J Value Function is All You Need: A Unified Learning Framework for Ride Hailing Platforms Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, (3605-3615)
  591. ACM
    Eliyahu T, Kazak Y, Katz G and Schapira M Verifying learning-augmented systems Proceedings of the 2021 ACM SIGCOMM 2021 Conference, (305-318)
  592. Kirtay M, Oztop E, Asada M and Hafner V Trust me! I am a robot: an affective computational account of scaffolding in robot-robot interaction 2021 30th IEEE International Conference on Robot & Human Interactive Communication (RO-MAN), (189-196)
  593. Gao Q, Yang X, Ji Y and Liu J Robust Controller Design Dased on Policy Iteration 2021 IEEE International Conference on Mechatronics and Automation (ICMA), (182-186)
  594. Zhu X, Zheng X, Zhang Q, Chen Z, Liu Y and Liang B Natural Residual Reinforcement Learning for Bicycle Robot Control 2021 IEEE International Conference on Mechatronics and Automation (ICMA), (1201-1206)
  595. Liao H, Zhou Z, Kong W, Chen Y, Wang X, Wang Z and Al Otaibi S (2021). Learning-Based Intent-Aware Task Offloading for Air-Ground Integrated Vehicular Edge Computing, IEEE Transactions on Intelligent Transportation Systems, 22:8, (5127-5139), Online publication date: 1-Aug-2021.
  596. Kumar N, Rahman S and Dhakad N (2021). Fuzzy Inference Enabled Deep Reinforcement Learning-Based Traffic Light Control for Intelligent Transportation System, IEEE Transactions on Intelligent Transportation Systems, 22:8, (4919-4928), Online publication date: 1-Aug-2021.
  597. ACM
    Shahsavari M, Thomas D, Brown A and Luk W Neuromorphic Design Using Reward-based STDP Learning on Event-Based Reconfigurable Cluster Architecture International Conference on Neuromorphic Systems 2021, (1-8)
  598. Ni C, Zhang A, Duan Y and Wang M Learning Good State and Action Representations via Tensor Decomposition 2021 IEEE International Symposium on Information Theory (ISIT), (1682-1687)
  599. Kwon S, Oh D and Ko Y (2021). Word sense disambiguation based on context selection using knowledge-based word similarity, Information Processing and Management: an International Journal, 58:4, Online publication date: 1-Jul-2021.
  600. ACM
    Scurto H, Caramiaux B and Bevilacqua F Prototyping Machine Learning Through Diffractive Art Practice Proceedings of the 2021 ACM Designing Interactive Systems Conference, (2013-2025)
  601. ACM
    Li Z, Shi L, Cristea A and Zhou Y A Survey of Collaborative Reinforcement Learning: Interactive Methods and Design Patterns Proceedings of the 2021 ACM Designing Interactive Systems Conference, (1579-1590)
  602. ACM
    De Ath G, Everson R, Rahat A and Fieldsend J (2021). Greed Is Good: Exploration and Exploitation Trade-offs in Bayesian Optimisation, ACM Transactions on Evolutionary Learning and Optimization, 1:1, (1-22), Online publication date: 28-Jun-2021.
  603. Zhang H, Sun J and Xu Z Learning to Mutate for Differential Evolution 2021 IEEE Congress on Evolutionary Computation (CEC), (1-8)
  604. ACM
    Xie W Research on UAV Anti-Multi-type Interferences Strategy Based on Improved Q-Learning Proceedings of the 2021 3rd International Conference on Information Technology and Computer Communications, (109-114)
  605. ACM
    Sahoo S, Baranwal A, Ullah S and Kumar A MemOReL Proceedings of the 2021 on Great Lakes Symposium on VLSI, (339-346)
  606. ACM
    Wu N, Xie Y and Hao C IRONMAN Proceedings of the 2021 on Great Lakes Symposium on VLSI, (39-44)
  607. ACM
    Ma L, Zhang W, Jiao J, Wang W, Butrovich M, Lim W, Menon P and Pavlo A MB2: Decomposed Behavior Modeling for Self-Driving Database Management Systems Proceedings of the 2021 International Conference on Management of Data, (1248-1261)
  608. Zeng A, Yu H, Da Q, Zhan Y, Yu Y, Zhou J and Miao C (2021). Improving search engine efficiency through contextual factor selection, AI Magazine, 42:2, (50-58), Online publication date: 1-Jun-2021.
  609. Taylor M, Bashkirov S, Rico J, Toriyama I, Miyada N, Yanagisawa H and Ishizuka K Learning Bipedal Robot Locomotion from Human Movement 2021 IEEE International Conference on Robotics and Automation (ICRA), (2797-2803)
  610. Qiao Z, Schneider J and Dolan J Behavior Planning at Urban Intersections through Hierarchical Reinforcement Learning* 2021 IEEE International Conference on Robotics and Automation (ICRA), (2667-2673)
  611. Tanaka K, Yonetani R, Hamaya M, Lee R, Drigalski F and Ijiri Y TRANS-AM: Transfer Learning by Aggregating Dynamics Models for Soft Robotic Assembly 2021 IEEE International Conference on Robotics and Automation (ICRA), (4627-4633)
  612. Vasylkiv Y, Ma Z, Li G, Sandry E, Brock H, Nakamura K, Pourang I and Gomez R Automating Behavior Selection for Affective Telepresence Robot 2021 IEEE International Conference on Robotics and Automation (ICRA), (2026-2032)
  613. ACM
    Hunt N, Fulton N, Magliacane S, Hoang T, Das S and Solar-Lezama A Verifiably safe exploration for end-to-end reinforcement learning Proceedings of the 24th International Conference on Hybrid Systems: Computation and Control, (1-11)
  614. ACM
    Ma C, Wang H, Sun H, van Huijgevoort E, Wang M and He Z Powering TV Experiences with Anytime Environmental Exploration Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, (1-6)
  615. ACM
    Liao Y Computational Workflows for Designing Input Devices Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, (1-6)
  616. ACM
    Alabed S and Yoneki E High-Dimensional Bayesian Optimization with Multi-Task Learning for RocksDB Proceedings of the 1st Workshop on Machine Learning and Systems, (111-119)
  617. ACM
    Omatu N and Phillips J Benefits of combining dimensional attention and working memory for partially observable reinforcement learning problems Proceedings of the 2021 ACM Southeast Conference, (209-213)
  618. Broekens J and Chetouani M (2021). Towards Transparent Robot Learning Through TDRL-Based Emotional Expressions, IEEE Transactions on Affective Computing, 12:2, (352-362), Online publication date: 1-Apr-2021.
  619. Chu N, Hoang D, Nguyen D, Huynh N and Dutkiewicz E Fast or Slow: An Autonomous Speed Control Approach for UAV-assisted IoT Data Collection Networks 2021 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  620. Shen X, Guo L, Lu Z, Wen X and Zhou S WiAgent: Link Selection for CSI-Based Activity Recognition in Densely Deployed Wi-Fi Environments 2021 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  621. Peng Y, Liu Y and Zhang H Deep Reinforcement Learning based Path Planning for UAV-assisted Edge Computing Networks 2021 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  622. Ntemos K, Kolokotronis N and Kalouptsidis N Using trust to mitigate malicious and selfish behavior of autonomous agents in CRNs 2016 IEEE 27th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), (1-7)
  623. Xi Wang and Tao Tang Optimal control of heavy haul train on steep downward slope 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), (778-783)
  624. Kirtay M, Vannucci L, Falotico E, Oztop E and Laschi C Sequential decision making based on emergent emotion for a humanoid robot 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), (1101-1106)
  625. Cui Y, Matsubara T and Sugimoto K Kernel dynamic policy programming: Practical reinforcement learning for high-dimensional robots 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids), (662-667)
  626. Devraj A and Meyn S Differential TD learning for value function approximation 2016 IEEE 55th Conference on Decision and Control (CDC), (6347-6354)
  627. Farahmand A, Nabi S, Grover P and Nikovski D Learning to control partial differential equations: Regularized Fitted Q-Iteration approach 2016 IEEE 55th Conference on Decision and Control (CDC), (4578-4585)
  628. Wang M and Chen Y An online primal-dual method for discounted Markov decision processes 2016 IEEE 55th Conference on Decision and Control (CDC), (4516-4521)
  629. Bian T and Jiang Z Value iteration, adaptive dynamic programming, and optimal control of nonlinear systems 2016 IEEE 55th Conference on Decision and Control (CDC), (3375-3380)
  630. Alibekov E, Kubalík J and Babuška R Symbolic method for deriving policy in reinforcement learning 2016 IEEE 55th Conference on Decision and Control (CDC), (2789-2795)
  631. Li N, Oyler D, Zhang M, Yildiz Y, Girard A and Kolmanovsky I Hierarchical reasoning game theory based approach for evaluation and testing of autonomous vehicle control systems 2016 IEEE 55th Conference on Decision and Control (CDC), (727-733)
  632. ACM
    ABE T, ORIHARA R, SEI Y, TAHARA Y and OHSUGA A Acquisition of Cooperative Behavior in a Soccer Task Using Reward Shaping Proceedings of the 2021 5th International Conference on Innovation in Artificial Intelligence, (145-150)
  633. Prashanth L, Korda N and Munos R (2021). Concentration bounds for temporal difference learning with linear function approximation: the case of batch data and uniform sampling, Machine Language, 110:3, (559-618), Online publication date: 1-Mar-2021.
  634. ACM
    Pradhan A, Joy E, Jawagal H and Prasad Jayaraman S A Framework for Leveraging Contextual Information in Automated Domain Specific Comprehension 2021 International Symposium on Electrical, Electronics and Information Engineering, (263-270)
  635. ACM
    SUGIMOTO M, UCHIDA R, TSUZUKI S, SORI H, INOUE H, KURASHIGE K and URUSHIHARA S An Experimental Study for Tracking Ability of Deep Q-Network under the Multi-Objective Behaviour using a Mobile Robot with LiDAR 2021 International Symposium on Electrical, Electronics and Information Engineering, (81-87)
  636. Janiar S and Pourahmadi V Deep-Reinforcement Learning for Fair Distributed Dynamic Spectrum Access in Wireless Networks 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), (1-4)
  637. De Rango F, Cordeschi N and Ritacco F Applying Q-learning approach to CSMA Scheme to dynamically tune the contention probability 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), (1-4)
  638. Gatimu K and Lee B qMDP: DASH Adaptation using Queueing Theory within a Markov Decision Process 2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC), (1-6)
  639. Hansen K, Misra K and Pai M (2021). Frontiers, Marketing Science, 40:1, (1-12), Online publication date: 1-Jan-2021.
  640. ACM
    Asadulaev A, Stein G and Filchenkov A Transgenerators Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence, (1-5)
  641. ACM
    Cui G, Shen R, Chen Y, Zou J, Yang S, Fan C and Zheng J Reinforced Evolutionary Algorithms for Game Difficulty Control Proceedings of the 2020 3rd International Conference on Algorithms, Computing and Artificial Intelligence, (1-7)
  642. ACM
    Shuvo S, Ahmed M, Kabir S and Shetu S Application of Machine Learning Based Hospital Up-gradation Policy for Bangladesh Proceedings of the 7th International Conference on Networking, Systems and Security, (18-24)
  643. Breschi V, Masti D, Formentin S and Bemporad A NAW-NET: neural anti-windup control for saturated nonlinear systems 2020 59th IEEE Conference on Decision and Control (CDC), (3335-3340)
  644. Greene M, Abudia M, Kamalapurkar R and Dixon W Model-Based Reinforcement Learning for Optimal Feedback Control of Switched Systems 2020 59th IEEE Conference on Decision and Control (CDC), (162-167)
  645. Westenbroek T, Mazumdar E, Fridovich-Keil D, Prabhu V, Tomlin C and Sastry S Adaptive Control for Linearizable Systems Using On-Policy Reinforcement Learning 2020 59th IEEE Conference on Decision and Control (CDC), (118-125)
  646. Ferrarotti L and Bemporad A Learning nonlinear feedback controllers from data via optimal policy search and stochastic gradient descent 2020 59th IEEE Conference on Decision and Control (CDC), (4961-4966)
  647. Chen X, Wu C, Chen T, Liu Z, Bennis M and Ji Y Age of Information-Aware Resource Management in UAV-Assisted Mobile-Edge Computing Systems GLOBECOM 2020 - 2020 IEEE Global Communications Conference, (1-6)
  648. Ye P, Wang Y, Li J, Xiao L and Zhu G (τ, ϵ)-Greedy Reinforcement Learning For Anti-Jamming Wireless Communications GLOBECOM 2020 - 2020 IEEE Global Communications Conference, (1-6)
  649. Wang Z, Wei Y, Yu F and Han Z Utility Optimization for Resource Allocation in Edge Network Slicing Using DRL GLOBECOM 2020 - 2020 IEEE Global Communications Conference, (1-6)
  650. Gao Y, Wu W, Dong J, Yin Y and Si P Deep Reinforcement Learning based Node Pairing Scheme in Edge-chain for IoT Applications GLOBECOM 2020 - 2020 IEEE Global Communications Conference, (1-6)
  651. Kidambi R, Rajeswaran A, Netrapalli P and Joachims T MOReL Proceedings of the 34th International Conference on Neural Information Processing Systems, (21810-21823)
  652. Scialom T, Dray P, Lamprier S, Piwowarski B and Staiano J ColdGANs Proceedings of the 34th International Conference on Neural Information Processing Systems, (18978-18989)
  653. Parker-Holder J, Pacchiano A, Choromanski K and Roberts S Effective diversity in population based reinforcement learning Proceedings of the 34th International Conference on Neural Information Processing Systems, (18050-18062)
  654. Zheng H, Wei P, Jiang J, Long G, Lu Q and Zhang C Cooperative heterogeneous deep reinforcement learning Proceedings of the 34th International Conference on Neural Information Processing Systems, (17455-17465)
  655. Brantley K, Dudík M, Lykouris T, Miryoosefi S, Simchowitz M, Slivkins A and Sun W Constrained episodic reinforcement learning in concave-convex and knapsack settings Proceedings of the 34th International Conference on Neural Information Processing Systems, (16315-16326)
  656. Xu T, Li Z and Yu Y Error bounds of imitating policies and environments Proceedings of the 34th International Conference on Neural Information Processing Systems, (15737-15749)
  657. Lee D and He N A unified switching system perspective and convergence analysis of Q-learning algorithms Proceedings of the 34th International Conference on Neural Information Processing Systems, (15556-15567)
  658. Chen J, Chen S and Pan S Storage efficient and dynamic flexible runtime channel pruning via deep reinforcement learning Proceedings of the 34th International Conference on Neural Information Processing Systems, (14747-14758)
  659. Curi S, Berkenkamp F and Krause A Efficient model-based reinforcement learning through optimistic policy search and planning Proceedings of the 34th International Conference on Neural Information Processing Systems, (14156-14170)
  660. Yu T, Thomas G, Yu L, Ermon S, Zou J, Levine S, Finn C and Ma T MOPO Proceedings of the 34th International Conference on Neural Information Processing Systems, (14129-14142)
  661. Tang Y Self-imitation learning via generalized lower bound Q-learning Proceedings of the 34th International Conference on Neural Information Processing Systems, (13964-13975)
  662. Zhang J, Kumor D and Bareinboim E Causal imitation learning with unobserved confounders Proceedings of the 34th International Conference on Neural Information Processing Systems, (12263-12274)
  663. Wang S, Huang L and Lui J Restless-UCB, an efficient and low-complexity algorithm for online restless bandits Proceedings of the 34th International Conference on Neural Information Processing Systems, (11878-11889)
  664. Pan L, Cai Q and Huang L Softmax deep double deterministic policy gradients Proceedings of the 34th International Conference on Neural Information Processing Systems, (11767-11777)
  665. Zhu G, Zhang M, Lee H and Zhang C Bridging imagination and reality for model-based deep reinforcement learning Proceedings of the 34th International Conference on Neural Information Processing Systems, (8993-9006)
  666. Lee S and Bareinboim E Characterizing optimal mixed policies Proceedings of the 34th International Conference on Neural Information Processing Systems, (8565-8576)
  667. Kwon M, Daptardar S, Schrater P and Pitkow X Inverse rational control with partially observable continuous nonlinear dynamics Proceedings of the 34th International Conference on Neural Information Processing Systems, (7898-7909)
  668. Tang Z, Feng Y, Zhang N, Peng J and Liu Q Off-policy interval estimation with lipschitz value iteration Proceedings of the 34th International Conference on Neural Information Processing Systems, (7887-7897)
  669. Li A, Pinto L and Abbeel P Generalized hindsight for reinforcement learning Proceedings of the 34th International Conference on Neural Information Processing Systems, (7754-7767)
  670. Nabli A and Carvalho M Curriculum learning for multilevel budgeted combinatorial problems Proceedings of the 34th International Conference on Neural Information Processing Systems, (7044-7056)
  671. van Seijen H, Nekoei H, Racah E and Chandar S The LoCA regret Proceedings of the 34th International Conference on Neural Information Processing Systems, (6562-6572)
  672. Chang M, Gupta A and Gupta S Semantic visual navigation by watching YouTube videos Proceedings of the 34th International Conference on Neural Information Processing Systems, (4283-4294)
  673. Mazoure B, des Combes R, Doan T, Bachman P and Hjelm R Deep reinforcement and InfoMax learning Proceedings of the 34th International Conference on Neural Information Processing Systems, (3686-3698)
  674. Lee J, Lee B and Kim K Reinforcement learning for control with multiple frequencies Proceedings of the 34th International Conference on Neural Information Processing Systems, (3254-3264)
  675. Boutilier C, Hsu C, Kveton B, Mladenov M, Szepesvári C and Zaheer M Differentiable meta-learning of bandit policies Proceedings of the 34th International Conference on Neural Information Processing Systems, (2122-2134)
  676. ACM
    Angelopoulos G and Metafas D Q Learning applied on the Board Game Dominion Proceedings of the 24th Pan-Hellenic Conference on Informatics, (34-37)
  677. Cai J WD3-MPER: A Method to Alleviate Approximation Bias in Actor-Critic Neural Information Processing, (713-724)
  678. Tan R, Ikeda K and Vergara J Hindsight-Combined and Hindsight-Prioritized Experience Replay Neural Information Processing, (429-439)
  679. Saeedvand S, Aghdasi H and Baltes J (2020). Novel hybrid algorithm for Team Orienteering Problem with Time Windows for rescue applications, Applied Soft Computing, 96:C, Online publication date: 1-Nov-2020.
  680. Frazelle C, Rogers J, Karamouzas I and Walker I Optimizing a Continuum Manipulator’s Search Policy Through Model-Free Reinforcement Learning 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (5564-5571)
  681. ACM
    Pagalyte E, Mancini M and Climent L Go with the Flow Proceedings of the 20th ACM International Conference on Intelligent Virtual Agents, (1-8)
  682. ACM
    Feng Y, Fan M, Sun M and Li P A Reinforced Semi-supervised Neural Network for Helpful Review Identification Proceedings of the 29th ACM International Conference on Information & Knowledge Management, (2021-2024)
  683. ACM
    Xian Y, Fu Z, Zhao H, Ge Y, Chen X, Huang Q, Geng S, Qin Z, de Melo G, Muthukrishnan S and Zhang Y CAFE Proceedings of the 29th ACM International Conference on Information & Knowledge Management, (1645-1654)
  684. ACM
    Li Y, Zheng Y and Yang Q Cooperative Multi-Agent Reinforcement Learning in Express System Proceedings of the 29th ACM International Conference on Information & Knowledge Management, (805-814)
  685. Baldi S, Liu D and Zhang Z On recursive temporal difference and eligibility traces IECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society, (501-506)
  686. ACM
    Bisi L, Liotet P, Sabbioni L, Reho G, Montali N, Restelli M and Corno C Foreign exchange trading Proceedings of the First ACM International Conference on AI in Finance, (1-8)
  687. ACM
    Yang H, Liu X, Zhong S and Walid A Deep reinforcement learning for automated stock trading Proceedings of the First ACM International Conference on AI in Finance, (1-8)
  688. ACM
    Vittori E, Trapletti M and Restelli M Option hedging with risk averse reinforcement learning Proceedings of the First ACM International Conference on AI in Finance, (1-8)
  689. Al-Mahbashi A, Schwartz H and Lambadaris I Machine Learning Approach for Multiple Coordinated Aerial Drones Pursuit-Evasion Games 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (642-647)
  690. Abouheaf M, Gueaieb W, Miah M and Spinello D Trajectory Tracking of Underactuated Sea Vessels With Uncertain Dynamics: An Integral Reinforcement Learning Approach 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (1866-1871)
  691. Kobayashi S and Shibuya T Reinforcement Learning Compensator Robust to the Time Constants of First Order Delay Elements 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (141-146)
  692. Perrusquía A, Yu W and Li X Robust Control in the Worst Case Using Continuous Time Reinforcement Learning 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (1951-1954)
  693. ACM
    Rahman A and Bhuiyan F A vision to mitigate bioinformatics software development challenges Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, (57-60)
  694. Kušić K, Dusparic I, Guériau M, Gregurić M and Ivanjko E Extended Variable Speed Limit control using Multi-agent Reinforcement Learning 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), (1-8)
  695. Garg D, Chli M and Vogiatzis G Multi-Agent Deep Reinforcement Learning for Traffic optimization through Multiple Road Intersections using Live Camera Feed 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), (1-8)
  696. Menéndez-Romero C, Winkler F, Dornhege C and Burgard W Maneuver Planning and Learning: a Lane Selection Approach for Highly Automated Vehicles in Highway Scenarios. 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), (1-7)
  697. Shabestary S, Abdulhai B, Ma H and Huo Y Cycle-level vs. Second-by-Second Adaptive Traffic Signal Control using Deep Reinforcement Learning 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC), (1-8)
  698. ACM
    Cui Y, Zhang G, Dong W, Sun X and Yang W Knowledge-based Deep Reinforcement Learning for Train Automatic Stop Control of High-Speed Railway Proceedings of the 2020 3rd International Conference on Machine Learning and Machine Intelligence, (31-36)
  699. Lin S and Beling P A Deep Reinforcement Learning Framework for Optimal Trade Execution Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track, (223-240)
  700. Aubret A, Matignon L and Hassas S ELSIM: End-to-End Learning of Reusable Skills Through Intrinsic Motivation Machine Learning and Knowledge Discovery in Databases, (541-556)
  701. Manoharan A, Ramesh R and Ravindran B Option Encoder: A Framework for Discovering a Policy Basis in Reinforcement Learning Machine Learning and Knowledge Discovery in Databases, (509-524)
  702. ACM
    Hu Z and Xing E Learning from All Types of Experiences Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (3531-3532)
  703. ACM
    Duan L, Zhan Y, Hu H, Gong Y, Wei J, Zhang X and Xu Y Efficiently Solving the Practical Vehicle Routing Problem Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (3054-3063)
  704. ACM
    Yancey K and Settles B A Sleeping, Recovering Bandit Algorithm for Optimizing Recurring Notifications Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (3008-3016)
  705. ACM
    Sachdeva N, Su Y and Joachims T Off-policy Bandits with Deficient Support Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (965-975)
  706. ACM
    Huai M, Sun J, Cai R, Yao L and Zhang A Malicious Attacks against Deep Reinforcement Learning Interpretations Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (472-482)
  707. Jain U, Weihs L, Kolve E, Farhadi A, Lazebnik S, Kembhavi A and Schwing A A Cordial Sync: Going Beyond Marginal Policies for Multi-agent Embodied Tasks Computer Vision – ECCV 2020, (471-490)
  708. Zou M, Huang E, Vogel-Heuser B and Cherr C Efficiently Learning a Distributed Control Policy in Cyber-Physical Production Systems Via Simulation Optimization 2020 IEEE 16th International Conference on Automation Science and Engineering (CASE), (645-651)
  709. ACM
    Liu Z, Wang L and Quan G Deep Reinforcement Learning based Elasticity-compatible Heterogeneous Resource Management for Time-critical Computing Proceedings of the 49th International Conference on Parallel Processing, (1-11)
  710. Simester D, Timoshenko A and Zoumpoulis S (2020). Efficiently Evaluating Targeting Policies, Management Science, 66:8, (3412-3424), Online publication date: 1-Aug-2020.
  711. ACM
    Zhao K, Wang X, Zhang Y, Zhao L, Liu Z, Xing C and Xie X Leveraging Demonstrations for Reinforcement Recommendation Reasoning over Knowledge Graphs Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, (239-248)
  712. ACM
    Wang Y, Wang J, Huang H, Li H and Liu X Evolutionary Product Description Generation Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, (119-128)
  713. Mitriakov A, Papadakis P, Mai Nguyen S and Garlatti S Staircase Traversal via Reinforcement Learning for Active Reconfiguration of Assistive Robots 2020 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), (1-8)
  714. Zhang H, Sun J and Xu Z Adaptive Structural Hyper-Parameter Configuration by Q-Learning 2020 IEEE Congress on Evolutionary Computation (CEC), (1-8)
  715. Zhang J and Bareinboim E Designing optimal dynamic treatment regimes Proceedings of the 37th International Conference on Machine Learning, (11012-11022)
  716. Wen J, Dai B, Li L and Schuurmans D Batch stationary distribution estimation Proceedings of the 37th International Conference on Machine Learning, (10203-10213)
  717. Tirinzoni A, Poiani R and Restelli M Sequential transfer in reinforcement learning with a generative model Proceedings of the 37th International Conference on Machine Learning, (9481-9492)
  718. Tangkaratt V, Han B, Khan M and Sugiyama M Variational imitation learning with diverse-quality demonstrations Proceedings of the 37th International Conference on Machine Learning, (9407-9417)
  719. Stooke A, Achiam J and Abbeel P Responsive safety in reinforcement learning by PID lagrangian methods Proceedings of the 37th International Conference on Machine Learning, (9133-9143)
  720. Nguyen V and Osborne M Knowing the what but not the where in Bayesian optimization Proceedings of the 37th International Conference on Machine Learning, (7317-7326)
  721. Khadka S, Majumdar S, Miret S, McAleer S and Tumer K Evolutionary reinforcement learning for sample-efficient multiagent coordination Proceedings of the 37th International Conference on Machine Learning, (6651-6660)
  722. Laskin M, Srinivas A and Abbeel P CURL Proceedings of the 37th International Conference on Machine Learning, (5639-5650)
  723. Grill J, Altché F, Tang Y, Hubert T, Valko M, Antonoglou I and Munos R Monte-Carlo tree search as regularized policy optimization Proceedings of the 37th International Conference on Machine Learning, (3769-3778)
  724. Feng Y, Ren T, Tang Z and Liu Q Accountable off-policy evaluation with kernel Bellman statistics Proceedings of the 37th International Conference on Machine Learning, (3102-3111)
  725. Edwards A, Sahni H, Liu R, Hung J, Jain A, Wang R, Ecoffet A, Miconi T, Isbell C and Yosinski J Estimating Q(s, s ′) with deep deterministic dynamics gradients Proceedings of the 37th International Conference on Machine Learning, (2825-2835)
  726. Bourel H, Maillard O and Talebi M Tightening exploration in upper confidence reinforcement learning Proceedings of the 37th International Conference on Machine Learning, (1056-1066)
  727. Wang H, Kaplan Z, Niu D and Li B Optimizing Federated Learning on Non-IID Data with Reinforcement Learning IEEE INFOCOM 2020 - IEEE Conference on Computer Communications, (1698-1707)
  728. Emara S, Li B and Chen Y Eagle: Refining Congestion Control by Learning from the Experts IEEE INFOCOM 2020 - IEEE Conference on Computer Communications, (676-685)
  729. ACM
    Zhu C, Leung H, Hu S and Cai Y (2021). A Q-values Sharing Framework for Multi-agent Reinforcement Learning under Budget Constraint, ACM Transactions on Autonomous and Adaptive Systems, 15:2, (1-28), Online publication date: 30-Jun-2020.
  730. Moorthy S and Guan Z FlyTera: Echo State Learning for Joint Access and Flight Control in THz-enabled Drone Networks 2020 17th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON), (1-9)
  731. ACM
    Sikdar S and Jermaine C MONSOON: Multi-Step Optimization and Execution of Queries with Partially Obscured Predicates Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data, (225-240)
  732. Simester D, Timoshenko A and Zoumpoulis S (2020). Targeting Prospective Customers, Management Science, 66:6, (2495-2522), Online publication date: 1-Jun-2020.
  733. Lesort T, Lomonaco V, Stoian A, Maltoni D, Filliat D and Díaz-Rodríguez N (2020). Continual learning for robotics, Information Fusion, 58:C, (52-68), Online publication date: 1-Jun-2020.
  734. Shan G, Xu S, Yang L, Jia S and Xiang Y (2020). Learn#, Expert Systems with Applications: An International Journal, 147:C, Online publication date: 1-Jun-2020.
  735. ACM
    Yang L, Hajiesmaili M, Sitaraman R, Wierman A, Mallada E and Wong W (2020). Online Linear Optimization with Inventory Management Constraints, Proceedings of the ACM on Measurement and Analysis of Computing Systems, 4:1, (1-29), Online publication date: 27-May-2020.
  736. Lai Y, Wang W, Yang Y, Zhu J and Kuang M Hindsight Planner Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems, (690-698)
  737. Wang H, Hu X, Yu Q, Gu M, Zhao W, Yan J and Hong T (2020). Integrating reinforcement learning and skyline computing for adaptive service composition, Information Sciences: an International Journal, 519:C, (141-160), Online publication date: 1-May-2020.
  738. Uwano F and Takadama K (2020). Reward Value-Based Goal Selection for Agents’ Cooperative Route Learning Without Communication in Reward and Goal Dynamism, SN Computer Science, 1:3, Online publication date: 1-May-2020.
  739. Zhuang S and Zuccon G Counterfactual Online Learning to Rank Advances in Information Retrieval, (415-430)
  740. Iranfar A, Terraneo F, Csordas G, Zapater M, Fornaciari W and Atienza D Dynamic thermal management with proactive fan speed control through reinforcement learning Proceedings of the 23rd Conference on Design, Automation and Test in Europe, (418-423)
  741. Shin S, Kang Y and Kim Y (2022). Android-GAN, Expert Systems with Applications: An International Journal, 141:C, Online publication date: 1-Mar-2020.
  742. ACM
    Abdelzaher T, Hao Y, Jayarajah K, Misra A, Skarin P, Yao S, Weerakoon D and Årzén K (2020). Five Challenges in Cloud-enabled Intelligence and Control, ACM Transactions on Internet Technology, 20:1, (1-19), Online publication date: 29-Feb-2020.
  743. Yan S, Xie Y, Wu F, Smith J, Lu W and Zhang B (2020). Image captioning via hierarchical attention mechanism and policy gradient optimization, Signal Processing, 167:C, Online publication date: 1-Feb-2020.
  744. Liu X, Mou L, Cui H, Lu Z and Song S (2020). Finding decision jumps in text classification, Neurocomputing, 371:C, (177-187), Online publication date: 2-Jan-2020.
  745. Gao R and Shah C (2020). Toward creating a fairer ranking in search engine results, Information Processing and Management: an International Journal, 57:1, Online publication date: 1-Jan-2020.
  746. Zhao H and Zhang C (2020). An online-learning-based evolutionary many-objective algorithm, Information Sciences: an International Journal, 509:C, (1-21), Online publication date: 1-Jan-2020.
  747. Baioletti M, Milani A and Santucci V (2020). Variable neighborhood algebraic Differential Evolution, Information Sciences: an International Journal, 507:C, (37-52), Online publication date: 1-Jan-2020.
  748. Barbieri E, Capocchi L and Santucci J (2019). Discrete-Event Simulation-Based Q-Learning Algorithm Applied to Financial Leverage Effect, SN Computer Science, 1:1, Online publication date: 1-Jan-2020.
  749. Su J, Cheng H, Guo H and Peng Z (2019). Robust Quadratic Programming for MDPs with uncertain observation noise, Neurocomputing, 370:C, (28-38), Online publication date: 22-Dec-2019.
  750. Saija K, Nethi S, Chaudhuri S and Karthik R A Machine Learning Approach for SNR Prediction in 5G Systems 2019 IEEE International Conference on Advanced Networks and Telecommunications Systems (ANTS), (1-6)
  751. Li Q, Xia L and Song R (2019). Bipartite state synchronization of heterogeneous system with active leader on signed digraph under adversarial inputs, Neurocomputing, 369:C, (69-79), Online publication date: 5-Dec-2019.
  752. Mejdoubi A, Zytoune O, Fouchal H and Ouadou M A Learning Approach for Road Traffic Optimization in Urban Environments Machine Learning for Networking, (355-366)
  753. Sugimoto M, Yoshioka T, Ishii K, Nonaka S, Deguchi M, Tsuzuki S and Hiran M A study for Motion-Planning Method Resident-tracking Robot based on Reinforcement Learning 2019 International Symposium on Micro-NanoMechatronics and Human Science (MHS), (1-5)
  754. Lins R, Dória A and Melo J (2019). Deep reinforcement learning applied to the k-server problem, Expert Systems with Applications: An International Journal, 135:C, (212-218), Online publication date: 30-Nov-2019.
  755. ACM
    Kuremoto T, Matsusaka H, Obayashi M, Mabu S and Kobayashi K An Improved Fuzzy Neural Network for Reinforcement Learning Proceedings of the 3rd International Conference on Big Data Research, (88-93)
  756. Oakley L and Oprea A : An Adaptive Reinforcement Learning Strategy for the Security Game Decision and Game Theory for Security, (364-384)
  757. Aydın H, Çilden E and Polat F Compact Frequency Memory for Reinforcement Learning with Hidden States PRIMA 2019: Principles and Practice of Multi-Agent Systems, (425-433)
  758. ACM
    Ye X and Fu L Deep Reinforcement Learning Based MAC Protocol for Underwater Acoustic Networks Proceedings of the 14th International Conference on Underwater Networks & Systems, (1-5)
  759. ACM
    Dugaev D and Peng Z RA-MAC Proceedings of the 14th International Conference on Underwater Networks & Systems, (1-8)
  760. Yildirim S, Aksakalli V and Alkaya A (2019). Canadian Traveler Problem with Neutralizations, Expert Systems with Applications: An International Journal, 132:C, (151-165), Online publication date: 15-Oct-2019.
  761. Zhang K, Zhang H, Mu Y and Sun S (2019). Tracking control optimization scheme for a class of partially unknown fuzzy systems by using integral reinforcement learning architecture, Applied Mathematics and Computation, 359:C, (344-356), Online publication date: 15-Oct-2019.
  762. ACM
    Rudovic O, Zhang M, Schuller B and Picard R Multi-modal Active Learning From Human Data: A Deep Reinforcement Learning Approach 2019 International Conference on Multimodal Interaction, (6-15)
  763. Wang L, Wang M and Yue T (2019). A fuzzy deterministic policy gradient algorithm for pursuit-evasion differential games, Neurocomputing, 362:C, (106-117), Online publication date: 14-Oct-2019.
  764. ACM
    Zhou M, Chen Y, Wen Y, Yang Y, Su Y, Zhang W, Zhang D and Wang J Factorized Q-learning for large-scale multi-agent systems Proceedings of the First International Conference on Distributed Artificial Intelligence, (1-7)
  765. ACM
    Zimmer M and Weng P An efficient reinforcement learning algorithm for learning deterministic policies in continuous domains Proceedings of the First International Conference on Distributed Artificial Intelligence, (1-7)
  766. Liu W, Wang L, Wang E, Yang Y, Zeghlache D and Zhang D (2022). Reinforcement learning-based cell selection in sparse mobile crowdsensing, Computer Networks: The International Journal of Computer and Telecommunications Networking, 161:C, (102-114), Online publication date: 9-Oct-2019.
  767. Wang S, Bi J, Wu J, Vasilakos A and Fan Q (2022). VNE-TD, Computer Networks: The International Journal of Computer and Telecommunications Networking, 161:C, (251-263), Online publication date: 9-Oct-2019.
  768. Ding D, Ding Z, Wei G and Han F (2019). An improved reinforcement learning algorithm based on knowledge transfer and applications in autonomous vehicles, Neurocomputing, 361:C, (243-255), Online publication date: 7-Oct-2019.
  769. Zhang J, Xia Y and Shen G (2019). A novel learning-based global path planning algorithm for planetary rovers, Neurocomputing, 361:C, (69-76), Online publication date: 7-Oct-2019.
  770. Jiang Y, Han D and Ko H (2019). Relay dueling network for visual tracking with broad field‐of‐view, IET Computer Vision, 13:7, (615-622), Online publication date: 1-Oct-2019.
  771. Ha M, Kwon S, Lee Y, Shim Y and Kim J (2019). Where WTS meets WTB, Pervasive and Mobile Computing, 59:C, Online publication date: 1-Oct-2019.
  772. Mozafari M, Ganjtabesh M, Nowzari-Dalini A, Thorpe S and Masquelier T (2019). Bio-inspired digit recognition using reward-modulated spike-timing-dependent plasticity in deep convolutional networks, Pattern Recognition, 94:C, (87-95), Online publication date: 1-Oct-2019.
  773. Bai C, Liu P, Zhao W and Tang X (2022). Guided goal generation for hindsight multi-goal reinforcement learning, Neurocomputing, 359:C, (353-367), Online publication date: 24-Sep-2019.
  774. Pourpanah F, Wang R, Lim C, Wang X, Seera M and Tan C (2022). An improved fuzzy ARTMAP and Q-learning agent model for pattern classification, Neurocomputing, 359:C, (139-152), Online publication date: 24-Sep-2019.
  775. Maisto D, Friston K and Pezzulo G (2022). Caching mechanisms for habit formation in Active Inference, Neurocomputing, 359:C, (298-314), Online publication date: 24-Sep-2019.
  776. Nguyen N, Nguyen T and Nahavandi S (2022). Multi-agent behavioral control system using deep reinforcement learning, Neurocomputing, 359:C, (58-68), Online publication date: 24-Sep-2019.
  777. Kobayashi T and Sugino T Continual Learning Exploiting Structure of Fractal Reservoir Computing Artificial Neural Networks and Machine Learning – ICANN 2019: Workshop and Special Sessions, (35-47)
  778. Polato M, Faggioli G, Lauriola I and Aiolli F Playing the Large Margin Preference Game Artificial Neural Networks and Machine Learning – ICANN 2019: Deep Learning, (792-804)
  779. Wijesuriya V and Abate A Bayes-Adaptive Planning for Data-Efficient Verification of Uncertain Markov Decision Processes Quantitative Evaluation of Systems, (91-108)
  780. Platzer A The Logical Path to Autonomous Cyber-Physical Systems Quantitative Evaluation of Systems, (25-33)
  781. Moridian B, Page B and Mahmoudian N Sample Efficient Reinforcement Learning for Navigation in Complex Environments 2019 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), (15-21)
  782. Mishra M, Mannaru P, Sidoti D, Bienkowski A, Zhang L and Pattipati K (2019). Context‐Driven Proactive Decision Support for Hybrid Teams, AI Magazine, 40:3, (41-57), Online publication date: 1-Sep-2019.
  783. Zhang T, Cheng Z and Li J (2019). Reinforcement learning-driven address mapping and caching for flash-based remote sensing image processing, Journal of Systems Architecture: the EUROMICRO Journal, 98:C, (374-387), Online publication date: 1-Sep-2019.
  784. Yuan W, Hang K, Kragic D, Wang M and Stork J (2019). End-to-end nonprehensile rearrangement with deep reinforcement learning and simulation-to-reality transfer, Robotics and Autonomous Systems, 119:C, (119-134), Online publication date: 1-Sep-2019.
  785. Butz M, Bilkey D, Humaidan D, Knott A and Otte S (2019). Learning, planning, and control in a monolithic neural event inference architecture, Neural Networks, 117:C, (135-144), Online publication date: 1-Sep-2019.
  786. Zhang F, Tang X, Li X, Khan S and Li Z (2022). Quantifying cloud elasticity with container-based autoscaling, Future Generation Computer Systems, 98:C, (672-681), Online publication date: 1-Sep-2019.
  787. García-Galicia M, Carsteanu A and Clempner J (2022). Continuous-time reinforcement learning approach for portfolio management with time penalization, Expert Systems with Applications: An International Journal, 129:C, (27-36), Online publication date: 1-Sep-2019.
  788. Davoodabadi Farahani M and Mozayani N (2019). Automatic construction and evaluation of macro-actions in reinforcement learning, Applied Soft Computing, 82:C, Online publication date: 1-Sep-2019.
  789. Han D, Böhmer W, Wooldridge M and Rogers A Multi-agent Hierarchical Reinforcement Learning with Dynamic Termination PRICAI 2019: Trends in Artificial Intelligence, (80-92)
  790. Mogavi R, Gujar S, Ma X and Hui P HRCR: Hidden Markov-Based Reinforcement to Reduce Churn in Question Answering Forums PRICAI 2019: Trends in Artificial Intelligence, (364-376)
  791. Chen T, Ma X, You S and Zhang X Soft Actor-Critic-Based Continuous Control Optimization for Moving Target Tracking Image and Graphics, (630-641)
  792. Konen W General Board Game Playing for Education and Research in Generic AI Game Learning 2019 IEEE Conference on Games (CoG), (1-8)
  793. Greenwood G and Ashlock D Monte Carlo Strategies for Exploiting Fairness in N-player Ultimatum Games 2019 IEEE Conference on Games (CoG), (1-7)
  794. Ognibene D, Fiore V and Gu X (2019). Addiction beyond pharmacological effects, Neural Networks, 116:C, (269-278), Online publication date: 1-Aug-2019.
  795. Wang X, Li J, Kuang X, Tan Y and Li J (2019). The security of machine learning in an adversarial setting, Journal of Parallel and Distributed Computing, 130:C, (12-23), Online publication date: 1-Aug-2019.
  796. Toroghi Haghighat A and Shajari M (2019). Block withholding game among bitcoin mining pools, Future Generation Computer Systems, 97:C, (482-491), Online publication date: 1-Aug-2019.
  797. Friedrich S, Schreibauer M and Buss M (2019). Least-squares policy iteration algorithms for robotics, Engineering Applications of Artificial Intelligence, 83:C, (72-84), Online publication date: 1-Aug-2019.
  798. Silva A, Obraczka K, Burleigh S, Nogueira J and Hirata C (2019). A congestion control framework for delay- and disruption tolerant networks, Ad Hoc Networks, 91:C, Online publication date: 1-Aug-2019.
  799. ACM
    Zhao X, Xia L, Tang J and Yin D (2019). "Deep reinforcement learning for search, recommendation, and online advertising: a survey" by Xiangyu Zhao, Long Xia, Jiliang Tang, and Dawei Yin with Martin Vesely as coordinator, ACM SIGWEB Newsletter, 2019:Spring, (1-15), Online publication date: 29-Jul-2019.
  800. Li D, Lei C, Jin Q and Han M Regularization in DQN for Parameter-Varying Control Learning Tasks Advances in Neural Networks – ISNN 2019, (35-44)
  801. Ni C and Wang M Maximum Likelihood Tensor Decomposition of Markov Decision Process 2019 IEEE International Symposium on Information Theory (ISIT), (3062-3066)
  802. Reyes M and Neuhoff D Monotonicity of Entropy in Positively Correlated Ising Trees 2019 IEEE International Symposium on Information Theory (ISIT), (707-711)
  803. Catacora Ocana J, Riccio F, Capobianco R and Nardi D Cooperative Multi-agent Deep Reinforcement Learning in a 2 Versus 2 Free-Kick Task RoboCup 2019: Robot World Cup XXIII, (44-57)
  804. Wang X, Li C, Yu L, Han L, Deng X, Yang E and Ren P (2019). UAV first view landmark localization with active reinforcement learning, Pattern Recognition Letters, 125:C, (549-555), Online publication date: 1-Jul-2019.
  805. Bossens D, Townsend N and Sobey A (2022). Learning to learn with active adaptive perception, Neural Networks, 115:C, (30-49), Online publication date: 1-Jul-2019.
  806. Zhang C and Zheng Z (2019). Task migration for mobile edge computing using deep reinforcement learning, Future Generation Computer Systems, 96:C, (111-118), Online publication date: 1-Jul-2019.
  807. Ben Amor N, El Khalfi Z, Fargier H and Sabbadin R (2019). Lexicographic refinements in possibilistic decision trees and finite-horizon Markov decision processes, Fuzzy Sets and Systems, 366:C, (85-109), Online publication date: 1-Jul-2019.
  808. ACM
    Shang Z, Zgraggen E, Buratti B, Kossmann F, Eichmann P, Chung Y, Binnig C, Upfal E and Kraska T Democratizing Data Science through Interactive Curation of ML Pipelines Proceedings of the 2019 International Conference on Management of Data, (1171-1188)
  809. ACM
    Russo G, Cardellini V and Presti F Reinforcement Learning Based Policies for Elastic Stream Processing on Heterogeneous Resources Proceedings of the 13th ACM International Conference on Distributed and Event-based Systems, (31-42)
  810. ACM
    Ke F, Zhao D, Sun G and Feng W Precise Evaluation for Continuous Action Control in Reinforcement Learning Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference, (67-70)
  811. Abouheaf M and Gueaieb W Model-Free Adaptive Control Approach Using Integral Reinforcement Learning 2019 IEEE International Symposium on Robotic and Sensors Environments (ROSE), (1-7)
  812. Abouheaf M, Mailhot N and Gueaieb W An Online Reinforcement Learning Wing-Tracking Mechanism for Flexible Wing Aircraft 2019 IEEE International Symposium on Robotic and Sensors Environments (ROSE), (1-7)
  813. Abouheaf M and Gueaieb W Neurofuzzy Reinforcement Learning Control Schemes for Optimized Dynamical Performance 2019 IEEE International Symposium on Robotic and Sensors Environments (ROSE), (1-7)
  814. Sciullo L, Trotta A, Gigli L and Di Felice M Deploying W3C Web of Things-Based Interoperable Mash-up Applications for Industry 4.0: A Testbed Wired/Wireless Internet Communications, (3-14)
  815. Takadama K, Yamazaki D, Nakata M and Sato H Complex-Valued-based Learning Classifier System for POMDP Environments 2019 IEEE Congress on Evolutionary Computation (CEC), (1852-1859)
  816. Boyalı A, Hashimoto N, John V and Acarman T Multi-Agent Reinforcement Learning for Autonomous On Demand Vehicles 2019 IEEE Intelligent Vehicles Symposium (IV), (1461-1468)
  817. Moore S and Stamper J Decision Support for an Adversarial Game Environment Using Automatic Hint Generation Intelligent Tutoring Systems, (82-88)
  818. Moran M and Gordon G (2019). Curious Feature Selection, Information Sciences: an International Journal, 485:C, (42-54), Online publication date: 1-Jun-2019.
  819. Feng L, Ali A, Liaqat H, Iftikhar M, Bashir A and Pack S (2022). Stochastic game-based dynamic information delivery system for wireless cooperative networks, Future Generation Computer Systems, 95:C, (277-291), Online publication date: 1-Jun-2019.
  820. Lawhead R and Gosavi A (2019). A bounded actor–critic reinforcement learning algorithm applied to airline revenue management, Engineering Applications of Artificial Intelligence, 82:C, (252-262), Online publication date: 1-Jun-2019.
  821. Wang B, Zhao D and Cheng J (2019). Adaptive cruise control via adaptive dynamic programming with experience replay, Soft Computing - A Fusion of Foundations, Methodologies and Applications, 23:12, (4131-4144), Online publication date: 1-Jun-2019.
  822. Dias Pais G, Dias T, Nascimento J and Miraldo P OmniDRL: Robust Pedestrian Detection using Deep Reinforcement Learning on Omnidirectional Cameras* 2019 International Conference on Robotics and Automation (ICRA), (4782-4789)
  823. Abouheaf M and Gueaieb W Multi-Agent Synchronization Using Online Model-Free Action Dependent Dual Heuristic Dynamic Programming Approach 2019 International Conference on Robotics and Automation (ICRA), (2195-2201)
  824. Choi S and Kim J Trajectory-based Probabilistic Policy Gradient for Learning Locomotion Behaviors 2019 International Conference on Robotics and Automation (ICRA), (1-7)
  825. Bayiz Y, Hsu S, Aguiles A, Shade-Alexander Y and Cheng B Experimental Learning of a Lift-Maximizing Central Pattern Generator for a Flapping Robotic Wing 2019 International Conference on Robotics and Automation (ICRA), (1997-2003)
  826. Berscheid L, Rühr T and Kröger T Improving Data Efficiency of Self-supervised Learning for Robotic Grasping 2019 International Conference on Robotics and Automation (ICRA), (2125-2131)
  827. Hussein M, Begum M and Petrik M Inverse Reinforcement Learning of Interaction Dynamics from Demonstrations 2019 International Conference on Robotics and Automation (ICRA), (2267-2274)
  828. Kendall A, Hawke J, Janz D, Mazur P, Reda D, Allen J, Lam V, Bewley A and Shah A Learning to Drive in a Day 2019 International Conference on Robotics and Automation (ICRA), (8248-8254)
  829. Baar J, Sullivan A, Cordorel R, Jha D, Romeres D and Nikovski D Sim-to-Real Transfer Learning using Robustified Controllers in Robotic Tasks involving Complex Dynamics 2019 International Conference on Robotics and Automation (ICRA), (6001-6007)
  830. Luo J, Solowjow E, Wen C, Ojea J, Agogino A, Tamar A and Abbeel P Reinforcement Learning on Variable Impedance Controller for High-Precision Robotic Assembly 2019 International Conference on Robotics and Automation (ICRA), (3080-3087)
  831. Choi Y, Lee K and Oh S Distributional Deep Reinforcement Learning with a Mixture of Gaussians 2019 International Conference on Robotics and Automation (ICRA), (9791-9797)
  832. Parras J and Zazo S (2019). Learning attack mechanisms in Wireless Sensor Networks using Markov Decision Processes, Expert Systems with Applications: An International Journal, 122:C, (376-387), Online publication date: 15-May-2019.
  833. Wilhelmi F, Cano C, Neu G, Bellalta B, Jonsson A and Barrachina-Muñoz S (2019). Collaborative Spatial Reuse in wireless networks via selfish Multi-Armed Bandits, Ad Hoc Networks, 88:C, (129-141), Online publication date: 15-May-2019.
  834. Kano H, Honda J, Sakamaki K, Matsuura K, Nakamura A and Sugiyama M (2019). Good arm identification via bandit feedback, Machine Language, 108:5, (721-745), Online publication date: 15-May-2019.
  835. ACM
    Wang H, Jenkins P, Wei H, Wu F and Li Z Learning Task-Specific City Region Partition The World Wide Web Conference, (3300-3306)
  836. ACM
    Yao Z, Peddamail J and Sun H CoaCor: Code Annotation for Code Retrieval with Reinforcement Learning The World Wide Web Conference, (2203-2214)
  837. ACM
    Pan F, Cai Q, Tang P, Zhuang F and He Q Policy Gradients for Contextual Recommendations The World Wide Web Conference, (1421-1431)
  838. ACM
    Li M, Qin Z, Jiao Y, Yang Y, Wang J, Wang C, Wu G and Ye J Efficient Ridesharing Order Dispatching with Mean Field Multi-Agent Reinforcement Learning The World Wide Web Conference, (983-994)
  839. ACM
    Pei C, Yang X, Cui Q, Lin X, Sun F, Jiang P, Ou W and Zhang Y Value-aware Recommendation based on Reinforcement Profit Maximization The World Wide Web Conference, (3123-3129)
  840. ACM
    He S and Shin K Spatio-Temporal Capsule-based Reinforcement Learning for Mobility-on-Demand Network Coordination The World Wide Web Conference, (2806-2813)
  841. Song R and Zhu L (2019). Stable value iteration for two-player zero-sum game of discrete-time nonlinear systems based on adaptive dynamic programming, Neurocomputing, 340:C, (180-195), Online publication date: 7-May-2019.
  842. Merdivan E, Singh D, Hanke S and Holzinger A (2019). Dialogue Systems for Intelligent Human Computer Interactions, Electronic Notes in Theoretical Computer Science (ENTCS), 343:C, (57-71), Online publication date: 4-May-2019.
  843. Sigaud O and Stulp F (2022). Policy search in continuous action domains, Neural Networks, 113:C, (28-40), Online publication date: 1-May-2019.
  844. Liu Z and Wu H (2019). New insight into the simultaneous policy update algorithms related to H ∞ state feedback control, Information Sciences: an International Journal, 484:C, (84-94), Online publication date: 1-May-2019.
  845. Chen X, Wang W, Cao W and Wu M (2022). Gaussian-kernel-based adaptive critic design using two-phase value iteration, Information Sciences: an International Journal, 482:C, (139-155), Online publication date: 1-May-2019.
  846. Nouri S, Li H, Venugopal S, Guo W, He M and Tian W (2019). Autonomic decentralized elasticity based on a reinforcement learning controller for cloud applications, Future Generation Computer Systems, 94:C, (765-780), Online publication date: 1-May-2019.
  847. Yazdjerdi P, Meskin N, Al-Naemi M, Al Moustafa A and Kovács L (2019). Reinforcement learning-based control of tumor growth under anti-angiogenic therapy, Computer Methods and Programs in Biomedicine, 173:C, (15-26), Online publication date: 1-May-2019.
  848. El Chamie M, Janak D and Açıkmeşe B (2019). Markov decision processes with sequential sensor measurements, Automatica (Journal of IFAC), 103:C, (450-460), Online publication date: 1-May-2019.
  849. Massaro A, De Pellegrini F and Maggi L Optimal Trunk-Reservation by Policy Learning IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, (127-135)
  850. Liang Q and Modiano E Optimal Network Control in Partially-Controllable Networks IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, (397-405)
  851. Bao Y, Peng Y and Wu C Deep Learning-based Job Placement in Distributed Machine Learning Clusters IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, (505-513)
  852. Wang F, Zhang C, wang F, Liu J, Zhu Y, Pang H and Sun L Intelligent Edge-Assisted Crowdcast with Deep Reinforcement Learning for Personalized QoE IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, (910-918)
  853. Wang H, Niu D and Li B Distributed Machine Learning with a Serverless Architecture IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, (1288-1296)
  854. Zhang Y, Zhao P, Bian K, Liu Y, Song L and Li X DRL360: 360-degree Video Streaming with Deep Reinforcement Learning IEEE INFOCOM 2019 - IEEE Conference on Computer Communications, (1252-1260)
  855. ACM
    Kim D and Ko Y Energy-aware medium access control for energy-harvesting machine-to-machine networks Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, (2399-2405)
  856. ACM
    Habet D and Terrioux C Conflict history based search for constraint satisfaction problem Proceedings of the 34th ACM/SIGAPP Symposium on Applied Computing, (1117-1122)
  857. Fulton N and Platzer A Verifiably Safe Off-Model Reinforcement Learning Tools and Algorithms for the Construction and Analysis of Systems, (413-430)
  858. ACM
    Alexopoulos C, Lachana Z, Androutsopoulou A, Diamantopoulou V, Charalabidis Y and Loutsaris M How Machine Learning is Changing e-Government Proceedings of the 12th International Conference on Theory and Practice of Electronic Governance, (354-363)
  859. Pravin Renold A and Balaji Ganesh A (2019). Energy efficient secure data collection with path-constrained mobile sink in duty-cycled unattended wireless sensor network, Pervasive and Mobile Computing, 55:C, (1-12), Online publication date: 1-Apr-2019.
  860. Fachechi A, Agliari E and Barra A (2019). Dreaming neural networks, Neural Networks, 112:C, (24-40), Online publication date: 1-Apr-2019.
  861. Li N, Zhang Y, Zhu L, Luo W and Kwong S (2019). Reinforcement learning based coding unit early termination algorithm for high efficiency video coding, Journal of Visual Communication and Image Representation, 60:C, (276-286), Online publication date: 1-Apr-2019.
  862. Hajian Heidary M and Aghaie A (2019). Risk averse sourcing in a stochastic supply chain, Computers and Industrial Engineering, 130:C, (62-74), Online publication date: 1-Apr-2019.
  863. Ren H, Zhang H, Wen Y and Liu C (2019). Integral reinforcement learning off-policy method for solving nonlinear multi-player nonzero-sum games with saturated actuator, Neurocomputing, 335:C, (96-104), Online publication date: 28-Mar-2019.
  864. Watanabe T Sampling Strategies for Fuzzy RANSAC Algorithm Based on Reinforcement Learning Integrated Uncertainty in Knowledge Modelling and Decision Making, (122-134)
  865. Li J and Tan Y (2019). A two-stage imitation learning framework for the multi-target search problem in swarm robotics, Neurocomputing, 334:C, (249-264), Online publication date: 21-Mar-2019.
  866. ACM
    Li M, Li A, Huang Y and Chu S Implementation of Deep Reinforcement Learning Proceedings of the 2nd International Conference on Information Science and Systems, (232-236)
  867. Fugate S and Ferguson‐Walter K (2019). Artificial Intelligence and Game Theory Models for Defending Critical Networks with Cyber Deception, AI Magazine, 40:1, (49-62), Online publication date: 1-Mar-2019.
  868. Barto A (2019). Reinforcement Learning, AI Magazine, 40:1, (3-15), Online publication date: 1-Mar-2019.
  869. Misra K, Schwartz E and Abernethy J (2019). Dynamic Online Pricing with Incomplete Information Using Multiarmed Bandit Experiments, Marketing Science, 38:2, (226-252), Online publication date: 1-Mar-2019.
  870. Gohari F, Aliee F and Haghighi H (2022). A Dynamic Local–Global Trust-aware Recommendation approach, Electronic Commerce Research and Applications, 34:C, Online publication date: 1-Mar-2019.
  871. Kazimipour B, Omidvar M, Qin A, Li X and Yao X (2022). Bandit-based cooperative coevolution for tackling contribution imbalance in large-scale optimization problems, Applied Soft Computing, 76:C, (265-281), Online publication date: 1-Mar-2019.
  872. Zhou Y, van Kampen E and Chu Q (2019). Hybrid Hierarchical Reinforcement Learning for online guidance and navigation with partial observability, Neurocomputing, 331:C, (443-457), Online publication date: 28-Feb-2019.
  873. ACM
    Chen M, Beutel A, Covington P, Jain S, Belletti F and Chi E Top-K Off-Policy Correction for a REINFORCE Recommender System Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, (456-464)
  874. ACM
    Qu C, Ji F, Qiu M, Yang L, Min Z, Chen H, Huang J and Croft W Learning to Selectively Transfer Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, (699-707)
  875. ACM
    Jagerman R, Markov I and de Rijke M When People Change their Mind Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining, (447-455)
  876. ACM
    Joshi R, Gupta V, Li X, Cui Y, Wang Z, Ravari Y, Klabjan D, Sifa R, Parsaeian A, Drachen A and Demediuk S A Team Based Player Versus Player Recommender Systems Framework For Player Improvement Proceedings of the Australasian Computer Science Week Multiconference, (1-7)
  877. ACM
    Demediuk S, Tamassia M, Li X and Raffe W Challenging AI Proceedings of the Australasian Computer Science Week Multiconference, (1-7)
  878. ACM
    Ayala A, Henríquez C and Cruz F Reinforcement learning using continuous states and interactive feedback Proceedings of the 2nd International Conference on Applications of Intelligent Systems, (1-5)
  879. Rizvi N and Ramesh D (2019). FBQ-LA, Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology, 36:3, (2715-2728), Online publication date: 1-Jan-2019.
  880. Ghesu F, Georgescu B, Zheng Y, Grbic S, Maier A, Hornegger J and Comaniciu D (2018). Multi-Scale Deep Reinforcement Learning for Real-Time 3D-Landmark Detection in CT Scans, IEEE Transactions on Pattern Analysis and Machine Intelligence, 41:1, (176-189), Online publication date: 1-Jan-2019.
  881. Garrido Merchán E, Puente C and Olivas J (2019). Generating a Question Answering System from Text Causal Relations Hybrid Artificial Intelligent Systems, 10.1007/978-3-030-29859-3_2, (14-25),
  882. ACM
    Peng X, Kanazawa A, Malik J, Abbeel P and Levine S (2018). SFV, ACM Transactions on Graphics, 37:6, (1-14), Online publication date: 31-Dec-2019.
  883. ACM
    Meng Z and Zhao H A Smart Sliding Chinese Pinyin Input Method Editor for Touchscreen Devices Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence, (1-6)
  884. ACM
    Wan Q, Liu W, Xu L and Guo J Extending the BDI Model with Q-learning in Uncertain Environment Proceedings of the 2018 International Conference on Algorithms, Computing and Artificial Intelligence, (1-6)
  885. François-Lavet V, Henderson P, Islam R, Bellemare M and Pineau J (2018). An Introduction to Deep Reinforcement Learning, Foundations and Trends® in Machine Learning, 11:3-4, (219-354), Online publication date: 20-Dec-2018.
  886. ACM
    Mehta A, Subramanian A and Subramanian A Learning End-to-end Autonomous Driving using Guided Auxiliary Supervision Proceedings of the 11th Indian Conference on Computer Vision, Graphics and Image Processing, (1-8)
  887. Sangiovanni B, Incremona G, Ferrara A and Piastra M Deep Reinforcement Learning Based Self-Configuring Integral Sliding Mode Control Scheme for Robot Manipulators 2018 IEEE Conference on Decision and Control (CDC), (5969-5974)
  888. Lee D, Yoon H and Hovakimyan N Primal-Dual Algorithm for Distributed Reinforcement Learning: Distributed GTD 2018 IEEE Conference on Decision and Control (CDC), (1967-1972)
  889. Paternain S, Andrés Bazerque J, Small A and Ribeiro A Learning Policies for Markov Decision Processes in Continuous Spaces 2018 IEEE Conference on Decision and Control (CDC), (4751-4758)
  890. Faust A, Aimone J, James C and Tapia L Resilient Computing with Reinforcement Learning on a Dynamical System: Case Study in Sorting 2018 IEEE Conference on Decision and Control (CDC), (5999-6006)
  891. Gatsis K and Pappas G Sample Complexity of Networked Control Systems Over Unknown Channels 2018 IEEE Conference on Decision and Control (CDC), (6067-6072)
  892. Yang Z, Zhang K, Hong M and Başar T A Finite Sample Analysis of the Actor-Critic Algorithm 2018 IEEE Conference on Decision and Control (CDC), (2759-2764)
  893. Pang B, Bian T and Jiang Z Data-driven Finite-horizon Optimal Control for Linear Time-varying Discrete-time Systems 2018 IEEE Conference on Decision and Control (CDC), (861-866)
  894. Baumann D, Trimpe S, Zhu J and Martius G Deep Reinforcement Learning for Event-Triggered Control 2018 IEEE Conference on Decision and Control (CDC), (943-950)
  895. Beirigo R, Todorov M and S. Barreto A Online TD(A) for discrete-time Markov jump linear systems 2018 IEEE Conference on Decision and Control (CDC), (2229-2234)
  896. Subramanian J and Mahajan A Renewal Monte Carlo: Renewal Theory Based Reinforcement Learning 2018 IEEE Conference on Decision and Control (CDC), (5759-5764)
  897. Larsson D, Kotsalis G and Tsiotras P Nash and Correlated Equilibria for Pursuit-Evasion Games Under Lack of Common Knowledge 2018 IEEE Conference on Decision and Control (CDC), (3579-3584)
  898. Gao B and Pavel L On Passivity and Reinforcement Learning in Finite Games 2018 IEEE Conference on Decision and Control (CDC), (340-345)
  899. Sahoo A and Narayanan V Event-based Near Optimal Sampling and Tracking Control of Nonlinear Systems 2018 IEEE Conference on Decision and Control (CDC), (55-60)
  900. Mukherjee S, Bai H and Chakrabortty A On Model-Free Reinforcement Learning of Reduced-Order Optimal Control for Singularly Perturbed Systems 2018 IEEE Conference on Decision and Control (CDC), (5288-5293)
  901. Abouheaf M, Lewis F and Mahmoud M Action Dependent Dual Heuristic Programming Solution for the Dynamic Graphical Games 2018 IEEE Conference on Decision and Control (CDC), (2741-2746)
  902. Jin M and Lavaei J Control-Theoretic Analysis of Smoothness for Stability-Certified Reinforcement Learning 2018 IEEE Conference on Decision and Control (CDC), (6840-6847)
  903. Ali Asad Rizvi S and Lin Z Model-Free Global Stabilization of Discrete-Time Linear Systems with Saturating Actuators Using Reinforcement Learning 2018 IEEE Conference on Decision and Control (CDC), (5276-5281)
  904. Narayanan V, Sahoo A and Jagannathan S Approximate Optimal Distributed Control of Nonlinear Interconnected Systems Using Nonzero-Sum Games 2018 IEEE Conference on Decision and Control (CDC), (2872-2877)
  905. Hu Z, Jiang Y, Ling X and Liu Q Accurate Q-Learning Neural Information Processing, (560-570)
  906. Zhu Y and Zhao D Driving Control with Deep and Reinforcement Learning in The Open Racing Car Simulator Neural Information Processing, (326-334)
  907. Goyal J, Madan A, Narayan A and Rao S ASD: A Framework for Generation of Task Hierarchies for Transfer in Reinforcement Learning Neural Information Processing, (313-325)
  908. Lin J, Peng Z and Cui D Deep Reinforcement Learning for Multi-resource Cloud Job Scheduling Neural Information Processing, (289-302)
  909. Chen S, Zhang X, Wu J and Liu D Averaged-A3C for Asynchronous Deep Reinforcement Learning Neural Information Processing, (277-288)
  910. Labao A, Raquel C and Naval P Induced Exploration on Policy Gradients by Increasing Actor Entropy Using Advantage Target Regions Neural Information Processing, (655-667)
  911. Yan Y and Liu Q Policy Space Noise in Deep Deterministic Policy Gradient Neural Information Processing, (624-634)
  912. Pan Z, Zhang Z and Chen Z Asynchronous Value Iteration Network Neural Information Processing, (169-180)
  913. Li T, Pan J, Zhu D and Meng M Learning to Interrupt: A Hierarchical Deep Reinforcement Learning Framework for Efficient Exploration 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO), (648-653)
  914. Qian D, Ren D, Meng Y, Zhu Y, Ding S, Fu S, Wang Z and Xia H End-to-End Learning Driver Policy using Moments Deep Neural Network 2018 IEEE International Conference on Robotics and Biomimetics (ROBIO), (1533-1538)
  915. Lu Y, Lu H, Cao L, Wu F and Zhu D Learning Deterministic Policy with Target for Power Control in Wireless Networks 2018 IEEE Global Communications Conference (GLOBECOM), (1-7)
  916. Sun F, Cheng N, Zhang S, Zhou H, Gui L and Shen X Reinforcement Learning Based Computation Migration for Vehicular Cloud Computing 2018 IEEE Global Communications Conference (GLOBECOM), (1-6)
  917. Ikeuchi H, Watanabe A, Kawata T and Kawahara R Root-Cause Diagnosis Using Logs Generated by User Actions 2018 IEEE Global Communications Conference (GLOBECOM), (1-7)
  918. Scalabrin M, Michelusi N and Rossi M Beam Training and Data Transmission Optimization in Millimeter-Wave Vehicular Networks 2018 IEEE Global Communications Conference (GLOBECOM), (1-7)
  919. Huynh N, Hoang D, Nguyen D, Dutkiewicz E, Niyato D and Wang P Reinforcement Learning Approach for RF-Powered Cognitive Radio Network with Ambient Backscatter 2018 IEEE Global Communications Conference (GLOBECOM), (1-6)
  920. Liu D and Yang C A Learning-Based Approach to Joint Content Caching and Recommendation at Base Stations 2018 IEEE Global Communications Conference (GLOBECOM), (1-7)
  921. Zhou J, Feng G, Yum T and Qin S Actor-Critic Algorithm Based Discontinuous Reception (DRX) for Machine-Type Communications 2018 IEEE Global Communications Conference (GLOBECOM), (1-7)
  922. Wang J, Liu K, Ni M and Pan J Learning Based Mobility Management Under Uncertainties for Mobile Edge Computing 2018 IEEE Global Communications Conference (GLOBECOM), (1-6)
  923. Meng X, Inaltekin H and Krongold B Deep Reinforcement Learning-Based Power Control in Full-Duplex Cognitive Radio Networks 2018 IEEE Global Communications Conference (GLOBECOM), (1-7)
  924. Zhao N, Liang Y, Niyato D, Pei Y and Jiang Y Deep Reinforcement Learning for User Association and Resource Allocation in Heterogeneous Networks 2018 IEEE Global Communications Conference (GLOBECOM), (1-6)
  925. Shi D, Ding J, Errapotu S, Yue H, Xu W, Zhou X and Pan M Deep Q-Network Based Route Scheduling for Transportation Network Company Vehicles 2018 IEEE Global Communications Conference (GLOBECOM), (1-7)
  926. Ko E and Chen K Wireless Communications Meets Artificial Intelligence: An Illustration by Autonomous Vehicles on Manhattan Streets 2018 IEEE Global Communications Conference (GLOBECOM), (1-7)
  927. Li A, Panahi F, Ohtsuki T and Han G Learning-Based Optimal Channel Selection in the Presence of Jammer for Cognitive Radio Networks 2018 IEEE Global Communications Conference (GLOBECOM), (1-6)
  928. Phan K, Hong Y and Viterbo E (2018). Adaptive Resource Allocation for Secure Two-Hop Relaying Communication, IEEE Transactions on Wireless Communications, 17:12, (8457-8472), Online publication date: 1-Dec-2018.
  929. R. T. R, Das G and Sen D (2018). Energy Efficient Scheduling for Concurrent Transmission in Millimeter Wave WPANs, IEEE Transactions on Mobile Computing, 17:12, (2789-2803), Online publication date: 1-Dec-2018.
  930. Hung S, Hsu H, Cheng S, Cui Q and Chen K (2018). Delay Guaranteed Network Association for Mobile Machines in Heterogeneous Cloud Radio Access Network, IEEE Transactions on Mobile Computing, 17:12, (2744-2760), Online publication date: 1-Dec-2018.
  931. ACM
    Muztoba M, Voleti R, Karabacak F, Park J and Ogras U (2018). Instinctive Assistive Indoor Navigation using Distributed Intelligence, ACM Transactions on Design Automation of Electronic Systems, 23:6, (1-21), Online publication date: 30-Nov-2018.
  932. Mili R and Chikhi S Reinforcement Learning Based Routing Protocols Analysis for Mobile Ad-Hoc Networks Machine Learning for Networking, (247-256)
  933. Jiang H, Qian J, Xie J and Yang J Episode-Experience Replay Based Tree-Backup Method for Off-Policy Actor-Critic Algorithm Pattern Recognition and Computer Vision, (562-573)
  934. Huang Y, Gu C, Wu K and Guan X Parallel Search by Reinforcement Learning for Object Detection Pattern Recognition and Computer Vision, (272-283)
  935. Hsieh T, EL-Manzalawy Y, Sun Y and Honavar V Compositional Stochastic Average Gradient for Machine Learning and Related Applications Intelligent Data Engineering and Automated Learning – IDEAL 2018, (740-752)
  936. ACM
    Qayyum S and Qureshi A A Survey on Machine Learning Based Requirement Prioritization Techniques Proceedings of the 2018 International Conference on Computational Intelligence and Intelligent Systems, (51-55)
  937. Phaniteja S, Dewangan P, Guhan P, Madhava Krishna K and Sarkar A Learning Dual Arm Coordinated Reachability Tasks in a Humanoid Robot with Articulated Torso 2018 IEEE-RAS 18th International Conference on Humanoid Robots (Humanoids), (1-9)
  938. Zhang W, Huang H, Zhang J, Jiang M and Luo G Adaptive-Precision Framework for SGD Using Deep Q-Learning 2018 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), (1-8)
  939. Guériau M and Dusparic I SAMoD: Shared Autonomous Mobility-on-Demand using Decentralized Reinforcement Learning 2018 21st International Conference on Intelligent Transportation Systems (ITSC), (1558-1563)
  940. Hug R, Becker S, Htibner W and Arens M Particle-based Pedestrian Path Prediction using LSTM-MDL Models 2018 21st International Conference on Intelligent Transportation Systems (ITSC), (2684-2691)
  941. Alesiani F and Gkiotsalitis K Reinforcement Learning-Based Bus Holding for High-Frequency Services 2018 21st International Conference on Intelligent Transportation Systems (ITSC), (3162-3168)
  942. Mirchevska B, Pek C, Werling M, Althoff M and Boedecker J High-level Decision Making for Safe and Reasonable Autonomous Lane Changing using Reinforcement Learning 2018 21st International Conference on Intelligent Transportation Systems (ITSC), (2156-2162)
  943. Qiao Z, Muelling K, Dolan J, Palanisamy P and Mudalige P POMDP and Hierarchical Options MDP with Continuous Actions for Autonomous Driving at Intersections 2018 21st International Conference on Intelligent Transportation Systems (ITSC), (2377-2382)
  944. ACM
    Wu C, Shi J, Yang Y and Li W Enhancing Machine Learning Based Malware Detection Model by Reinforcement Learning Proceedings of the 8th International Conference on Communication and Network Security, (74-78)
  945. Prause M and Weigand J (2018). Market Model Benchmark Suite for Machine Learning Techniques, IEEE Computational Intelligence Magazine, 13:4, (14-24), Online publication date: 1-Nov-2018.
  946. Xu Y, Yu J, Headley W and Buehrer R Deep Reinforcement Learning for Dynamic Spectrum Access in Wireless Networks MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM), (207-212)
  947. Xu Y, Yu J and Buehrer R Dealing with Partial Observations in Dynamic Spectrum Access: Deep Recurrent Q-Networks MILCOM 2018 - 2018 IEEE Military Communications Conference (MILCOM), (865-870)
  948. Uwano F and Takadama K Strategy for Learning Cooperative Behavior with Local Information for Multi-agent Systems PRIMA 2018: Principles and Practice of Multi-Agent Systems, (663-670)
  949. Bougie N and Ichise R Abstracting Reinforcement Learning Agents with Prior Knowledge PRIMA 2018: Principles and Practice of Multi-Agent Systems, (431-439)
  950. Schäfer D and Hüllermeier E Preference-Based Reinforcement Learning Using Dyad Ranking Discovery Science, (161-175)
  951. Nguyen H, Nguyen B, Dong T, Ngo D and Nguyen T Deep Q-Learning with Multiband Sensing for Dynamic Spectrum Access 2018 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN), (1-5)
  952. ACM
    Krishna R, Lee D, Li F and Bernstein M Engagement Learning: Expanding Visual Knowledge by Engaging Online Participants Adjunct Proceedings of the 31st Annual ACM Symposium on User Interface Software and Technology, (87-89)
  953. Woo S, Yeon J, Ji M, Moon I and Park J Deep Reinforcement Learning with Fully Convolutional Neural Network to Solve an Earthwork Scheduling Problem 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (4236-4242)
  954. Shi W, Song S and Wu C High-Level Tracking of Autonomous Underwater Vehicles Based on Pseudo Averaged Q-Learning 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (4138-4143)
  955. Yorita A, Egerton S, Oakman J, Chan C and Kubota N A Robot Assisted Stress Management Framework: Using Conversation to Measure Occupational Stress 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (3761-3767)
  956. Notsu A, Yasuda K, Ubukata S and Honda K Optimization of Learning Cycles in Online Reinforcement Learning Systems 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (3530-3534)
  957. Nguyen D, Vuong T, Kieu H, Pham L and Le T Vision Memory for Target Object Navigation Using Deep Reinforcement Learning: An Empirical Study 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (3267-3273)
  958. Peer E, Menkovski V, Zhang Y and Lee W Shunting Trains with Deep Reinforcement Learning 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (3063-3068)
  959. Xin B, Tang K, Wang L and Chen C Knowledge Transfer between Multi-granularity Models for Reinforcement Learning 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (2881-2886)
  960. Scobee D, Rubies Royo V, Tomlin C and Sastry S Haptic Assistance via Inverse Reinforcement Learning 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (1510-1517)
  961. Park J and Lee S Solving the Memory-Based Memoryless Trade-off Problem for EEG Signal Classification 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (505-510)
  962. Sama K, Morales Y, Akai N, Takeuchi E and Takeda K Learning How to Drive in Blind Intersections from Human Data 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (317-324)
  963. Kartik D, Sabir E, Mitra U and Natarajan P Policy Design for Active Sequential Hypothesis Testing using Deep Learning 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), (741-748)
  964. Savas Y, Ornik M, Cubuktepe M and Topcu U Entropy Maximization for Constrained Markov Decision Processes 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), (911-918)
  965. Li H, Chen H and Zhang W On Model-free Reinforcement Learning for Switched Linear Systems: A Subspace Clustering Approach 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton), (123-130)
  966. Rodriguez-Ramos A, Sampedro C, Bavle H, Moreno I and Campoy P A Deep Reinforcement Learning Technique for Vision-Based Autonomous Multirotor Landing on a Moving Platform 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (1010-1017)
  967. Huang R, Pengl Z, Cheng H, Hu J, Qiu J, Zou C and Chen Q Learning-based Walking Assistance Control Strategy for a Lower Limb Exoskeleton with Hemiplegia Patients 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2280-2285)
  968. Zhu S, Surovik D, Bekris K and Boularias A Efficient Model Identification for Tensegrity Locomotion 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2985-2990)
  969. Luo J, Solowjow E, Wen C, Ojea J and Agogino A Deep Reinforcement Learning for Robotic Assembly of Mixed Deformable and Rigid Objects 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2062-2069)
  970. Mehndiratta M, Camci E and Kayacan E Automated Tuning of Nonlinear Model Predictive Controller by Reinforcement Learning 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (3016-3021)
  971. Lathuilière S, Massé B, Mesejo P and Horaud R Deep Reinforcement Learning for Audio-Visual Gaze Control 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (1555-1562)
  972. Henderson P, Vertescher M, Meger D and Coates M Cost Adaptation for Robust Decentralized Swarm Behaviour 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (4099-4106)
  973. Sherstan C, Machado M and Pilarskir P Accelerating Learning in Constructive Predictive Frameworks with the Successor Representation 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2997-3003)
  974. Khan A, Kumar V and Ribeiro A Learning Sample-Efficient Target Reaching for Mobile Robots 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (3080-3087)
  975. Banerjee B Autonomous Acquisition of Behavior Trees for Robot Control 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (3460-3467)
  976. Jones D, Hollinger G, Kuhlman M, Sofge D and Gupta S Stochastic Optimization for Autonomous Vehicles with Limited Control Authority 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2395-2401)
  977. Derner E, Kubalík J and Babuška R Reinforcement Learning with Symbolic Input-Output Models 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (3004-3009)
  978. Shen M, Habibi G and How J Transferable Pedestrian Motion Prediction Models at Intersections 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (4547-4553)
  979. Ding G, Aghli S, Heckman C and Chen L Game-Theoretic Cooperative Lane Changing Using Data-Driven Models 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (3640-3647)
  980. Huang S, Bhatia K, Abbeel P and Dragan A Establishing Appropriate Trust via Critical States 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (3929-3936)
  981. Berseth G, Kyriazis A, Zinin I, Choi W and van de Panne M Model-Based Action Exploration for Learning Dynamic Motion Skills 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (1540-1546)
  982. Lin J, Somani N, Hu B, Rickert M and Knoll A An Efficient and Time-Optimal Trajectory Generation Approach for Waypoints Under Kinematic Constraints and Error Bounds 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (5869-5876)
  983. Tabor S, Guilliard I and Kolobov A ArduSoar: An Open-Source Thermalling Controller for Resource-Constrained Autopilots 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (6255-6262)
  984. Primeau N, Falcon R, Abielmona R and Petriu E (2018). A Review of Computational Intelligence Techniques in Wireless Sensor and Actuator Networks, IEEE Communications Surveys & Tutorials, 20:4, (2822-2854), Online publication date: 1-Oct-2018.
  985. Mao Q, Hu F and Hao Q (2018). Deep Learning for Intelligent Wireless Networks: A Comprehensive Survey, IEEE Communications Surveys & Tutorials, 20:4, (2595-2621), Online publication date: 1-Oct-2018.
  986. Aslani M, Seipel S, Mesgari M and Wiering M (2022). Traffic signal optimization through discrete and continuous reinforcement learning with robustness analysis in downtown Tehran, Advanced Engineering Informatics, 38:C, (639-655), Online publication date: 1-Oct-2018.
  987. ACM
    Martinez-Gil F, Lozano M, García-Fernández I and Fernández F (2017). Modeling, Evaluation, and Scale on Artificial Pedestrians, ACM Computing Surveys, 50:5, (1-35), Online publication date: 30-Sep-2018.
  988. Schwung D, Reimann J, Schwung A and Ding S Self Learning in Flexible Manufacturing Units: A Reinforcement Learning Approach 2018 International Conference on Intelligent Systems (IS), (31-38)
  989. Dong N, Kampffmeyer M, Liang X, Wang Z, Dai W and Xing E Reinforced Auto-Zoom Net: Towards Accurate and Fast Breast Cancer Segmentation in Whole-Slide Images Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, (317-325)
  990. Liu D and Jiang T Deep Reinforcement Learning for Surgical Gesture Segmentation and Classification Medical Image Computing and Computer Assisted Intervention – MICCAI 2018, (247-255)
  991. Alansary A, Folgoc L, Vaillant G, Oktay O, Li Y, Bai W, Passerat-Palmbach J, Guerrero R, Kamnitsas K, Hou B, McDonagh S, Glocker B, Kainz B and Rueckert D Automatic View Planning with Multi-scale Deep Reinforcement Learning Agents Medical Image Computing and Computer Assisted Intervention – MICCAI 2018, (277-285)
  992. Schneckenreither M and Haeussler S Reinforcement Learning Methods for Operations Research Applications: The Order Release Problem Machine Learning, Optimization, and Data Science, (545-559)
  993. Zhang C, Gupta C, Farahat A, Ristovski K and Ghosh D Equipment Health Indicator Learning Using Deep Reinforcement Learning Machine Learning and Knowledge Discovery in Databases, (488-504)
  994. Temesgene D, Miozzo M and Dini P Dynamic Functional Split Selection in Energy Harvesting Virtual Small Cells Using Temporal Difference Learning 2018 IEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), (1813-1819)
  995. Nisioti E and Thomos N Decentralized Reinforcement Learning Based MAC Optimization 2018 IEEE 29th Annual International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC), (1-5)
  996. Wagner M, Basevi H, Shetty R, Li W, Malinowski M, Fritz M and Leonardis A Answering Visual What-If Questions: From Actions to Predicted Scene Descriptions Computer Vision – ECCV 2018 Workshops, (521-537)
  997. Byeon W, Wang Q, Srivastava R and Koumoutsakos P ContextVP: Fully Context-Aware Video Prediction Computer Vision – ECCV 2018, (781-797)
  998. Luc P, Couprie C, LeCun Y and Verbeek J Predicting Future Instance Segmentation by Forecasting Convolutional Features Computer Vision – ECCV 2018, (593-608)
  999. Li Y, Wang L, Yang T and Gong B How Local Is the Local Diversity? Reinforcing Sequential Determinantal Point Processes with Dynamic Ground Sets for Supervised Video Summarization Computer Vision – ECCV 2018, (156-174)
  1000. Liang X, Wang T, Yang L and Xing E CIRL: Controllable Imitative Reinforcement Learning for Vision-Based Self-driving Computer Vision – ECCV 2018, (604-620)
  1001. Zhang J, Wu Q, Shen C, Zhang J, Lu J and van den Hengel A Goal-Oriented Visual Question Generation via Intermediate Rewards Computer Vision – ECCV 2018, (189-204)
  1002. Qu S, Wang J and Jasperneite J Dynamic scheduling in large-scale stochastic processing networks for demand-driven manufacturing using distributed reinforcement learning 2018 IEEE 23rd International Conference on Emerging Technologies and Factory Automation (ETFA), (433-440)
  1003. ACM
    Wan Y, Zhao Z, Yang M, Xu G, Ying H, Wu J and Yu P Improving automatic source code summarization via deep reinforcement learning Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, (397-407)
  1004. Liu C, Chen Z, Tang J, Xu J and Piao C (2018). Energy-Efficient UAV Control for Effective and Fair Communication Coverage: A Deep Reinforcement Learning Approach, IEEE Journal on Selected Areas in Communications, 36:9, (2059-2070), Online publication date: 1-Sep-2018.
  1005. Zheng Y, Meng Z, Hao J and Zhang Z Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments PRICAI 2018: Trends in Artificial Intelligence, (421-429)
  1006. Li G, He B, Gomez R and Nakamura K Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), (1156-1162)
  1007. Aykin C, Knopp M and Diepold K Deep Reinforcement Learning for Formation Control 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), (1-5)
  1008. Vasan G and Pilarski P Context-Aware Learning from Demonstration: Using Camera Data to Support the Synergistic Control of a Multi-Joint Prosthetic Arm 2018 7th IEEE International Conference on Biomedical Robotics and Biomechatronics (Biorob), (199-206)
  1009. Cui Y, Zhu L, Fujisaki M, Kanokogi H and Matsubara T Factorial Kernel Dynamic Policy Programming for Vinyl Acetate Monomer Plant Model Control 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE), (304-309)
  1010. Chen X and Jin R Data Fusion Pipelines for Autonomous Smart Manufacturing 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE), (1203-1208)
  1011. Sygulla F, Wittmann R, Seiwald P, Berninger T, Hildebrandt A, Wahrmann D and Rixen D An EtherCAT-Based Real-Time Control System Architecture for Humanoid Robots 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE), (483-490)
  1012. Wang L, Jiang J and Liao L Sentence Compression with Reinforcement Learning Knowledge Science, Engineering and Management, (3-15)
  1013. Tavares A and Chaimowicz L Tabular Reinforcement Learning in Real-Time Strategy Games via Options 2018 IEEE Conference on Computational Intelligence and Games (CIG), (1-8)
  1014. Shao K, Zhao D, Li N and Zhu Y Learning Battles in ViZDoom via Deep Reinforcement Learning 2018 IEEE Conference on Computational Intelligence and Games (CIG), (1-4)
  1015. Torrado R, Bontrager P, Togelius J, Liu J and Perez-Liebana D Deep Reinforcement Learning for General Video Game AI 2018 IEEE Conference on Computational Intelligence and Games (CIG), (1-8)
  1016. Moridian B, Kamal A and Mahmoudian N Learning Navigation Tasks from Demonstration for Semi-Autonomous Remote Operation of Mobile Robots 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), (1-8)
  1017. Pham H, La H, Feil-Seifer D and Van Nguyen L Reinforcement Learning for Autonomous UAV Navigation Using Function Approximation 2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), (1-6)
  1018. Goswami A, Zhai C and Mohapatra P Learning to Rank and Discover for E-Commerce Search Machine Learning and Data Mining in Pattern Recognition, (331-346)
  1019. Bernstein A, Burnaev E and Kachan O Reinforcement Learning for Computer Vision and Robot Navigation Machine Learning and Data Mining in Pattern Recognition, (258-272)
  1020. Hatano K Can Machine Learning Techniques Provide Better Learning Support for Elderly People? Distributed, Ambient and Pervasive Interactions: Technologies and Contexts, (178-187)
  1021. Raeissi M and Farinelli A Learning Queuing Strategies in Human-Multi-Robot Interaction Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (2207-2209)
  1022. Li M, Brys T and Kudenko D Introspective Reinforcement Learning and Learning from Demonstration Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (1992-1994)
  1023. Ritschel H Socially-Aware Reinforcement Learning for Personalized Human-Robot Interaction Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (1775-1777)
  1024. Hong Z, Su S, Shann T, Chang Y and Lee C A Deep Policy Inference Q-Network for Multi-Agent Systems Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (1388-1396)
  1025. Metcalf K, Theobald B and Apostoloff N Learning Sharing Behaviors with Arbitrary Numbers of Agents Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (1232-1240)
  1026. Sen S, Rahaman Z, Crawford C and Yücel O Agents for Social (Media) Change Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (1198-1202)
  1027. Silva F and Costa A Object-Oriented Curriculum Generation for Reinforcement Learning Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (1026-1034)
  1028. Jain A and Precup D Eligibility Traces for Options Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (1008-1016)
  1029. Barlier M, Laroche R and Pietquin O Training Dialogue Systems With Human Advice Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (999-1007)
  1030. Liebman E, Zavesky E and Stone P A Stitch in Time - Autonomous Model Management via Reinforcement Learning Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (990-998)
  1031. Vanzo A, Part J, Yu Y, Nardi D and Lemon O Incrementally Learning Semantic Attributes through Dialogue Interaction Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (865-873)
  1032. Phan T, Belzner L, Gabor T and Schmid K Leveraging Statistical Multi-Agent Online Planning with Emergent Value Function Approximation Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (730-738)
  1033. Toro Icarte R, Klassen T, Valenzano R and McIlraith S Teaching Multiple Tasks to an RL Agent using LTL Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (452-461)
  1034. Palmer G, Tuyls K, Bloembergen D and Savani R Lenient Multi-Agent Deep Reinforcement Learning Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (443-451)
  1035. Omidshafiei S, Kim D, Pazis J and How J Crossmodal Attentive Skill Learner Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (139-146)
  1036. Foerster J, Chen R, Al-Shedivat M, Whiteson S, Abbeel P and Mordatch I Learning with Opponent-Learning Awareness Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, (122-130)
  1037. Morley-Drabble C and Singh S One Soft Robot: A Complementary Design & Control Strategy for a Pneumatically Powered Soft Robot 2018 IEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), (942-949)
  1038. Miranda Í, Ladeira M and de Castro Aranha C A Comparison Study Between Deep Learning and Genetic Programming Application in Cart Pole Balancing Problem 2018 IEEE Congress on Evolutionary Computation (CEC), (1-7)
  1039. Lemos L, Bazzan A and Pasin M Co-Adaptive Reinforcement Learning in Microscopic Traffic Systems 2018 IEEE Congress on Evolutionary Computation (CEC), (1-8)
  1040. Challita U, Dong L and Saad W (2018). Proactive Resource Management for LTE in Unlicensed Spectrum: A Deep Learning Perspective, IEEE Transactions on Wireless Communications, 17:7, (4674-4689), Online publication date: 1-Jul-2018.
  1041. Alimoradi M and Husseinzadeh Kashan A (2018). A league championship algorithm equipped with network structure and backward Q-learning for extracting stock trading rules, Applied Soft Computing, 68:C, (478-493), Online publication date: 1-Jul-2018.
  1042. Qiao Z, Muelling K, Dolan J, Palanisamy P and Mudalige P Automatically Generated Curriculum based Reinforcement Learning for Autonomous Vehicles in Urban Environment 2018 IEEE Intelligent Vehicles Symposium (IV), (1233-1238)
  1043. Yang S, Li J, Wang J, Liu Z and Yang F Learning Urban Navigation via Value Iteration Network 2018 IEEE Intelligent Vehicles Symposium (IV), (800 -805 )
  1044. Kang W and Yoo S Dynamic Management of Key States for Reinforcement Learning-assisted Garbage Collection to Reduce Long Tail Latency in SSD 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), (1-6)
  1045. Sadasivam S, Lee J, Chen Z and Jain R Invited: Efficient Reinforcement Learning for Automating Human Decision-Making in SoC Design 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC), (1-6)
  1046. Watkinson W and Camp T Training a RoboCup Striker Agent via Transferred Reinforcement Learning RoboCup 2018: Robot World Cup XXII, (109-121)
  1047. Aşık O, Görer B and Akın H End-to-End Deep Imitation Learning: Robot Soccer Case Study RoboCup 2018: Robot World Cup XXII, (137-149)
  1048. Xu H, Cao Y, Shang Y, Liu Y, Tan J and Guo L Adversarial Reinforcement Learning for Chinese Text Summarization Computational Science – ICCS 2018, (519-532)
  1049. Aggarwal M, Arora A, Sodhani S and Krishnamurthy B Improving Search Through A3C Reinforcement Learning Based Conversational Agent Computational Science – ICCS 2018, (273-286)
  1050. Satyal S, Weber I, Paik H, Di Ciccio C and Mendling J AB Testing for Process Versions with Contextual Multi-armed Bandit Algorithms Advanced Information Systems Engineering, (19-34)
  1051. Zajdel R and Kusy M Application of Reinforcement Learning to Stacked Autoencoder Deep Network Architecture Optimization Artificial Intelligence and Soft Computing, (267-276)
  1052. Papiez P and Horzyk A Motivated Reinforcement Learning Using Self-Developed Knowledge in Autonomous Cognitive Agent Artificial Intelligence and Soft Computing, (170-182)
  1053. Nakaya Y and Osana Y Deep Q-Network Using Reward Distribution Artificial Intelligence and Soft Computing, (160-169)
  1054. Qi X (2018). Rotor resistance and excitation inductance estimation of an induction motor using deep-Q-learning algorithm, Engineering Applications of Artificial Intelligence, 72:C, (67-79), Online publication date: 1-Jun-2018.
  1055. Sisikoglu Sir E, Pariazar M and Sir M (2018). Capacitated inspection scheduling of multi-unit systems, Computers and Industrial Engineering, 120:C, (471-479), Online publication date: 1-Jun-2018.
  1056. Dunin-Barkowski W and Solovyeva K Pavlov principle and brain reverse engineering 2018 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology (CIBCB), (1-5)
  1057. Zhu D, Li T, Ho D, Wang C and Meng M Deep Reinforcement Learning Supervised Autonomous Exploration in Office Environments 2018 IEEE International Conference on Robotics and Automation (ICRA), (7548-7555)
  1058. Pautrat R, Chatzilygeroudis K and Mouret J Bayesian Optimization with Automatic Prior Selection for Data-Efficient Direct Policy Search 2018 IEEE International Conference on Robotics and Automation (ICRA), (7571-7578)
  1059. Joshi G and Chowdhary G Cross-Domain Transfer in Reinforcement Learning Using Target Apprentice 2018 IEEE International Conference on Robotics and Automation (ICRA), (7525-7532)
  1060. Yuan W, Stork J, Kragic D, Wang M and Hang K Rearrangement with Nonprehensile Manipulation Using Deep Reinforcement Learning 2018 IEEE International Conference on Robotics and Automation (ICRA), (270-277)
  1061. Derner E, Kubalík J and Babuška R Data-driven Construction of Symbolic Process Models for Reinforcement Learning 2018 IEEE International Conference on Robotics and Automation (ICRA), (1-8)
  1062. Brinkmann G, Bessa W, Duecker D, Kreuzer E and Solowjow E Reinforcement Learning of Depth Stabilization with a Micro Diving Agent 2018 IEEE International Conference on Robotics and Automation (ICRA), (1-7)
  1063. Chatzilygeroudis K and Mourer J Using Parameterized Black-Box Priors to Scale Up Model-Based Policy Search for Robotics 2018 IEEE International Conference on Robotics and Automation (ICRA), (1-9)
  1064. Quillen D, Jang E, Nachum O, Finn C, Ibarz J and Levine S Deep Reinforcement Learning for Vision-Based Robotic Grasping: A Simulated Comparative Evaluation of Off-Policy Methods 2018 IEEE International Conference on Robotics and Automation (ICRA), (6284-6291)
  1065. Heim S, Ruppert F, Sarvestani A and Spröwitz A Shaping in Practice: Training Wheels to Learn Fast Hopping Directly in Hardware 2018 IEEE International Conference on Robotics and Automation (ICRA), (1-6)
  1066. Guo M, Andersson S and Dimarogonas D Human-in-the-Loop Mixed-Initiative Control Under Temporal Tasks 2018 IEEE International Conference on Robotics and Automation (ICRA), (6395-6400)
  1067. Bayiz Y, Chen L, Hsu S, Liu P, Aguiles A and Cheng B Real-Time Learning of Efficient Lift Generation on a Dynamically Scaled Flapping Wing Using Policy Search 2018 IEEE International Conference on Robotics and Automation (ICRA), (1-5)
  1068. Thomas G, Chien M, Tamar A, Ojea J and Abbeel P Learning Robotic Assembly from CAD 2018 IEEE International Conference on Robotics and Automation (ICRA), (1-9)
  1069. Li K, Rath M and Burdick J Inverse Reinforcement Learning via Function Approximation for Clinical Motion Analysis 2018 IEEE International Conference on Robotics and Automation (ICRA), (610-617)
  1070. Schiatti L, Tessadori J, Deshpande N, Barresi G, King L and Mattos L Human in the Loop of Robot Learning: EEG-Based Reward Signal for Target Identification and Reaching Task 2018 IEEE International Conference on Robotics and Automation (ICRA), (4473-4480)
  1071. Kemna S, Kroemer O and Sukhatme G Pilot Surveys for Adaptive Informative Sampling 2018 IEEE International Conference on Robotics and Automation (ICRA), (6417-6424)
  1072. Liu J and Williams R Optimal Intermittent Deployment and Sensor Selection for Environmental Sensing with Multi-Robot Teams 2018 IEEE International Conference on Robotics and Automation (ICRA), (1078-1083)
  1073. Daher T, Jemaa S and Decreusefond L Softwarized and distributed learning for SON management systems NOMS 2018 - 2018 IEEE/IFIP Network Operations and Management Symposium, (1-7)
  1074. Budhdev N, Chan M and Mitra T PR3: Power Efficient and Low Latency Baseband Processing for LTE Femtocells IEEE INFOCOM 2018 - IEEE Conference on Computer Communications, (2357-2365)
  1075. Xu Z, Tang J, Meng J, Zhang W, Wang Y, Liu C and Yang D Experience-driven Networking: A Deep Reinforcement Learning based Approach IEEE INFOCOM 2018 - IEEE Conference on Computer Communications, (1871-1879)
  1076. Ceran E, Gündüz D and György A Average age of information with hybrid ARQ under a resource constraint 2018 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  1077. D'Oro S, Zappone A, Palazzo S and Lops M A learning-based approach to energy efficiency maximization in wireless networks 2018 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  1078. Sadeghi A, Sheikholeslami F, Matrques A and Giannakis G Reinforcement Learning for 5G Caching with Dynamic Cost 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (6653-6657)
  1079. Jiang Y, Shin H and Ko H Precise Regression for Bounding Box Correction for Improved Tracking Based on Deep Reinforcement Learning 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (1643-1647)
  1080. Ortiz A, Weber T and Klein A A Two-Layer Reinforcement Learning Solution for Energy Harvesting Data Dissemination Scenarios 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (6648-6652)
  1081. Noshad M and Hero A Rate-Optimal Meta Learning of Classification Error 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2481-2485)
  1082. Tjandra A, Sakti S and Nakamura S Sequence-to-Sequence Asr Optimization Via Reinforcement Learning 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (5829-5833)
  1083. Redhu S, Garg P and Hegde R Joint Mobile Sink Scheduling and Data Aggregation in Asynchronous Wireless Sensor Networks Using Q-Learning 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (6438-6442)
  1084. ACM
    McGough A and Forshaw M Evaluation of Energy Consumption of Replicated Tasks in a Volunteer Computing Environment Companion of the 2018 ACM/SPEC International Conference on Performance Engineering, (85-90)
  1085. Czibula G, Czibula I and Marian Z (2018). An effective approach for determining the class integration test order using reinforcement learning, Applied Soft Computing, 65:C, (517-530), Online publication date: 1-Apr-2018.
  1086. Bacon P and Precup D (2018). Constructing Temporal Abstractions Autonomously in Reinforcement Learning, AI Magazine, 39:1, (39-50), Online publication date: 1-Mar-2018.
  1087. Phatak S, Freigoun M, Martín C, Rivera D, Korinek E, Adams M, Buman M, Klasnja P and Hekler E (2018). Modeling individual differences, Journal of Biomedical Informatics, 79:C, (82-97), Online publication date: 1-Mar-2018.
  1088. Englert P and Toussaint M (2018). Learning manipulation skills from a single demonstration, International Journal of Robotics Research, 37:1, (137-154), Online publication date: 1-Jan-2018.
  1089. Matei I, Minhas R, de Kleer J and Ganguli A Improving state-action space exploration in reinforcement learning using geometric properties 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (6403-6408)
  1090. Vamvoudakis K and Safaei F Stochastic zero-sum nash games for uncertain nonlinear Markovian jump systems 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (5582-5589)
  1091. Marco A, Hennig P, Schaal S and Trimpe S On the design of LQR kernels for efficient controller learning 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (5193-5200)
  1092. Sutter T, Kamoutsi A, Esfahani P and Lygeros J Data-driven approximate dynamic programming: A linear programming approach 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (5174-5179)
  1093. Jiang B, Roozbehani M and Dahleh M Coalitional game with opinion exchange 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (5008-5013)
  1094. Arruda E, Fragoso M and Ourique F Multi-partition time aggregation for Markov Chains 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (4922-4927)
  1095. Guo M and Zavlanos M Temporal task planning in wirelessly connected environments with unknown channel quality 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (4161-4168)
  1096. Larsson D, Braun D and Tsiotras P Hierarchical state abstractions for decision-making problems with computational constraints 2017 IEEE 56th Annual Conference on Decision and Control (CDC), (1138-1143)
  1097. Chellaboina V Model-Free Optimal Control: A Critical Analysis Big Data Analytics, (215-222)
  1098. Hung S, Zhang X, Festag A, Chen K and Fettweis G Virtual Cells and Virtual Networks Enablelow-Latency Vehicle-to-Vehicle Communication GLOBECOM 2017 - 2017 IEEE Global Communications Conference, (1-7)
  1099. Xie J, Liang Y, Pei Y, Fang J and Wang L Intelligent Multi-Radio Access Based on Markov Decision Process GLOBECOM 2017 - 2017 IEEE Global Communications Conference, (1-6)
  1100. Lien S, Hung S, Deng D and Wang Y Efficient Ultra-Reliable and Low Latency Communications and Massive Machine-Type Communications in 5G New Radio GLOBECOM 2017 - 2017 IEEE Global Communications Conference, (1-7)
  1101. Dong L, Niyato D, Kim D and Hoang D A Joint Scheduling and Content Caching Scheme for Energy Harvesting Access Points with Multicast GLOBECOM 2017 - 2017 IEEE Global Communications Conference, (1-6)
  1102. ACM
    Valadarsky A, Schapira M, Shahaf D and Tamar A Learning to Route Proceedings of the 16th ACM Workshop on Hot Topics in Networks, (185-191)
  1103. Spitz J, Bouyarmane K, Ivaldi S and Mouret J Trial-and-error learning of repulsors for humanoid QP-based whole-body control 2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids), (468-475)
  1104. Xiong F, Liu Z, Yang X, Sun B, Chiu C and Qiao H A Bayesian Posterior Updating Algorithm in Reinforcement Learning Neural Information Processing, (418-426)
  1105. Tan F, Yan P and Guan X Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning Neural Information Processing, (475-483)
  1106. Li F, Qin J, Kang Y and Zheng W Consensus Based Distributed Reinforcement Learning for Nonconvex Economic Power Dispatch in Microgrids Neural Information Processing, (831-839)
  1107. Reinke C, Uchibe E and Doya K Average Reward Optimization with Multiple Discounting Reinforcement Learners Neural Information Processing, (789-800)
  1108. Li H, Wei T, Ren A, Zhu Q and Wang Y Deep reinforcement learning: Framework, applications, and embedded implementations: Invited paper 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), (847-854)
  1109. Mera-Gómez C, Ramírez F, Bahsoon R and Buyya R A Debt-Aware Learning Approach for Resource Adaptations in Cloud Elasticity Management Service-Oriented Computing, (367-382)
  1110. Chae H, Kang C, Kim B, Kim J, Chung C and Choi J Autonomous braking system via deep reinforcement learning 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), (1-6)
  1111. Liu Y, Liu L and Chen W Intelligent traffic light control using distributed multi-agent Q learning 2017 IEEE 20th International Conference on Intelligent Transportation Systems (ITSC), (1-8)
  1112. Hajiaghajani F and Biswas S Towards scalable and privacy preserving commercial content dissemination in social wireless networks 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), (1-7)
  1113. Barrachina-Muñoz S and Bellalta B Learning optimal routing for the uplink in LPWANs using similarity-enhanced e-greedy 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), (1-5)
  1114. Daher T, Ben Jemaa S and Decreusefond L Cognitive management of self — Organized radio networks based on multi armed bandit 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), (1-5)
  1115. AlQerm I and Shihada B Enhanced machine learning scheme for energy efficient resource allocation in 5G heterogeneous cloud radio access networks 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), (1-7)
  1116. Roessingh J, Toubman A, van Oijen J, Poppinga G, L⊘vlid R, Hou M and Luotsinen L Machine learning techniques for autonomous agents in military simulations — Multum in parvo 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (3445-3450)
  1117. Toghiani-Rizi B, Kamrani F, Luotsinen L and Gisslén L Evaluating deep reinforcement learning for computer generated forces in ground combat simulation 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (3433-3438)
  1118. Huang Y, Huang S, Chen H, Chen Y, Liu C and Li T A 3D vision based object grasping posture learning system for home service robots 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (2690-2695)
  1119. Dunjko V, Taylor J and Briegel H Advances in quantum reinforcement learning 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (282-287)
  1120. Anwar A, Atia G and Guirguis M Dynamic game-theoretic defense approach against stealthy Jamming attacks in wireless networks 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton), (252-258)
  1121. Arin A and Rabadi G (2017). Integrating estimation of distribution algorithms versus Q-learning into Meta-RaPS for solving the 0-1 multidimensional knapsack problem, Computers and Industrial Engineering, 112:C, (706-720), Online publication date: 1-Oct-2017.
  1122. Morozkin P, Swynghedauw M and Trocan M Neural Network Based Eye Tracking Computational Collective Intelligence, (600-609)
  1123. Chohra A and Madani K Adaptive Motivation System Under Modular Reinforcement Learning for Agent Decision-Making Modeling of Biological Regulation Computational Collective Intelligence, (32-42)
  1124. Rosman G, Paull L and Rus D Hybrid control and learning with coresets for autonomous vehicles 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (6894-6901)
  1125. Zhang J, Springenberg J, Boedecker J and Burgard W Deep reinforcement learning with successor features for navigation across similar environments 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (2371-2378)
  1126. Ghesu F, Georgescu B, Grbic S, Maier A, Hornegger J and Comaniciu D Robust Multi-scale Anatomical Landmark Detection in Incomplete 3D-CT Data Medical Image Computing and Computer Assisted Intervention − MICCAI 2017, (194-202)
  1127. Oonishi H and Iima H Improving generalization ability in a puzzle game using reinforcement learning 2017 IEEE Conference on Computational Intelligence and Games (CIG), (232-239)
  1128. Isaksen A, Wallace D, Finkelstein A and Nealen A Simulating strategy and dexterity for puzzle games 2017 IEEE Conference on Computational Intelligence and Games (CIG), (142-149)
  1129. Demediuk S, Tamassia M, Raffe W, Zambetta F, Li X and Mueller F Monte Carlo tree search based algorithms for dynamic difficulty adjustment 2017 IEEE Conference on Computational Intelligence and Games (CIG), (53-59)
  1130. Li B, Xia L and Zhao Q Complexity analysis of reinforcement learning and its application to robotics 2017 13th IEEE Conference on Automation Science and Engineering (CASE), (1425-1426)
  1131. Wiley T, Bratko I and Sammut C A Machine Learning System for Controlling a Rescue Robot RoboCup 2017: Robot World Cup XXI, (108-119)
  1132. Lobos-Tsunekawa K, Leottau D and Ruiz-del-Solar J Toward Real-Time Decentralized Reinforcement Learning Using Finite Support Basis Functions RoboCup 2017: Robot World Cup XXI, (95-107)
  1133. Bai A, Russell S and Chen X Concurrent Hierarchical Reinforcement Learning for RoboCup Keepaway RoboCup 2017: Robot World Cup XXI, (190-203)
  1134. Vasan G and Pilarski P Learning from demonstration: Teaching a myoelectric prosthesis with an intact limb via reinforcement learning 2017 International Conference on Rehabilitation Robotics (ICORR), (1457-1464)
  1135. Travnik J and Pilarski P Representing high-dimensional data to intelligent prostheses and other wearable assistive robots: A first comparison of tile coding and selective Kanerva coding 2017 International Conference on Rehabilitation Robotics (ICORR), (1443-1450)
  1136. ACM
    Spieker H, Gotlieb A, Marijan D and Mossige M Reinforcement learning for automatic test case prioritization and selection in continuous integration Proceedings of the 26th ACM SIGSOFT International Symposium on Software Testing and Analysis, (12-22)
  1137. Gottwald M, Meyer D, Hao Shen and Diepold K Learning to walk with prior knowledge 2017 IEEE International Conference on Advanced Intelligent Mechatronics (AIM), (1369-1374)
  1138. Qi X, Luo Y, Wu G, Boriboonsomsin K and Barth M Deep reinforcement learning-based vehicle energy efficiency autonomous learning system 2017 IEEE Intelligent Vehicles Symposium (IV), (1228-1233)
  1139. Chen X, Zhai Y, Lu C, Gong J and Wang G A learning model for personalized adaptive cruise control 2017 IEEE Intelligent Vehicles Symposium (IV), (379-384)
  1140. Kuefler A, Morton J, Wheeler T and Kochenderfer M Imitating driver behavior with generative adversarial networks 2017 IEEE Intelligent Vehicles Symposium (IV), (204-211)
  1141. Antipov D and Buzdalova A Runtime Analysis of Random Local Search on JUMP function with Reinforcement Based Selection of Auxiliary Objectives 2017 IEEE Congress on Evolutionary Computation (CEC), (2169-2176)
  1142. Kunanusont K, Lucas S and Pérez-Liébana D General Video Game AI: Learning from screen capture 2017 IEEE Congress on Evolutionary Computation (CEC), (2078-2085)
  1143. Budhraja K and Oates T Neuroevolution-based Inverse Reinforcement Learning 2017 IEEE Congress on Evolutionary Computation (CEC), (67-76)
  1144. Poon J, Cui Y, Miro J, Matsubara T and Sugimoto K Local driving assistance from demonstration for mobility aids 2017 IEEE International Conference on Robotics and Automation (ICRA), (5935-5941)
  1145. Doerr A, Nguyen-Tuong D, Marco A, Schaal S and Trimpe S Model-based policy search for automatic tuning of multivariate PID controllers 2017 IEEE International Conference on Robotics and Automation (ICRA), (5295-5301)
  1146. Montgomery W, Ajay A, Finn C, Abbeel P and Levine S Reset-free guided policy search: Efficient deep reinforcement learning with stochastic initial states 2017 IEEE International Conference on Robotics and Automation (ICRA), (3373-3380)
  1147. Martinez-Cantin R Bayesian optimization with adaptive kernels for robot control 2017 IEEE International Conference on Robotics and Automation (ICRA), (3350-3356)
  1148. Bös J, Wahrburg A and Listmann K Iteratively Learned and Temporally Scaled Force Control with application to robotic assembly in unstructured environments 2017 IEEE International Conference on Robotics and Automation (ICRA), (3000-3007)
  1149. Sung J, Salisbury J and Saxena A Learning to represent haptic feedback for partially-observable tasks 2017 IEEE International Conference on Robotics and Automation (ICRA), (2802-2809)
  1150. Pace A and Burden S Decoupled limbs yield differentiable trajectory outcomes through intermittent contact in locomotion and manipulation 2017 IEEE International Conference on Robotics and Automation (ICRA), (2261-2266)
  1151. Marco A, Berkenkamp F, Hennig P, Schoellig A, Krause A, Schaal S and Trimpe S Virtual vs. real: Trading off simulations and physical experiments in reinforcement learning with Bayesian optimization 2017 IEEE International Conference on Robotics and Automation (ICRA), (1557-1563)
  1152. Chen Y, Liu M, Everett M and How J Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning 2017 IEEE International Conference on Robotics and Automation (ICRA), (285-292)
  1153. Daher T, Jemaa S and Decreusefond L Q-Learning for Policy Based SON Management in wireless Access Networks 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM), (1091-1096)
  1154. ACM
    Williams J, Rafferty A, Ang A, Tingley D, Lasecki W and Kim J Connecting Instructors and Learning Scientists via Collaborative Dynamic Experimentation Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems, (3012-3018)
  1155. Kavalerov M, Shilova Y and Likhacheva Y Adaptive Q-routing with Random Echo and Route Memory Proceedings of the 20th Conference of Open Innovations Association FRUCT, (138-145)
  1156. Shen C and van der Schaar M A Learning Approach to Frequent Handover Mitigations in 3GPP Mobility Protocols 2017 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  1157. Guo K, Yang C and Liu T Caching in Base Station with Recommendation via Q-Learning 2017 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  1158. Aref M, Jayaweera S and Machuzak S Multi-Agent Reinforcement Learning Based Cognitive Anti-Jamming 2017 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  1159. Wang X and Shen C Dynamic User Association in Enterprise Small Cell Networks with Hybrid Access 2017 IEEE Wireless Communications and Networking Conference (WCNC), (1-6)
  1160. ACM
    Senft E, Lemaignan S, Baxter P and Belpaeme T Leveraging Human Inputs in Interactive Machine Learning for Human Robot Interaction Proceedings of the Companion of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, (281-282)
  1161. ACM
    Ramicic M and Bonarini A Attention-Based Experience Replay in Deep Q-Learning Proceedings of the 9th International Conference on Machine Learning and Computing, (476-481)
  1162. Wang Z, Tian Z, Xu J, Maeda R, Li H, Yang P, Wang Z, Duong L, Wang Z and Chen X Modular reinforcement learning for self-adaptive energy efficiency optimization in multicore system 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC), (684-689)
  1163. Goharimanesh M, Abbasi Jannatabadi E, Moeinkhah H, Naghibi-Sistani M and Akbari A (2017). An intelligent controller for ionic polymer metal composites using optimized fuzzy reinforcement learning, Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology, 33:1, (125-136), Online publication date: 1-Jan-2017.
  1164. Wu C, Song H, Yan C and Wang Y (2017). A fuzzy-based function approximation technique for reinforcement learning, Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology, 32:6, (3909-3920), Online publication date: 1-Jan-2017.
  1165. Fernandez-Gauna B, Fernandez-Gamiz U and Graña M (2017). Variable speed wind turbine controller adaptation by reinforcement learning, Integrated Computer-Aided Engineering, 24:1, (27-39), Online publication date: 1-Jan-2017.
  1166. Xu J and Ren S Online Learning for Offloading and Autoscaling in Renewable-Powered Mobile Edge Computing 2016 IEEE Global Communications Conference (GLOBECOM), (1-6)
  1167. Begashaw S, Nguyen D and Dandekar K Enhancing Blind Interference Alignment with Reinforcement Learning 2016 IEEE Global Communications Conference (GLOBECOM), (1-7)
  1168. Ortiz A, Al-Shatri H, Li X, Weber T and Klein A A Learning Based Solution for Energy Harvesting Decode-and-Forward Two-Hop Communications 2016 IEEE Global Communications Conference (GLOBECOM), (1-7)
  1169. Chai J, Fang R, Liu C and She L (2016). Collaborative Language Grounding Toward Situated Human‐Robot Dialogue, AI Magazine, 37:4, (32-45), Online publication date: 1-Dec-2016.
  1170. Wang P, Rowe J, Mott B and Lester J Decomposing Drama Management in Educational Interactive Narrative: A Modular Reinforcement Learning Approach Interactive Storytelling, (270-282)
  1171. ACM
    Clapp L, Bastani O, Anand S and Aiken A Minimizing GUI event traces Proceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, (422-434)
  1172. Bouten N, Claeys M, Van Poecke B, Latré S and De Turck F Dynamic Server Selection Strategy for Multi-server HTTP Adaptive Streaming Services Proceedings of the 12th Conference on International Conference on Network and Service Management, (82-90)
  1173. Wender S and Watson I Combining Case-Based Reasoning and Reinforcement Learning for Tactical Unit Selection in Real-Time Strategy Game AI Case-Based Reasoning Research and Development, (413-429)
  1174. Harutyunyan A, Bellemare M, Stepleton T and Munos R Q() with Off-Policy Corrections Algorithmic Learning Theory, (305-320)
  1175. Ghesu F, Georgescu B, Mansi T, Neumann D, Hornegger J and Comaniciu D An Artificial Agent for Anatomical Landmark Detection in Medical Images Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016, (229-237)
  1176. ACM
    Demediuk S, Raffe W and Li X An Adaptive Training Framework for Increasing Player Proficiency in Games and Simulations Proceedings of the 2016 Annual Symposium on Computer-Human Interaction in Play Companion Extended Abstracts, (125-131)
  1177. Mourning R and Ying Tang Virtual reality social training for adolescents with high-functioning autism 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (004848-004853)
  1178. Wang L, Brun O and Gelenbe E Adaptive workload distribution for local and remote Clouds 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (003984-003988)
  1179. Kamrani F, Luotsinen L and Løvlid R Learning objective agent behavior using a data-driven modeling approach 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (002175-002181)
  1180. Toubman A, Roessingh J, van Oijen J, Løvlid R, Ming Hou , Meyer C, Luotsinen L, Rijken R, Harris J and Turčaník M Modeling behavior of Computer Generated Forces with Machine Learning Techniques, the NATO Task Group approach 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC), (001906-001911)
  1181. Shaker N Intrinsically motivated reinforcement learning: A promising framework for procedural content generation 2016 IEEE Conference on Computational Intelligence and Games (CIG), (1-8)
  1182. Kurek M and Jaśkowski W Heterogeneous team deep q-learning in low-dimensional multi-agent environments 2016 IEEE Conference on Computational Intelligence and Games (CIG), (1-8)
  1183. Kiourt C and Kalles D Using opponent models to train inexperienced synthetic agents in social environments 2016 IEEE Conference on Computational Intelligence and Games (CIG), (1-4)
  1184. Kiourt C and Kalles D (2016). A platform for large-scale game-playing multi-agent systems on a high performance computing infrastructure, Multiagent and Grid Systems, 12:1, (35-54), Online publication date: 1-Jan-2016.
  1185. Wu G, Yuan C, Leng B and Wang X Finite-to-Infinite N-Best POMDP for Spoken Dialogue Management Chinese Computational Linguistics and Natural Language Processing Based on Naturally Annotated Big Data, (369-380)
  1186. Magyar G and Vircikova M Socially-Assistive Emotional Robot that Learns from the Wizard During the Interaction for Preventing Low Back Pain in Children Social Robotics, (411-420)
  1187. ACM
    Valerio V, Petrioli C, Pescosolido L and Van Der Shaar M A Reinforcement Learning-based Data-Link Protocol for Underwater Acoustic Communications Proceedings of the 10th International Conference on Underwater Networks & Systems, (1-5)
  1188. Yu H and Bertsekas D (2015). A Mixed Value and Policy Iteration Method for Stochastic Control with Universally Measurable Policies, Mathematics of Operations Research, 40:4, (926-968), Online publication date: 1-Oct-2015.
  1189. Apostolopoulos S, Leibold M and Buss M Settling time reduction for underactuated walking robots 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (6402-6408)
  1190. Masuyama G and Umeda K Apprenticeship learning based on inconsistent demonstrations 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (5273-5278)
  1191. Chi Zhang , Hao Zhang and Parker L Feature Space Decomposition for effective robot adaptation 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), (441-448)
  1192. ACM
    Amarilli A, Maniu S and Senellart P (2015). Intensional data on the web, ACM SIGWEB Newsletter, 2015:Summer, (1-12), Online publication date: 17-Aug-2015.
  1193. ACM
    El-Roby A and Aboulnaga A ALEX Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, (1839-1853)
  1194. Lacerda A, Santos R, Veloso A and Ziviani N (2015). Improving daily deals recommendation using explore-then-exploit strategies, Information Retrieval, 18:2, (95-122), Online publication date: 1-Apr-2015.
  1195. ACM
    de la Cruz G, Peng B, Lasecki W and Taylor M Towards Integrating Real-Time Crowd Advice with Reinforcement Learning Companion Proceedings of the 20th International Conference on Intelligent User Interfaces, (17-20)
  1196. Raju L, Sankar S and Milton R (2022). Distributed Optimization of Solar Micro-grid Using Multi Agent Reinforcement Learning, Procedia Computer Science, 46:C, (231-239), Online publication date: 1-Jan-2015.
  1197. Raju L, Milton R, Suresh S and Sankar S (2022). Reinforcement Learning in Adaptive Control of Power System Generation, Procedia Computer Science, 46:C, (202-209), Online publication date: 1-Jan-2015.
  1198. Yliniemi L, Agogino A and Tumer K (2014). Multirobot Coordination for Space Exploration, AI Magazine, 35:4, (61-74), Online publication date: 1-Dec-2014.
  1199. Robertson G and Watson I (2014). A Review of Real‐Time Strategy Game AI, AI Magazine, 35:4, (75-104), Online publication date: 1-Dec-2014.
  1200. Yang X, Liu D and Wei Q Reinforcement-Learning-Based Controller Design for Nonaffine Nonlinear Systems Advances in Neural Networks – ISNN 2014, (51-58)
  1201. ACM
    Luckow K, Păsăreanu C, Dwyer M, Filieri A and Visser W Exact and approximate probabilistic symbolic execution for nondeterministic programs Proceedings of the 29th ACM/IEEE International Conference on Automated Software Engineering, (575-586)
  1202. ACM
    Pejovic V and Musolesi M Anticipatory mobile computing for behaviour change interventions Proceedings of the 2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing: Adjunct Publication, (1025-1034)
  1203. Dimitrakakis C, Li G and Tziortziotis N (2014). The Reinforcement Learning Competition 2014, AI Magazine, 35:3, (61-65), Online publication date: 1-Sep-2014.
  1204. Powell W (2014). Energy and Uncertainty, AI Magazine, 35:3, (8-21), Online publication date: 1-Sep-2014.
  1205. Koedinger K, Brunskill E, Baker R, McLaughlin E and Stamper J (2013). New Potentials for Data‐Driven Intelligent Tutoring System Development and Optimization, AI Magazine, 34:3, (27-41), Online publication date: 1-Sep-2013.
  1206. ACM
    Helms T, Ewald R, Rybacki S and Uhrmacher A A generic adaptive simulation algorithm for component-based simulation systems Proceedings of the 1st ACM SIGSIM Conference on Principles of Advanced Discrete Simulation, (11-22)
  1207. ACM
    Hofmann K, Schuth A, Whiteson S and de Rijke M Reusing historical interaction data for faster online learning to rank for IR Proceedings of the sixth ACM international conference on Web search and data mining, (183-192)
  1208. Rolón M and Martínez E (2012). Agent learning in autonomic manufacturing execution systems for enterprise networking, Computers and Industrial Engineering, 63:4, (901-925), Online publication date: 1-Dec-2012.
  1209. ACM
    El Mougy A and Ibnkahla M A cognitive WSN framework for highway safety based on weighted cognitive maps and Q-learning Proceedings of the second ACM international symposium on Design and analysis of intelligent vehicular networks and applications, (55-62)
  1210. ACM
    Moling O, Baltrunas L and Ricci F Optimal radio channel recommendations with explicit and implicit feedback Proceedings of the sixth ACM conference on Recommender systems, (75-82)
  1211. Carrera A, Ahmadzadeh S, Ajoudani A, Kormushev P, Carreras M and Caldwell D (2012). Towards Autonomous Robotic Valve Turning, Cybernetics and Information Technologies, 12:3, (17-26), Online publication date: 1-Sep-2012.
  1212. Groce A Coverage rewarded Proceedings of the 26th IEEE/ACM International Conference on Automated Software Engineering, (380-383)
  1213. Tenenbaum J and Shrager J (2011). Cancer, AI Magazine, 32:2, (14-26), Online publication date: 1-Jun-2011.
  1214. Fernández S, Aler R and Borrajo D (2011). Knowledge Transfer between Automated Planners, AI Magazine, 32:2, (79-94), Online publication date: 1-Jun-2011.
  1215. Mehta N, Ray S, Tadepalli P and Dietterich T (2011). Automatic Discovery and Transfer of Task Hierarchies in Reinforcement Learning, AI Magazine, 32:1, (35-50), Online publication date: 1-Mar-2011.
  1216. Klenk M, Aha D and Molineaux M (2011). The Case for Case‐Based Transfer Learning, AI Magazine, 32:1, (54-69), Online publication date: 1-Mar-2011.
  1217. Taylor M and Stone P (2011). An Introduction to Intertask Transfer for Reinforcement Learning, AI Magazine, 32:1, (15-34), Online publication date: 1-Mar-2011.
  1218. ACM
    Lee Y, Wampler K, Bernstein G, Popović J and Popović Z Motion fields for interactive character locomotion ACM SIGGRAPH Asia 2010 papers, (1-8)
  1219. Whiteson S, Tanner B and White A (2010). The Reinforcement Learning Competitions, AI Magazine, 31:2, (81-94), Online publication date: 1-Jun-2010.
  1220. Li H (2010). Multiagent Q-learning for aloha-like spectrum access in cognitive radio systems, EURASIP Journal on Wireless Communications and Networking, 2010, (1-13), Online publication date: 1-Apr-2010.
  1221. ACM
    Coros S, Beaudoin P and van de Panne M Robust task-based control policies for physics-based characters ACM SIGGRAPH Asia 2009 papers, (1-9)
  1222. Wawrzyński P A Cat-Like Robot Real-Time Learning to Run Proceedings of the 2009 conference on Adaptive and Natural Computing Algorithms - Volume 5495, (380-390)
  1223. Grześ M and Kudenko D Improving Optimistic Exploration in Model-Free Reinforcement Learning Proceedings of the 2009 conference on Adaptive and Natural Computing Algorithms - Volume 5495, (360-369)
  1224. Urbanowicz R and Moore J (2009). Learning classifier systems, Journal of Artificial Evolution and Applications, 2009, (1-25), Online publication date: 1-Jan-2009.
  1225. Tonmukayakul A and Weiss M (2008). A study of secondary spectrum use using agent-based computational economics, Netnomics, 9:2, (125-151), Online publication date: 1-Oct-2008.
  1226. Mattiussi C, Marbach D, Dürr P and Floreano D (2008). The Age of Analog Networks, AI Magazine, 29:3, (63-76), Online publication date: 1-Sep-2008.
  1227. Knoblock C, Ambite J, Carman M, Michelson M, Szekely P and Tuchinda R (2008). Beyond the Elves, AI Magazine, 29:2, (33-42), Online publication date: 1-Jun-2008.
  1228. Sproewitz A, Moeckel R, Maye J and Ijspeert A (2008). Learning to Move in Modular Robots using Central Pattern Generators and Online Optimization, International Journal of Robotics Research, 27:3-4, (423-443), Online publication date: 1-Mar-2008.
  1229. Fitch R and Butler Z (2008). Million Module March, International Journal of Robotics Research, 27:3-4, (331-343), Online publication date: 1-Mar-2008.
  1230. Varshavskaya P, Kaelbling L and Rus D (2008). Automated Design of Adaptive Controllers for Modular Robots using Reinforcement Learning, International Journal of Robotics Research, 27:3-4, (505-526), Online publication date: 1-Mar-2008.
  1231. Barbakh W and Fyfe C Clustering with Reinforcement Learning Intelligent Data Engineering and Automated Learning - IDEAL 2007, (507-516)
  1232. Li J, Zhang K and Chan L Independent Factor Reinforcement Learning for Portfolio Management Intelligent Data Engineering and Automated Learning - IDEAL 2007, (1020-1031)
  1233. Leng J, Jain L and Fyfe C Convergence Analysis on Approximate Reinforcement Learning Knowledge Science, Engineering and Management, (85-91)
  1234. Omori T, Yokoyama A, Okada H, Ishikawa S and Nagata Y Computational Modeling of Human-Robot Interaction Based on Active Intention Estimation Neural Information Processing, (185-192)
  1235. Sigaud O and Wilson S (2007). Learning classifier systems: a survey, Soft Computing - A Fusion of Foundations, Methodologies and Applications, 11:11, (1065-1078), Online publication date: 1-Sep-2007.
  1236. Goto T, Homma N, Yoshizawa M and Abe K (2007). A phased reinforcement learning algorithm for complex control problems, Artificial Life and Robotics, 11:2, (190-196), Online publication date: 1-Jul-2007.
  1237. Chen G, Low C and Yang Z Extremal search of decision policies for scalable distributed applications Proceedings of the 2nd international conference on Scalable information systems, (1-8)
  1238. Anastasio T and Gad Y (2007). Sparse cerebellar innervation can morph the dynamics of a model oculomotor neural integrator, Journal of Computational Neuroscience, 22:3, (239-254), Online publication date: 1-Jun-2007.
  1239. Verbeeck K, Nowé A, Parent J and Tuyls K (2007). Exploring selfish reinforcement learning in repeated games with stochastic rewards, Autonomous Agents and Multi-Agent Systems, 14:3, (239-269), Online publication date: 1-Jun-2007.
  1240. Jankowski A and Skowron A Toward Perception Based Computing: A Rough-Granular Perspective Web Intelligence Meets Brain Informatics, (122-142)
  1241. Hoshino Y, Sakakura A and Kamei K A proposal of the learning system using the recordable multi-layer type rule base and its application for the fire panic problem Proceedings of the 2006 international conference on Game research and development, (137-140)
  1242. Ponsen M, Muñoz‐Avila H, Spronck P and Aha D (2006). Automatically Generating Game Tactics through Evolutionary Learning, AI Magazine, 27:3, (75-84), Online publication date: 1-Sep-2006.
  1243. ACM
    Sato Y, Akatsuka Y and Nishizono T Reward allotment in an event-driven hybrid learning classifier system for online soccer games Proceedings of the 8th annual conference on Genetic and evolutionary computation, (1753-1760)
  1244. ACM
    Lee G and Bulitko V Genetic algorithms for action set selection across domains Proceedings of the 8th annual conference on Genetic and evolutionary computation, (1697-1704)
  1245. ACM
    Whiteson S and Stone P On-line evolutionary computation for reinforcement learning in stochastic domains Proceedings of the 8th annual conference on Genetic and evolutionary computation, (1577-1584)
  1246. ACM
    Aliprandi D, Mancastroppa A and Matteucci M A Bayesian approach to learning classifier systems in uncertain environments Proceedings of the 8th annual conference on Genetic and evolutionary computation, (1537-1544)
  1247. ACM
    Lanzi P, Loiacono D, Wilson S and Goldberg D Classifier prediction based on tile coding Proceedings of the 8th annual conference on Genetic and evolutionary computation, (1497-1504)
  1248. ACM
    Lanzi P and Loiacono D Standard and averaging reinforcement learning in XCS Proceedings of the 8th annual conference on Genetic and evolutionary computation, (1489-1496)
  1249. ACM
    Taylor M, Whiteson S and Stone P Comparing evolutionary and temporal difference methods in a reinforcement learning domain Proceedings of the 8th annual conference on Genetic and evolutionary computation, (1321-1328)
  1250. ACM
    Bosman P and de Jong E Combining gradient techniques for numerical multi-objective evolutionary optimization Proceedings of the 8th annual conference on Genetic and evolutionary computation, (627-634)
  1251. ACM
    McDowell J, Soto P, Dallery J and Kulubekova S A computational theory of adaptive behavior based on an evolutionary reinforcement mechanism Proceedings of the 8th annual conference on Genetic and evolutionary computation, (175-182)
  1252. Iwata K, Ikeda K and Sakai H (2006). A Statistical Property of Multiagent Learning Based on Markov Decision Process, IEEE Transactions on Neural Networks, 17:4, (829-842), Online publication date: 1-Jul-2006.
  1253. Wang G, Jiang P and Feng Z (2006). Extraction of robot primitive control rules from natural language instructions, International Journal of Automation and Computing, 3:3, (282-290), Online publication date: 1-Jul-2006.
  1254. Nilsson N (2005). Human‐Level Artificial Intelligence? Be Serious!, AI Magazine, 26:4, (68-75), Online publication date: 1-Dec-2005.
  1255. Jangmin O, Lee J, Lee J and Zhang B Dynamic asset allocation exploiting predictors in reinforcement learning framework Proceedings of the 15th European Conference on Machine Learning, (298-309)
  1256. Bayer-Zubek V Learning diagnostic policies from examples by systematic search Proceedings of the 20th conference on Uncertainty in artificial intelligence, (27-34)
  1257. Koenig S, Likhachev M, Liu Y and Furcy D (2004). Incremental Heuristic Search in AI, AI Magazine, 25:2, (99-112), Online publication date: 1-Jun-2004.
  1258. Zimmerman T and Kambhampati S (2003). Learning‐Assisted Automated Planning, AI Magazine, 24:2, (73-96), Online publication date: 1-Jun-2003.
  1259. Borkar V (2002). Q-Learning for Risk-Sensitive Control, Mathematics of Operations Research, 27:2, (294-311), Online publication date: 1-May-2002.
  1260. Koenig S (2001). Agent‐Centered Search, AI Magazine, 22:4, (109-131), Online publication date: 1-Dec-2001.
  1261. Thrun S (2000). Probabilistic Algorithms in Robotics, AI Magazine, 21:4, (93-109), Online publication date: 1-Dec-2000.
Contributors
  • DeepMind Technologies Limited
  • University of Massachusetts Amherst

Recommendations