New approaches to the approximation of solutions in machine learning

Aleksandr S. Gorobtsov; Горобцов Александр Сергеевич; Evgeny N. Ryzhov; Рыжов Евгений Николаевич; Yulia A. Orlova; Орлова Юлия Александровна; Anastasia R. Donsckaia; Донская Анастасия Романовна

doi:10.14357/20718594240208

New approaches to the approximation of solutions in machine learning

Authors: Gorobtsov A.S.¹^,2, Ryzhov E.N.¹, Orlova Y.A.¹, Donsckaia A.R.¹
Affiliations:
1. Volgograd State Technical University
2. A. A. Blagonravov Institute of Mechanical Engineering RAS
Issue: No 2 (2024)
Pages: 106-115
Section: Intelligent Planning and Control
URL: https://journals.rcsi.science/2071-8594/article/view/265523
DOI: https://doi.org/10.14357/20718594240208
EDN: https://elibrary.ru/WEVOWU
ID: 265523

Cite item

Abstract

Machine learning tasks focused on determining the laws of control of robots with complex locomotion are considered. The exponential computational complexity of such tasks is shown when using existing methods, in particular, reinforcement learning. The theoretical possibility of finding a multidimensional control function based on differential-algebraic equations of the dynamics of such systems is substantiated by varying the selected subset of the coupling equations. The possibility of a significant reduction in the dimension of the parameter space of the optimization problem on this basis is analyzed. Examples of the proposed method use for solving problems of the dynamics of machines, zoomorphic and anthropomorphic robots are given. The compatibility of the proposed mathematical method with neuromorphic dynamic systems used as a kernel in reservoir computing is shown. The fundamental admissibility of designing hardware for implementing reservoir computing on this basis is shown.

Keywords

machine learning, variational methods, optimal control, robotics, walking robots, reservoir computing

About the authors

Aleksandr S. Gorobtsov

Volgograd State Technical University; A. A. Blagonravov Institute of Mechanical Engineering RAS

Author for correspondence.
Email: vm@vstu.ru

Doctor of technical sciences, professor, Head of the Department of Higher Mathematics, Chief Researcher

Russian Federation, Volgograd; Moscow

Evgeny N. Ryzhov

Volgograd State Technical University

Email: vm@vstu.ru

Candidate of physical and mathematical sciences, Assistant Professor

Russian Federation, Volgograd

Yulia A. Orlova

Volgograd State Technical University

Email: yulia.orlova@gmail.com

Doctor of technical sciences, docent, Head of the Department

Russian Federation, Volgograd

Anastasia R. Donsckaia

Volgograd State Technical University

Email: donsckaia.anastasiya@yandex.ru

Senior Lecturer

Russian Federation, Volgograd

References

Visilter Yu., Gorbatsevich V., Zheltov S. Strukturno- funktsional′nyy analiz i sintez glubokikh konvolyutsionnykh neyronnykh setey [Structural and functional analysis and synthesis of deep convolution neural networks] // Komp′yuternaya Optika [Computer Optics]. 2019. V. 43. P. 886-900.
Gorobtsov A. Obobshchennaya zadacha dinamicheskogo uravnoveshivaniya i perspektivnyye napravleniya yeye primeneniya [Generalized dynamic balancing problem and promising directions of its application] // Izvestiya Vysshikh Uchebnykh Zavedeniy. Mashinostroyeniye [ BMSTU Journal of Mechanical Engineering]. 2023. P. 14-24.
Gorobtsov A., Aleshin A., Rashoyan G., Skvortsov S., Shalyukhin K. Upravleniye soglasovannym dvizheniyem gruppy shagayushchikh robotov pri perenose gruza [Control of coordinated movement of a group of walking robots during cargo transfer] // Spravochnik. Inzhenernyy Zhurnal. [SPRAVOCHNIK. Inzhenernyi zhurnal]. 2019. P. 9-16.
Gorobtsov A., Andreyev A., Markov A., Skorikov A., Tarasov P. Osobennosti resheniya uravneniy metoda obratnoy zadachi dlya sinteza ustoychivogo upravlyayemogo dvizheniya shagayushchikh robotov [Features of solving the equations of the inverse problem method for the synthesis of stable controlled motion of walking robots] // Informatika I Avtomatizatsiya [Informatics and Automation]. 2019. No 18. P. 85-122.
FRUND – Sistema dlja reshenija nelinejnyh dinamicheskih uravnenij. [FRUND—A System for Solving Non-Linear Dynamic Equations] // Electronic resource. URL: http://frund.vstu.ru/ (accessed 24.10.2022).
Agrawal S., Shen S., Panne M. Diverse motion variations for physics-based character animation // Proceedings Of The 12th ACM SIGGRAPH Eurographics Symposium On Computer Animation, 2013. Р. 37-44.
Amirifar R., Sadati N. A low-order H00 controller design for an active suspension system via linear matrix inequalities // Journal Of Vibration And Control. 2004. V. 10. Р. 1181-1197.
Bergamin K., Clavet S., Holden D., Forbes J. DReCon: data-driven responsive control of physics-based characters// ACM Transactions On Graphics (TOG). 2019. V. 38. P. 1-11.
Chatzilygeroudis K., Cully A., Vassiliades V., Mouret J. Quality-Diversity Optimization: A Novel Branch of Stochastic Optimization // Black Box Optimization, Machine Learning, And No-Free Lunch Theorems. Springer. 2021. P. 109-135.
Englsberger J., Werner A., Ott C., Henze B., Roa M., Garofalo G., Burger R., Beyer A., Eiberger O., Schmid K. Others Overview of the torque-controlled humanoid robot TORO // 2014 IEEE-RAS International Conference On Humanoid Robots. 2014. P. 916-923.
Feng S., Whitman E., Xinjilefu X., Atkeson C. Optimization-based full body control for the darpa robotics challenge// Journal Of Field Robotics. 2015. V. 32. P. 293-312.
Gorobtsov A., Kartsov S., Pletnev A., Polyakov Yu, A. Komp’yuternye metody postroeniya i issledovaniya matematicheskikh modeley dinamiki konstruktsiy avtomobiley [Computer methods of constructing and studying of mathematical models for car structural dynamics] // Mashinostroenie Publ. 201. P. 462.
Gorobtsov A., Skorikov A., Tarasov P., Markov A., Andreev A. Methods of Increasing Service Minibots Functional Capabilities // Creativity In Intelligent Technologies And Data Science. Third Conference // CIT&DS. 2019. P. 191-202.
Gorobtsov A., Sychev O., Orlova Yu., Smirnov E., Grigoreva O., Bochkin A., Andreeva M. Optimal Greedy Control in Reinforcement Learning // Sensors. 2022. P. 14.
Guglielmino E., Sireteanu T., Stammers C., Ghita G., Giuclea M. Semi-active suspension control: improved vehicle ride and road friendliness // Springer Science & Business Media. 2008. P. 302.
Haarnoja T., Ha S., Zhou A., Tan J., Tucker G., Levine S. Learning to walk via deep reinforcement learning // ArXiv Preprint. 2018.
Haykin S. Neural Networks and Learning Machines // Prentice Hall. 2009. P. 906.
Heess N., Tb D., Sriram S., Lemmon J., Merel J., Wayne G., Tassa Y., Erez T., Wang Z., Eslami S. Others Emergence of locomotion behaviours in rich environments // ArXiv Preprint. 2017.
Hessel M., Modayil J., Van Hasselt H., Schaul T., Ostrovski G., Dabney W., Horgan D., Piot B., Azar M., Silver D. Rainbow: Combining improvements in deep reinforcement learning // Proceedings Of The AAAI Conference On Artificial Intelligence, 2018. V. 32. P. 14.
Hochreiter S. Schmidhuber J. Long Short-Term Memory // Neural Computation. 1997. V. 9. P. 1735-1780.
Karnopp D. Active damping in road vehicle suspension systems // Vehicle System Dynamics. 1983. V. 12. P. 291-311.
Mania H., Guy A. Recht B. Simple random search provides a competitive approach to reinforcement learning // ArXiv Preprint. 2018.
Maslennikov O.V., Pugavko M.M., Shchapin D.S., Nekorkin V.I. Nonlinear dynamics and machine learning of recurrent spiking neural networks // Uspekhi Fizicheskikh Nauk. 2021. V. 192. No 10. P. 1089-1109.
Mnih V., Kavukcuoglu K., Silver D., Graves A., Antonoglou I., Wierstra D. Riedmiller M. Playing Atari with deep reinforcement learning // ArXiv Preprint. 2013.
Mouret J., Maguire G. Quality diversity for multi-task optimization // Proceedings Of The 2020 Genetic And Evolutionary Computation Conference. 2020. V. 6. P. 9.
Nakajima K., Fisher I. Reservoir Computing // Springer. 2021.
Peters J., Schaal S. Reinforcement learning of motor skills with policy gradients. Neural Networks // The Official Journal Of The International Neural Network Society. 2008. V. 21. No 4. P. 682-97.
Schulman J., Wolski F., Dhariwal P., Radford A., Klimov, O. Proximal Policy Optimization Algorithms // ArXiv Preprint. 2017.
Siekmann J., Godse Y., Fern A., Hurst J. Sim-to-real learning of all common bipedal gaits via periodic reward composition // 2021 IEEE International Conference On Robotics And Automation (ICRA). 2021. P. 7309-7315.
Silver D., Hubert T., Schrittwieser J., Antonoglou I., Lai M., Guez A., Lanctot M., Sifre L., Kumaran D., Graepel T. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play // Science. 2018. V. 362. P. 1140-1144.
Surana S., Lim B., Cully A. Efficient Learning of Locomotion Skills through the Discovery of Diverse Environmental Trajectory Generator Priors // ArXiv Preprint. 2022.
Sutton R. Barto A.G. Reinforcement Learning // MIT Press. 2020. P. 547.
Valueva M., Nagornov N., Lyakhov P., Valuev G., Chervyakov N. Application of the residue number system to reduce hardware costs of the convolutional neural network implementation // Mathematics and Computers in Simulation. 2020. V. 177. P. 232-243.
Veselov G., Sinicyn A. Synthesis of nonlinear control law for car hydraulic suspension with regard kinematic constraints // 2019 12th International Conference On Developments In ESystems Engineering (DeSE). 2019. P. 704-708.
Xie Z., Berseth G., Clary P., Hurst J., Panne M. Feedback Control For Cassie With Deep Reinforcement Learning // 2018 IEEE/RSJ International Conference On Intelligent Robots And Systems (IROS). 2018. P. 1241-1246.
Yagiz N., Hacioglu Y., Taskin Y. Fuzzy sliding-mode control of active suspensions // IEEE Transactions On Industrial Electronics. 2008. V. 55. P. 3883-3890.

Supplementary files

Supplementary Files

Action

1. JATS XML

Download

Username
Password
Remember me

Forgot password?	Register

Username
Password
Remember me

Forgot password?	Register