ENHANCEMENTS OF FUZZY Q-LEARNING ALGORITHM
Keywords:fuzzy models, reinforcement learning, Q-Learning, automatic generation of fuzzy models
AbstractFuzzy Q-Learning algorithm combines reinforcement learning techniques with fuzzy modelling. It provides a ﬂexible solution for automatic discovery of rules for fuzzy systems inthe process of reinforcement learning. In this paper we propose several enhancements tothe original algorithm to make it more performant and more suitable for problems withcontinuous-input continuous-output space. Presented improvements involve generalizationof the set of possible rule conclusions. The aim is not only to automatically discover anappropriate rule-conclusions assignment, but also to automatically deﬁne the actual conclusions set given the all possible rules conclusions. To improve algorithm performance whendealing with environments with inertness, a special rule selection policy is proposed.
Bonarini A.: Evolutionary Learning of Fuzzy rules: competition and cooperation. Fuzzy Modelling: Paradigms and Practice, Kluwer Academic Press, Norwell, MA (1997), 265–284
Glorennec P. Y., Jouﬀe L.: Fuzzy Q-learning. Proc. of the Sixth International Conference on Fuzzy Systems 1997
Glorennec P. Y.: Reinforcement Learning: an Overview. ESIT 2000, Aachen, Germany 2000
Jang J. R.: ANFIS: Adaptive-Neural-Based Fuzzy Inference System. IEEE Trans. on SMC, vol. 23 1993, 665–685
Jouﬀe L.: Fuzzy inference system learning by reinforcement methods, IEEE Trans. Syst. Man. Cybernet. Part C: Appl. Rev., 28, 3, 1998
Klawonn F., Kruse R.: Constructing a fuzzy controller from data. Fuzzy Sets and Systems, 85, 1997, 177–193
Mandani E. H.: Applications of fuzzy algorithms for simple dynamic plants. Proc. IEE, 121, 1974, 1585–1588
Sutton R. S.: Generalization in reinforcement learning: Successful examples using sparse coarse coding. Advances in Neural Information Processing Systems, vol. 8. MIT Press 1996
Takagi H., Sugeno M.: Fuzzy identiﬁcation of systems and its applications to modeling and control. IEEE Transactions on System, Man, and Cybernetics, 15(1) 1985, 116–132.
Thrun S. B.: Eﬃcient exploration in reinforcement learning, Technical Report CMU-CS-92-102, School of Computer Science, Carnegie Mellon University 1992
Watkins C. J. C. H.: Learning from Delayed Rewards. King’s College, Cambridge 1989 (PhD Thesis)