M GUÉRIAU, N CARDOZO, I DUSPARIC
IEEE 13TH INTERNATIONAL CONFERENCE ON SELF-ADAPTIVE AND SELF-ORGANIZING SYSTEMS (SASO), 2019
Reinforcement learning (RL) is increasingly used to achieve adaptive behaviours in Internet of Things systems relying on large amounts of sensor data. To address the need for self-adaptation in such environments, techniques for detecting environment changes and re-learning behaviours appropriate to those changes have been proposed. However, with the heterogeneity of sensor inputs, the problem of self-adaptation permeates one level deeper; in order for the learnt behaviour to adapt, the underlying environment representation needs to adapt first. The granularity of the RL state space might need to be adapted to learn more efficiently, or to match the new granularity of input data. This paper proposes an implementation of Constructivist RL (Con-RL), enabling RL to learn and continuously adapt its state space representations. We propose a Multi-Layer Growing Neural Gas (ML-GNG) technique, as an extension of the GNG clustering algorithm, to autonomously learn suitable state spaces based on sensor data and learnt actions at runtime. We also create and continuously update a repository of state spaces, selecting the most appropriate one to use at each time step. We evaluate Con-RL in two scenarios: the canonical RL mountain car single-agent scenario, and a large-scale multi-agent car and ride-sharing scenario. We demonstrate its ability to adapt to new sensor inputs, to increase the speed of learning through state space optimization, and to maintain stable long-term performance.