Abstract
Reinforcement learning (RL) is essential for the computation of game equilibria and the estimation of payoffs under incomplete information. However, it has been a challenge to apply RL-based algorithms in the energy trading game among smart microgrids where no information concerning the distribution of payoffs is a priori available and the strategy chosen by each microgrid is private to opponents, even trading partners. This paper proposes a new energy trading framework based on the repeated game that enables each microgrid to individually and randomly choose a strategy with probability to trade the energy in an independent market so as to maximize his/her average revenue. By establishing the relationship between the average utility maximization and the best strategy, two learning-automaton-based algorithms are developed for seeking the Nash equilibria to accommodate the variety of situations. The novelty of the proposed algorithms is related to the incorporation of a normalization procedure into the classical linear reward-inaction scheme to provide a possibility to operate any bounded utility of a stochastic character. Finally, a numerical example is given to demonstrate the effectiveness of the algorithms.
| Original language | English |
|---|---|
| Article number | 7460097 |
| Pages (from-to) | 5109-5119 |
| Number of pages | 11 |
| Journal | IEEE Transactions on Industrial Electronics |
| Volume | 63 |
| Issue number | 8 |
| DOIs | |
| Publication status | Published - Aug 2016 |
Keywords
- Energy trading game
- incomplete information
- reinforcement learning (RL)
- smart microgrids (MGs)