Training Efficiency of DDQN-Based Multilevel Inverter Control: The Influence of Reward Function Penalty Terms

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Reinforcement Learning (RL)-based controllers have recently gained attention as AI-driven, model-free methods for controlling power electronic converters by learning optimal control actions through continuous interaction with the environment. Their learning process is governed by a reward function, which guides the agent’s behavior. This paper investigates the influence of incorporating penalty terms into the reward function on the training efficiency and performance of an RL-based controller for a 7-level grid-tied Packed-U-Cell (PUC7) multilevel inverter. The controller is developed using the Double Deep Q-Network (DDQN) algorithm, selected for its balanced combination of strong performance and ease of implementation. The control objectives include sinusoidal current injection into the grid and capacitor voltage regulation around the desired value. The reward function is designed based on current and voltage tracking errors, with two penalty terms introduced to limit deviations beyond predefined thresholds. The study evaluates the impact of varying these penalty magnitudes on learning speed, convergence behavior, and tracking quality. Simulations are conducted in MATLAB/Simulink, demonstrating that the appropriate selection and application of penalties improve training efficiency without compromising control performance.

Original languageEnglish
Title of host publicationProceedings of the 2nd Symposium on Smart, Sustainable, and Secure Internet of Things - Proceedings of S4IoT 2025
EditorsMohamed Trabelsi, Zied Bouida, M. Murugappan, Murad Khan
PublisherSpringer Science and Business Media Deutschland GmbH
Pages151-162
Number of pages12
ISBN (Print)9789819551354
DOIs
Publication statusPublished - 2026
Event2nd Symposium on Smart, Sustainable, and Secure Internet of Things, S4IoT 2025 - Doha, Kuwait
Duration: 6 May 20257 May 2025

Publication series

NameLecture Notes in Electrical Engineering
Volume1513 LNEE
ISSN (Print)1876-1100
ISSN (Electronic)1876-1119

Conference

Conference2nd Symposium on Smart, Sustainable, and Secure Internet of Things, S4IoT 2025
Country/TerritoryKuwait
CityDoha
Period6/05/257/05/25

Keywords

  • Reinforcement learning
  • Reward function

Fingerprint

Dive into the research topics of 'Training Efficiency of DDQN-Based Multilevel Inverter Control: The Influence of Reward Function Penalty Terms'. Together they form a unique fingerprint.

Cite this