Efficient Data Labeling and Optimal Device Scheduling in HWNs Using Clustered Federated Semi-Supervised Learning

Research output: Contribution to journalArticlepeer-review

1 Citation (Scopus)

Abstract

Clustered Federated Multi-task Learning (CFL) has emerged as a promising technique to address statistical challenges, particularly with non-independent and identically distributed (non-IID) data across users. However, existing CFL studies entirely rely on the impractical assumption that devices possess access to accurate ground-truth labels. This assumption becomes specifically problematic in hierarchical wireless networks (HWNs), with vast unlabeled data and dual-level model aggregation, not only leading to slowing down convergence speeds and extending processing times but also resulting in increased resource consumption. To this end, we propose Clustered Federated Semi-Supervised Learning (CFSL), a novel framework tailored for more realistic scenarios in HWNs. We leverage specialized models resulting from device clustering and present two prediction model schemes, the best-performing specialized model and the weighted-averaging ensemble model, to correctly label unlabeled, unseen data. For the best-performing specialized model scheme, a specialized model excelling in label prediction for a specific device is assigned to correctly label the unlabeled data, even when the data originates from other environments, while the weighted-averaging ensemble model combines all specialized models into a unified model, capturing more details from broader data distributions across edge networks. The CFSL also introduces two novel prediction time schemes, split-based and stopping-based, for accurately timing the labeling process, alongside two strategic device selection schemes, greedy and round-robin, upon reaching each cluster’s stopping point. Extensive testing validates CFSL’s superiority over existing models in labeling and testing accuracies and resource efficiency, achieving up to 51% energy savings.

Original languageEnglish
Pages (from-to)4941-4957
Number of pages17
JournalIEEE Transactions on Communications
Volume73
Issue number7
DOIs
Publication statusPublished - Jul 2025

Keywords

  • Analytical models
  • Clustered federated learning (CFL)
  • Computational modeling
  • Convergence
  • Data models
  • Energy consumption
  • Ensemble models
  • Hierarchical wireless networks
  • Labeling
  • Predictive models
  • Semisupervised learning
  • Servers
  • Specialized models
  • Training
  • Worker selection
  • semi-supervised learning (SSL)

Fingerprint

Dive into the research topics of 'Efficient Data Labeling and Optimal Device Scheduling in HWNs Using Clustered Federated Semi-Supervised Learning'. Together they form a unique fingerprint.

Cite this