TY - GEN
T1 - Exploring Manifold Embedding Techniques for Enhanced Clustering Efficiency
AU - Allaoui, Mebarka
AU - Kherfi, Mohammed Lamine
AU - Hedjam, Rachid
AU - Belhaouari, Samir Brahim
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Manifold Embedding (ME) is a fundamental field in machine learning, enabling the effective analysis and visualization of high-dimensional datasets. It plays a crucial role in tasks such as clustering and classification by transforming data into more manageable representations. Recently, deep learning-based and ME methods have shown promise in enhancing clustering performance by preserving local structures while uncovering global relationships within the data. This paper investigates the impact of ME techniques on clustering performance, examining six approaches: Principal Component Analysis (PCA), Denoising Autoencoder (DAE), Convolutional Autoencoder (CAE), Uniform Manifold Approximation and Projection (UMAP), Isometric Mapping (ISOMAP), and t-Distributed Stochastic Neighbor Embedding (t-SNE). We hypothesize that these techniques enable the discovery of lower-dimensional representations that improve clustering effectiveness. To validate this, we incorporate ME as a preprocessing step before applying clustering algorithms, including k-means, Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), Gaussian Mixture Models (GMM), and Agglomerative Hierarchical Clustering. Our experiments, conducted on benchmark datasets, analyze the clustering performance across varying numbers of dimensions. The results demonstrate that UMAP and t-SNE consistently enhance clustering performance across all datasets, while other ME techniques exhibit fluctuating effectiveness depending on the chosen representation space. These findings highlight the importance of selecting an appropriate ME method for optimizing clustering outcomes.
AB - Manifold Embedding (ME) is a fundamental field in machine learning, enabling the effective analysis and visualization of high-dimensional datasets. It plays a crucial role in tasks such as clustering and classification by transforming data into more manageable representations. Recently, deep learning-based and ME methods have shown promise in enhancing clustering performance by preserving local structures while uncovering global relationships within the data. This paper investigates the impact of ME techniques on clustering performance, examining six approaches: Principal Component Analysis (PCA), Denoising Autoencoder (DAE), Convolutional Autoencoder (CAE), Uniform Manifold Approximation and Projection (UMAP), Isometric Mapping (ISOMAP), and t-Distributed Stochastic Neighbor Embedding (t-SNE). We hypothesize that these techniques enable the discovery of lower-dimensional representations that improve clustering effectiveness. To validate this, we incorporate ME as a preprocessing step before applying clustering algorithms, including k-means, Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN), Gaussian Mixture Models (GMM), and Agglomerative Hierarchical Clustering. Our experiments, conducted on benchmark datasets, analyze the clustering performance across varying numbers of dimensions. The results demonstrate that UMAP and t-SNE consistently enhance clustering performance across all datasets, while other ME techniques exhibit fluctuating effectiveness depending on the chosen representation space. These findings highlight the importance of selecting an appropriate ME method for optimizing clustering outcomes.
KW - Clustering
KW - Deep Embedding
KW - Dimensionality Reduction
KW - Manifold Embedding
UR - https://www.scopus.com/pages/publications/105018474558
U2 - 10.1109/ACDSA65407.2025.11166497
DO - 10.1109/ACDSA65407.2025.11166497
M3 - Conference contribution
AN - SCOPUS:105018474558
T3 - International Conference on Artificial Intelligence, Computer, Data Sciences, and Applications, ACDSA 2025
BT - International Conference on Artificial Intelligence, Computer, Data Sciences, and Applications, ACDSA 2025
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2nd International Conference on Artificial Intelligence, Computer, Data Sciences, and Applications, ACDSA 2025
Y2 - 7 August 2025 through 9 August 2025
ER -