Optimization of Image Compression Using K-Means Clustering for Digital Heritage Archives
DOI:
https://doi.org/10.26877/asset.v8i1.2772Keywords:
Image compression, vector quantization, PSNR, SSIM, digital archivingAbstract
Preserving digital cultural assets requires efficient compression to minimize storage and bandwidth costs. However, existing studies rarely evaluate K-Means Clustering on structurally complex objects such as the Prambanan Temple, leaving a research gap in assessing its performance against standard codecs. This study introduces a novel optimized K-Means pipeline with adaptive cluster selection and improved centroid initialization for compressing high-detail temple imagery. The method groups pixels based on color proximity, reducing redundancy while preserving key structural patterns. Experiments show that K-Means achieves PSNR 28.08–30.65 dB and SSIM 0.86–0.92, outperforming baseline JPEG at similar file sizes PSNR 26–28 dB, SSIM 0.80–0.87. This quantitative comparison demonstrates the model’s superior perceptual retention in textured stone regions. The methodological contribution lies in combining spatial–chromatic feature weighting with iterative centroid refinement, which increases cluster stability and reduces quantization artifacts. Findings confirm K-Means as a viable alternative for controlled-distortion compression. In conclusion, the proposed approach provides practical engineering implications, enabling reduced storage footprints, predictable reconstruction quality, and integration into hybrid compression pipelines for large-scale digital imaging systems.
References
[1] S. A. Abbas, A. Aslam, A. U. Rehman, W. A. Abbasi, S. Arif and S. Z. H. Kazmi, "K-Means and K-Medoids: Cluster Analysis on Birth Data Collected in City Muzaffarabad, Kashmir," in IEEE Access, vol. 8, pp. 151847-151855, 2020, doi: 10.1109/ACCESS.2020.3014021.
[2] Anwar MT, Nugrohadi S, Tantriyati V, Windarni VA. Rain prediction using rule-based machine learning approach. Advance Sustainable Science, Engineering and Technology. 2020 May 1;2(1):0200104.
[3] D. Cheng, J. Huang, S. Zhang, S. Xia, G. Wang and J. Xie, "K-Means Clustering With Natural Density Peaks for Discovering Arbitrary-Shaped Clusters," in IEEE Transactions on Neural Networks and Learning Systems, vol. 35, no. 8, pp. 11077-11090, Aug. 2024, doi: 10.1109/TNNLS.2023.3248064.
[4] F. Deng, W. Gu, W. Zeng, Z. Zhang and F. Wang, "Hazardous Chemical Accident Prevention Based on K-Means Clustering Analysis of Incident Information," in IEEE Access, vol. 8, pp. 180171-180183, 2020, doi: 10.1109/ACCESS.2020.3028235.
[5] J. Han, J. Xu, F. Nie and X. Li, "Multi-View K-Means Clustering With Adaptive Sparse Memberships and Weight Allocation," in IEEE Transactions on Knowledge and Data Engineering, vol. 34, no. 2, pp. 816-827, 1 Feb. 2022, doi: 10.1109/TKDE.2020.2986201.
[6] K. Kandali, L. Bennis and H. Bennis, "A New Hybrid Routing Protocol Using a Modified K-Means Clustering Algorithm and Continuous Hopfield Network for VANET," in IEEE Access, vol. 9, pp. 47169-47183, 2021, doi: 10.1109/ACCESS.2021.3068074.
[7] I. Khan, Z. Luo, J. Z. Huang and W. Shahzad, "Variable Weighting in Fuzzy k-Means Clustering to Determine the Number of Clusters," in IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 9, pp. 1838-1853, 1 Sept. 2020, doi: 10.1109/TKDE.2019.2911582.
[8] T. Li, Y. Ma and T. Endoh, "Normalization-Based Validity Index of Adaptive K-Means Clustering for Multi-Solution Application," in IEEE Access, vol. 8, pp. 9403-9419, 2020, doi: 10.1109/ACCESS.2020.2964763.
[9] X. Liu, X. Yang, J. Zhang, J. Wang and F. Nie, "Outlier Indicator Based Projection Fuzzy K-Means Clustering for Hyperspectral Image," in IEEE Signal Processing Letters, vol. 32, pp. 496-500, 2025, doi: 10.1109/LSP.2024.3521714.
[10] S. M. Miraftabzadeh, C. G. Colombo, M. Longo and F. Foiadelli, "K-Means and Alternative Clustering Methods in Modern Power Systems," in IEEE Access, vol. 11, pp. 119596-119633, 2023, doi: 10.1109/ACCESS.2023.3327640.
[11] F. Nie, Z. Li, R. Wang and X. Li, "An Effective and Efficient Algorithm for K-Means Clustering With New Formulation," in IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 4, pp. 3433-3443, 1 April 2023, doi: 10.1109/TKDE.2022.3155450.
[12] R. Pan, C. Zhong and J. Qian, "Balanced Fair K-Means Clustering," in IEEE Transactions on Industrial Informatics, vol. 20, no. 4, pp. 5914-5923, April 2024, doi: 10.1109/TII.2023.3342888.
[13] A. Punhani, N. Faujdar, K. K. Mishra and M. Subramanian, "Binning-Based Silhouette Approach to Find the Optimal Cluster Using K-Means," in IEEE Access, vol. 10, pp. 115025-115032, 2022, doi: 10.1109/ACCESS.2022.3215568.
[14] M. Raeisi and A. B. Sesay, "A Distance Metric for Uneven Clusters of Unsupervised K-Means Clustering Algorithm," in IEEE Access, vol. 10, pp. 86286-86297, 2022, doi: 10.1109/ACCESS.2022.3198992.
[15] A. Rizwan, N. Iqbal, A. N. Khan, R. Ahmad and D. H. Kim, "Toward Effective Pattern Recognition Based on Enhanced Weighted K-Mean Clustering Algorithm for Groundwater Resource Planning in Point Cloud," in IEEE Access, vol. 9, pp. 130154-130169, 2021, doi: 10.1109/ACCESS.2021.3111112.
[16] K. P. Sinaga, I. Hussain and M. -S. Yang, "Entropy K-Means Clustering With Feature Reduction Under Unknown Number of Clusters," in IEEE Access, vol. 9, pp. 67736-67751, 2021, doi: 10.1109/ACCESS.2021.3077622.
[17] K. P. Sinaga and M. -S. Yang, "Unsupervised K-Means Clustering Algorithm," in IEEE Access, vol. 8, pp. 80716-80727, 2020, doi: 10.1109/ACCESS.2020.2988796.
[18] Q. Wang, J. Liu, B. Wei, W. Chen and S. Xu, "Investigating the Construction, Training, and Verification Methods of k-Means Clustering Fault Recognition Model for Rotating Machinery," in IEEE Access, vol. 8, pp. 196515-196528, 2020, doi: 10.1109/ACCESS.2020.3028146.
[19] S. Wang and R. Ferrús, "Extracting Cell Patterns From High-Dimensional Radio Network Performance Datasets Using Self-Organizing Maps and K-Means Clustering," in IEEE Access, vol. 9, pp. 42045-42058, 2021, doi: 10.1109/ACCESS.2021.3065820.
[20] X. Wang, C. Shao, S. Xu, S. Zhang, W. Xu and Y. Guan, "Study on the Location of Private Clinics Based on K-Means Clustering Method and an Integrated Evaluation Model," in IEEE Access, vol. 8, pp. 23069-23081, 2020, doi: 10.1109/ACCESS.2020.2967797.
[21] H. Yan, Y. Shi, Y. Long, P. Yu, X. Geng and D. Long, "An Efficient Division Method of Traffic Cell Based on Improved K-means Clustering Algorithm for the Location of Infrastructure in Vehicular Networks," in IEEE Transactions on Vehicular Technology, vol. 74, no. 2, pp. 1959-1967, Feb. 2025, doi: 10.1109/TVT.2024.3370777.
[22] H. Yang, H. Peng, J. Zhu and F. Nie, "Co-Clustering Ensemble Based on Bilateral K-Means Algorithm," in IEEE Access, vol. 8, pp. 51285-51294, 2020, doi: 10.1109/ACCESS.2020.2979915.
[23] M. Yang, L. Huang and C. Tang, "K-Means Clustering with Local Distance Privacy," in Big Data Mining and Analytics, vol. 6, no. 4, pp. 433-442, December 2023, doi: 10.26599/BDMA.2022.9020050.
[24] M. -S. Yang and I. Hussain, "Unsupervised Multi-View K-Means Clustering Algorithm," in IEEE Access, vol. 11, pp. 13574-13593, 2023, doi: 10.1109/ACCESS.2023.3243133.
[25] G. Yao, Y. Wu, X. Huang, Q. Ma and J. Du, "Clustering of Typical Wind Power Scenarios Based on K-Means Clustering Algorithm and Improved Artificial Bee Colony Algorithm," in IEEE Access, vol. 10, pp. 98752-98760, 2022, doi: 10.1109/ACCESS.2022.3203695.
[26] H. -H. Zhao, X. -C. Luo, R. Ma and X. Lu, "An Extended Regularized K-Means Clustering Approach for High-Dimensional Customer Segmentation With Correlated Variables," in IEEE Access, vol. 9, pp. 48405-48412, 2021, doi: 10.1109/ACCESS.2021.3067499.
[27] X. Zhao, F. Nie, R. Wang and X. Li, "Robust Fuzzy K-Means Clustering With Shrunk Patterns Learning," in IEEE Transactions on Knowledge and Data Engineering, vol. 35, no. 3, pp. 3001-3013, 1 March 2023, doi: 10.1109/TKDE.2021.3116257.
[28] K. R. Žalik and M. Žalik, "Comparison of K-Means, K-Means++, X-Means and Single Value Decomposition for Image Compression," 2023 27th International Conference on Circuits, Systems, Communications and Computers (CSCC), Rhodes (Rodos) Island, Greece, 2023, pp. 295-301, doi: 10.1109/CSCC58962.2023.00055.
[29] P. Wang, "Compression of Ultra High Definition Image based on K-means Clustering Algorithm," 2024 IEEE 4th International Conference on Information Technology, Big Data and Artificial Intelligence (ICIBA), Chongqing, China, 2024, pp. 1058-1062, doi: 10.1109/ICIBA62489.2024.10868552.
[30] R. Rayan, M. S. Hossain and Asaduzzaman, "Compression of Large-Scale Image Dataset using Principal Component Analysis and K-means Clustering," 2019 International Conference on Electrical, Computer and Communication Engineering (ECCE), Cox'sBazar, Bangladesh, 2019, pp. 1-5, doi: 10.1109/ECACE.2019.8679270.
[31] A. Banerjee and A. Halder, "An efficient image compression algorithm for almost dual-color image based on k-means clustering, bit-map generation and RLE," 2010 International Conference on Computer and Communication Technology (ICCCT), Allahabad, India, 2010, pp. 201-205, doi: 10.1109/ICCCT.2010.5640529.
[32] Z. Sun and Y. Wun, "Multispectral Image Compression Based on Fractal and K-Means Clustering," 2009 First International Conference on Information Science and Engineering, Nanjing, China, 2009, pp. 1341-1344, doi: 10.1109/ICISE.2009.772.
[33] S. Sivaarunagirinathan, B. Ajith Bala, S. Fairooz, G. Sasi, H. Narayan Upadhyay and V. Elamaran, "Lossy Data Compression using K-Means Clustering on Retinal Images using RStudio," 2021 3rd International Conference on Advances in Computing, Communication Control and Networking (ICAC3N), Greater Noida, India, 2021, pp. 1772-1776, doi: 10.1109/ICAC3N53548.2021.9725647.
[34] R. Kumari and S. Sriramulu, "Lossless Image Compression using K-Means Clustering in Color Pixel Domain," 2024 IEEE International Conference on Computing, Power and Communication Technologies (IC2PCT), Greater Noida, India, 2024, pp. 1925-1933, doi: 10.1109/IC2PCT60090.2024.10486602.
[35] Z. Wang, "Entropy Analysis for Clustering Based Lossless Compression of Remotely Sensed Images," 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 2021, pp. 4220-4223, doi: 10.1109/BigData52589.2021.9671694.
[36] D. K. Mahapatra and U. R. Jena, "Partitional k-means clustering based hybrid DCT-Vector Quantization for image compression," 2013 IEEE Conference on Information & Communication Technologies, Thuckalay, India, 2013, pp. 1175-1179, doi: 10.1109/CICT.2013.6558278.



