Clustering and Classification Techniques in Machine Learning-based Intrusion Detection Systems: A Review
DOI:
https://doi.org/10.14741/ijcet/v.16.2.3Keywords:
Intrusion Detection System (IDS), Machine Learning (ML), Clustering, ClassificationAbstract
The rapid proliferation of networked digital systems has substantially expanded the threat landscape for cyber intrusions, necessitating robust and adaptive security mechanisms. Intrusion Detection Systems (IDS) constitute a critical component of network security infrastructure by identifying unauthorized access attempts and malicious behavior in real time. Traditional signature-based detection techniques, while effective against known threats, exhibit notable limitations in identifying novel and polymorphic attack patterns. This limitation has driven extensive research into machine learning (ML)-based IDS frameworks that leverage statistical and structural properties of network traffic for automated threat detection. Among ML paradigms, clustering and classification methods have emerged as particularly effective strategies. Classification algorithms employ supervised learning to discriminate between normal and attack traffic based on labeled training data, while clustering techniques utilize unsupervised learning to discover anomalous groupings in unlabeled network data. This paper presents a systematic and comprehensive review of clustering and classification techniques applied in ML-based IDS, examining foundational algorithms including Logistic Regression, K-Nearest Neighbor, Decision Tree, Random Forest, Support Vector Machine, K-Means, and Hierarchical Clustering. The study further analyzes widely adopted benchmark datasets, evaluates performance metrics, and discusses current research challenges. Identified gaps include the handling of high-dimensional feature spaces, computational constraints in IoT environments, and the development of lightweight hybrid detection models. Directions for future research are outlined to guide the design of more intelligent, scalable, and efficient intrusion detection frameworks.
