Anomaly-based intrusion detection leveraging optimized firewall log analysis: a real-time machine learning solution
10.11591/ijece.v15i5.pp4785-4802
Tran Cong Hung
,
Dam Minh Linh
,
Han Minh Chau
,
Ngo Xuan Thoai
,
Thai Duc Phuong
,
Huynh De Thu
Firewall logs play a vital role in cybersecurity by recording network traffic and flagging potential threats. This study evaluates five machine learning algorithms-decision tree (DT), random forest (RF), extra trees (ET), CatBoost (CB), and AdaBoost (AB)-on a dataset of 65,532 firewall log entries. Models were assessed using accuracy, precision, recall, training/prediction time, and Pearson correlation for feature selection, across multiple train-test splits. The DT model achieved the best performance, reaching 99.45% test accuracy, 97.457% precision, and 93.389% recall at a 7:3 split, along with the fastest training time (0.20642s). We propose real-time flow-level intrusion detection (RT-FLID), novel, lightweight, real-time intrusion detection system that leverages multithreaded processing and flow-level analysis to boost detection speed and scalability. Unlike existing approaches that rely heavily on deep packet inspection or computationally intensive processing, RT-FLID requires minimal resources while maintaining high detection accuracy. The architecture efficiently handles large traffic volumes and dynamically identifies anomalies such as distributed denial-of-service (DDoS) and port scans. Validated on real-world logs, the system maintained high accuracy in critical classes like “deny” and “reset-both.” These findings highlight RT-FLID’s novelty and practical advantages, demonstrating its potential for deployment in high-throughput, low-latency network environments.