IAES Inter national J our nal of Articial Intelligence (IJ-AI) V ol. 15, No. 1, February 2026, pp. 481 492 ISSN: 2252-8938, DOI: 10.11591/ijai.v15.i1.pp481-492 481 An efcient ensemble tr ee-based framew ork f or intrusion detection in industrial inter net of things netw orks Mouad Choukhairi 1 , Oumaima Chentou 2 , Ouail Choukhairi 1 , Y oussef F akhri 1 1 LARI Laboratory , Department of Computer Science, F aculty of Sciences, Ibn T of ail Uni v ersity , K enitra, Morocco 2 Engineering Science Laboratory , National School of Applied Sciences (ENSA), Ibn T of ail Uni v ersity , K enitra, Morocco Article Inf o Article history: Recei v ed Apr 27, 2025 Re vised Oct 31, 2025 Accepted No v 8, 2025 K eyw ords: Cybersecurity Ensemble learning IIoT security Intrusion detection Machine learning Multiclass T oN-IoT ABSTRA CT The increasing comple xity of c yber threats in industrial internet of things (IIoT) en vironments necessit ates rob ust, scalable, and ef cient intrusion detection systems (IDS). This study presents a no v el ensemble tree-based frame w ork that inte grates gradient boosting-based machine learning models, including XGBoost, LightGBM, AdaBoost, and CatBoost, with mutual information (MI) feature selection and synthetic minority o v e r -sampling technique (S MO TE) to enhance multiclass intrusion detection performance. The frame w ork is designed to handle lar ge-scale, imbalanced datasets ef ciently while maintaining high classication accurac y . Performance e v aluation using the telemetry of netw ork (T oN)-IoT benchmark dataset demonstrates that the proposed models achie v e a high accurac y of 99.43%, with a strong precision-recall balance and an F1-score, ensuring minimal f alse positi v e rat es of 0.08%. By le v eraging MI for optimal feature selection and SMO TE for data balancing, this approach ef fecti v ely enhances detection capabilities in highly dynamic netw ork en vironments. The lightweight architecture and reduced e x ecution time mak e the frame w ork well-suited for deplo yment in edge or fog nodes within smart industrial en vironments. The proposed solution pro vides a scalable and adaptable methodology for securing IIoT netw orks, making it a pplicable for real-time intrusion monitoring and further c ybersecurity adv ancements in industrial systems. This is an open access article under the CC BY -SA license . Corresponding A uthor: Mouad Choukhairi LARI Laboratory , Department of Computer Science, F aculty of Sciences, Ibn T of ail Uni v ersity B.P 133, Uni v ersity Campus, K enitra, Morocco Email: mouad.choukhairi@uit.ac.ma 1. INTR ODUCTION The industrial internet of things (IIoT) is transforming modern industries by seamlessly i nte grating sensors, actuators, and control systems, thereby f acilitating real-time data e xchange and enabling unprecedented le v els of operational automation [1]. This interconnected ecosystem allo ws for enhanced monitoring, predicti v e maintenance, and optimized resource allocation, leading to increased ef cienc y and producti vity across v arious sectors [2]. Ho we v er , this increased connecti vity inherently introduces ne w vulnerabilities, making critical infrastructures more susceptible to sophisticated c yberattacks [3]. T raditional intrusion detection systems (IDS) often f all short in ef fecti v ely safe guarding IIoT netw orks due to the dynamic nature of IIoT data and the continuous emer gence of no v el, zero-day e xploits [4], [5]. Signature-based IDS, which rely on predened attack patterns, struggle to detect anomalies and de viations from established norms J ournal homepage: http://ijai.iaescor e .com Evaluation Warning : The document was created with Spire.PDF for Python.
482 ISSN: 2252-8938 in these comple x en vironments. The inadequac y stems from their inabili ty to adapt to the e v olving threat landscape and the unique characteristics of IIoT traf c patterns. IIoT datasets present unique challenges for machine learning (ML) based IDS, including high dimensionality , class imbalance, and inherent noise, which signicantly complicates the training and deplo yment of ef fecti v e detection models [6]. The high dimensional ity of IIoT data, characterized by a lar ge number of features e xtracted from netw ork traf c and sensor readings, can lead to the curse of dimensionality , where the performance of ML algorithms de grades as the number of features increases. Class imbalance, where the number of instances belonging to dif ferent attack classes v aries signicantly , further e xacerbates the problem, as ML models tend to be biased to w ards the majority class, resulting in poor detection rates for minority classes, which often represent critical security threats. Furthermore, the presence of noise in IIoT data, arising from sensor inaccuracies, communication errors, and en vironmental f actors, can further de grade the performance of ML models, leading to increased f alse positi v e rates (FPR) and reduced detection accurac y . T o address the challenge of detecting anomalies and unkno wn attacks in real-time within IoT de vices, ML techniques can be le v eraged [7]. The application of ML algorithms of fers the potential for automating anomaly detection in industrial machinery by analyzing the v ast amounts of data generated by IoT de vices [8]. ML models ha v e demonstrated signicant promise in the realm of IDS for IIoT , yet the y also present certain limitations that need to be carefully addressed. The ef fecti v eness of IDS has gro wn in popularity recently , and identifying unauthorized indi viduals is its main objecti v e [9]. Ensemble ML models ha v e sho wn remarkable performance in intrusion detection tasks due to their ability to combine multiple base learners and capture comple x relationships within the data [10]. Ho we v er , e v en with tree-based algorithms, attack ers can introduce small changes in IoT netw ork traf c that can mislead these algorithms. Despite the increasing research ef forts, anomaly detection using ML is still e v olving [7]. The current ML models lack rob ustness when f acing pre viously unseen types of attacks [11]. The models’ ability to generalize across di v erse IIoT en vironments and adapt to e v olving attack strate gies remains a concern. Thus, ne w attack detection methods are needed for risk mitig ation [12]. Adv anced methods are needed because traditional approaches for detect ing c yber -attacks ha v e lo w ef cienc y [13]. Therefore, there is still the opportunity to de v elop ef fecti v e intrusion detection for lar ge-scale IIoT systems. Gradient boosti ng ML algorithms lik e XGBoost, LightGBM, AdaBoost, and CatBoost ha v e g ained attention due to their ability to capture non-linear relationships and scale to lar ge datasets with high-performance learning. Ho we v er , their performance in IIoT scenarios is const rained by challenges such as high feature dimensionality and class imbalance, which can lead to biased models or increased f alse alarms. T o address these limitations and challenges, this paper proposes a comprehensi v e frame w ork that inte grates an ensemble tree-based architecture consisting of XGBoost, LightGBM, AdaBoost, and CatBoost as state-of-the-art gradient boosting classiers with mutual information (MI) for feature selection and synthetic minority o v er -sampling technique (SMO TE) for class balancing. The no v elty of this research lies in combining MI and SMO TE with four popular gradient boosting classiers in a unied IDS pipeline. Unlik e pre vious studies that e v aluate only indi vidual components or models, we systematically benchmark multiple models’ scenarios, analyze the interaction of pre-processing strate gies, and pro vide e x ecution time analysis t o determine real-time feasibility . Ev aluation on the telemetry of netw ork (T oN)-IoT dataset demonstrates that our approach attains high classication accurac y while preserving lo w FPR and ef cient runtimes, making it vi able for real-time IIoT intrusion monitoring. 2. RELA TED W ORK In the realm of IIoT intrusion detection, feature engineering and class balancing strate gies are pi v otal in addressing the challenges posed by high-dimensional and imbalanced datasets. Feature engineering strate gies ha v e been e xtensi v ely e xplored to enhance detection accurac y , reduce f alse po s iti v es, and manage high-dimensional data. MI is a prominent technique used for feature selection, which helps reduce redundanc y and select the most rele v ant features, thereby impro ving class ication accurac y and detection performance in IIoT netw orks [14], [15]. Principal component analysis (PCA) is another widely used method for feature e xtraction, which has been sho wn to signicantl y impro v e detection accurac y , achie ving up to 100% in some cases by transforming high-dimensional data int o a lo wer -dimensional space while retaining essential information [16], [17]. Relief is a kno wn feature selection method that e v aluates the importance of features Int J Artif Intell, V ol. 15, No. 1, February 2026: 481–492 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 483 based on their ability to distinguish between dif ferent class es [18]. The light feature engineering based on the mean decrease in accurac y (LEMD A) method, a no v el feature engineering approach, has demonstrated a substantial impro v ement in F1-scores by an a v erage of 34% across v arious models, indicating its ef fecti v eness in enhancing m odel performance while reducing training and detection times [19]. Additionally , bio-inspired feature selection methods lik e gray w olf optimization (GW O) ha v e been sho wn to outperform other techniques, achie ving high accurac y and F1-scores wit h reduced e x ecution time when combined with classiers lik e k-nearest neighbors (KNN) [20]. The inte gration of feature selection and reduction techniques, such as minimum redundanc y maximum rele v ance and PCA, has been ef fecti v e in balancing model comple xity and performance, achie ving high accurac y rates of up to 99.9% in binary classication tasks [21]. In addressing class imbalance in the same conte xt, v arious studies ha v e e xplored the ef fecti v enes s of dif ferent class-balancing strate gies, such as SMO TE, adapti v e synthetic sampling (AD ASYN), and other o v ersampling and undersampling techniques. SMO TE is frequently highlighted for its ability to enhance classication performance by producing synthetic samples for the minority class, thereby impro ving metrics lik e F1-score, precision, and recall. F or instance, in one study , SMO TE achie v ed a precision of 99.19%, a recall of 72.45%, and an F1-score of 79.13% when applied to the IoT -23 dataset, indicating a balanced impro v ement in detection performance [22]. AD ASYN, which adapts the number of synthetic samples generated for dif ferent minority class e xamples based on their dif culty , has also been sho wn to impro v e classication metrics, although specic performance metrics were not detailed in the pro vided conte xts [23]. Other studies ha v e compared these techniques with ensemble models, nding that methods lik e SMO TE, when combined with ensemble learners, can signicantly boost accurac y by 1% to 4% and achie v e precision, recall, and F1-scores between 95% and 100% [24]. Additionally , the inte gration of these techniques with adv anced models lik e XGBoost has demonstrated remarkable ef fecti v eness, achie ving F1-scores as high as 99.9% on imbalanced IIoT datasets [25]. Despite these impro v ements, challenges remain, as o v ersampling and undersampling can sometimes lead to high f alse-positi v e rates or reduced performance in majority classes, necessitating further renement and h ybrid approaches [25], [26]. 3. METHOD This section details the w orko w adopted to b uild, design, and assess the proposed intrusion det ection frame w ork. The pipeline is arranged in v e sequent ial blocks: data preparation and pre-processing, feature engineering, class balancing, model de v elopment, and e v aluation. All steps were e x ecuted in the sequence presented and were designed to enable full reproducibility . Figure 1 gi v es a high–le v el o v ervie w , while Algorithm 1 lists the e xact steps mirrored in our approach. Figure 1. Frame w ork of the proposed w orko w for c yberattack classication in IIoT netw orks An ef cient ensemble tr ee-based fr ame work for intrusion detection in ... (Mouad Choukhairi) Evaluation Warning : The document was created with Spire.PDF for Python.
484 ISSN: 2252-8938 3.1. Data pr eparation and pr e-pr ocessing The T oN-IoT dataset is a ne xt-generation benchmark e xpressly crafte d for IoT and IIoT c ybersecurity research [27]. Built in an Industry 4.0 c yber -range at UNSW Canberra, it fuses time-aligned telemetry from more than ten industrial sensors, yielding millions of records that are indi vidually labelled as benign or as one of nine representati v e attack f amilies (i.e., denial of service, ransomw are, man in the middle, passw ord/brute-force, distrib uted denial of service, backdoor , injection, cross-site scripting, and scanning). This multimodal design mirrors the cloud–fog–edge hierarch y typical of modern f actories, letting researchers test AI-dri v en intrusion-detection and threat-intelligence models under realistic IIoT traf c and class-imbalance conditions. Consequently , T oN-IoT has become a de f acto reference corpus for e v aluating security analytics in Industry 4.0 en vironments. T o enable the classication of IoT -based c yberattacks, a total of forty-three features are e xtracted to char acterize each o w , cate gorized into six subsets based on the nature of the information the y con v e y (e.g., connection acti vity features, violation acti vity features, and statistical acti vity features). The training and testing data used in this w ork are dra wn from an of cially released subset of the T oN IoT dataset, which includes 300,000 normal traf c o ws and 20,000 o ws for each attack cate gory , e xcept for the XSS attack class, which contains only 1,043 recorded o ws. W e rely on the IoT -telemetry splits of the T oN-IoT corpus, where each ph ysical sensor is pro vided as an independent CSV le containing pre-di vided train/test records across the security classes (i.e., benign or attack type), which we ha v e mer ged into a single dataset for comprehensi v e analysis. The same dataset w as processed in se v eral steps to prepare it for ML model training. This section describes each pre-processing step, including handling missing data, feature normalization, and cate gorical data encoding. Algorithm 1 MI–SMO TE–Boost intrusion detection Requir e: Dataset D , top- k , k smote Ensur e: T rained models M 1: Encode, impute, scale D 2: Compute MI; select top- k features X k 3: Split D D train , D test 4: Apply SMO TE ( k smote ) on D train [ X k ] 5: f or each booster { XGBoost, LightGBM, AD ABoost, CA TBoost } do 6: T rain booster on balanced D train [ X k ] 7: Ev aluate on D test [ X k ] 8: Store metrics M 9: end f or 10: r etur n M 3.1.1. Handling missing data Handling missing v alues is an important step in data cleaning, and it is crucial for ensuring the inte grity and completeness of the dataset. F or numeric features with missing v alues, mean imputation w as applied. This in v olv es replacing missing v alues in a feature x j with the mean of that feature computed o v er the numerical data subset. This approach ensures that all data records can be used for training without introducing signicant bias. The imputation formula used is: x ij 1 | D numeric | X k D numeric x k j if x ij is missing (1) Where D numeric represents t he numerical data subset, x ij is the missing v alue, and the mean of the feature x j is computed o v er the entire numerical data subset. F or cate gorical features, missing v alues were handled separately . In the T oN-IoT dataset, an y missing cate gories were imputed using the most frequent v alue (i.e., mode) within the data, ensuring that the cate gories are consistent across the dataset. 3.1.2. F eatur e normalization T o guarantee that e v ery numerical feature plays an acti v e role in the model training, sta n da rdization w as applied using the S tandar dS cal er from scikit-learn. This transformation ensures that each feature has Int J Artif Intell, V ol. 15, No. 1, February 2026: 481–492 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 485 a mean of zero and a standard de viation of one, which pre v ents features with lar ger numeric ranges from disproportionately inuencing the model. The standardization formula used is: x ij = x ij µ j σ j (2) Where x ij is the standardized v alue of feature x j for the i -th instance, µ j is the m ean, and σ j is the standard de viation for feature x j across the data. This scaling w as applied to training and testing sets, ensuring that all features are treated consistently across both training and testing phases. 3.1.3. Categorical data encoding Cate gorical features, such as traf c type or de vice identi ers, were transformed into numeric al representations using one-hot encoding. This process c reates binary columns for each unique cate gory in a feature. F or e xample, if a feature protocol has three unique v alues (e.g., T C P , U D P , and I C M P ), one-hot encoding w ould generate three binary colum ns: pr otocol T C P , pr otocol U D P , pr otocol I C M P . Each instance of the dataset is represented by 1 in the corresponding cate gory column and 0 in the others. This transformation pre v ents the model from assuming an y ordinal re lationship between cate gories and ensures that cate gorical v ariables are processed appropriately by tree-based ML models. 3.2. F eatur e engineering MI quanties the amount of information one v ariable pro vides about another [28]. In classi cation tasks, MI is used to select features that ha v e the highest dependenc y on the class labels. T o mitig ate the issue of dimensionality , MI w as used to e v aluate the rele v ance of each feature X j with respect to the multiclass label Y . The MI score quanties the reduction in label uncertainty due to kno wledge of the feature, with the formula: I ( X j ; Y ) = X x j X y p ( x j , y ) log p ( x j , y ) p ( x j ) p ( y ) (3) Where p ( x j , y ) is the joint probability of feature X j and label Y , and p ( x j ) , p ( y ) are the mar ginal probabilities of X j and Y , respecti v ely . The MI score measures the amount of information shared between a feature and the tar get, with higher v alues indicating stronger rele v ance. In this study , we applied the S el ectK B est method from scikit-learn, using mutual inf o cl assif () as the scoring function to select the top k = 10 features that e xhibit the highest MI with the tar get label. This selection helps to eliminate irrele v ant, redundant, or noisy features, impro ving the performance of the model by reducing o v ertting and making the learning process more ef cient. The selected features were used for model training, ensuring that only the most informati v e predictors were considered. 3.3. Class balancing Imbalanced class distrib utions constitute a critical challenge in intrusion detection, particularly in IIoT en vironments where normal traf c v astly outweighs malicious instances. This imbalance leads to biased decision boundaries that f a v or the majority class, resulting in high f alse-ne g ati v e rates for minority class predictions. SMO TE w as applied to address this issue by generating synthetic instances for the minority class through interpolation [29]. SMO TE synthesizes ne w instances by sampling from the minority class x and selecting one of its k nn nearest neighbors, x nn . A synthetic e xample is created by adding a scaled dif ference between the minority sample and its neighbor: ˜ x = x + λ ( x nn x ) , λ U (0 , 1) (4) Where λ is a randomly chosen v alue between 0 and 1, ensuring that the synthetic instance lies some where between x and x nn in the feature space. This process is repeated for all minority instances until class distrib ution approaches balance, ef fecti v ely enlar ging the minority class manifold and promoting wider decision mar gins. This re-balancing approach helps impro v e the model’ s ability to learn from both minority and majority classes equally , reducing the occurrence of f alse ne g ati v es and f alse positi v es during model prediction. An ef cient ensemble tr ee-based fr ame work for intrusion detection in ... (Mouad Choukhairi) Evaluation Warning : The document was created with Spire.PDF for Python.
486 ISSN: 2252-8938 3.4. Gradient-boosted model de v elopment 3.4.1. Data partitioning The data partitioning stage is essential in ML pipeline. After pre-processing, the dataset w as randomly split into 80% for training and 20% for testing, which is a standard approach in ML for model e v aluation. Additionally , 10-fold cross-v alidation w as emplo yed to assess each model’ s performance and generalizability . This technique in v olv es splitting the dataset into ten equal partitions. Each fold serv es as a test set once, while the remaining nine folds are used for training. This process ensures that e v ery data point is utilized for both training and testing. By emplo ying this strate gy , o v ertting is minimized, and a more accurate estimate of the model’ s performance is obtained compared to a single train-test split. 3.4.2. Model tting and v alidation In this study , four po werful ensemble learners, such as XGBoost, LightGBM, AdaBoost, and CatBoost, were independently trained to e v aluate their ef fecti v eness in detecting intrusions in IIoT en vironments. These models were chosen for their capacity to ef ciently proc ess lar ge-scale datasets, capture comple x feature patterns, and pro vide high accurac y with relat i v ely f ast training times [30]. The models were trained using the balanced, ten-feature design matrix. Each model’ s objecti v e function and w orking mechanism are described in detail, focusing on ho w the y iterati v ely impro v e their performance during the training process. XGBoost: it pro vides a highly ef cient, scalable form of gradient boosting by sequentially constructing decision trees, each trained to rectify the residual errors of the preceding ensemble, and minimizes a re gularized additi v e loss function: L ( t ) = N X i =1 y i , ˆ y ( t 1) i + f t ( x i ) + Ω( f t ) (5) Where N is the number of samples, is the loss function, usually multinomial logistic loss for classication tasks, y i is the true label of the i -th sample, ˆ y ( t 1) i is the prediction from the pre vious iteration, f t ( x i ) is the decision function of the tree at iteration t for sample x i , and Ω( f t ) is the re gularization term that penalizes the com ple xity of the decision tree f t . The re gularization term Ω( f t ) helps pre v ent o v ertting by controlling the comple xity of the model. It is dened as: Ω( f t ) = γ T + 1 2 λ T X j =1 w 2 j (6) Where T is the number of lea v es in the tree, w j represents the weight of the j -th leaf, and γ and λ are h yperparameters controlling the comple xity of the tree. The goal of XGBoost is to minimize this objecti v e function by balancing model t to data while pre v enting o v ertting by penalizing lar ge trees. LightGBM: it is a gradient boosting frame w ork designed to handle lar ge datasets with hi gh e r ef cienc y than traditional gradient boosting methods lik e XGBoost. Similar to XGBoost, LightGBM b uilds an ensemble of trees sequentially , with each tree focusing on the residuals of the pre vious one. Its objecti v e function is dened similarly to that of XGBoos t, with an additional focus on ef cienc y and speed. The re gularization term in LightGBM is gi v en by: Ω( f t ) = λ T X j =1 w 2 j (7) Where λ is a re gularization parameter , and w j represents the weight of the j -th leaf. Additionally , LightGBM emplo ys a histogram-based approach for training, which speeds up computation by approximating the feature v alues into discrete bins, reducing the computational cost of nding the best split for each feature. AdaBoost: it is an ensemble technique that forms a rob ust model by aggre g ating weak learners and iterati v ely up-weighting the misclassied samples, forcing the model to focus more on hard-to-classify e xamples in subsequent iterations. AdaBoost iterati v ely adjusts the weight of each weak classier , and the nal prediction is the weighted sum of all weak classiers. It minimizes t he weighted error by adjusting Int J Artif Intell, V ol. 15, No. 1, February 2026: 481–492 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 487 the weights of the training instances after each iteration. The weight update rule for the i -th sample in AdaBoost is: α t = 1 2 log 1 ϵ t ϵ t (8) Where ϵ t is the weighted error of the weak learner in the t -th iteration. The nal prediction is obtained by combining the weak learners using t heir weights, where the weak learners with l o wer errors are gi v en higher weights: f ( x ) = sign   T X t =1 α t h t ( x ) ! (9) Where h t ( x ) is the weak classier at iteration t , α t is the weight assigned to the weak classier at iteration t , and f ( x ) is the nal prediction, which is the weighted sum of the weak classiers’ predictions. CatBoost: it is specically de v eloped to process cate gorical v ari ables with high ef cienc y by con v erting them into numerical representations via an ef cient algorithm that accounts for the order of cate gories and pre v ents tar get leakage, which is benecial in IIoT en vironments where cate gorical data, such as de vice types or protocols, are common. CatBoost applies an ‘ordered boosting’ approach, which mitig ates tar get leakage and pre v ents o v ertting. The objecti v e function for CatBoost is similar to that of XGBoost and LightGBM, with an additional emphasis on cate gorical feature handling. The re gularization term controls the comple xity of the trees and pre v ents o v ertting. 3.5. Ev aluation strategy This w ork em plo ys a suite of e v aluation metrics—F1-score, accurac y , precision, FPR, and rec all—to rigorously quantify the ef fecti v eness of IIoT -oriented IDS. These metrics collecti v ely pro vide a nuanced vie w of classication performance. This perspecti v e is especially critical when addressing the class imbalance characteristic of intrusion detection datasets. 4. RESUL TS AND DISCUSSION This section presents the e xperimental ndings, comparati v e analysis, and a comprehensi v e dis cussion re g arding the performance impro v ements achie v ed through the proposed technique. 4.1. Experimental setup All e xperiments were conducted on the current v ersion of Google Colab, operating in a cloud en vironment equipped with dual Intel® Xeon® virtual CPUs, approximately 12 GB of system memory . Programming w as performed in Python 3.10, utilizing standard ML and data processing libraries, including scikit-learn, imbalanced-learn, XGBoost, LightGBM, and CatBoost, alongside visualization tools such as matplotlib and seaborn. Each classier—AdaBoost, CatBoost, LightGBM, and XGBoost—w as trained initially on the ra w feature set (i.e., baseline) and subsequently on a feature subset selected via MI, with class imbalance addressed through SMO TE. Model e v aluation emplo yed 10-fold stratied cross-v alidation and captured k e y performance indicators: F1-score, precision, accurac y , recall, FPR, training time, and prediction time, enabling a comprehensi v e comparison of model beha vior before and after feature engineering and class balancing. 4.2. Global perf ormance comparison The global performance comparison across all models is summarized in T able 1. Each model w as e v aluated in tw o scenarios: baseline (i.e., before MI-SMO TE) and enhanced (i.e., after MI-SMO TE). As illustrated in T able 1, all models achie v ed substantial impro v ements across k e y performance indicators after the application of MI-SMO TE. Accurac y con v er ged to approximately 99.43% across all models. Notably , AdaBoost, which initially had the lo west performance, e xhibited the greatest relati v e impro v ement in both classication metrics and computational ef cienc y . The F1-score, which balances precision and recall, is a k e y indicator of classication. An ef cient ensemble tr ee-based fr ame work for intrusion detection in ... (Mouad Choukhairi) Evaluation Warning : The document was created with Spire.PDF for Python.
488 ISSN: 2252-8938 T able 1. Performance comparison of ensemble models before and after MI-SMO TE Metric LightGBM XGBoost CatBoost AdaBoost Accurac y (before) 0.9930 0.9943 0.9916 0.9859 Accurac y (after) 0.9943 0.9943 0.9943 0.9943 Precision (before) 0.9931 0.9940 0.9920 0.9866 Precision (after) 0.9945 0.9945 0.9945 0.9945 Recall (before) 0.9930 0.9943 0.9916 0.9859 Recall (after) 0.9943 0.9943 0.9943 0.9943 F1-score (before) 0.9930 0.9937 0.9907 0.9851 F1-score (after) 0.9937 0.9939 0.9937 0.9937 FPR (before) 0.00094 0.00082 0.00118 0.00198 FPR (after) 0.00080 0.00080 0.00080 0.00080 T raining time (before) (s) 7.5607 12.1173 10.5279 91.8730 T raining time (after) (s) 4.0427 3.6500 7.2592 15.2537 Prediction time (before) (s) 0.9968 0.3159 0.4913 6.9909 Prediction time (after) (s) 0.3893 0.1745 0.1430 2.7225 T otal time (before) (s) 8.5575 12.4332 11.0192 98.8639 T otal time (after) (s) 4.4320 3.8245 7.4022 17.9762 As sho wn in Figure 2, all models demonstrated an increase in F1-score after the application of MI-SMO TE. AdaBoost e xperienced the most signicant impro v ement, increasing from 98.51% to 99.37%, while CatBoost impro v ed from 99.07% to 99.37%. LightGBM and XGBoost also sho wed slight b ut consistent g ains, stabilizing near 99.37% and 99.39%. Minimizing FPR is crucial, especially in applicat ions where f alse alarms carry high costs. The FPR e v olution is presented in Figure 3, where it sho ws a consistent reduction in FPR across all models. Initially , AdaBoost and CatBoost e xhibited higher FPR v alues of 0.00198 and 0.00118, respecti v ely . After applying MI-SMO TE, all models achie v ed a reduced and unied FPR of 0.00080. Ef cienc y in terms of computational resources is another critical aspect for real-w orld deplo yment. The impact of MI-SMO TE on training and prediction times is depicted in Figure 4, illustrating that training and prediction times were generally reduced after pre-processing, feature engineering, and class balancing phases. AdaBoost beneted signicantly , reducing its total e x ecution time from approximatel y 99 seconds to 18 seconds. XGBoost and LightGBM also achie v ed considerable reductions in training and prediction times, conrming the ef cienc y g ain of the approach’ s steps. CatBoost, after MI-SMO TE, managed to decrease its total time to approximately 7.4 seconds. Ov erall, the inte gration of MI-based feature selection and SMO TE-based balancing substantially enhanced classication rob ustness, minimized f alse alarms , and optimized computational ef cienc y across all e v aluated models. Figure 2. Comparison of F1-scores for all models before and after MI-SMO TE application Int J Artif Intell, V ol. 15, No. 1, February 2026: 481–492 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 489 Figure 3. FPR for all models before and after MI-SMO TE inte gration decreases consistently to 0.080% Figure 4. Ex ecution time analysis sho wing training and prediction durations before and after MI-SMO TE 5. CONCLUSION This study aimed to in v estig ate the impact of i nte grating MI feature selection and SMO TE clas s balancing techniques on the performance of ensemble learning models for classication tasks. As initially stated, the objecti v e w as to enhance predicti v e accurac y , reduce FPR, and optimize computational ef cienc y . The e xperimental results conrm that these objecti v es were successfully achie v ed. All e v aluated models, such as LightGBM, XGBoost, CatBoost, and AdaBoost, sho wed consistent impro v ements in classication metrics after the application of MI-SMO TE. Notably , F1-s cores e xceeded 99.37% across all models, while FPR w as uniformly reduced to 0.080%. Additionally , signicant reductions in training and prediction times were observ ed for se v eral models, further v alidating the ef fecti v eness of the frame w ork’ s stages. These ndings not only demonstrate the compatibility between the research objecti v es and outcomes b ut also highlight the An ef cient ensemble tr ee-based fr ame work for intrusion detection in ... (Mouad Choukhairi) Evaluation Warning : The document was created with Spire.PDF for Python.
490 ISSN: 2252-8938 practicality of the proposed approach for real-w orld lar ge-scale deplo yments where both performance and ef cienc y are critical. Prospects for future w ork include the e xtension of this methodology to more di v erse and imbalanced IIoT datasets (e.g., NF-T oN-IoT -v2, UNSW -NB15) to assess generalizability across dif ferent en vironments. Additionall y , we plan to conduct an ablation study to isolate and analyze the indi vidual impacts of MI-based feature selection and SMO TE balancing techniques on classicat ion performance. The e xploration of adapti v e or dynamic feature selection strate gies be yond MI, such as h ybrid lter -wrapper methods, and the inte gration of balancing approaches that are dynamically t ailored to the nature of specic attack cate gories represent promising enhancements. Furtherm o r e, we inte n d to e xplore e xplainable articial intelligence (XAI) tools such as Shaple y additi v e e xplanations (SHAP) and local interpretable model-agnostic e xplanations (LIME) to impro v e the interpretability of model decisions and support transparenc y in real-w orld deplo yments. Finally , in v estig ating zero-day threat detection capabilities using anomaly-based l earning or fe w-shot learning models will also be considered to bolster resilience ag ainst unkno wn attacks. FUNDING INFORMA TION Authors state no funding in v olv ed. A UTHOR CONTRIB UTIONS ST A TEMENT This journal uses the Contrib utor Roles T axonomy (CRediT) to recognize indi vidual author contrib utions, reduce authorship disputes, and f acilitate collaboration. Name of A uthor C M So V a F o I R D O E V i Su P Fu Mouad Choukhairi Oumaima Chentou Ouail Choukhairi Y oussef F akhri C : C onceptualization I : I n v estig ation V i : V i sualization M : M ethodology R : R esources Su : Su pervision So : So ftw are D : D ata Curation P : P roject Administration V a : V a lidation O : Writing - O riginal Draft Fu : Fu nding Acquisition F o : F o rmal Analysis E : Writing - Re vie w & E diting CONFLICT OF INTEREST ST A TEMENT Authors state no conict of interest. INFORMED CONSENT This study did not in v olv e human participants, and informed consent w as therefore not required. ETHICAL APPR O V AL This research did not in v olv e human or animal subjects and did not require ethical appro v al. D A T A A V AILABILITY The data that supports the ndings of this study is openly a v ailable in The T oN-IoT dataset at: https://research.unsw .edu.au/projects/toniot-datasets. REFERENCES [1] S. H. Jaer , “Utilizing feature selection techniques in intrusion detection system for internet of things, in Pr oceedings of the 2nd International Confer ence on Futur e Networks and Distrib uted Systems , 2018, pp. 1–3, doi: 10.1145/3231053.3234323. [2] A. M. V uln, V . I. V asilye v , V . E. Gv ozde v , K. V . Mirono v , and O. E. Churkin, “Netw ork traf c analysis based on machine learning methods, J ournal of Physics: Confer ence Series , v ol. 2001, no. 1, 2021, doi: 10.1088/1742-6596/2001/1/012017. Int J Artif Intell, V ol. 15, No. 1, February 2026: 481–492 Evaluation Warning : The document was created with Spire.PDF for Python.