IAES
Inter
national
J
our
nal
of
Articial
Intelligence
(IJ-AI)
V
ol.
15,
No.
1,
February
2026,
pp.
481
∼
492
ISSN:
2252-8938,
DOI:
10.11591/ijai.v15.i1.pp481-492
❒
481
An
efcient
ensemble
tr
ee-based
framew
ork
f
or
intrusion
detection
in
industrial
inter
net
of
things
netw
orks
Mouad
Choukhairi
1
,
Oumaima
Chentou
2
,
Ouail
Choukhairi
1
,
Y
oussef
F
akhri
1
1
LARI
Laboratory
,
Department
of
Computer
Science,
F
aculty
of
Sciences,
Ibn
T
of
ail
Uni
v
ersity
,
K
enitra,
Morocco
2
Engineering
Science
Laboratory
,
National
School
of
Applied
Sciences
(ENSA),
Ibn
T
of
ail
Uni
v
ersity
,
K
enitra,
Morocco
Article
Inf
o
Article
history:
Recei
v
ed
Apr
27,
2025
Re
vised
Oct
31,
2025
Accepted
No
v
8,
2025
K
eyw
ords:
Cybersecurity
Ensemble
learning
IIoT
security
Intrusion
detection
Machine
learning
Multiclass
T
oN-IoT
ABSTRA
CT
The
increasing
comple
xity
of
c
yber
threats
in
industrial
internet
of
things
(IIoT)
en
vironments
necessit
ates
rob
ust,
scalable,
and
ef
cient
intrusion
detection
systems
(IDS).
This
study
presents
a
no
v
el
ensemble
tree-based
frame
w
ork
that
inte
grates
gradient
boosting-based
machine
learning
models,
including
XGBoost,
LightGBM,
AdaBoost,
and
CatBoost,
with
mutual
information
(MI)
feature
selection
and
synthetic
minority
o
v
e
r
-sampling
technique
(S
MO
TE)
to
enhance
multiclass
intrusion
detection
performance.
The
frame
w
ork
is
designed
to
handle
lar
ge-scale,
imbalanced
datasets
ef
ciently
while
maintaining
high
classication
accurac
y
.
Performance
e
v
aluation
using
the
telemetry
of
netw
ork
(T
oN)-IoT
benchmark
dataset
demonstrates
that
the
proposed
models
achie
v
e
a
high
accurac
y
of
99.43%,
with
a
strong
precision-recall
balance
and
an
F1-score,
ensuring
minimal
f
alse
positi
v
e
rat
es
of
0.08%.
By
le
v
eraging
MI
for
optimal
feature
selection
and
SMO
TE
for
data
balancing,
this
approach
ef
fecti
v
ely
enhances
detection
capabilities
in
highly
dynamic
netw
ork
en
vironments.
The
lightweight
architecture
and
reduced
e
x
ecution
time
mak
e
the
frame
w
ork
well-suited
for
deplo
yment
in
edge
or
fog
nodes
within
smart
industrial
en
vironments.
The
proposed
solution
pro
vides
a
scalable
and
adaptable
methodology
for
securing
IIoT
netw
orks,
making
it
a
pplicable
for
real-time
intrusion
monitoring
and
further
c
ybersecurity
adv
ancements
in
industrial
systems.
This
is
an
open
access
article
under
the
CC
BY
-SA
license
.
Corresponding
A
uthor:
Mouad
Choukhairi
LARI
Laboratory
,
Department
of
Computer
Science,
F
aculty
of
Sciences,
Ibn
T
of
ail
Uni
v
ersity
B.P
133,
Uni
v
ersity
Campus,
K
enitra,
Morocco
Email:
mouad.choukhairi@uit.ac.ma
1.
INTR
ODUCTION
The
industrial
internet
of
things
(IIoT)
is
transforming
modern
industries
by
seamlessly
i
nte
grating
sensors,
actuators,
and
control
systems,
thereby
f
acilitating
real-time
data
e
xchange
and
enabling
unprecedented
le
v
els
of
operational
automation
[1].
This
interconnected
ecosystem
allo
ws
for
enhanced
monitoring,
predicti
v
e
maintenance,
and
optimized
resource
allocation,
leading
to
increased
ef
cienc
y
and
producti
vity
across
v
arious
sectors
[2].
Ho
we
v
er
,
this
increased
connecti
vity
inherently
introduces
ne
w
vulnerabilities,
making
critical
infrastructures
more
susceptible
to
sophisticated
c
yberattacks
[3].
T
raditional
intrusion
detection
systems
(IDS)
often
f
all
short
in
ef
fecti
v
ely
safe
guarding
IIoT
netw
orks
due
to
the
dynamic
nature
of
IIoT
data
and
the
continuous
emer
gence
of
no
v
el,
zero-day
e
xploits
[4],
[5].
Signature-based
IDS,
which
rely
on
predened
attack
patterns,
struggle
to
detect
anomalies
and
de
viations
from
established
norms
J
ournal
homepage:
http://ijai.iaescor
e
.com
Evaluation Warning : The document was created with Spire.PDF for Python.
482
❒
ISSN:
2252-8938
in
these
comple
x
en
vironments.
The
inadequac
y
stems
from
their
inabili
ty
to
adapt
to
the
e
v
olving
threat
landscape
and
the
unique
characteristics
of
IIoT
traf
c
patterns.
IIoT
datasets
present
unique
challenges
for
machine
learning
(ML)
based
IDS,
including
high
dimensionality
,
class
imbalance,
and
inherent
noise,
which
signicantly
complicates
the
training
and
deplo
yment
of
ef
fecti
v
e
detection
models
[6].
The
high
dimensional
ity
of
IIoT
data,
characterized
by
a
lar
ge
number
of
features
e
xtracted
from
netw
ork
traf
c
and
sensor
readings,
can
lead
to
the
curse
of
dimensionality
,
where
the
performance
of
ML
algorithms
de
grades
as
the
number
of
features
increases.
Class
imbalance,
where
the
number
of
instances
belonging
to
dif
ferent
attack
classes
v
aries
signicantly
,
further
e
xacerbates
the
problem,
as
ML
models
tend
to
be
biased
to
w
ards
the
majority
class,
resulting
in
poor
detection
rates
for
minority
classes,
which
often
represent
critical
security
threats.
Furthermore,
the
presence
of
noise
in
IIoT
data,
arising
from
sensor
inaccuracies,
communication
errors,
and
en
vironmental
f
actors,
can
further
de
grade
the
performance
of
ML
models,
leading
to
increased
f
alse
positi
v
e
rates
(FPR)
and
reduced
detection
accurac
y
.
T
o
address
the
challenge
of
detecting
anomalies
and
unkno
wn
attacks
in
real-time
within
IoT
de
vices,
ML
techniques
can
be
le
v
eraged
[7].
The
application
of
ML
algorithms
of
fers
the
potential
for
automating
anomaly
detection
in
industrial
machinery
by
analyzing
the
v
ast
amounts
of
data
generated
by
IoT
de
vices
[8].
ML
models
ha
v
e
demonstrated
signicant
promise
in
the
realm
of
IDS
for
IIoT
,
yet
the
y
also
present
certain
limitations
that
need
to
be
carefully
addressed.
The
ef
fecti
v
eness
of
IDS
has
gro
wn
in
popularity
recently
,
and
identifying
unauthorized
indi
viduals
is
its
main
objecti
v
e
[9].
Ensemble
ML
models
ha
v
e
sho
wn
remarkable
performance
in
intrusion
detection
tasks
due
to
their
ability
to
combine
multiple
base
learners
and
capture
comple
x
relationships
within
the
data
[10].
Ho
we
v
er
,
e
v
en
with
tree-based
algorithms,
attack
ers
can
introduce
small
changes
in
IoT
netw
ork
traf
c
that
can
mislead
these
algorithms.
Despite
the
increasing
research
ef
forts,
anomaly
detection
using
ML
is
still
e
v
olving
[7].
The
current
ML
models
lack
rob
ustness
when
f
acing
pre
viously
unseen
types
of
attacks
[11].
The
models’
ability
to
generalize
across
di
v
erse
IIoT
en
vironments
and
adapt
to
e
v
olving
attack
strate
gies
remains
a
concern.
Thus,
ne
w
attack
detection
methods
are
needed
for
risk
mitig
ation
[12].
Adv
anced
methods
are
needed
because
traditional
approaches
for
detect
ing
c
yber
-attacks
ha
v
e
lo
w
ef
cienc
y
[13].
Therefore,
there
is
still
the
opportunity
to
de
v
elop
ef
fecti
v
e
intrusion
detection
for
lar
ge-scale
IIoT
systems.
Gradient
boosti
ng
ML
algorithms
lik
e
XGBoost,
LightGBM,
AdaBoost,
and
CatBoost
ha
v
e
g
ained
attention
due
to
their
ability
to
capture
non-linear
relationships
and
scale
to
lar
ge
datasets
with
high-performance
learning.
Ho
we
v
er
,
their
performance
in
IIoT
scenarios
is
const
rained
by
challenges
such
as
high
feature
dimensionality
and
class
imbalance,
which
can
lead
to
biased
models
or
increased
f
alse
alarms.
T
o
address
these
limitations
and
challenges,
this
paper
proposes
a
comprehensi
v
e
frame
w
ork
that
inte
grates
an
ensemble
tree-based
architecture
consisting
of
XGBoost,
LightGBM,
AdaBoost,
and
CatBoost
as
state-of-the-art
gradient
boosting
classiers
with
mutual
information
(MI)
for
feature
selection
and
synthetic
minority
o
v
er
-sampling
technique
(SMO
TE)
for
class
balancing.
The
no
v
elty
of
this
research
lies
in
combining
MI
and
SMO
TE
with
four
popular
gradient
boosting
classiers
in
a
unied
IDS
pipeline.
Unlik
e
pre
vious
studies
that
e
v
aluate
only
indi
vidual
components
or
models,
we
systematically
benchmark
multiple
models’
scenarios,
analyze
the
interaction
of
pre-processing
strate
gies,
and
pro
vide
e
x
ecution
time
analysis
t
o
determine
real-time
feasibility
.
Ev
aluation
on
the
telemetry
of
netw
ork
(T
oN)-IoT
dataset
demonstrates
that
our
approach
attains
high
classication
accurac
y
while
preserving
lo
w
FPR
and
ef
cient
runtimes,
making
it
vi
able
for
real-time
IIoT
intrusion
monitoring.
2.
RELA
TED
W
ORK
In
the
realm
of
IIoT
intrusion
detection,
feature
engineering
and
class
balancing
strate
gies
are
pi
v
otal
in
addressing
the
challenges
posed
by
high-dimensional
and
imbalanced
datasets.
Feature
engineering
strate
gies
ha
v
e
been
e
xtensi
v
ely
e
xplored
to
enhance
detection
accurac
y
,
reduce
f
alse
po
s
iti
v
es,
and
manage
high-dimensional
data.
MI
is
a
prominent
technique
used
for
feature
selection,
which
helps
reduce
redundanc
y
and
select
the
most
rele
v
ant
features,
thereby
impro
ving
class
ication
accurac
y
and
detection
performance
in
IIoT
netw
orks
[14],
[15].
Principal
component
analysis
(PCA)
is
another
widely
used
method
for
feature
e
xtraction,
which
has
been
sho
wn
to
signicantl
y
impro
v
e
detection
accurac
y
,
achie
ving
up
to
100%
in
some
cases
by
transforming
high-dimensional
data
int
o
a
lo
wer
-dimensional
space
while
retaining
essential
information
[16],
[17].
Relief
is
a
kno
wn
feature
selection
method
that
e
v
aluates
the
importance
of
features
Int
J
Artif
Intell,
V
ol.
15,
No.
1,
February
2026:
481–492
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Artif
Intell
ISSN:
2252-8938
❒
483
based
on
their
ability
to
distinguish
between
dif
ferent
class
es
[18].
The
light
feature
engineering
based
on
the
mean
decrease
in
accurac
y
(LEMD
A)
method,
a
no
v
el
feature
engineering
approach,
has
demonstrated
a
substantial
impro
v
ement
in
F1-scores
by
an
a
v
erage
of
34%
across
v
arious
models,
indicating
its
ef
fecti
v
eness
in
enhancing
m
odel
performance
while
reducing
training
and
detection
times
[19].
Additionally
,
bio-inspired
feature
selection
methods
lik
e
gray
w
olf
optimization
(GW
O)
ha
v
e
been
sho
wn
to
outperform
other
techniques,
achie
ving
high
accurac
y
and
F1-scores
wit
h
reduced
e
x
ecution
time
when
combined
with
classiers
lik
e
k-nearest
neighbors
(KNN)
[20].
The
inte
gration
of
feature
selection
and
reduction
techniques,
such
as
minimum
redundanc
y
maximum
rele
v
ance
and
PCA,
has
been
ef
fecti
v
e
in
balancing
model
comple
xity
and
performance,
achie
ving
high
accurac
y
rates
of
up
to
99.9%
in
binary
classication
tasks
[21].
In
addressing
class
imbalance
in
the
same
conte
xt,
v
arious
studies
ha
v
e
e
xplored
the
ef
fecti
v
enes
s
of
dif
ferent
class-balancing
strate
gies,
such
as
SMO
TE,
adapti
v
e
synthetic
sampling
(AD
ASYN),
and
other
o
v
ersampling
and
undersampling
techniques.
SMO
TE
is
frequently
highlighted
for
its
ability
to
enhance
classication
performance
by
producing
synthetic
samples
for
the
minority
class,
thereby
impro
ving
metrics
lik
e
F1-score,
precision,
and
recall.
F
or
instance,
in
one
study
,
SMO
TE
achie
v
ed
a
precision
of
99.19%,
a
recall
of
72.45%,
and
an
F1-score
of
79.13%
when
applied
to
the
IoT
-23
dataset,
indicating
a
balanced
impro
v
ement
in
detection
performance
[22].
AD
ASYN,
which
adapts
the
number
of
synthetic
samples
generated
for
dif
ferent
minority
class
e
xamples
based
on
their
dif
culty
,
has
also
been
sho
wn
to
impro
v
e
classication
metrics,
although
specic
performance
metrics
were
not
detailed
in
the
pro
vided
conte
xts
[23].
Other
studies
ha
v
e
compared
these
techniques
with
ensemble
models,
nding
that
methods
lik
e
SMO
TE,
when
combined
with
ensemble
learners,
can
signicantly
boost
accurac
y
by
1%
to
4%
and
achie
v
e
precision,
recall,
and
F1-scores
between
95%
and
100%
[24].
Additionally
,
the
inte
gration
of
these
techniques
with
adv
anced
models
lik
e
XGBoost
has
demonstrated
remarkable
ef
fecti
v
eness,
achie
ving
F1-scores
as
high
as
99.9%
on
imbalanced
IIoT
datasets
[25].
Despite
these
impro
v
ements,
challenges
remain,
as
o
v
ersampling
and
undersampling
can
sometimes
lead
to
high
f
alse-positi
v
e
rates
or
reduced
performance
in
majority
classes,
necessitating
further
renement
and
h
ybrid
approaches
[25],
[26].
3.
METHOD
This
section
details
the
w
orko
w
adopted
to
b
uild,
design,
and
assess
the
proposed
intrusion
det
ection
frame
w
ork.
The
pipeline
is
arranged
in
v
e
sequent
ial
blocks:
data
preparation
and
pre-processing,
feature
engineering,
class
balancing,
model
de
v
elopment,
and
e
v
aluation.
All
steps
were
e
x
ecuted
in
the
sequence
presented
and
were
designed
to
enable
full
reproducibility
.
Figure
1
gi
v
es
a
high–le
v
el
o
v
ervie
w
,
while
Algorithm
1
lists
the
e
xact
steps
mirrored
in
our
approach.
Figure
1.
Frame
w
ork
of
the
proposed
w
orko
w
for
c
yberattack
classication
in
IIoT
netw
orks
An
ef
cient
ensemble
tr
ee-based
fr
ame
work
for
intrusion
detection
in
...
(Mouad
Choukhairi)
Evaluation Warning : The document was created with Spire.PDF for Python.
484
❒
ISSN:
2252-8938
3.1.
Data
pr
eparation
and
pr
e-pr
ocessing
The
T
oN-IoT
dataset
is
a
ne
xt-generation
benchmark
e
xpressly
crafte
d
for
IoT
and
IIoT
c
ybersecurity
research
[27].
Built
in
an
Industry
4.0
c
yber
-range
at
UNSW
Canberra,
it
fuses
time-aligned
telemetry
from
more
than
ten
industrial
sensors,
yielding
millions
of
records
that
are
indi
vidually
labelled
as
benign
or
as
one
of
nine
representati
v
e
attack
f
amilies
(i.e.,
denial
of
service,
ransomw
are,
man
in
the
middle,
passw
ord/brute-force,
distrib
uted
denial
of
service,
backdoor
,
injection,
cross-site
scripting,
and
scanning).
This
multimodal
design
mirrors
the
cloud–fog–edge
hierarch
y
typical
of
modern
f
actories,
letting
researchers
test
AI-dri
v
en
intrusion-detection
and
threat-intelligence
models
under
realistic
IIoT
traf
c
and
class-imbalance
conditions.
Consequently
,
T
oN-IoT
has
become
a
de
f
acto
reference
corpus
for
e
v
aluating
security
analytics
in
Industry
4.0
en
vironments.
T
o
enable
the
classication
of
IoT
-based
c
yberattacks,
a
total
of
forty-three
features
are
e
xtracted
to
char
acterize
each
o
w
,
cate
gorized
into
six
subsets
based
on
the
nature
of
the
information
the
y
con
v
e
y
(e.g.,
connection
acti
vity
features,
violation
acti
vity
features,
and
statistical
acti
vity
features).
The
training
and
testing
data
used
in
this
w
ork
are
dra
wn
from
an
of
cially
released
subset
of
the
T
oN
IoT
dataset,
which
includes
300,000
normal
traf
c
o
ws
and
20,000
o
ws
for
each
attack
cate
gory
,
e
xcept
for
the
XSS
attack
class,
which
contains
only
1,043
recorded
o
ws.
W
e
rely
on
the
IoT
-telemetry
splits
of
the
T
oN-IoT
corpus,
where
each
ph
ysical
sensor
is
pro
vided
as
an
independent
CSV
le
containing
pre-di
vided
train/test
records
across
the
security
classes
(i.e.,
benign
or
attack
type),
which
we
ha
v
e
mer
ged
into
a
single
dataset
for
comprehensi
v
e
analysis.
The
same
dataset
w
as
processed
in
se
v
eral
steps
to
prepare
it
for
ML
model
training.
This
section
describes
each
pre-processing
step,
including
handling
missing
data,
feature
normalization,
and
cate
gorical
data
encoding.
Algorithm
1
MI–SMO
TE–Boost
intrusion
detection
Requir
e:
Dataset
D
,
top-
k
,
k
smote
Ensur
e:
T
rained
models
M
1:
Encode,
impute,
scale
D
2:
Compute
MI;
select
top-
k
features
X
k
3:
Split
D
→
D
train
,
D
test
4:
Apply
SMO
TE
(
k
smote
)
on
D
train
[
X
k
]
5:
f
or
each
booster
∈
{
XGBoost,
LightGBM,
AD
ABoost,
CA
TBoost
}
do
6:
T
rain
booster
on
balanced
D
train
[
X
k
]
7:
Ev
aluate
on
D
test
[
X
k
]
8:
Store
metrics
→
M
9:
end
f
or
10:
r
etur
n
M
3.1.1.
Handling
missing
data
Handling
missing
v
alues
is
an
important
step
in
data
cleaning,
and
it
is
crucial
for
ensuring
the
inte
grity
and
completeness
of
the
dataset.
F
or
numeric
features
with
missing
v
alues,
mean
imputation
w
as
applied.
This
in
v
olv
es
replacing
missing
v
alues
in
a
feature
x
j
with
the
mean
of
that
feature
computed
o
v
er
the
numerical
data
subset.
This
approach
ensures
that
all
data
records
can
be
used
for
training
without
introducing
signicant
bias.
The
imputation
formula
used
is:
x
ij
←
1
|
D
numeric
|
X
k
∈
D
numeric
x
k
j
if
x
ij
is
missing
(1)
Where
D
numeric
represents
t
he
numerical
data
subset,
x
ij
is
the
missing
v
alue,
and
the
mean
of
the
feature
x
j
is
computed
o
v
er
the
entire
numerical
data
subset.
F
or
cate
gorical
features,
missing
v
alues
were
handled
separately
.
In
the
T
oN-IoT
dataset,
an
y
missing
cate
gories
were
imputed
using
the
most
frequent
v
alue
(i.e.,
mode)
within
the
data,
ensuring
that
the
cate
gories
are
consistent
across
the
dataset.
3.1.2.
F
eatur
e
normalization
T
o
guarantee
that
e
v
ery
numerical
feature
plays
an
acti
v
e
role
in
the
model
training,
sta
n
da
rdization
w
as
applied
using
the
S
tandar
dS
cal
er
from
scikit-learn.
This
transformation
ensures
that
each
feature
has
Int
J
Artif
Intell,
V
ol.
15,
No.
1,
February
2026:
481–492
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Artif
Intell
ISSN:
2252-8938
❒
485
a
mean
of
zero
and
a
standard
de
viation
of
one,
which
pre
v
ents
features
with
lar
ger
numeric
ranges
from
disproportionately
inuencing
the
model.
The
standardization
formula
used
is:
x
′
ij
=
x
ij
−
µ
j
σ
j
(2)
Where
x
′
ij
is
the
standardized
v
alue
of
feature
x
j
for
the
i
-th
instance,
µ
j
is
the
m
ean,
and
σ
j
is
the
standard
de
viation
for
feature
x
j
across
the
data.
This
scaling
w
as
applied
to
training
and
testing
sets,
ensuring
that
all
features
are
treated
consistently
across
both
training
and
testing
phases.
3.1.3.
Categorical
data
encoding
Cate
gorical
features,
such
as
traf
c
type
or
de
vice
identi
ers,
were
transformed
into
numeric
al
representations
using
one-hot
encoding.
This
process
c
reates
binary
columns
for
each
unique
cate
gory
in
a
feature.
F
or
e
xample,
if
a
feature
protocol
has
three
unique
v
alues
(e.g.,
T
C
P
,
U
D
P
,
and
I
C
M
P
),
one-hot
encoding
w
ould
generate
three
binary
colum
ns:
pr
otocol
T
C
P
,
pr
otocol
U
D
P
,
pr
otocol
I
C
M
P
.
Each
instance
of
the
dataset
is
represented
by
1
in
the
corresponding
cate
gory
column
and
0
in
the
others.
This
transformation
pre
v
ents
the
model
from
assuming
an
y
ordinal
re
lationship
between
cate
gories
and
ensures
that
cate
gorical
v
ariables
are
processed
appropriately
by
tree-based
ML
models.
3.2.
F
eatur
e
engineering
MI
quanties
the
amount
of
information
one
v
ariable
pro
vides
about
another
[28].
In
classi
cation
tasks,
MI
is
used
to
select
features
that
ha
v
e
the
highest
dependenc
y
on
the
class
labels.
T
o
mitig
ate
the
issue
of
dimensionality
,
MI
w
as
used
to
e
v
aluate
the
rele
v
ance
of
each
feature
X
j
with
respect
to
the
multiclass
label
Y
.
The
MI
score
quanties
the
reduction
in
label
uncertainty
due
to
kno
wledge
of
the
feature,
with
the
formula:
I
(
X
j
;
Y
)
=
X
x
j
X
y
p
(
x
j
,
y
)
log
p
(
x
j
,
y
)
p
(
x
j
)
p
(
y
)
(3)
Where
p
(
x
j
,
y
)
is
the
joint
probability
of
feature
X
j
and
label
Y
,
and
p
(
x
j
)
,
p
(
y
)
are
the
mar
ginal
probabilities
of
X
j
and
Y
,
respecti
v
ely
.
The
MI
score
measures
the
amount
of
information
shared
between
a
feature
and
the
tar
get,
with
higher
v
alues
indicating
stronger
rele
v
ance.
In
this
study
,
we
applied
the
S
el
ectK
B
est
method
from
scikit-learn,
using
mutual
inf
o
cl
assif
()
as
the
scoring
function
to
select
the
top
k
=
10
features
that
e
xhibit
the
highest
MI
with
the
tar
get
label.
This
selection
helps
to
eliminate
irrele
v
ant,
redundant,
or
noisy
features,
impro
ving
the
performance
of
the
model
by
reducing
o
v
ertting
and
making
the
learning
process
more
ef
cient.
The
selected
features
were
used
for
model
training,
ensuring
that
only
the
most
informati
v
e
predictors
were
considered.
3.3.
Class
balancing
Imbalanced
class
distrib
utions
constitute
a
critical
challenge
in
intrusion
detection,
particularly
in
IIoT
en
vironments
where
normal
traf
c
v
astly
outweighs
malicious
instances.
This
imbalance
leads
to
biased
decision
boundaries
that
f
a
v
or
the
majority
class,
resulting
in
high
f
alse-ne
g
ati
v
e
rates
for
minority
class
predictions.
SMO
TE
w
as
applied
to
address
this
issue
by
generating
synthetic
instances
for
the
minority
class
through
interpolation
[29].
SMO
TE
synthesizes
ne
w
instances
by
sampling
from
the
minority
class
x
and
selecting
one
of
its
k
nn
nearest
neighbors,
x
nn
.
A
synthetic
e
xample
is
created
by
adding
a
scaled
dif
ference
between
the
minority
sample
and
its
neighbor:
˜
x
=
x
+
λ
(
x
nn
−
x
)
,
λ
∼
U
(0
,
1)
(4)
Where
λ
is
a
randomly
chosen
v
alue
between
0
and
1,
ensuring
that
the
synthetic
instance
lies
some
where
between
x
and
x
nn
in
the
feature
space.
This
process
is
repeated
for
all
minority
instances
until
class
distrib
ution
approaches
balance,
ef
fecti
v
ely
enlar
ging
the
minority
class
manifold
and
promoting
wider
decision
mar
gins.
This
re-balancing
approach
helps
impro
v
e
the
model’
s
ability
to
learn
from
both
minority
and
majority
classes
equally
,
reducing
the
occurrence
of
f
alse
ne
g
ati
v
es
and
f
alse
positi
v
es
during
model
prediction.
An
ef
cient
ensemble
tr
ee-based
fr
ame
work
for
intrusion
detection
in
...
(Mouad
Choukhairi)
Evaluation Warning : The document was created with Spire.PDF for Python.
486
❒
ISSN:
2252-8938
3.4.
Gradient-boosted
model
de
v
elopment
3.4.1.
Data
partitioning
The
data
partitioning
stage
is
essential
in
ML
pipeline.
After
pre-processing,
the
dataset
w
as
randomly
split
into
80%
for
training
and
20%
for
testing,
which
is
a
standard
approach
in
ML
for
model
e
v
aluation.
Additionally
,
10-fold
cross-v
alidation
w
as
emplo
yed
to
assess
each
model’
s
performance
and
generalizability
.
This
technique
in
v
olv
es
splitting
the
dataset
into
ten
equal
partitions.
Each
fold
serv
es
as
a
test
set
once,
while
the
remaining
nine
folds
are
used
for
training.
This
process
ensures
that
e
v
ery
data
point
is
utilized
for
both
training
and
testing.
By
emplo
ying
this
strate
gy
,
o
v
ertting
is
minimized,
and
a
more
accurate
estimate
of
the
model’
s
performance
is
obtained
compared
to
a
single
train-test
split.
3.4.2.
Model
tting
and
v
alidation
In
this
study
,
four
po
werful
ensemble
learners,
such
as
XGBoost,
LightGBM,
AdaBoost,
and
CatBoost,
were
independently
trained
to
e
v
aluate
their
ef
fecti
v
eness
in
detecting
intrusions
in
IIoT
en
vironments.
These
models
were
chosen
for
their
capacity
to
ef
ciently
proc
ess
lar
ge-scale
datasets,
capture
comple
x
feature
patterns,
and
pro
vide
high
accurac
y
with
relat
i
v
ely
f
ast
training
times
[30].
The
models
were
trained
using
the
balanced,
ten-feature
design
matrix.
Each
model’
s
objecti
v
e
function
and
w
orking
mechanism
are
described
in
detail,
focusing
on
ho
w
the
y
iterati
v
ely
impro
v
e
their
performance
during
the
training
process.
−
XGBoost:
it
pro
vides
a
highly
ef
cient,
scalable
form
of
gradient
boosting
by
sequentially
constructing
decision
trees,
each
trained
to
rectify
the
residual
errors
of
the
preceding
ensemble,
and
minimizes
a
re
gularized
additi
v
e
loss
function:
L
(
t
)
=
N
X
i
=1
ℓ
y
i
,
ˆ
y
(
t
−
1)
i
+
f
t
(
x
i
)
+
Ω(
f
t
)
(5)
Where
N
is
the
number
of
samples,
ℓ
is
the
loss
function,
usually
multinomial
logistic
loss
for
classication
tasks,
y
i
is
the
true
label
of
the
i
-th
sample,
ˆ
y
(
t
−
1)
i
is
the
prediction
from
the
pre
vious
iteration,
f
t
(
x
i
)
is
the
decision
function
of
the
tree
at
iteration
t
for
sample
x
i
,
and
Ω(
f
t
)
is
the
re
gularization
term
that
penalizes
the
com
ple
xity
of
the
decision
tree
f
t
.
The
re
gularization
term
Ω(
f
t
)
helps
pre
v
ent
o
v
ertting
by
controlling
the
comple
xity
of
the
model.
It
is
dened
as:
Ω(
f
t
)
=
γ
T
+
1
2
λ
T
X
j
=1
w
2
j
(6)
Where
T
is
the
number
of
lea
v
es
in
the
tree,
w
j
represents
the
weight
of
the
j
-th
leaf,
and
γ
and
λ
are
h
yperparameters
controlling
the
comple
xity
of
the
tree.
The
goal
of
XGBoost
is
to
minimize
this
objecti
v
e
function
by
balancing
model
t
to
data
while
pre
v
enting
o
v
ertting
by
penalizing
lar
ge
trees.
−
LightGBM:
it
is
a
gradient
boosting
frame
w
ork
designed
to
handle
lar
ge
datasets
with
hi
gh
e
r
ef
cienc
y
than
traditional
gradient
boosting
methods
lik
e
XGBoost.
Similar
to
XGBoost,
LightGBM
b
uilds
an
ensemble
of
trees
sequentially
,
with
each
tree
focusing
on
the
residuals
of
the
pre
vious
one.
Its
objecti
v
e
function
is
dened
similarly
to
that
of
XGBoos
t,
with
an
additional
focus
on
ef
cienc
y
and
speed.
The
re
gularization
term
in
LightGBM
is
gi
v
en
by:
Ω(
f
t
)
=
λ
T
X
j
=1
w
2
j
(7)
Where
λ
is
a
re
gularization
parameter
,
and
w
j
represents
the
weight
of
the
j
-th
leaf.
Additionally
,
LightGBM
emplo
ys
a
histogram-based
approach
for
training,
which
speeds
up
computation
by
approximating
the
feature
v
alues
into
discrete
bins,
reducing
the
computational
cost
of
nding
the
best
split
for
each
feature.
−
AdaBoost:
it
is
an
ensemble
technique
that
forms
a
rob
ust
model
by
aggre
g
ating
weak
learners
and
iterati
v
ely
up-weighting
the
misclassied
samples,
forcing
the
model
to
focus
more
on
hard-to-classify
e
xamples
in
subsequent
iterations.
AdaBoost
iterati
v
ely
adjusts
the
weight
of
each
weak
classier
,
and
the
nal
prediction
is
the
weighted
sum
of
all
weak
classiers.
It
minimizes
t
he
weighted
error
by
adjusting
Int
J
Artif
Intell,
V
ol.
15,
No.
1,
February
2026:
481–492
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Artif
Intell
ISSN:
2252-8938
❒
487
the
weights
of
the
training
instances
after
each
iteration.
The
weight
update
rule
for
the
i
-th
sample
in
AdaBoost
is:
α
t
=
1
2
log
1
−
ϵ
t
ϵ
t
(8)
Where
ϵ
t
is
the
weighted
error
of
the
weak
learner
in
the
t
-th
iteration.
The
nal
prediction
is
obtained
by
combining
the
weak
learners
using
t
heir
weights,
where
the
weak
learners
with
l
o
wer
errors
are
gi
v
en
higher
weights:
f
(
x
)
=
sign
T
X
t
=1
α
t
h
t
(
x
)
!
(9)
Where
h
t
(
x
)
is
the
weak
classier
at
iteration
t
,
α
t
is
the
weight
assigned
to
the
weak
classier
at
iteration
t
,
and
f
(
x
)
is
the
nal
prediction,
which
is
the
weighted
sum
of
the
weak
classiers’
predictions.
−
CatBoost:
it
is
specically
de
v
eloped
to
process
cate
gorical
v
ari
ables
with
high
ef
cienc
y
by
con
v
erting
them
into
numerical
representations
via
an
ef
cient
algorithm
that
accounts
for
the
order
of
cate
gories
and
pre
v
ents
tar
get
leakage,
which
is
benecial
in
IIoT
en
vironments
where
cate
gorical
data,
such
as
de
vice
types
or
protocols,
are
common.
CatBoost
applies
an
‘ordered
boosting’
approach,
which
mitig
ates
tar
get
leakage
and
pre
v
ents
o
v
ertting.
The
objecti
v
e
function
for
CatBoost
is
similar
to
that
of
XGBoost
and
LightGBM,
with
an
additional
emphasis
on
cate
gorical
feature
handling.
The
re
gularization
term
controls
the
comple
xity
of
the
trees
and
pre
v
ents
o
v
ertting.
3.5.
Ev
aluation
strategy
This
w
ork
em
plo
ys
a
suite
of
e
v
aluation
metrics—F1-score,
accurac
y
,
precision,
FPR,
and
rec
all—to
rigorously
quantify
the
ef
fecti
v
eness
of
IIoT
-oriented
IDS.
These
metrics
collecti
v
ely
pro
vide
a
nuanced
vie
w
of
classication
performance.
This
perspecti
v
e
is
especially
critical
when
addressing
the
class
imbalance
characteristic
of
intrusion
detection
datasets.
4.
RESUL
TS
AND
DISCUSSION
This
section
presents
the
e
xperimental
ndings,
comparati
v
e
analysis,
and
a
comprehensi
v
e
dis
cussion
re
g
arding
the
performance
impro
v
ements
achie
v
ed
through
the
proposed
technique.
4.1.
Experimental
setup
All
e
xperiments
were
conducted
on
the
current
v
ersion
of
Google
Colab,
operating
in
a
cloud
en
vironment
equipped
with
dual
Intel®
Xeon®
virtual
CPUs,
approximately
12
GB
of
system
memory
.
Programming
w
as
performed
in
Python
3.10,
utilizing
standard
ML
and
data
processing
libraries,
including
scikit-learn,
imbalanced-learn,
XGBoost,
LightGBM,
and
CatBoost,
alongside
visualization
tools
such
as
matplotlib
and
seaborn.
Each
classier—AdaBoost,
CatBoost,
LightGBM,
and
XGBoost—w
as
trained
initially
on
the
ra
w
feature
set
(i.e.,
baseline)
and
subsequently
on
a
feature
subset
selected
via
MI,
with
class
imbalance
addressed
through
SMO
TE.
Model
e
v
aluation
emplo
yed
10-fold
stratied
cross-v
alidation
and
captured
k
e
y
performance
indicators:
F1-score,
precision,
accurac
y
,
recall,
FPR,
training
time,
and
prediction
time,
enabling
a
comprehensi
v
e
comparison
of
model
beha
vior
before
and
after
feature
engineering
and
class
balancing.
4.2.
Global
perf
ormance
comparison
The
global
performance
comparison
across
all
models
is
summarized
in
T
able
1.
Each
model
w
as
e
v
aluated
in
tw
o
scenarios:
baseline
(i.e.,
before
MI-SMO
TE)
and
enhanced
(i.e.,
after
MI-SMO
TE).
As
illustrated
in
T
able
1,
all
models
achie
v
ed
substantial
impro
v
ements
across
k
e
y
performance
indicators
after
the
application
of
MI-SMO
TE.
Accurac
y
con
v
er
ged
to
approximately
99.43%
across
all
models.
Notably
,
AdaBoost,
which
initially
had
the
lo
west
performance,
e
xhibited
the
greatest
relati
v
e
impro
v
ement
in
both
classication
metrics
and
computational
ef
cienc
y
.
The
F1-score,
which
balances
precision
and
recall,
is
a
k
e
y
indicator
of
classication.
An
ef
cient
ensemble
tr
ee-based
fr
ame
work
for
intrusion
detection
in
...
(Mouad
Choukhairi)
Evaluation Warning : The document was created with Spire.PDF for Python.
488
❒
ISSN:
2252-8938
T
able
1.
Performance
comparison
of
ensemble
models
before
and
after
MI-SMO
TE
Metric
LightGBM
XGBoost
CatBoost
AdaBoost
Accurac
y
(before)
0.9930
0.9943
0.9916
0.9859
Accurac
y
(after)
0.9943
0.9943
0.9943
0.9943
Precision
(before)
0.9931
0.9940
0.9920
0.9866
Precision
(after)
0.9945
0.9945
0.9945
0.9945
Recall
(before)
0.9930
0.9943
0.9916
0.9859
Recall
(after)
0.9943
0.9943
0.9943
0.9943
F1-score
(before)
0.9930
0.9937
0.9907
0.9851
F1-score
(after)
0.9937
0.9939
0.9937
0.9937
FPR
(before)
0.00094
0.00082
0.00118
0.00198
FPR
(after)
0.00080
0.00080
0.00080
0.00080
T
raining
time
(before)
(s)
7.5607
12.1173
10.5279
91.8730
T
raining
time
(after)
(s)
4.0427
3.6500
7.2592
15.2537
Prediction
time
(before)
(s)
0.9968
0.3159
0.4913
6.9909
Prediction
time
(after)
(s)
0.3893
0.1745
0.1430
2.7225
T
otal
time
(before)
(s)
8.5575
12.4332
11.0192
98.8639
T
otal
time
(after)
(s)
4.4320
3.8245
7.4022
17.9762
As
sho
wn
in
Figure
2,
all
models
demonstrated
an
increase
in
F1-score
after
the
application
of
MI-SMO
TE.
AdaBoost
e
xperienced
the
most
signicant
impro
v
ement,
increasing
from
98.51%
to
99.37%,
while
CatBoost
impro
v
ed
from
99.07%
to
99.37%.
LightGBM
and
XGBoost
also
sho
wed
slight
b
ut
consistent
g
ains,
stabilizing
near
99.37%
and
99.39%.
Minimizing
FPR
is
crucial,
especially
in
applicat
ions
where
f
alse
alarms
carry
high
costs.
The
FPR
e
v
olution
is
presented
in
Figure
3,
where
it
sho
ws
a
consistent
reduction
in
FPR
across
all
models.
Initially
,
AdaBoost
and
CatBoost
e
xhibited
higher
FPR
v
alues
of
0.00198
and
0.00118,
respecti
v
ely
.
After
applying
MI-SMO
TE,
all
models
achie
v
ed
a
reduced
and
unied
FPR
of
0.00080.
Ef
cienc
y
in
terms
of
computational
resources
is
another
critical
aspect
for
real-w
orld
deplo
yment.
The
impact
of
MI-SMO
TE
on
training
and
prediction
times
is
depicted
in
Figure
4,
illustrating
that
training
and
prediction
times
were
generally
reduced
after
pre-processing,
feature
engineering,
and
class
balancing
phases.
AdaBoost
beneted
signicantly
,
reducing
its
total
e
x
ecution
time
from
approximatel
y
99
seconds
to
18
seconds.
XGBoost
and
LightGBM
also
achie
v
ed
considerable
reductions
in
training
and
prediction
times,
conrming
the
ef
cienc
y
g
ain
of
the
approach’
s
steps.
CatBoost,
after
MI-SMO
TE,
managed
to
decrease
its
total
time
to
approximately
7.4
seconds.
Ov
erall,
the
inte
gration
of
MI-based
feature
selection
and
SMO
TE-based
balancing
substantially
enhanced
classication
rob
ustness,
minimized
f
alse
alarms
,
and
optimized
computational
ef
cienc
y
across
all
e
v
aluated
models.
Figure
2.
Comparison
of
F1-scores
for
all
models
before
and
after
MI-SMO
TE
application
Int
J
Artif
Intell,
V
ol.
15,
No.
1,
February
2026:
481–492
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Artif
Intell
ISSN:
2252-8938
❒
489
Figure
3.
FPR
for
all
models
before
and
after
MI-SMO
TE
inte
gration
decreases
consistently
to
0.080%
Figure
4.
Ex
ecution
time
analysis
sho
wing
training
and
prediction
durations
before
and
after
MI-SMO
TE
5.
CONCLUSION
This
study
aimed
to
in
v
estig
ate
the
impact
of
i
nte
grating
MI
feature
selection
and
SMO
TE
clas
s
balancing
techniques
on
the
performance
of
ensemble
learning
models
for
classication
tasks.
As
initially
stated,
the
objecti
v
e
w
as
to
enhance
predicti
v
e
accurac
y
,
reduce
FPR,
and
optimize
computational
ef
cienc
y
.
The
e
xperimental
results
conrm
that
these
objecti
v
es
were
successfully
achie
v
ed.
All
e
v
aluated
models,
such
as
LightGBM,
XGBoost,
CatBoost,
and
AdaBoost,
sho
wed
consistent
impro
v
ements
in
classication
metrics
after
the
application
of
MI-SMO
TE.
Notably
,
F1-s
cores
e
xceeded
99.37%
across
all
models,
while
FPR
w
as
uniformly
reduced
to
0.080%.
Additionally
,
signicant
reductions
in
training
and
prediction
times
were
observ
ed
for
se
v
eral
models,
further
v
alidating
the
ef
fecti
v
eness
of
the
frame
w
ork’
s
stages.
These
ndings
not
only
demonstrate
the
compatibility
between
the
research
objecti
v
es
and
outcomes
b
ut
also
highlight
the
An
ef
cient
ensemble
tr
ee-based
fr
ame
work
for
intrusion
detection
in
...
(Mouad
Choukhairi)
Evaluation Warning : The document was created with Spire.PDF for Python.
490
❒
ISSN:
2252-8938
practicality
of
the
proposed
approach
for
real-w
orld
lar
ge-scale
deplo
yments
where
both
performance
and
ef
cienc
y
are
critical.
Prospects
for
future
w
ork
include
the
e
xtension
of
this
methodology
to
more
di
v
erse
and
imbalanced
IIoT
datasets
(e.g.,
NF-T
oN-IoT
-v2,
UNSW
-NB15)
to
assess
generalizability
across
dif
ferent
en
vironments.
Additionall
y
,
we
plan
to
conduct
an
ablation
study
to
isolate
and
analyze
the
indi
vidual
impacts
of
MI-based
feature
selection
and
SMO
TE
balancing
techniques
on
classicat
ion
performance.
The
e
xploration
of
adapti
v
e
or
dynamic
feature
selection
strate
gies
be
yond
MI,
such
as
h
ybrid
lter
-wrapper
methods,
and
the
inte
gration
of
balancing
approaches
that
are
dynamically
t
ailored
to
the
nature
of
specic
attack
cate
gories
represent
promising
enhancements.
Furtherm
o
r
e,
we
inte
n
d
to
e
xplore
e
xplainable
articial
intelligence
(XAI)
tools
such
as
Shaple
y
additi
v
e
e
xplanations
(SHAP)
and
local
interpretable
model-agnostic
e
xplanations
(LIME)
to
impro
v
e
the
interpretability
of
model
decisions
and
support
transparenc
y
in
real-w
orld
deplo
yments.
Finally
,
in
v
estig
ating
zero-day
threat
detection
capabilities
using
anomaly-based
l
earning
or
fe
w-shot
learning
models
will
also
be
considered
to
bolster
resilience
ag
ainst
unkno
wn
attacks.
FUNDING
INFORMA
TION
Authors
state
no
funding
in
v
olv
ed.
A
UTHOR
CONTRIB
UTIONS
ST
A
TEMENT
This
journal
uses
the
Contrib
utor
Roles
T
axonomy
(CRediT)
to
recognize
indi
vidual
author
contrib
utions,
reduce
authorship
disputes,
and
f
acilitate
collaboration.
Name
of
A
uthor
C
M
So
V
a
F
o
I
R
D
O
E
V
i
Su
P
Fu
Mouad
Choukhairi
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
Oumaima
Chentou
✓
✓
✓
✓
✓
✓
Ouail
Choukhairi
✓
✓
✓
✓
✓
✓
Y
oussef
F
akhri
✓
✓
✓
✓
✓
✓
✓
C
:
C
onceptualization
I
:
I
n
v
estig
ation
V
i
:
V
i
sualization
M
:
M
ethodology
R
:
R
esources
Su
:
Su
pervision
So
:
So
ftw
are
D
:
D
ata
Curation
P
:
P
roject
Administration
V
a
:
V
a
lidation
O
:
Writing
-
O
riginal
Draft
Fu
:
Fu
nding
Acquisition
F
o
:
F
o
rmal
Analysis
E
:
Writing
-
Re
vie
w
&
E
diting
CONFLICT
OF
INTEREST
ST
A
TEMENT
Authors
state
no
conict
of
interest.
INFORMED
CONSENT
This
study
did
not
in
v
olv
e
human
participants,
and
informed
consent
w
as
therefore
not
required.
ETHICAL
APPR
O
V
AL
This
research
did
not
in
v
olv
e
human
or
animal
subjects
and
did
not
require
ethical
appro
v
al.
D
A
T
A
A
V
AILABILITY
The
data
that
supports
the
ndings
of
this
study
is
openly
a
v
ailable
in
The
T
oN-IoT
dataset
at:
https://research.unsw
.edu.au/projects/toniot-datasets.
REFERENCES
[1]
S.
H.
Jaer
,
“Utilizing
feature
selection
techniques
in
intrusion
detection
system
for
internet
of
things,
”
in
Pr
oceedings
of
the
2nd
International
Confer
ence
on
Futur
e
Networks
and
Distrib
uted
Systems
,
2018,
pp.
1–3,
doi:
10.1145/3231053.3234323.
[2]
A.
M.
V
uln,
V
.
I.
V
asilye
v
,
V
.
E.
Gv
ozde
v
,
K.
V
.
Mirono
v
,
and
O.
E.
Churkin,
“Netw
ork
traf
c
analysis
based
on
machine
learning
methods,
”
J
ournal
of
Physics:
Confer
ence
Series
,
v
ol.
2001,
no.
1,
2021,
doi:
10.1088/1742-6596/2001/1/012017.
Int
J
Artif
Intell,
V
ol.
15,
No.
1,
February
2026:
481–492
Evaluation Warning : The document was created with Spire.PDF for Python.