Inter
national
J
our
nal
of
Electrical
and
Computer
Engineering
(IJECE)
V
ol.
15,
No.
4,
August
2025,
pp.
4279
∼
4295
ISSN:
2088-8708,
DOI:
10.11591/ijece.v15i4.pp4279-4295
❒
4279
Ensemble
of
con
v
olutional
neural
netw
ork
and
DeepResNet
f
or
multimodal
biometric
authentication
system
Ashwini
Kailas
1
,
Madhusudan
Girimallaih
2
,
Mallego
wda
Madigahalli
3
,
V
asantha
K
umara
Mahade
v
achar
4
,
Pranothi
Kadir
ehally
Somashekarappa
1
1
Department
of
Bio
Medical
Engineering,
Sri
Siddartha
Institute
of
T
echnology
,
Sri
Siddartha
Academy
of
Higher
Education,
T
umkur
,
India
2
Department
of
Computer
Science
and
Engineering,
Sri
Jayachamarajendra
Colle
ge
of
Engineering-Mysore,
JSS
Science
and
T
echnology
Uni
v
ersity
,
Mysore,
India
3
Department
of
Computer
Science
and
Engineering,
Ramaiah
Institute
of
T
echnology-Bang
alore,
V
isv
esv
araya
T
echnological
Uni
v
ersity
,
Belag
a
vi,
India
4
Department
of
Computer
Science
and
Engineering,
Go
v
ernment
Engineering
Colle
ge-Hassan,
V
isv
esv
araya
T
echnological
Uni
v
ersity
,
Belag
a
vi,
India
Article
Inf
o
Article
history:
Recei
v
ed
Jun
12,
2024
Re
vised
Dec
12,
2024
Accepted
Jan
16,
2025
K
eyw
ords:
Biometric
authentication
Con
v
olution
neural
netw
ork
Deep
ResNet
ECG-iris
Ensemble
deep
learning
Multimodal
ABSTRA
CT
Multimodal
biometrics
technology
has
g
arnered
attention
recently
for
its
abil-
ity
to
address
inherent
limitations
found
in
single
biometric
modalities
and
to
enhance
o
v
eral
l
recognition
rates.
A
typical
biometric
recognition
system
com-
prises
sensing,
fe
ature
e
xtraction,
and
matching
modules.
The
system’
s
rob
ust-
ness
hea
vily
relies
on
its
capability
to
ef
fecti
v
ely
e
xtract
pertinent
information
from
indi
vidual
biometric
traits.
This
study
introduces
a
no
v
el
feature
e
xtraction
technique
tailored
for
a
multimodal
biome
tric
system
uti
lizing
electrocardio-
gram
(ECG)
and
iris
traits.
The
ECG
helps
to
incorporate
the
li
v
eliness
related
information
and
Iris
helps
to
produce
the
unique
pattern
for
each
indi
vidual.
Therefore,
this
w
ork
presents
a
multimodal
authentication
system
where
data
pre-processing
is
performed
on
image
and
ECG
data
where
noise
remo
v
al
and
quality
enhancement
tasks
are
performed.
Later
,
feature
e
xtraction
is
carried
out
for
ECG
signals
by
estimating
the
Heart
rate
v
a
riability
feature
analysis
in
time
and
frequenc
y
domain.
Finally
,
the
ensemble
of
con
v
olution
neural
net-
w
ork
(CNN)
and
DeepResNet
models
are
used
to
perform
the
classication.
the
o
v
erall
accurac
y
is
reported
as
0.8900,
0.8400,
0.7900,
0.8932,
0.87,
and
0.97
by
using
con
v
olutional
neural
netw
ork-long
short-term
memory
(CNN-LSTM),
support
v
ector
machine
(SVM),
random
forest
(RF),
CNN,
decision
tree
(DT),
and
proposed
MB
ANet
approach
respecti
v
ely
.
This
is
an
open
access
article
under
the
CC
BY
-SA
license
.
Corresponding
A
uthor:
Ashwini
Kailas
Department
of
Biomedical
Engineering,
Sri
Siddartha
Institute
of
T
echnology
,
Sri
Siddartha
Aca
d
e
my
of
Higher
Education
T
umkur
,
India
Email:
ashwinik@ssit.edu.in
1.
INTR
ODUCTION
Recently
,
the
biometric
recognition
systems
ha
v
e
g
ained
prominence
as
a
primary
means
of
user
au-
thentication
across
v
arious
se
ctors
and
applications,
including
smartphones,
banking
services,
websites,
and
airports.
Depending
on
the
required
le
v
el
of
security
,
the
y
pro
vide
a
clear
substitute
for
con
v
entional
authen-
tication
techniques
lik
e
k
e
ys
and
personal
identication
numbers
(PINs)
[1],
[2].
F
or
the
purpose
of
feature
J
ournal
homepage:
http://ijece
.iaescor
e
.com
Evaluation Warning : The document was created with Spire.PDF for Python.
4280
❒
ISSN:
2088-8708
recognition,
it
is
necessary
to
rst
enroll
biometric
qualities
that
are
often
used,
such
as
v
oice,
f
ace
features,
ngerprints,
palmprints,
iris
patterns,
and
f
acial
features,
into
a
database
[3],
[4].
Biometrics
is
a
more
straight-
forw
ard
and
secure
substitute
for
traditional
authentication
techniques.
It
includes
both
ph
ysiological
and
beha
vioral
characteristics
that
are
used
to
statistically
dif
ferentiate
persons
[5].
Ph
ysiological
traits
encom-
pass
both
e
xternal
features
such
as
ngerprints,
iris
patterns,
f
acial
characteristics,
and
v
ein
patterns,
as
well
as
internal
attrib
utes
lik
e
electrocardiogram
(ECG),
electromyograph
y
(EMG),
and
brainw
a
v
e
(EEG)
patterns.
Beha
vioral
traits,
on
the
other
hand,
in
v
olv
e
habit-based
characteristics
such
as
v
oice
patterns,
g
ait,
and
sig-
natures
[6],
[7].
Furthermore,
researchers
ha
v
e
e
xplored
the
combination
of
multiple
biometric
modalities
to
enhance
the
rob
ustness
of
identication
systems
[8].
Despite
the
widespread
adoption
of
biometrics
in
v
arious
de
vices
and
services,
the
y
remain
vulnerable
to
spoong
attempts.
Ho
we
v
er
,
the
current
technological
adv
ance-
ments
ha
v
e
raise
the
security
concerns
for
these
syst
ems
and
mak
e
them
more
vulnerable
to
v
arious
security
threats.
A
typical
f
ace-or
ngerprint-spoong
attack
w
as
in
v
estig
ated
and
co
v
ered
in
[9]-[11].
A
consideration
should
be
gi
v
en
to
li
v
eness
detection
or
continuous
biometric
authentication
techniques
in
order
to
defend
ag
ainst
presentation
attacks
and
unauthorized
user
accessibility
to
the
systems
[12]-[14].
Using
a
non-in
v
asi
v
e,
quantiable
sensor
that
can
g
ather
users’
biometric
information,
perpetual
biometric
authentication
continually
v
eries
the
identication
of
the
user
.
Consequently
,
because
of
the
distincti
v
e
features
of
the
ECG
signals,
continuous
biometric
authentication
has
dra
wn
a
lot
of
interest
as
a
potentially
e
xtremely
viable
ne
xt-generation
approach.
The
ECG
is
a
skin-attached
electrode-deri
v
ed
electrical
signal
that
consists
of
three
unique
elements:
the
T
-w
a
v
e,
QRS
comple
x,
and
P-w
a
v
e
[15].
V
ariations
in
ECG
patterns
among
indi
viduals
can
be
attrib
uted
to
three
primary
rea
sons.
First
of
all,
indi
vidual
dif
ferences
e
xist
in
ph
ysiological
parameters
including
cardiac
mass,
size,
conducti
vit
y
,
and
acti
vity
.
Second,
ECG
pattern
v
ariability
is
inuence
d
by
geometrical
parameters
arising
from
dif
ferences
in
the
location
and
v
ector
of
the
heart.
Finally
,
the
specic
structure
and
mak
eup
of
the
heart
are
inuenced
by
indi
vidual
deoxyribonucleic
acid
(DN
A
traits).
Ne
v
ertheless,
because
the
ECG
is
an
electrical
transmission,
v
ariations
in
heart
rate
and
ambient
f
actors
might
af
fect
its
reading.
Moreo
v
er
,
the
reliability
of
unimodal
authentication
systems
decreases
for
increased
sample
size
[16].
Multimodal
biometric
systems
incorporate
a
minimum
of
tw
o
biometric
features
in
comparison
to
unimodal
biometric
systems
in
order
to
impro
v
e
recognition
precision
and
strengthen
defenses
from
spoong
attacks
[17],
[18].
Since
both
ngerprints
and
high-quality
heart
signals
may
be
concurrently
tak
en
from
the
ngertips,
ngerprints
and
heart
s
ignals
pro
vide
a
perfect
combination
for
multimodal
fusion.
Heart
signal
possesses
a
li
v
eness
property
that
enhances
their
security
as
a
biometric
modality
,
and
their
fusion
with
nger
-
prints
holds
promise
for
establishing
a
rob
ust
and
secure
authentication
and
identication
system
[19],
[20].
Numerous
multimodal
biometric
systems
inte
grating
ngerprints
and
heart
signal
ha
v
e
been
proposed
in
the
literature.
Bala
et
al.
[21]
presented
a
detailed
study
about
multimodal
fusion
algorithm
for
combining
thes
e
modalities.
K
om
eili
et
al.
[22]
introduced
a
multimodal
system
that
inte
grates
ngerprints
and
heart
signal
while
incorporating
automatic
template
updating
of
heart
signal
records.
By
combining
ngerprint
authen-
tication
with
heart
signal
data,
Jomma
et
al.
[23],
[24]
used
a
sequential
mechanism
to
impro
v
e
ngerprint
authentication’
s
resilience
ag
ainst
presentation
attack.
In
a
similar
v
ein,
the
reason
iris-based
biometric
identication
is
so
well-lik
ed
is
due
to
its
e
xceptional
reliability
and
ef
cac
y
as
a
means
of
human
dif
ferentiation
[25].
Because
iris
patterns
naturally
are
so
easily
distinguished,
the
human
iris
pro
vides
signicant
scientic
adv
antages.
The
primary
benet
is
stability
,
as
an
indi
vidual’
s
iris
does
not
alter
.
Man
y
strate
gies,
which
can
be
cate
gorized
into
distinct
methodologies
such
as
stage-based
approaches,
zero-intersection
representation,
te
xture
analysis,
and
v
ariation
in
intensities,
focused
on
changes
in
the
iris
pattern
throughout
the
de
v
elopment
of
the
iris
recognition
system.
The
most
reliable
biometric
feature
is
belie
v
ed
to
be
found
in
the
human
iris.
When
used
in
surv
eillance-based
systems,
such
as
when
utilizing
the
iris
template’
s
te
xture
changes,
it
may
be
quite
benecial.
The
method
suggested
in
[26]
separates
into
subblocks
after
re
v
ealing
the
iris
te
xture
using
a
2D
Gabor
lter
bank.
Consequent
ly
,
the
outcomes
of
the
conducted
tests
demonstrated
ef
fecti
v
e
outcomes.
The
method
in
[27]
for
identity
identication
mak
es
use
of
deep
learning.
In
this
article,
an
intelligent
surv
eillance
system
including
good
accurac
y
outcomes
w
as
e
v
aluated
on
man
y
standard
databases.
Therefore,
by
le
v
eraging
the
iris
and
ECG
signal
data
we
present
a
no
v
el
multimodal
aut
hentication
system
by
using
these
tw
o
modalities.
An
authentication
system
that
le
v
erages
both
iris
recognition
and
ECG
authentication
presents
se
v
eral
adv
antages.
Firstly
,
it
of
fers
heightened
security
through
a
multi-layered
ap-
proach.
Iris
patterns
and
ECG
signals
are
unique
to
indi
viduals,
making
it
challenging
for
unauthorized
users
Int
J
Elec
&
Comp
Eng,
V
ol.
15,
No.
4,
August
2025:
4279-4295
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
❒
4281
to
mimic
or
spoof
them
ef
fecti
v
ely
.
This
multi-f
actor
authentication
signicantly
reduces
the
risk
of
unautho-
rized
access.
Secondly
,
the
inte
gration
of
iris
recognition
and
ECG
authentication
results
in
enhanced
accurac
y
during
identity
v
erication
processes.
Both
modalities
boast
high
accurac
y
rates,
minimizing
instances
of
f
alse
positi
v
es
and
f
alse
ne
g
ati
v
es.
This
accurac
y
is
crucial
for
maintaining
the
i
nte
grity
and
reliability
of
the
authen-
tication
system.
Furthermore,
the
combination
of
iris
recognition
and
ECG
authentication
pro
vides
resistance
ag
ainst
v
arious
spoong
attempts.
Attempts
to
for
ge
or
replicate
iris
patterns
or
ECG
signals
are
e
xceedingly
dif
cult,
reinforcing
the
system’
s
rob
ustness
ag
ainst
fraudulent
acti
vities.
Additionally
,
users
benet
from
the
con
v
enience
of
non-intrusi
v
e
biometric
authentication
methods.
Eliminating
the
need
for
passw
ords
or
ph
ys-
ical
tok
ens
streamlines
the
authentication
process
and
enhances
user
e
xperience.
Moreo
v
er
,
the
system
of
fers
biometric
redundanc
y
,
ensuring
continuous
access
e
v
en
if
one
modality
f
ails
or
becomes
una
v
ailable.
The
incorporation
of
ECG
authentication
also
introduces
health
monitoring
capabilities,
enabling
the
detection
of
potential
cardiac
irre
gularities
during
the
authentication
process.
This
feature
contrib
utes
to
user
well-being
be-
yond
authentication
purposes.
Furthermore
,
the
system
demonstrates
resilience
to
en
vironmental
f
actors
such
as
lighting
conditions
and
noise,
ensuring
consistent
performance
across
v
arious
settings.
The
proposed
w
ork
can
be
adopted
in
v
arious
application
domains
such
as
medical
signal
proces
s-
ing,
biometric
authenticaton,
telecommunication
and
remote
sensing,
and
industrial
monitoring
and
controls.
In
me
dical
diagnostics
and
monitoring,
precise
interpretation
of
biological
signals
lik
e
ECGs
and
iris-based
multimodal
authentication
systems
is
crucial.
The
proposed
method
adv
ances
signal
authentication
reliability
despite
noise
artif
acts,
ensuring
more
accurate
diagnoses
and
authentication
outcomes.
This
capability
en-
hances
patient
care
quality
and
medical
procedure
ef
cienc
y
.
Biometric
authenti
cation
systems,
le
v
eraging
iris
recognition
and
other
modalities,
are
inte
gral
to
security
frame
w
orks.
By
mitig
ating
noise
sources
such
as
baseline
w
ander
and
electrode
artif
acts,
the
proposed
method
boosts
biometric
s
ystem
rob
ustness
and
accurac
y
.
This
enhancement
forties
security
protocols,
reducing
unauthorized
access
risks
and
safe
guarding
sensiti
v
e
data
and
f
acilities.
Similarly
,
the
signal
quality
is
paramount
in
telecommunications
and
remote
sensing
for
ef
fecti
v
e
com-
munication
and
data
analysis.
Noise
interference
can
de
grade
performance
signicantly
.
The
proposed
method
impro
v
es
signal-to-noise
ratio
(SNR)
and
minimizes
residual
dif
ferences
in
noisy
en
vironments.
This
adv
ance-
ment
enhances
data
transmission
reliability
and
f
acilitates
precise
remote
sensing
observ
ations,
supporting
scientic
and
operational
objecti
v
es.In
industrial
en
vironments,
real-time
monitoring
and
control
systems
rely
on
accurate
signal
processing.
Addressing
challenges
posed
by
motion
artif
acts
and
color
noise,
the
proposed
method
enhances
signal
aut
hentication
precision.
This
impro
v
ement
supports
reliable
f
ault
detection,
predic-
ti
v
e
maintenance,
and
process
optimization,
reducing
do
wntime
and
enhancing
producti
vit
y
across
industrial
operations.
Lastly
,
the
use
of
iris
patterns
and
ECG
signals
preserv
es
user
pri
v
ac
y
by
a
v
oiding
the
collect
ion
of
personally
identiable
information.
This
aspect
is
critical
for
maintaining
use
r
trust
and
compliance
with
pri
v
ac
y
re
gulati
ons.
In
conclusion,
an
authentication
system
combining
iris
recognition
and
ECG
authentication
of
fers
a
comprehensi
v
e
solution
characterized
by
rob
ust
security
,
accurac
y
,
user
con
v
enience,
health
monitoring
capabilities,
and
pri
v
ac
y
preserv
ation.
Based
on
these
adv
antages,
the
main
contrib
ution
of
this
w
ork
can
be
listed
as
follo
ws:
i)
to
present
a
data
pre-processing
method
for
ECG
and
iris
image
data;
ii)
to
perform
ECG
ltering
and
image
denoising
where
ECG
ltering
is
carried
out
with
the
help
of
e
xtended
Kalman
lter
,
whereas
image
ltering
uses
a
w
a
v
elet
transform
model;
iii)
to
present
a
heart
rate
feature
analysis
in
time
and
frequenc
y
domain
for
ECG
signals;
and
i
v)
to
present
an
ensemble
of
CNN
and
DeepResNet-
based
transfer
learning
models
for
classication.
2.
PR
OPOSED
MB
ANET
MODEL
FOR
REAL
TIME
A
UTHENTICA
TION
In
this
section
we
describe
the
MB
ANet
approach
for
real-time
authentication
by
using
ECG
and
iris
modalities.
F
or
each
user
,
the
ECG
and
iris
data
is
capt
ured
and
stored.
This
data
is
processed
through
se
v
eral
stages
which
are
described
belo
w
.
The
complete
architecture
of
proposed
model
is
depicted
in
Figure
1.
Gen-
erally
,
an
electrocardiogram
is
recorded
by
af
xing
electrodes
to
the
patient’
s
body
,
through
which
electrical
signals
are
recei
v
ed
by
the
de
vice.
Consequently
,
the
quality
of
the
ECG
signal
obtained
is
directly
inuenced
by
the
contact
between
these
electrodes
and
the
user’
s
skin.
Furthermore,
proximity
to
equipment
utilizing
alternating
current
(A
C)
po
wer
introduces
interference
from
the
po
wer
grid
to
the
human
body
.
These
tw
o
forms
of
noise
signicantly
impact
the
recei
v
ed
ECG
signal
quality
,
necessitating
their
elimination.
Similarly
,
Ensemble
of
con
volutional
neur
al
network
and
DeepResNet
for
...
(Ashwini
Kailas)
Evaluation Warning : The document was created with Spire.PDF for Python.
4282
❒
ISSN:
2088-8708
the
quality
of
iris
images
is
af
fected
due
to
dif
ferent
types
of
noise.
Therefore,
the
rst
phase
focuses
on
de-
v
elopment
of
an
ef
cient
approach
for
ECG
signal
ltering
and
noise
remo
v
al
from
iris
images.
These
signals
and
image
data
contain
certain
patterns
which
are
kno
wn
as
their
k
e
y
attrib
utes.
Arranging
these
attrib
utes
and
annotating
the
data
plays
important
role
in
machine
learning
applications.
Thus,
in
ne
xt
stage,
we
present
fea-
ture
e
xtraction
process
for
both
ECG
and
iris
image
data.
Finally
,
these
attrib
utes
are
used
to
trai
n
the
machine
learning
model
to
v
erify
the
user
authenticity
.
The
training
process
requires
a
x
ed
ratio
of
dataset
for
training
and
remaining
samples
are
used
for
testing
and
v
alidation
purpose.
Figure
1.
Proposed
MB
ANet
architecture
2.1.
Noise
model
f
or
ECG
signal
As
discussed
before,
the
ECG
signals
gets
contaminated
due
to
dif
ferent
types
of
noises.
In
this
w
ork,
we
ha
v
e
considered
dif
ferent
types
of
noise
such
as
Gaussian,
Baseline
w
ander
,
Muscle
artef
act,
and
po
wer
line
interference.
The
details
of
these
noises
and
their
e
xpressions
are
described
belo
w:
a.
Gaussian
noise
Gaussian
white
noise
is
often
used
to
model
random
uctuations
in
the
ECG
signal.
It
is
char
acterized
by
a
constant
v
ariance
and
zero
mean.
In
the
dynamic
model,
w
k
,
representing
process
noise,
can
be
modeled
as
Gaussian
white
noise.
Similarly
,
in
the
measurement
model,
v
k
,
representing
measurement
noise,
can
also
be
modeled
as
Gaussian
white
noise.
The
co
v
ariance
matrices
Q
and
R
in
the
prediction
and
update
steps
of
the
EKF
reect
the
v
ariance
of
the
process
and
measurement
noise,
respecti
v
ely
.
The
Gaussian
white
noise
is
e
xpressed
as
(1):
w
(
t
)
∼
N
(0
,
σ
2
)
(1)
where
N
(0
,
σ
2
)
represents
a
Gaussian
distrib
ution
with
mean
0
and
v
ariance
σ
2
.
b
.
Baseline
w
ander
noise
Baseline
w
ander
refers
to
lo
w-frequenc
y
drifts
in
the
ECG
signal
caused
by
v
arious
f
actors
such
as
respiration
and
mo
v
ement
artif
acts.
A
simple
mathematical
model
for
baseline
w
ander
can
be
a
random
w
alk
process,
where
the
signal
drifts
randomly
o
v
er
time.
The
baseline
w
ander
b
(
t
)
can
be
e
xpressed
as
(2):
b
(
t
+
1)
=
b
(
t
)
+
ϵ
(2)
where
b
(
t
)
represents
the
baseline
w
ander
at
time
t
,
and
ϵ
is
a
random
noise
component
at
each
time
step.
c.
Muscle
artif
acts
Muscle
artif
acts
introduce
high-frequenc
y
noise
spik
es
in
the
ECG
signal,
often
caused
by
muscle
contractions
or
mo
v
ement.
These
artif
acts
can
be
modeled
as
impulsi
v
e
noise
,
where
sporadic
spik
es
occur
randomly
.
The
muscle
artif
acts
m
(
t
)
can
be
modeled
as
(3):
m
(
t
)
=
A
·
δ
(
t
−
t
i
)
(3)
where
-
A
represents
the
amplitude
of
the
artif
act.
-
δ
(
t
−
t
i
)
is
the
Dirac
delta
function,
representing
the
spik
e
occurring
at
time
t
i
.
Int
J
Elec
&
Comp
Eng,
V
ol.
15,
No.
4,
August
2025:
4279-4295
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
❒
4283
d.
Po
wer
line
interference
Po
wer
line
interference
introduces
periodic
noise
at
the
frequenc
y
of
the
po
wer
supply
(e.g.,
50
Hz
or
60
Hz).
A
sinusoidal
model
is
commonly
used
to
represent
po
wer
line
interference.
It
can
be
e
xpressed
as
(4):
p
(
t
)
=
A
·
sin(2
π
f
t
+
ϕ
)
(4)
where
-
A
is
the
amplitude
of
the
interference,
-
f
is
the
frequenc
y
of
the
po
wer
supply
,
-
ϕ
is
the
phase
angle.
2.2.
ECG
ltering
and
image
denoising
This
subsection
presents
the
solution
for
ECG
signal
ltering
and
image
denoising.
The
ECG
ltering
model
uses
the
Extended
Kalman
ltering
model
to
eliminate
the
noise
from
the
ECG
signal.
The
standard
Kalman
ltering
model
is
a
recursi
v
e
techni
que
for
data
ltering
and
is
widely
adopted
in
data
pre-processing
and
ltering
tasks.
ECG
signals
can
be
modeled
as
a
combination
of
v
arious
components
such
as
the
QRS
comple
x,
P-w
a
v
e,
T
-w
a
v
e,
baseline
w
ander
,
and
noise.
A
com
mon
model
for
the
ECG
signal
can
be
represented
as
(5):
y
(
t
)
=
s
(
t
)
+
n
(
t
)
(5)
where
y
(
t
)
is
the
observ
ed
ECG
signal,
s
(
t
)
is
the
true
underlying
signal,
and
n
(
t
)
represents
the
noise.
Initially
,
we
present
the
dynamic
modeling
of
the
underlying
signal
to
represent
the
e
v
olution
of
the
ECG
signal
o
v
er
time.
In
the
case
of
ECG
signal
ltering,
this
could
be
a
rst-order
model
for
the
state
e
v
olution.
F
or
instance,
it
can
be
e
xpressed
as
(6):
x
k
+1
=
F
·
x
k
+
w
k
(6)
where
x
k
represents
the
state
of
the
system
at
time
step
k
,
which
could
include
paramete
rs
such
as
amplitude
and
frequenc
y
,
F
is
the
state
transition
matrix,
and
w
k
represents
the
process
noise.
In
the
ne
xt
stage,
we
apply
the
measurement
model,
which
describes
ho
w
the
observ
ed
signal
is
rel
ated
to
the
true
state
of
the
system.
In
this
case,
it
could
be
a
linear
or
non-linear
function
depending
on
the
specic
characteristics
of
the
ECG
signal.
The
measurement
model
can
be
e
xpressed
as
(7):
z
k
=
H
·
x
k
+
v
k
(7)
where
z
k
represents
the
observ
ed
ECG
signal
at
time
step
k
,
H
is
the
measurem
ent
matrix,
and
v
k
represents
the
measurement
noise.
Further
,
we
apply
the
Extended
Kalman
Filtering
model,
which
is
completed
in
three
main
steps:
initialization,
prediction,
and
update.
These
steps
can
be
described
as
follo
ws:
Initialization:
Initialize
the
state
v
ector
x
0
and
the
error
co
v
ariance
matrix
P
0
.
Prediction:
Predict
the
ne
xt
state
using
the
dynamic
model
as:
ˆ
x
k
+1
|
k
=
F
·
ˆ
x
k
Predict
the
error
co
v
ariance
matrix:
P
k
+1
|
k
=
F
·
P
k
·
F
T
+
Q
where
Q
represents
the
process
noise
co
v
ariance
matrix.
Update:
Compute
the
Kalman
Gain:
K
k
+1
=
P
k
+1
|
k
·
H
T
·
H
·
P
k
+1
|
k
·
H
T
+
R
−
1
where
R
is
the
measure-
ment
noise
co
v
ariance
matrix.
Update
the
state
estimate:
ˆ
x
k
+1
=
ˆ
x
k
+1
|
k
+
K
k
+1
·
z
k
+1
−
H
·
ˆ
x
k
+1
|
k
Update
the
error
co-
v
ariance
matrix:
P
k
+1
=
(
I
−
K
k
+1
·
H
)
·
P
k
+1
|
k
Repeat
the
prediction
and
update
steps
for
each
time
step,
incorporati
ng
ne
w
meas
urements
and
re-
ning
the
state
estimate.
The
nal
output
of
the
lter
is
the
estimated
ECG
signal,
which
is
the
state
estimate
ˆ
x
k
at
each
time
step.
Similarly
,
we
apply
an
image
ltering
model
using
t
h
e
w
a
v
elet
transform
approach.
The
w
a
v
elet
transform
dif
fers
from
the
F
ourier
transform
by
e
mplo
ying
a
nite
decaying
w
a
v
elet
basis
in
place
of
the
innite
trigonometric
basis.
Unlik
e
the
F
ourier
basis,
the
w
a
v
elet
basis
possesses
nite
ener
gy
,
typically
fo-
cusing
around
a
singular
point,
and
inte
grates
to
zero.
While
the
F
ourier
transform
relies
solely
on
the
v
ariable
ω
,
the
w
a
v
elet
transform
introduces
tw
o
v
ariables:
scale
a
and
translation
b
.
The
scale
parameter
a
corresponds
to
frequenc
y
,
whereas
the
translation
parameter
b
corresponds
t
o
time.
Consequently
,
the
w
a
v
elet
transform
enables
time-frequenc
y
analysis,
f
acilitating
the
e
xtraction
of
the
time-frequenc
y
spectrum
of
the
signal.
By
Ensemble
of
con
volutional
neur
al
network
and
DeepResNet
for
...
(Ashwini
Kailas)
Evaluation Warning : The document was created with Spire.PDF for Python.
4284
❒
ISSN:
2088-8708
utilizing
the
scaling
and
translation
of
the
mother
w
a
v
elet
function,
a
w
a
v
elet
sequence
can
be
generated,
with
its
general
form
e
xpressed
as
(8):
ψ
a,b
(
t
)
=
1
√
a
ψ
t
−
b
a
,
a,
b
∈
R
(8)
During
the
w
a
v
elet
transform
process,
the
scale
f
act
or
a
and
time
shift
b
are
theoretically
continuous,
which
poses
computational
challenges
for
nite-time
e
x
ecution.
T
o
address
this,
the
discret
e
w
a
v
elet
transform
(D
WT)
discretizes
the
scale
f
actor
a
and
time
shift
b
based
on
specic
rules.
By
adopting
discrete
v
alues
for
a
and
b
,
the
D
WT
enables
computationally
feasible
analysis.
Opting
for
po
wer
-of-2
v
alues
for
a
and
b
enhances
the
accurac
y
and
ef
cienc
y
of
signal
analysis.
The
w
a
v
elet
function
can
be
e
xpressed
as
(9):
ψ
m,n
(
k
)
=
2
−
m
2
ψ
2
−
m
·
k
−
n
,
m,
n
∈
Z
(9)
The
w
a
v
elet
transform
is
capable
of
breaking
do
wn
the
original
image
data
into
approximate
and
detailed
components,
which
primarily
re
v
eal
the
noise
present
in
the
image.
F
ollo
wing
this,
by
applying
w
a
v
elet
re-
construction
to
the
thresholded
detailed
components,
we
can
obtain
smoother
image
information.
The
o
v
erall
process
of
w
a
v
elet
transform
denoising
is
depicted
in
Figure
2.
Figure
2.
W
a
v
elet
transform
for
image
denoising
2.3.
F
eatur
e
extraction
F
or
ECG
signal,
we
ha
v
e
considered
P
an
T
ompkins
peak
detection
approach
to
identify
the
v
arious
peaks
of
ECG
signal.
Further
,
we
e
xtract
time
and
frequenc
y
domain
heart
rate
v
ariability
(HR
V)
features
from
ECG
signals.
Belo
w
gi
v
en
T
able
1
demonstrates
the
time
domain
features
used
as
important
attrib
utes
of
ECG
signals.
Similarly
,
we
e
xtract
frequenc
y
domain
feature
for
the
ECG
signal.
In
this
process,
lo
w-frequenc
y
(LF),
high-frequenc
y
(HF),
v
ery-lo
w-frequenc
y
(VLF)
and
ultra-lo
w-frequenc
y
(ULF)
are
considered.
T
able
1.
HR
V
features
time
domain
Feature
Description
Measurement
unit
SDNN
Standard
de
viation
of
NN
interv
als
ms
SD
ANN
Standard
de
viation
of
mean
of
NN
interv
als
in
5
min
windo
ws
ms
RMSSD
Square
root
of
the
mean
of
the
sum
of
the
squares
of
dif
ferences
between
adjacent
NN
interv
als
ms
SDNN
inde
x
Mean
of
the
standard
de
viation
of
all
NN
interv
als
performed
on
all
5-minute
se
gments
of
the
entire
recording
ms
SDSD
Standard
de
viation
of
dif
ferences
between
adjacent
NN
interv
als
ms
NN50
The
count
of
number
of
pairs
of
adjacent
NN
interv
als
dif
fering
by
more
than
50
ms
ms
pNN50
NN50
count
di
vided
by
the
total
number
of
all
NN
interv
als
%
2.4.
Classication
In
this
w
ork,
we
apply
tw
o
dif
ferent
classier
approach
by
using
deep
learning
system
and
combined
result
is
cons
idered
as
nal
outcome.
F
or
e
xample,
if
ECG
signal
is
authenticated
and
Iris
image
authentication
Int
J
Elec
&
Comp
Eng,
V
ol.
15,
No.
4,
August
2025:
4279-4295
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
❒
4285
f
ails
then
the
system
considers
t
he
imposter
input.
In
order
to
classify
ECG
signals,
we
ha
v
e
considered
ensemble
of
three
single
CNN
classier
.
The
CNN
model
relearns
the
features
produced
by
the
single
netw
ork.
Each
CNN
model
uses
rectied
linear
units
(ReLU),
Leakage
ReLU(LReLU),
and
e
xponential
linear
units
(ELU),
respecti
v
ely
.
Figure
3
depicts
the
o
v
erall
architecture
of
MB
ANet
model.
The
HR
V
features
time
domain
is
depicted
in
T
able
2.
Figure
3.
ECG
classication
T
able
2.
HR
V
features
time
domain
Feature
Description
Measurement
unit
LF
peak
Peak
frequenc
y
of
the
current
lo
w-frequenc
y
band
(0.04–0.15Hz)
Hz
HF
peak
Peak
frequenc
y
of
the
high-frequenc
y
band
(0.15–0.4Hz)
Hz
LF
po
wer
Absolute
po
wer
of
the
lo
w-frequenc
y
band
(0.04–0.15Hz)
ms2
Relati
v
e
po
wer
of
the
lo
w-frequenc
y
band
(0.04–0.15Hz)
in
normal
units
nu
Relati
v
e
po
wer
of
the
lo
w-frequenc
y
band
(0.04–0.15Hz)
%
HF
po
wer
Absolute
po
wer
of
the
high-frequenc
y
band
(0.15–0.4Hz)
ms2
Relati
v
e
po
wer
of
the
high-frequenc
y
band
(0.15–0.4Hz)
in
normal
units
nu
Relati
v
e
po
wer
of
the
high-frequenc
y
band
(0.15–0.4Hz)
%
VLF
po
wer
Absolute
po
wer
of
the
v
ery-lo
w-frequenc
y
band
(0.0033–0.04Hz)
ms2
ULF
po
wer
Absolute
po
wer
of
the
ultra-lo
w-frequenc
y
band
ms2
LF/HF
Ratio
of
LF-to-HF
po
wer
%
In
ne
xt
stage,
we
perform
c
lassication
for
Iris
images.
F
or
this
task,
we
ha
v
e
used
transfer
learn-
ing
approach
and
combined
it
with
DeepResNet
model
to
enhance
the
classication
performance.
ho
we
v
er
,
this
module
also
uses
deep
transfer
learning
based
Imagenet
model.
The
ResNet
model
introduces
a
short
connection
to
skip
one
or
more
layer
.
The
basic
architecture
of
ResNet
is
depicted
in
belo
w
gi
v
en
Figure
4.
Ensemble
of
con
volutional
neur
al
network
and
DeepResNet
for
...
(Ashwini
Kailas)
Evaluation Warning : The document was created with Spire.PDF for Python.
4286
❒
ISSN:
2088-8708
Figure
4.
ECG
classication
This
model
is
trained
with
the
help
of
the
cross-entrop
y
loss
function.
Further
,
the
loss
function
is
optimized
by
incorporating
the
L
2
norm,
which
helps
to
reduce
o
v
ertting.
Thus,
the
nal
loss
function
for
this
model
can
be
e
xpressed
as
(10):
L
nal
=
L
class
+
λ
1
∥
W
fc
∥
2
F
(10)
where
L
class
represents
the
classication
loss
(cross-entrop
y
loss),
and
∥
W
fc
∥
2
F
denotes
the
Frobenius
norm
of
the
weight
matrix
W
fc
in
the
last
layer
.
Finally
,
the
loss
function
uses
the
Adam
optimizer
to
minimize
the
o
v
erall
loss.
3.
RESUL
TS
AND
DISCUSSION
This
section
presents
the
outcome
of
MB
ANet
model
along
with
its
comparati
v
e
analysis
with
e
xisting
approaches
of
classication
and
authentication.
The
rst
subsection
presents
the
brief
details
about
the
dataset
used
in
this
w
ork,
the
ne
xt
subsection
describes
the
details
about
performance
measurement
parameters,
nally
,
the
outcom
e
of
MB
ANet
approach
is
demonstrated
and
compared
with
e
xisting
models.
The
combination
of
these
modalities
in
publically
a
vilable
dataset
is
not
present
therefore
we
ha
v
e
considered
syntntically
creates
dataset
from
dif
ferent
sources.
3.1.
P
erf
ormance
measur
ement
The
performance
of
ECG
signal
denoising
is
measured
using
v
arious
parameters.
These
parameters
are
as
follo
ws:
−
Mean
squared
error
(MSE)
MSE
=
1
N
N
X
i
=1
(
X
(
i
)
−
Y
(
i
))
2
(11)
−
Root
mean
square
error
(RMSE)
RMSE
=
v
u
u
t
1
N
N
X
i
=1
(
X
(
i
)
−
Y
(
i
))
2
(12)
−
Peak
signal-to-noise
ratio
(PSNR)
PSNR
=
10
·
log
10
P
N
i
=1
MAX
2
MSE
!
(13)
−
Percent
root
mean
square
dif
ference
(PRD)
PRD
=
100
×
v
u
u
t
P
N
i
=1
(
Y
(
i
)
−
X
(
i
))
2
P
N
i
=1
X
(
i
)
2
(14)
Int
J
Elec
&
Comp
Eng,
V
ol.
15,
No.
4,
August
2025:
4279-4295
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Elec
&
Comp
Eng
ISSN:
2088-8708
❒
4287
The
performance
of
the
MB
ANet
approach
is
e
v
aluated
using
confusion
matrix
calculations.
The
confusion
matrix
is
generated
based
on
true
positi
v
e,
f
alse
positi
v
e,
f
alse
ne
g
ati
v
e,
and
true
ne
g
ati
v
e
v
alues.
T
able
3
pro
vides
a
sample
representation
of
the
confusion
matrix.
T
able
3.
Confusion
matrix
Actual
class
Predicted
class
Genuine
user
Imposter
user
Genuine
user
T
rue
positi
v
e
F
alse
ne
g
ati
v
e
Imposter
user
F
alse
positi
v
e
T
rue
ne
g
ati
v
e
W
e
use
the
suggested
technique
to
quantify
se
v
eral
statistical
performance
measures,
including
acc
u-
rac
y
,
precision,
and
F1-score,
based
on
this
confusion
matrix.
The
assessment
of
accurate
instance
cate
goriza-
tion
relati
v
e
to
the
total
number
of
occurrences
is
called
accurac
y
.
Here’
s
ho
w
accurac
y
is
calculated:
Acc
=
TP
+
TN
TP
+
TN
+
FP
+
FN
(15)
Ne
xt,
we
calculate
the
suggested
approach’
s
Precision.
The
ratio
of
true
positi
v
es
to
(true
and
f
alse)
positi
v
es
is
used
to
calculate
it:
P
=
TP
TP
+
FP
(16)
Lastly
,
we
use
the
sensiti
vity
and
precision
parameters
t
o
calculate
the
F-measure,
which
may
be
written
as
(17):
F
=
2
·
P
·
Sensiti
vity
P
+
Sensiti
vity
(17)
3.2.
P
arameters
and
h
yper
parameters
This
section
p
r
esents
the
dif
ferent
parameters
and
h
yperparameters
used
in
this
w
ork
to
train
the
deep
learning
model
for
ECG
and
Iris
authentication.This
model
considers
the
image
size
224x224,
thus
the
input
shape
becomes
4,3,224,224
where
4
is
batch
size
and
3
is
the
channel
of
image
data.
Similarly
,
the
ECG
signal
is
represented
as
4,1,100
with
batch
size
4.
The
output
of
image
model
produces
a
similar
size
of
image
whereas
the
ECG
processing
module
generates
similar
size
of
data.
In
this
w
ork,
we
ha
v
e
considered
100
samples
are
considered,
split
equally
between
ECG
and
iris
samples,
with
50
s
amples
each.
The
dataset
is
di
vided
using
a
70%-30%
train-test
split.
This
ensures
that
the
models
are
trained
on
a
suf
cient
amount
of
data
while
retaining
a
separate
portion
for
e
v
aluation
to
g
auge
their
performance
ef
fecti
v
ely
.
In
order
to
consider
the
noise
aspect,
tw
o
le
v
els
of
noise
intensity
are
e
xam
ined:
5
dB
and
10
dB.
These
le
v
els
simulate
dif
ferent
de
grees
of
noise
interference
commonly
encountered
in
real-w
orld
scenarios.
Finally
,
dif
ferent
deep
learning
training
parameters
are
used
to
train
the
proposed
MB
Anet
model.
T
able
4
presents
the
considered
parameters.
T
able
4.
Simulation
parameters
P
arameters
Considered
v
alue
T
otal
samples
100
ECG
sample
50
Iris
sample
50
T
rain
test
ratio
70%-30%
Noise
type
White
noise,
color
noise,
motion
artif
act,
electrode
artif
act,
baseline
w
ander
Noise
le
v
els
5
dB,
10
dB
Learning
rate
0.001
Batch
size
4
Optimizer
Adam
Scheduler
ReduceLR
OnPlateau
Epochs
100
Loss
CrossEntrop
yLoss
Cross
v
alidation
10
fold
Simulation
T
ool
Python
3.8
Ensemble
of
con
volutional
neur
al
network
and
DeepResNet
for
...
(Ashwini
Kailas)
Evaluation Warning : The document was created with Spire.PDF for Python.
4288
❒
ISSN:
2088-8708
3.3.
Comparati
v
e
analysis
First
of
all,
we
process
the
Iris
image
data
where
image
annotation,
labelling
boundary
ide
n
t
ication,
mask
e
xtraction
and
normalization
tasks
are
performed.
Figure
5
depicts
the
sample
outcome
of
these
steps.
The
normalized
image
is
further
used
for
feature
e
xtraction
and
classication
tasks.
Similarly
,
we
perform
se
v
eral
tasks
on
ECG
signals
such
as
ECG
signal
ltering
because
these
signals
are
prone
to
v
arious
types
of
noise.
Figure
6
depicts
the
original
signal,
noisy
signal
and
their
corresponding
ltered
signals.
Figure
5.
ECG
classication
In
order
to
measure
the
ltering
performance,
we
consider
dif
ferent
types
of
noises
such
as
white
noise,
color
noise,
motion
artif
act,
electrode
artif
act
,
baseline
w
ander
and
v
aried
the
noise
dB
as
5
dB
and
10
dB.
T
able
5
sho
ws
the
obtained
performance
in
terms
of
PSNR,
MSE,
mean
absolute
error
(MAE),
RMSE,
PRD,
and
correlation
coef
cient
(CC).
Her
e,
for
5
dB
noise,
max.
PSNR
is
attained
as
46.10
dB
for
Baseline
w
ander
and
similarly
,
for
10
dB,
max.
PSNR
is
attained
as
44.82
dB
for
white
noise.
Under
5
dB
noise
conditions,
the
proposed
MB
ANet
model
achie
v
ed
an
a
v
erage
impro
v
ement
of
approximately
15%
in
PSNR,
indicating
better
preserv
ation
of
signal
quality
compared
to
e
xisting
models.
This
impro
v
ement
translates
to
a
noticeable
reduction
in
noise
distortion,
as
e
videnced
by
a
10%
decrease
in
MSE
and
RMSE
v
alues,
signifying
closer
agreement
between
predicted
and
actual
v
alues.
Moreo
v
er
,
the
MB
ANet
model
e
xhibited
a
12%
decrease
in
MAE,
indicating
more
accurate
predictions
and
a
3%
impro
v
ement
in
PRD,
reecting
a
reduction
in
residual
dif
ferences
relati
v
e
to
reference
signals.
Under
10
dB
noise
conditions,
the
impro
v
ements
were
e
v
en
more
pronounced,
wit
h
the
MB
ANet
model
achie
ving
around
20%
higher
PSNR
v
alues
compared
to
e
xisting
models.
This
enhancement
high-
lights
the
model’
s
capability
to
maintain
superior
image
quality
despite
higher
noise
le
v
els.
The
model
also
demonstrated
a
15%
reduction
in
MSE
and
RMSE,
underscoring
its
ability
to
minimize
prediction
errors.
Fur
-
thermore,
a
5%
impro
v
ement
in
PRD
and
a
2%
increase
in
CC
were
observ
ed,
indicating
enhanced
accurac
y
and
stronger
linear
relationships
between
predicted
and
actual
v
alues.
Finally
,
we
measured
the
classication
accurac
y
performance.
In
this
w
ork,
we
ha
v
e
considered
100
user
cases
which
is
di
vided
into
50%
for
training
and
50%
for
testing.
In
testing
phase,
25
users
belong
to
genuine
cate
gory
and
remaining
25
users
belong
to
imposter
cate
gory
.
This
section
presents
the
classication
accurac
y
performance
for
real-time
cases
by
us
ing
MB
ANet
model.
The
attained
results
ar
e
then
contrasted
with
the
standard
classication
approaches.
Results
are
depicted
in
Figure
7.
According
to
this
e
xperiment,
the
random
forest
has
misclassied
21
entities
to
dif
ferent
classes
which
af
fects
the
performance
of
RF
classier
,
similarly
,
SVM
also
has
16
misclassied
entities
whereas
the
proposed
approach
has
reported
only
3
entities
as
misclassied
resulting
in
increased
accurac
y
.
T
able
6
sho
ws
the
per
-
formance
obtained
by
using
dif
ferent
classiers.
According
to
this
e
xperiment,
the
o
v
erall
accurac
y
is
reported
as
0.8900,
0.8400,
0.7900,
0.8932,
0.87,
and
0.97
by
using
con
v
olutional
neural
netw
ork-long
short-term
memory
(CNN-LSTM),
support
v
ector
machine
(SVM),
random
forest
(RF),
con
v
olutional
neural
netw
ork(CNN),
decision
tree
(DT),
and
MB
ANet
approach
respecti
v
ely
.
The
e
xisting
models
rely
on
single
modalities
ho
we
v
er
some
recent
methods
ha
v
e
fo-
cused
on
de
v
eloping
multimodal
authentication
b
ut
these
methods
do
not
consider
the
noise
in
ECG
signal
and
iris
images
whereas
the
proposed
model
introduced
a
multimodal
authentication
sys
tem
with
compreshensi
v
e
ltering
model.
Similarly
,
the
proposed
m
odel
uses
pre-trained
deep
learning
models
to
impro
v
e
the
training
speed
and
accurac
y
.
The
training
speed
performance
is
depicted
in
T
able
7.
Int
J
Elec
&
Comp
Eng,
V
ol.
15,
No.
4,
August
2025:
4279-4295
Evaluation Warning : The document was created with Spire.PDF for Python.