I
A
E
S
I
n
t
e
r
n
at
io
n
al
Jou
r
n
al
of
A
r
t
if
ic
ia
l
I
n
t
e
ll
ig
e
n
c
e
(
I
J
-
AI
)
V
ol
. 14, No. 6, D
e
c
e
m
be
r
2025
, pp.
5131
~
5139
I
S
S
N
:
2252
-
8938
,
D
O
I
:
10.11591/
ij
a
i.
v
14
.i
6
.pp
5131
-
5139
5131
Jou
r
n
al
h
om
e
page
:
ht
tp
:
//
ij
ai
.
ia
e
s
c
or
e
.c
om
A
b
l
e
n
d
e
d
e
n
se
m
b
l
e
ap
p
r
oac
h
f
or
ac
c
u
r
at
e
h
u
m
an
ac
t
i
vi
t
y
r
e
c
ogn
i
t
i
on
R
e
z
w
an
a K
ar
im
1
,
A
f
s
an
a B
e
gu
m
1
,
M
i
s
k
at
u
l
Jan
n
at
2
,
A
b
u
K
ow
s
h
ir
B
it
t
o
1
1
D
e
pa
r
t
m
e
nt
of
S
of
t
w
a
r
e
E
ngi
ne
e
r
i
ng, D
a
f
f
odi
l
I
nt
e
r
na
t
i
ona
l
U
ni
ve
r
s
i
t
y, D
ha
k
a
, B
a
ngl
a
de
s
h
2
D
e
pa
r
t
m
e
nt
of
C
om
put
e
r
S
c
i
e
nc
e
a
nd
E
ngi
ne
e
r
i
ng,
I
nt
e
r
na
t
i
ona
l
I
s
l
a
m
i
c
U
ni
v
e
r
s
i
t
y C
hi
t
t
a
gong
,
C
hi
t
t
a
gong
, B
a
ngl
a
de
s
h
A
r
t
ic
le
I
n
f
o
A
B
S
T
R
A
C
T
A
r
ti
c
le
h
is
to
r
y
:
R
e
c
e
iv
e
d
M
a
y 28, 2025
R
e
vi
s
e
d
O
c
t
10, 2025
A
c
c
e
pt
e
d
N
ov 8, 2025
Human
activit
y
recogniti
on
(HAR)
is
a
nov
el
computer
vision
are
a
with
applicati
ons
in
fashion,
entertain
ment,
healthcare,
and
urban
pla
nning.
Previou
sly,
convolut
ional
neural
networks
(CNNs) were used
in
HAR
due to
their
ability
to
extract
spatial
featur
es
from
images.
However,
CNNs
are
not
effective
in
processin
g
varying
input
sizes
and
long
-
range
dependencies
in
complex
human m
otions
. This
work exam
ines
another ap
proach usi
ng
vision
transforme
rs
(ViT)
and
swin
transformers
(SwinT)
that
pro
cess
ima
ges
as
patch
sequences
and
perform
self
-
attenti
on.
These
models
particular
l
y
excel
in
learning
global
relationships
and
minor
motion
changes
in
body
motion
and
are
therefore
very
well
-
suited
to
variegated
and
subtle
activity
det
ection.
To
further
enhance
recognition
performance,
we
propose
a
hybrid
ensemble
method
by
combining
ViT
and
SwinT
models
with
different
scales
(small,
base,
and
large).
Experimental
outcomes
show
that
while
single
trans
former
models
are
competitive,
the
hybrid
ensemble
beats
them
across
the
board
with
the
highest
acc
uracy
and
balanced
precision,
recall,
and
F1
-
score.
These
findings
confirm
that
the
intended
ensemble
model
provides
a
more
scalable
and
robust
solution
than
either
single
-
model
or
CNN
-
based
approaches,
and thi
s encourages
accurate hu
man activ
ity recog
nition
.
K
e
y
w
o
r
d
s
:
E
ns
e
m
bl
e
m
ode
l
H
um
a
n a
c
ti
vi
ty
r
e
c
ogni
ti
on
R
e
c
ogni
ti
on a
ppl
ic
a
ti
ons
S
c
a
la
bl
e
vi
s
io
n m
ode
ls
T
r
a
ns
f
or
m
e
r
a
r
c
hi
te
c
tu
r
e
s
This is an
open
acce
ss artic
le unde
r the
CC BY
-
SA
license.
C
or
r
e
s
pon
di
n
g A
u
th
or
:
A
bu K
ow
s
hi
r
B
it
to
D
e
pa
r
tm
e
nt
of
S
of
twa
r
e
E
ngi
ne
e
r
in
g, D
a
f
f
odi
l
I
nt
e
r
na
ti
ona
l
U
n
iv
e
r
s
it
y
D
ha
ka
-
12
16
, B
a
ngl
a
de
s
h
E
m
a
il
:
a
bu.kows
hi
r
777@
gm
a
il
.c
om
1.
I
N
T
R
O
D
U
C
T
I
O
N
I
n
r
e
c
e
nt
,
th
e
pot
e
nt
ia
l
o
f
m
a
c
hi
ne
s
to
a
ut
om
a
ti
c
a
ll
y
de
te
c
t
a
nd
c
la
s
s
if
y
hum
a
n
a
c
ti
vi
ty
f
r
om
vi
s
ua
l
in
f
or
m
a
ti
on
ha
s
be
e
n
a
to
pi
c
of
im
m
e
ns
e
in
te
r
e
s
t
in
r
e
s
e
a
r
c
h
a
nd
pr
a
c
ti
c
a
l
a
ppl
ic
a
ti
ons
[
1]
.
H
um
a
n
a
c
ti
vi
ty
r
e
c
ogni
ti
on
(
HAR
)
is
a
ke
y t
e
c
hnol
ogy
f
or
nu
m
e
r
ous
a
ppl
ic
a
ti
ons
s
uc
h a
s
he
a
lt
hc
a
r
e
m
oni
to
r
in
g, s
ur
ve
il
la
nc
e
s
ys
te
m
s
,
hum
a
n
-
c
om
put
e
r
in
te
r
a
c
ti
on,
s
por
ts
a
n
a
ly
ti
c
s
,
a
nd
s
m
a
r
t
e
nvi
r
onm
e
nt
s
.
T
h
e
a
dv
a
nc
e
m
e
nt
of
vi
de
o
s
ur
ve
il
la
nc
e
s
ys
t
e
m
s
,
w
e
a
r
a
bl
e
s
e
ns
or
s
,
a
nd
in
te
ll
ig
e
nt
c
a
m
e
r
a
s
ha
s
br
ought
a
bout
a
n
e
xpl
os
iv
e
in
c
r
e
a
s
e
in
th
e
a
m
ount
of
a
c
ti
vi
ty
-
r
e
la
te
d
da
ta
,
th
e
r
e
by
c
r
e
a
ti
ng
ne
w
c
ha
ll
e
nge
s
in
th
e
c
or
r
e
c
t
in
te
r
pr
e
ta
ti
on
a
nd
c
la
s
s
if
ic
a
ti
on of
i
nt
r
ic
a
te
huma
n a
c
ti
vi
ti
e
s
[
2]
.
H
A
R
s
y
s
t
e
m
s
r
e
li
e
d on
h
a
nd
m
a
de
f
e
a
t
ur
e
e
ngi
ne
e
r
i
ng
a
n
d
c
on
ve
nt
io
na
l
m
a
c
hi
ne
le
a
r
n
in
g
m
e
th
od
s
[
3]
.
Y
e
t
,
s
u
c
h
m
e
th
o
d
s
t
e
n
d
to
f
a
i
l
i
n
a
de
qu
a
t
e
ly
c
a
p
tu
r
i
ng
th
e
c
om
pl
e
x
s
p
a
ti
a
l
a
n
d
t
e
m
por
a
l
pr
o
c
e
s
s
e
s
of
hum
a
n
a
c
ti
v
it
i
e
s
,
e
s
p
e
c
i
a
ll
y
in
h
e
t
e
r
og
e
n
e
ou
s
a
n
d
un
s
tr
uc
tu
r
e
d
s
e
tt
i
ng
s
.
T
h
e
a
dv
a
n
c
e
m
e
nt
of
d
e
e
p
le
a
r
n
in
g
t
e
c
h
ni
q
ue
s
,
s
p
e
c
if
i
c
a
ll
y
c
o
nv
ol
u
ti
o
na
l
n
e
ur
a
l
n
e
t
w
or
ks
(
C
N
N
s
)
,
w
a
s
a
br
e
a
kt
hr
oug
h
in
th
e
s
e
n
s
e
th
a
t
it
e
na
bl
e
d
a
ut
om
a
ti
c
f
e
a
tu
r
e
e
xt
r
a
c
ti
on
a
n
d r
e
pr
e
s
e
nt
a
ti
on l
e
a
r
n
in
g
di
r
e
c
tl
y f
r
o
m
r
a
w
v
is
ua
l
d
a
t
a
[
4]
. E
v
e
n
t
h
ou
gh t
he
y
a
r
e
s
u
c
c
e
s
s
f
ul
,
C
N
N
-
b
a
s
e
d
m
od
e
l
s
a
r
e
a
t
ti
m
e
s
i
na
de
qu
a
t
e
i
n
c
a
p
tu
r
i
ng
l
ong
-
r
a
n
ge
d
e
p
e
nd
e
nc
i
e
s
a
n
d
in
tr
i
c
a
te
s
pa
ti
a
l
r
e
l
a
ti
on
s
h
ip
s
,
w
hi
c
h
a
r
e
e
s
s
e
nt
i
a
l
f
or
c
o
m
pr
e
h
e
n
di
n
g
s
op
hi
s
ti
c
a
t
e
d
h
um
a
n
a
c
t
iv
i
ti
e
s
.
M
o
s
t
r
e
c
e
nt
ly
,
e
n
s
e
m
bl
e
-
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
. 14, No. 6, D
e
c
e
m
be
r
2025
:
5131
-
5139
5132
ba
s
e
d
m
e
t
hod
s
C
N
N
-
ba
s
e
d
tr
a
n
s
f
e
r
le
a
r
n
in
g
e
n
s
e
m
bl
e
s
a
nd
e
a
r
ly
-
e
xi
t
ne
two
r
k
s
,
pr
o
ve
d
th
a
t
th
e
u
ni
o
n
of
he
t
e
r
o
g
e
n
e
ou
s
m
o
de
ls
gr
e
a
tl
y i
nc
r
e
a
s
e
d r
o
bu
s
t
ne
s
s
a
s
w
e
ll
a
s
r
e
c
ogni
ti
o
n
a
c
c
ur
a
c
y
[
5]
.
I
n
r
e
s
pons
e
to
th
e
s
e
c
on
s
tr
a
in
ts
,
th
e
c
om
put
e
r
vi
s
io
n
ha
s
pr
ogr
e
s
s
iv
e
ly
s
hi
f
te
d
to
w
a
r
ds
tr
a
ns
f
or
m
e
r
-
ba
s
e
d m
ode
ls
, w
hi
c
h w
e
r
e
i
ni
ti
a
ll
y popula
r
i
n na
tu
r
a
l
la
ngua
ge
pr
oc
e
s
s
in
g
[
6]
. V
is
io
n
tr
a
ns
f
or
m
e
r
s
(
V
iT
)
a
nd
th
e
ir
va
r
ia
ti
ons
ha
ve
pr
ove
n
e
xc
e
pt
io
na
ll
y
e
f
f
e
c
ti
ve
in
n
um
e
r
ous
im
a
ge
r
e
c
ogni
ti
on
a
ppl
ic
a
ti
ons
by
e
f
f
ic
ie
nt
ly
le
a
r
ni
ng
gl
oba
l
de
pe
nde
nc
ie
s
w
it
h
s
e
lf
-
a
tt
e
nt
io
n
m
e
c
ha
ni
s
m
s
.
A
m
ong
th
e
m
,
th
e
s
w
in
tr
a
ns
f
or
m
e
r
(
S
w
in
T
)
a
hi
e
r
a
r
c
hi
c
a
l
V
iT
m
ode
l
ha
s
s
how
n
a
pr
om
is
in
g
a
r
c
hi
te
c
tu
r
e
by
c
om
bi
ni
ng
th
e
m
e
r
it
s
o
f
bot
h
tr
a
ns
f
or
m
e
r
s
a
nd C
N
N
s
, of
f
e
r
in
g i
m
pr
ove
d e
f
f
ic
ie
nc
y a
nd s
c
a
la
bi
li
ty
f
or
de
ns
e
vi
s
io
n t
a
s
ks
[
7]
.
T
he
r
e
ha
ve
b
e
e
n
s
e
ve
r
a
l
s
tu
di
e
s
on
s
e
ns
or
-
ba
s
e
d
H
A
R
u
s
in
g
a
c
c
e
le
r
om
e
te
r
,
gyr
os
c
ope
,
a
nd
r
a
da
r
s
e
ns
or
da
ta
.
H
u
a
n
e
t
al
.
[
8]
pr
e
s
e
nt
e
d
a
li
ght
-
w
e
ig
ht
hybr
id
V
iT
ne
twor
k
f
or
r
a
da
r
-
ba
s
e
d
H
A
R
,
w
hi
c
h
c
om
bi
ne
d
c
onvolut
io
na
l
ope
r
a
ti
on
s
a
nd
s
e
lf
-
a
tt
e
nt
io
n
f
or
pr
oc
e
s
s
in
g
m
ic
r
o
-
dopple
r
m
a
p
s
e
f
f
ic
ie
nt
ly
.
U
ll
a
h
a
nd M
uni
r
[
9]
pr
e
s
e
nt
c
a
s
c
a
de
du
a
l
a
tt
e
nt
io
n C
N
N
w
it
h
a
bi
-
di
r
e
c
ti
ona
l
ga
te
d
r
e
c
ur
r
e
nt
uni
t
(
G
R
U
)
ha
s
a
ls
o
be
e
n
pr
opos
e
d
to
le
a
r
n
bot
h
s
pa
ti
a
l
a
nd
te
m
por
a
l
f
e
a
tu
r
e
s
,
le
a
di
ng
to
im
pr
ove
d
r
e
c
ogni
ti
on
a
c
c
ur
a
c
y
in
th
e
s
c
e
na
r
io
of
H
A
R
ta
s
ks
.
T
he
s
e
s
tu
di
e
s
in
di
c
a
t
e
th
e
pot
e
nt
ia
l
f
or
c
om
bi
ni
ng
tr
a
ns
f
or
m
e
r
m
ode
ls
w
it
h
tr
a
di
ti
ona
l
de
e
p l
e
a
r
ni
ng a
ppr
oa
c
he
s
t
o i
m
pr
ove
t
he
pe
r
f
or
m
a
nc
e
of
s
e
ns
or
-
ba
s
e
d H
A
R
s
y
s
te
m
s
.
V
a
ghe
la
e
t
al
.
[
10]
us
e
d
f
e
a
tu
r
e
f
us
io
n
a
ppr
oa
c
he
s
to
e
nha
n
c
e
a
c
ti
vi
ty
r
e
c
ogni
ti
on
f
r
om
m
ul
ti
m
oda
l
da
ta
.
M
or
s
he
d
e
t.
al
.
[
11]
im
pr
ove
d
r
e
c
ogni
ti
on
by
da
ta
f
us
io
n w
it
h
f
e
a
tu
r
e
e
ngi
ne
e
r
in
g,
s
how
c
a
s
in
g
how
th
e
in
c
or
por
a
ti
on
of
c
a
r
e
f
ul
ly
c
hos
e
n
ha
ndc
r
a
f
te
d
f
e
a
tu
r
e
s
w
it
h
m
a
c
hi
ne
le
a
r
ni
ng
c
oul
d
be
us
e
d
to
in
c
r
e
a
s
e
a
c
c
ur
a
c
y.
C
om
pr
e
he
n
s
iv
e
s
ur
ve
y
s
on
th
e
e
vol
ut
io
n
f
r
om
ha
ndc
r
a
f
te
d
to
de
e
p
ne
twor
ks
,
w
it
h
oppor
tu
ni
ti
e
s
a
nd
c
ha
ll
e
nge
s
f
or
th
e
c
om
bi
na
ti
on
of
ha
ndc
r
a
f
te
d
a
nd
le
a
r
ne
d
f
e
a
tu
r
e
s
[
12]
.
U
lh
a
q
e
t
al
.
[
13]
di
d
s
ur
ve
y
a
nd
s
how
s
how
a
c
ti
on r
e
c
ogni
ti
on a
dva
nc
e
m
e
nt
s
t
hr
ough the
i
nt
e
gr
a
ti
on of
ha
ndc
r
a
f
te
d a
nd l
e
a
r
ne
d f
e
a
tu
r
e
s
.
T
he
r
e
c
e
nt
ye
a
r
s
ha
ve
s
e
e
n
r
a
pi
d
de
ve
lo
pm
e
nt
to
w
a
r
ds
tr
a
ns
f
or
m
e
r
-
ba
s
e
d
m
ode
ls
f
or
H
A
R
.
T
he
in
it
ia
l
V
iT
de
m
ons
tr
a
te
d
im
pr
e
s
s
iv
e
im
a
ge
r
e
c
ogni
ti
on
a
bi
li
ty
a
t
s
c
a
le
by
r
e
pr
e
s
e
nt
in
g
gl
oba
l
c
ont
e
xt
in
f
or
m
a
ti
on
s
how
e
d
by
D
os
ovi
ts
ki
y
e
t
al
.
[
14]
.
S
w
in
T
im
pr
ove
d
on
th
is
by
in
tr
oduc
in
g
hi
e
r
a
r
c
hi
c
a
l
a
tt
e
nt
io
n
m
e
c
ha
ni
s
m
s
w
it
h
s
hi
f
te
d
w
in
dow
s
to
im
pr
ove
c
om
put
a
ti
o
na
l
e
f
f
ic
ie
nc
y
s
how
e
d
by
L
iu
e
t
al
.
[
15]
.
W
e
ns
e
l
e
t
al
.
[
16]
a
nd
R
e
da
e
t
al
.
[
17]
pr
e
s
e
nt
a
r
c
hi
te
c
tu
r
e
s
s
u
c
h
a
s
V
i
T
-
r
e
c
ur
r
e
nt
tr
a
ns
f
or
m
e
r
(
R
e
T
)
,
ha
vi
ng
in
te
gr
a
te
d
r
e
c
ur
r
e
nt
a
nd
V
iT
m
odul
e
s
f
or
m
or
e
a
c
c
ur
a
te
vi
de
o
a
c
ti
vi
ty
r
e
c
ogni
ti
on
a
nd
C
onV
iVi
T
,
w
hi
c
h
c
om
bi
ne
d
c
onvolut
io
na
l
la
ye
r
s
w
it
h
f
a
c
to
r
iz
e
d
s
e
lf
-
a
tt
e
nt
io
n
c
ont
in
ue
d
to
a
dva
nc
e
s
pa
ti
ot
e
m
por
a
l
m
ode
li
ng.
H
a
n
e
t
al
.
[
18]
pr
e
s
e
nt
a
nove
l
a
ppr
oa
c
h
u
s
in
g
V
iT
f
or
hum
a
n
a
c
ti
vi
ty
r
e
c
ogni
ti
on
w
a
s
pr
e
s
e
nt
e
d,
ta
ki
ng
a
dva
nt
a
ge
of
th
e
m
ode
l'
s
s
tr
e
ngt
h
in
c
a
pt
ur
in
g
la
r
ge
-
s
c
a
le
c
o
nt
e
xt
ua
l
in
f
or
m
a
ti
on,
he
nc
e
a
c
hi
e
vi
ng
hi
ghe
r
r
e
c
ogni
ti
on
pe
r
f
or
m
a
nc
e
.
A
ddi
ti
ona
ll
y
W
a
ng
e
t
al
.
[
19]
ga
it
r
e
c
ogni
ti
on
ha
s
be
e
n
im
pr
ove
d
w
it
h
th
e
in
tr
oduc
ti
on
of
gl
oba
l
-
lo
c
a
l
f
e
a
tu
r
e
f
us
io
n
us
in
g
S
w
in
T
a
nd
3D
C
N
N
,
w
hi
c
h s
uc
c
e
s
s
f
ul
ly
e
xt
r
a
c
ts
s
pa
ti
a
l
a
nd
te
m
por
a
l
f
e
a
tu
r
e
s
f
r
om
ga
it
s
e
que
nc
e
s
.
T
he
r
e
s
e
a
r
c
h
in
[
20]
,
[
21]
pr
e
s
e
nt
hybr
id
V
iT
f
or
e
f
f
ic
ie
nt
H
A
R
a
nd
a
V
iT
m
ode
l
f
or
a
c
ti
on r
e
c
ogni
ti
on f
r
om
s
ti
ll
i
m
a
ge
s
w
e
r
e
s
ig
ni
f
ic
a
nt
i
m
pr
ove
m
e
nt
.
T
he
r
e
s
e
a
r
c
h
in
[
22]
,
[
23]
in
ve
s
ti
ga
te
s
th
e
c
a
pa
bi
li
ty
of
S
w
in
T
a
nd
V
iT
m
ode
l
s
in
hum
a
n
a
c
ti
vi
ty
r
e
c
ogni
ti
on
th
r
ough
th
e
ir
in
c
or
por
a
ti
on
in
to
a
hybr
id
e
ns
e
m
bl
e
le
a
r
ni
ng
m
ode
l.
T
he
a
ppl
ic
a
ti
on
of
th
e
e
ns
e
m
bl
e
a
ppr
oa
c
h
i
s
to
le
ve
r
a
ge
th
e
c
om
pl
e
m
e
nt
a
r
y
s
tr
e
ngt
hs
of
di
f
f
e
r
e
nt
m
ode
ls
s
o
a
s
to
im
pr
ove
th
e
r
obus
tn
e
s
s
a
nd ge
ne
r
a
li
z
a
bi
li
ty
of
H
A
R
s
ys
te
m
s
.
T
hr
ough thi
s
pr
oc
e
s
s
, t
hi
s
pa
pe
r
a
dds
t
o t
he
body of
w
or
k
in
a
c
ti
vi
ty
r
e
c
ogni
ti
on
by
a
s
s
e
s
s
in
g
tr
a
ns
f
or
m
e
r
m
ode
ls
a
nd
s
ho
w
in
g
how
e
n
s
e
m
bl
e
m
e
th
od
s
c
a
n
be
us
e
d
to
im
pr
ove
pe
r
f
or
m
a
nc
e
i
n r
e
a
l
-
w
or
ld
a
nd c
ha
ll
e
ngi
ng c
ondi
ti
ons
.
2.
M
E
T
H
O
D
T
h
is
s
e
c
ti
o
n
i
s
or
ga
ni
z
e
d
in
to
f
i
ve
k
e
y
s
ub
-
s
e
c
ti
o
ns
:
m
e
th
odo
l
ogy
,
d
a
ta
s
e
t
,
d
a
ta
pr
e
pr
o
c
e
s
s
in
g
,
d
a
t
a
vi
s
u
a
li
z
a
ti
o
n,
a
nd
m
o
de
l
d
e
s
c
r
ip
t
io
n.
E
a
c
h
pa
r
t
pr
ov
id
e
s
a
d
e
ta
i
le
d
e
xp
la
na
ti
o
n
t
o
gi
v
e
th
e
r
e
a
de
r
a
c
l
e
a
r
und
e
r
s
t
a
nd
in
g
of
t
h
e
ov
e
r
a
ll
a
p
pr
o
a
c
h
t
a
k
e
n
in
th
i
s
s
t
ud
y.
F
r
om
F
ig
ur
e
1
w
e
c
a
n
s
e
e
th
a
t
th
i
s
s
t
ud
y
pr
opo
s
e
s
a
r
obu
s
t
m
e
th
o
do
lo
g
y f
or
H
A
R
us
in
g
d
e
e
p l
e
a
r
ni
n
g a
nd
e
f
f
i
c
i
e
n
t
d
a
t
a
pr
o
c
e
s
s
in
g. I
m
a
g
e
d
a
t
a
r
e
pr
e
s
e
nt
i
ng
va
r
io
u
s
a
c
ti
v
it
i
e
s
i
s
c
ol
l
e
c
t
e
d
a
nd
pr
e
pr
oc
e
s
s
e
d by
r
e
s
iz
i
ng
, nor
m
a
l
iz
in
g
pi
x
e
l
v
a
lu
e
s
,
a
nd
o
ne
-
hot
e
n
c
od
in
g c
l
a
s
s
l
a
b
e
l
s
.
T
h
e
da
ta
s
e
t
i
s
s
pl
i
t
i
nt
o
tr
a
in
i
ng
(
80%
)
,
va
l
id
a
ti
on
(
10%
)
,
a
nd
t
e
s
ti
n
g
(
10%
)
s
e
t
s
w
it
h
b
a
l
a
n
c
e
d
c
l
a
s
s
di
s
tr
i
but
io
n
.
T
r
a
n
s
f
e
r
l
e
a
r
ni
ng
i
s
a
pp
li
e
d
us
in
g
pr
e
-
tr
a
i
ne
d
V
i
T
a
n
d
S
w
in
T
m
o
de
ls
in
s
m
a
ll
,
b
a
s
e
,
a
nd
l
a
r
g
e
va
r
i
a
n
t
s
.
T
h
e
s
e
m
o
de
ls
a
r
e
f
i
ne
-
tu
n
e
d
o
n
th
e
a
c
ti
v
it
y
d
a
t
a
s
e
t
to
c
a
p
tu
r
e
e
s
s
e
nt
i
a
l
s
pa
ti
a
l
-
t
e
m
por
a
l
p
a
tt
e
r
n
s
.
T
o
le
v
e
r
a
g
e
th
e
s
tr
e
ng
th
s
of
t
he
m
od
e
l
s
,
a
n
e
n
s
e
m
b
le
f
us
io
n
s
tr
a
t
e
gy
is
a
d
opt
e
d
:
th
e
pr
ob
a
bi
li
t
y
e
s
ti
m
a
te
s
f
r
om
in
di
vi
d
u
a
l
m
od
e
l
s
a
r
e
b
le
nd
e
d
u
s
i
ng
a
s
t
a
c
ke
d
le
a
r
n
in
g
p
a
r
a
di
g
m
w
he
r
e
tr
a
di
ti
o
n
a
l
m
a
c
h
in
e
le
a
r
n
in
g
c
l
a
s
s
if
i
e
r
s
(
s
u
c
h
a
s
s
up
por
t
v
e
c
to
r
c
l
a
s
s
if
i
e
r
(
S
V
C
)
,
lo
gi
s
t
ic
r
e
gr
e
s
s
io
n
(
L
R
)
,
r
a
ndo
m
f
or
e
s
t
(
R
F
)
,
gr
a
di
e
n
t
bo
o
s
ti
ng
(
G
B
)
,
k
-
n
e
a
r
e
s
t
n
e
i
gh
bor
(
K
N
N
)
,
a
nd
X
G
B
oo
s
t
(
X
G
B
)
a
r
e
e
m
pl
o
ye
d
a
s
m
e
t
a
-
l
e
a
r
ne
r
s
.
T
h
e
t
e
c
hni
qu
e
e
n
s
ur
e
s
th
a
t
bot
h
t
he
d
e
e
p
f
e
a
t
ur
e
r
e
pr
e
s
e
nt
a
ti
on
s
a
n
d
v
a
r
io
u
s
d
e
c
i
s
i
on
-
m
a
ki
ng
a
ppr
oa
c
h
e
s
a
r
e
ut
i
li
z
e
d
t
o
ge
ne
r
a
te
th
e
ul
ti
m
a
t
e
pr
e
di
c
ti
o
n.
F
or
v
a
li
da
ti
n
g
e
f
f
e
c
ti
ve
ne
s
s
,
a
b
la
ti
o
n
s
tu
di
e
s
c
om
p
a
r
e
s
in
gl
e
-
m
o
de
l
a
nd
e
n
s
e
m
bl
e
pe
r
f
or
m
a
n
c
e
r
a
ti
ng
s
,
w
hi
le
m
ul
t
ip
l
e
e
xp
e
r
im
e
n
ta
l
r
un
s
t
a
ti
s
ti
c
a
l
t
e
s
t
s
va
li
d
a
t
e
th
e
s
t
a
bi
li
t
y
a
n
d
r
ob
u
s
tn
e
s
s
of
th
e
pr
op
o
s
e
d m
e
t
hod
. E
va
l
u
a
ti
on
m
e
tr
i
c
s
i
n
c
l
ud
e
a
c
c
ur
a
c
y
,
pr
e
c
i
s
i
on
, r
e
c
a
ll
,
a
n
d F
1
-
s
c
or
e
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
A
bl
e
nde
d e
ns
e
m
bl
e
app
r
oac
h f
or
ac
c
ur
at
e
hum
an a
c
ti
v
it
y
r
e
c
o
gni
ti
on
(
R
e
z
w
ana K
ar
im
)
5133
F
ig
ur
e
1. S
te
p
by s
te
p pr
oc
udur
e
di
a
gr
a
m
2
.
1
.
D
at
a
s
e
t
T
o
c
a
r
r
y
out
th
e
e
xpe
r
im
e
nt
s
th
a
t
a
r
e
pr
e
s
e
nt
e
d
in
th
is
s
tu
dy,
a
publ
ic
ly
a
c
c
e
s
s
ib
le
hum
a
n
a
c
ti
on
de
te
c
ti
on
da
ta
s
e
t
obt
a
in
e
d
f
r
om
K
a
ggl
e
[
24]
w
a
s
ut
il
iz
e
d.
T
he
da
ta
s
e
t
h
a
s
18,000
la
be
le
d
im
a
ge
s
,
w
hi
c
h
a
r
e
w
e
ll
a
r
r
a
nge
d
in
to
15
di
f
f
e
r
e
nt
hum
a
n
a
c
ti
on
c
la
s
s
e
s
.
T
h
e
a
c
t
io
ns
w
it
hi
n
th
e
da
ta
s
e
t
a
r
e
c
a
ll
in
g,
c
la
ppi
ng,
c
yc
li
ng,
da
nc
in
g,
dr
in
ki
ng,
e
a
ti
ng,
f
ig
ht
in
g,
hugging,
la
ughi
ng,
li
s
te
ni
ng
to
m
us
ic
,
r
unni
ng,
s
it
ti
ng,
s
le
e
pi
ng,
te
xt
in
g,
a
nd
us
in
g
la
pt
op
.
E
ve
r
y
c
la
s
s
is
f
il
le
d
w
it
h
e
xa
c
tl
y
1,0
00
tr
a
in
in
g
im
a
ge
s
a
nd
200
te
s
t
im
a
g
e
s
to
gi
ve
a
ba
la
nc
e
d
c
la
s
s
di
s
tr
ib
ut
io
n
f
or
r
obus
t
m
ode
l
te
s
ti
ng.15,000
im
a
ge
s
w
e
r
e
a
s
s
ig
ne
d
f
or
tr
a
in
in
g
a
nd
th
e
r
e
m
a
in
in
g
3,000
im
a
ge
s
f
or
te
s
ti
ng.
F
ur
th
e
r
,
w
hi
le
tr
a
in
in
g
th
e
m
ode
l,
10%
of
th
e
r
e
s
pe
c
ti
ve
c
la
s
s
tr
a
in
in
g
im
a
ge
s
w
e
r
e
a
ls
o
s
pl
it
a
s
th
e
va
li
da
ti
on
s
e
t.
T
he
va
li
da
ti
on
da
ta
s
e
t
w
a
s
ut
il
iz
e
d
to
de
c
id
e
on
th
e
m
ode
l
pe
r
f
or
m
a
nc
e
a
nd
tu
ne
hype
r
pa
r
a
m
e
te
r
s
to
pr
e
ve
nt
ove
r
f
it
ti
ng
a
nd
e
nha
nc
e
th
e
ge
n
e
r
a
li
z
a
bi
li
ty
of
th
e
pr
opos
e
d m
e
th
od.
2
.
2
.
D
at
a
p
r
e
p
r
oc
e
s
s
in
g
O
ur
da
ta
s
e
t
w
a
s
w
e
ll
-
ba
la
nc
e
d
a
nd
f
r
e
e
of
m
is
s
in
g
va
lu
e
s
or
n
oi
s
e
.
T
h
e
y
w
e
r
e
,
how
e
ve
r
,
of
va
r
yi
ng
s
iz
e
s
,
w
hi
c
h
pos
e
d
a
n
is
s
u
e
f
or
ba
tc
h
pr
oc
e
s
s
in
g.
W
e
e
m
pl
o
ye
d
a
s
e
que
nc
e
of
pr
e
pr
oc
e
s
s
in
g
m
e
th
ods
to
s
ta
nda
r
di
z
e
a
nd
m
a
ke
th
e
m
de
e
p
le
a
r
ni
ng
-
f
r
ie
ndl
y.
F
i
r
s
t,
w
e
r
e
s
iz
e
d
a
ll
im
a
ge
s
to
224×
224
pi
xe
l
s
f
or
c
ons
is
te
nc
y
in
th
e
da
ta
s
e
t.
R
e
s
iz
in
g
w
a
s
ne
c
e
s
s
a
r
y
to
f
a
c
il
it
a
te
m
in
i
-
ba
tc
h
tr
a
in
in
g,
w
he
r
e
im
a
ge
s
m
us
t
be
th
e
s
a
m
e
s
iz
e
.
N
e
xt
,
w
e
pe
r
f
or
m
e
d
pi
xe
l
nor
m
a
li
z
a
ti
on,
no
r
m
a
li
z
in
g
pi
xe
l
va
lu
e
s
to
th
e
[
0,
1]
r
a
nge
by
di
vi
di
n
g
by
255.
T
hi
s
nor
m
a
li
z
a
ti
on
s
te
p
is
im
por
ta
nt
to
s
ta
bi
li
z
e
a
nd
s
pe
e
d
up
th
e
tr
a
in
in
g
pr
oc
e
s
s
,
e
s
pe
c
ia
ll
y
in
tr
a
ns
f
e
r
le
a
r
ni
ng
is
s
ue
s
.
W
e
a
ls
o
one
-
hot
e
nc
ode
d
th
e
c
la
s
s
la
be
ls
.
T
h
e
c
la
s
s
la
be
l
s
(
e
.g.,
"
c
a
ll
in
g,"
"
c
la
ppi
ng,"
"
c
yc
li
ng"
)
w
e
r
e
in
it
ia
ll
y
c
onve
r
te
d
to
num
e
r
ic
a
l
f
or
m
(
e
.g.,
0,
1,
2)
,
th
e
n
c
onve
r
te
d
to
bi
na
r
y
ve
c
to
r
s
.
F
or
e
xa
m
pl
e
,
f
or
15
c
la
s
s
e
s
,
th
e
la
b
e
l
"
c
a
ll
in
g"
is
[
1,
0,
0,
..
.,
0]
,
a
nd
s
o
on.
T
he
s
e
pr
e
pr
oc
e
s
s
in
g
m
e
th
ods
e
ns
ur
e
our
da
ta
w
a
s
c
le
a
n, nor
m
a
li
z
e
d, a
nd r
e
a
dy f
or
e
f
f
e
c
ti
ve
m
ode
l
tr
a
in
in
g.
2
.
3
.
P
r
op
os
e
d
e
n
s
e
m
b
le
m
od
e
l
O
ur
pr
opos
e
d
e
ns
e
m
bl
e
m
ode
l
a
ggr
e
ga
te
s
s
e
ve
r
a
l
pr
e
-
tr
a
in
e
d
tr
a
ns
f
or
m
e
r
m
ode
ls
to
im
p
r
ove
th
e
a
c
c
ur
a
c
y a
nd
s
ta
bi
li
ty
of
th
e
c
la
s
s
if
ic
a
ti
on
[
25]
. T
he
pi
pe
li
ne
be
gi
ns
w
it
h
a
s
ta
nda
r
d
d
a
ta
pr
e
pr
oc
e
s
s
in
g
ph
a
s
e
w
he
r
e
a
ll
in
put
im
a
ge
s
a
r
e
r
e
s
ha
pe
d
a
nd
r
e
s
iz
e
d
to
224×
224
p
ix
e
ls
f
or
uni
f
or
m
it
y
a
nd
pi
xe
ls
nor
m
a
li
z
e
d
to
s
c
a
le
th
e
va
lu
e
s
in
th
e
r
a
nge
[
0,
1]
.
C
la
s
s
la
be
ls
a
r
e
a
ls
o
e
nc
od
e
d
us
in
g
one
-
hot
e
nc
odi
ng
to
pr
e
pa
r
e
th
e
m
f
or
c
la
s
s
if
ic
a
ti
on.
F
ol
lo
w
in
g
pr
e
pr
oc
e
s
s
in
g,
th
e
im
a
g
e
s
a
r
e
p
a
s
s
e
d
th
r
ough
s
ix
pr
e
-
tr
a
in
e
d
m
ode
ls
of
two
ki
nds
:
V
iT
(
V
iT
-
B
/1
6,
V
iT
-
L
/1
6,
a
nd
V
iT
-
S
/1
6)
a
nd
S
w
in
T
(
sw
in
-
B
,
s
w
in
-
L
,
a
nd
sw
in
-
S
)
.
E
a
c
h
m
ode
l
pr
oduc
e
s
f
e
a
tu
r
e
ve
c
to
r
s
of
va
r
yi
ng
le
ngt
hs
,
w
hi
c
h
a
r
e
pa
s
s
e
d
th
r
ough
a
de
ns
e
la
ye
r
a
nd
a
n
a
c
ti
va
ti
on
la
ye
r
to
pr
oduc
e
pe
r
-
c
a
te
gor
y
pr
e
di
c
ti
ons
.
T
he
s
e
th
e
n
b
e
c
a
m
e
c
om
bi
ne
d
u
s
in
g
a
s
ta
c
ki
ng
e
ns
e
m
bl
e
m
e
th
od
by
f
e
e
di
ng
in
a
ll
pr
oba
bi
li
ty
out
put
s
of
tr
a
ns
f
or
m
e
r
s
a
s
in
put
f
e
a
tu
r
e
s
to
a
n
e
ns
e
m
bl
e
of
di
f
f
e
r
e
nt
m
a
c
hi
ne
le
a
r
ni
ng
c
la
s
s
if
ie
r
s
.
W
e
us
e
d
S
V
C
,
L
R
,
R
F
,
G
B
,
K
N
N
,
a
nd
X
G
B
a
s
m
e
ta
-
le
a
r
ne
r
s
.
A
ll
of
th
e
s
e
c
la
s
s
if
ie
r
s
im
pa
r
t
va
r
io
us
in
duc
ti
ve
bi
a
s
e
s
, t
he
r
e
by e
na
bl
in
g t
he
m
e
ta
-
la
ye
r
t
o l
e
a
r
n l
in
e
a
r
a
s
w
e
ll
a
s
non
-
li
ne
a
r
de
c
is
io
n bounda
r
ie
s
a
nd
e
nha
nc
e
g
e
ne
r
a
li
z
a
ti
on.
T
he
f
in
a
l
pr
e
di
c
t
e
d
c
la
s
s
is
obt
a
in
e
d
by
c
on
s
ol
id
a
ti
ng
th
e
out
put
s
of
th
e
m
e
ta
-
le
a
r
ne
r
s
,
th
us
g
a
in
in
g
f
r
om
th
e
c
om
pl
e
m
e
nt
a
r
y
s
tr
e
n
gt
hs
of
bot
h
de
e
p
tr
a
ns
f
or
m
e
r
m
ode
ls
a
nd
tr
a
di
ti
ona
l
e
ns
e
m
bl
e
c
la
s
s
if
ie
r
s
.
T
hi
s
hybr
id
s
ta
c
ki
ng
e
ns
e
m
bl
e
s
uc
c
e
s
s
f
ul
ly
e
nha
nc
e
s
th
e
pe
r
f
or
m
a
nc
e
of
th
e
ove
r
a
ll
c
la
s
s
if
ic
a
ti
on s
y
s
te
m
. F
ig
ur
e
2 s
how
s
our
pr
opos
e
d m
od
e
l
w
or
kf
lo
w
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
. 14, No. 6, D
e
c
e
m
be
r
2025
:
5131
-
5139
5134
F
ig
ur
e
2
.
O
ur
pr
opos
e
d e
ns
e
m
bl
e
m
ode
l
w
or
kf
lo
w
di
a
gr
a
m
2
.
4
.
M
od
e
l
p
e
r
f
or
m
an
c
e
c
al
c
u
la
t
io
n
T
o
a
s
s
e
s
s
in
g
th
e
p
e
r
f
or
m
a
nc
e
of
th
e
pr
opos
e
d
H
A
R
m
ode
l,
a
c
om
pr
e
he
ns
iv
e
s
e
t
of
e
va
lu
a
ti
on
m
e
tr
ic
s
i
s
e
m
pl
oye
d. A
s
t
he
pr
obl
e
m
i
n que
s
ti
on i
s
m
ul
ti
-
c
la
s
s
i
n na
tu
r
e
, e
va
lu
a
ti
on me
tr
ic
s
s
uc
h a
s
a
c
c
ur
a
c
y,
pr
e
c
is
io
n,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
a
r
e
e
m
pl
oye
d
f
or
qua
nt
i
f
yi
ng
th
e
c
la
s
s
if
ic
a
ti
on
a
c
c
ur
a
c
y
of
th
e
m
ode
l
on
e
a
c
h
of
th
e
a
c
ti
vi
ty
ty
pe
s
. T
he
s
e
m
e
tr
ic
s
not
onl
y
a
ll
ow
th
e
qu
a
nt
if
ic
a
ti
on
of
ove
r
a
ll
c
or
r
e
c
tn
e
s
s
but
th
e
pr
opor
ti
on
of
c
or
r
e
c
tl
y
pr
e
di
c
te
d
a
c
ti
vi
ti
e
s
ve
r
s
us
in
c
or
r
e
c
t
pr
e
di
c
ti
ons
a
s
w
e
ll
.
M
or
e
ove
r
,
a
c
onf
us
io
n
m
a
tr
ix
c
a
n
be
e
m
pl
oye
d
to
di
s
pl
a
y
th
e
m
od
e
l
pr
e
di
c
ti
ons
on
a
pe
r
-
c
la
s
s
b
a
s
is
,
pr
ovi
di
ng
r
ic
h
in
f
or
m
a
ti
on
a
bout
w
ha
t
a
c
ti
vi
ti
e
s
a
r
e
c
or
r
e
c
tl
y
r
e
c
ogni
z
e
d
a
nd
w
he
r
e
m
is
c
la
s
s
if
ic
a
ti
ons
oc
c
ur
.
T
he
f
ol
lo
w
in
g
a
r
e
s
om
e
of
th
e
pe
r
f
or
m
a
nc
e
m
e
tr
ic
s
th
a
t
w
e
r
e
c
a
lc
ul
a
t
e
d.
F
r
om
th
e
s
e
pa
r
a
m
e
te
r
s
w
e
id
e
nt
if
ie
d
th
e
be
s
t
c
la
s
s
if
ie
r
to
r
e
c
ogni
z
e
H
A
R
.
M
os
t
of
th
e
pe
r
f
or
m
a
nc
e
m
e
tr
ic
s
in
pe
r
c
e
nt
a
ge
(
%
)
ha
ve
be
e
n
c
a
lc
ul
a
t
e
d
ba
s
e
d
on
(
1
)
-
(
4)
ba
s
e
d on the
c
onf
us
io
n m
a
tr
ix
e
xpl
a
in
e
d i
n obta
in
e
d f
r
om
t
he
c
la
s
s
if
ie
r
.
=
(
+
+
+
+
)
×
100%
(
1)
=
(
+
)
×
100%
(
2)
=
(
+
)
×
100%
(
3
)
1
−
=
(
2
×
×
+
)
×
100%
(
4
)
3.
R
E
S
U
L
T
S
A
N
D
D
I
S
C
U
S
S
I
O
N
F
r
om
T
a
bl
e
1
w
e
c
a
n
s
e
e
th
a
t
V
iT
ba
s
e
m
ode
l
pe
r
f
or
m
s
w
e
ll
in
te
r
m
s
of
s
om
e
good
c
la
s
s
if
ic
a
ti
on
a
bi
li
ti
e
s
f
or
th
e
15
a
c
ti
vi
ty
c
l
a
s
s
e
s
w
it
h
ove
r
a
ll
a
c
c
ur
a
c
y
r
a
ngi
ng
f
r
om
95
-
98%
de
pe
ndi
ng
on
th
e
a
c
ti
vi
ty
.
I
t
doe
s
e
xc
e
e
di
ngl
y w
e
ll
w
he
n i
de
nt
if
yi
ng i
ndi
vi
dua
l
a
c
ti
vi
ti
e
s
l
ik
e
c
yc
li
ng w
it
h ne
a
r
pe
r
f
e
c
t
a
c
c
ur
a
c
y (
96.57%
)
a
nd
r
e
c
a
ll
(
98.5%
)
,
a
nd
r
unni
ng
w
it
h
s
im
il
a
r
ly
hi
gh
r
a
te
s
.
H
o
w
e
ve
r
,
th
e
m
ode
l
s
uf
f
e
r
s
in
a
c
ti
ons
li
ke
phone
c
a
ll
a
nd
c
l
a
p,
w
he
r
e
pr
e
c
is
io
n
(
70.52
a
nd
74.73%
)
a
nd
r
e
c
a
ll
(
61
a
nd
69.5%
)
a
r
e
not
a
bl
y
lo
w
e
r
,
im
pl
yi
ng
m
is
c
la
s
s
if
ic
a
ti
on
s
due
to
pe
r
ha
ps
s
im
il
a
r
vi
s
ua
l
f
e
a
tu
r
e
s
or
s
u
bt
le
a
c
ti
on
di
s
s
im
il
a
r
it
ie
s
.
F
ur
th
e
r
m
or
e
,
m
us
ic
li
s
te
ni
ng
c
om
e
s
w
it
h
m
ode
r
a
te
a
c
c
ur
a
c
y
of
58.37%
e
v
e
n
w
it
h
e
nha
nc
e
d
r
e
c
a
ll
of
71.5%
,
s
how
in
g
th
a
t
th
e
m
ode
l
ha
s
di
f
f
ic
ul
ty
s
e
pa
r
a
ti
ng
th
is
c
la
s
s
f
r
om
ot
he
r
s
a
s
it
m
a
y
ha
ve
ove
r
la
ppi
ng
f
e
a
tu
r
e
s
w
it
h
te
xt
in
g
or
us
in
g
a
la
pt
op.
S
it
ti
ng
pe
r
f
or
m
a
nc
e
is
a
ls
o
c
om
pa
r
a
ti
ve
ly
lo
w
w
it
h
a
c
c
ur
a
c
y
a
t
51.62%
,
s
how
in
g
c
onf
us
io
n
w
it
h
ot
he
r
s
ti
ll
or
lo
w
-
m
ot
io
n
a
c
ti
vi
ti
e
s
.
T
hi
s
m
ode
l,
a
lt
hough e
f
f
ic
ie
nt
,
s
how
s
th
e
f
a
il
in
g
of
a
n
una
dul
te
r
a
te
d
V
iT
a
r
c
hi
te
c
tu
r
e
i
n t
ot
a
ll
y e
nc
om
pa
s
s
in
g s
ubt
le
a
c
ti
vi
ty
pa
tt
e
r
ns
.
F
r
om
T
a
bl
e
2
w
e
c
a
n
s
e
e
th
a
t
s
c
a
li
ng
to
V
iT
la
r
ge
yi
e
ld
s
pe
r
f
or
m
a
nc
e
in
c
r
e
a
s
e
s
on
m
os
t
a
c
ti
vi
ti
e
s
,
w
it
h
a
c
c
ur
a
c
y
a
nd
r
e
c
a
ll
e
s
p
e
c
ia
ll
y
im
pr
ovi
ng
f
or
tr
ic
ky
c
la
s
s
e
s
li
ke
c
a
ll
in
g
(
a
c
c
ur
a
c
y
76.74%
,
r
e
c
a
ll
66%
)
,
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
A
bl
e
nde
d e
ns
e
m
bl
e
app
r
oac
h f
or
ac
c
ur
at
e
hum
an a
c
ti
v
it
y
r
e
c
o
gni
ti
on
(
R
e
z
w
ana K
ar
im
)
5135
c
la
ppi
ng
(
a
c
c
ur
a
c
y
77.96%
,
r
e
c
a
ll
72.5%
)
,
a
nd
m
us
ic
(
a
c
c
ur
a
c
y
63.9%
,
r
e
c
a
ll
77%
)
.
T
h
e
in
c
r
e
a
s
e
d
d
e
pt
h
a
nd
m
ode
l
c
a
pa
c
it
y
a
ll
ow
in
c
r
e
a
s
e
d
di
s
c
r
im
in
a
ti
ve
f
e
a
tu
r
e
le
a
r
ni
n
g,
w
hi
c
h
i
s
e
vi
de
nt
in
ne
a
r
ly
pe
r
f
e
c
t
c
y
c
li
ng
pe
r
f
or
m
a
nc
e
(
99.5%
pr
e
c
is
io
n
a
nd
99%
r
e
c
a
ll
)
a
nd
be
tt
e
r
in
te
r
a
c
ti
on
w
it
h
dyna
m
ic
a
c
ti
vi
ti
e
s
s
uc
h a
s
r
unni
ng
a
nd
dr
in
ki
ng.
N
e
ve
r
th
e
le
s
s
,
c
e
r
ta
in
c
la
s
s
e
s
s
u
c
h
a
s
f
ig
ht
in
g
s
e
e
a
de
c
r
e
a
s
e
in
pr
e
c
is
io
n
to
74.38%
e
ve
n
w
it
h
good
r
e
c
a
ll
(
90%
)
,
w
hi
c
h
m
a
y
be
be
c
a
us
e
of
e
pi
s
odi
c
f
a
ls
e
pos
it
iv
e
s
pos
s
ib
ly
due
to
ove
r
la
ppi
ng
a
c
ti
on
f
e
a
tu
r
e
s
.
M
ode
r
a
te
a
c
c
ur
a
c
y
on
s
it
ti
ng
(
70.86%
)
a
nd
la
pt
op
us
e
(
74.25%
)
a
ls
o
s
ugge
s
ts
s
om
e
r
e
s
id
ua
l
di
f
f
ic
ul
ty
in
di
s
ti
ngui
s
hi
ng
s
e
de
nt
a
r
y
be
ha
vi
or
.
G
e
ne
r
a
ll
y,
vi
s
io
n
la
r
ge
obt
a
in
s
a
n
e
qui
li
br
iu
m
be
twe
e
n
gr
e
a
te
r
a
c
c
ur
a
c
y a
nd gr
e
a
te
r
c
l
a
s
s
s
e
pa
r
a
ti
on, ye
t
s
ti
ll
s
tr
uggl
e
s
w
it
h vi
s
ua
ll
y s
im
il
a
r
or
s
ubt
le
a
c
ti
ons
.
F
r
om
T
a
bl
e
3
w
e
c
a
n
s
e
e
th
a
t
th
e
V
i
T
s
m
a
ll
m
ode
l,
w
it
h
it
s
s
m
a
ll
e
r
pa
r
a
m
e
te
r
s
iz
e
,
ge
n
e
r
a
ll
y
s
how
s
lo
w
e
r
pe
r
f
or
m
a
nc
e
a
ll
a
r
ound,
e
s
p
e
c
ia
ll
y
on
m
or
e
a
m
bi
guou
s
a
c
ti
vi
ti
e
s
.
C
a
ll
a
nd
c
la
p
pr
e
c
is
io
n
a
nd
r
e
c
a
ll
dr
op
unde
r
70%
,
w
it
h
c
a
ll
p
r
e
c
is
io
n
c
om
in
g
in
a
t
61.82
%
,
w
hi
c
h
s
ugge
s
ts
de
c
r
e
a
s
e
d
c
a
pa
bi
li
ty
to
di
f
f
e
r
e
nt
ia
te
s
ubt
le
m
ot
io
ns
.
D
e
s
pi
te
th
is
,
th
e
m
ode
l
r
e
m
a
in
s
hi
ghl
y
f
unc
ti
ona
l
on
c
le
a
r
a
c
ti
vi
ti
e
s
li
ke
c
yc
li
ng
(
98.48%
pr
e
c
is
io
n)
a
nd
e
a
ti
ng
(
94.86%
pr
e
c
is
io
n)
,
de
m
ons
tr
a
t
in
g
th
a
t
e
a
s
ie
r
c
la
s
s
e
s
r
e
m
a
in
w
e
ll
-
id
e
nt
if
ie
d.
I
nt
e
r
e
s
ti
ngl
y,
ta
s
ks
w
it
h
dyna
m
ic
in
te
r
a
c
ti
on
s
uc
h
a
s
f
ig
ht
in
g
s
uf
f
e
r
in
pr
e
c
i
s
io
n
(
58.96%
)
w
hi
le
r
e
c
a
ll
r
e
m
a
in
s
good
(
90.5%
)
,
im
pl
yi
ng
f
a
ls
e
pos
it
iv
e
s
m
os
t
li
ke
ly
du
e
to
li
m
it
e
d
m
ode
l
c
a
pa
c
it
y.
T
he
r
e
du
c
ti
on
in
pr
e
c
is
io
n
a
nd
r
e
c
a
ll
f
or
la
ught
e
r
a
nd
m
us
ic
a
ls
o
poi
nt
s
to
di
f
f
ic
ul
ty
w
it
h
s
ubt
le
e
m
ot
io
na
l
or
ba
c
kgr
oun
d
a
c
ti
vi
ty
.
T
hi
s
m
ode
l
m
a
y
be
be
tt
e
r
d
e
pl
oye
d
in
c
ont
e
xt
s
w
he
r
e
r
e
s
our
c
e
s
a
r
e
li
m
it
e
d
but
c
a
n
a
f
f
or
d
to
m
a
ke
e
r
r
or
s
on s
ubt
le
c
la
s
s
e
s
.
F
r
om
T
a
bl
e
4
w
e
c
a
n
s
e
e
th
a
t
th
e
S
w
in
T
ba
s
e
m
ode
l
c
on
s
ta
n
tl
y
im
pr
ove
s
c
la
s
s
if
ic
a
ti
on
s
c
or
e
s
by
e
m
pl
oyi
ng
hi
e
r
a
r
c
hi
c
a
l
r
e
pr
e
s
e
nt
a
ti
on
a
nd
lo
c
a
l
-
gl
oba
l
a
tt
e
nt
io
n.
P
r
e
c
is
io
n
a
nd
r
e
c
a
ll
a
r
e
gr
e
a
tl
y
e
nha
nc
e
d
f
or
m
os
t
c
la
s
s
e
s
;
f
or
e
xa
m
pl
e
,
dr
in
ki
ng
ha
s
92.82%
pr
e
c
is
io
n
a
nd
84%
r
e
c
a
ll
,
w
he
r
e
a
s
hugging
ha
s
90.86%
pr
e
c
is
io
n
a
nd
84.5%
r
e
c
a
ll
.
S
pa
ti
a
l
-
te
m
por
a
l
f
in
e
-
gr
a
in
e
d
s
ubt
le
ti
e
s
a
r
e
e
f
f
e
c
ti
ve
ly
c
a
pt
ur
e
d
by
th
e
a
r
c
hi
te
c
tu
r
e
a
s
e
vi
de
nc
e
d
by
hi
gh
F
1
-
s
c
or
e
s
in
e
a
ti
ng
(
90.4%
)
a
nd
r
unni
ng
(
86.46%
)
.
T
he
r
e
a
r
e
s
ti
ll
di
f
f
ic
ul
ti
e
s
in
la
ughi
ng
a
nd
te
xt
in
g
a
c
c
ur
a
te
ly
a
t
a
r
ound
70%
th
a
t
in
di
c
a
te
li
nge
r
in
g
c
la
s
s
c
onf
us
io
n,
but
ove
r
a
ll
ba
la
nc
e
of
c
la
s
s
e
s
i
s
be
tt
e
r
t
ha
n i
n
V
iT
a
lo
ne
. T
he
m
ode
l
a
ls
o ha
s
good r
e
c
a
ll
i
n dyna
m
ic
a
c
ti
ons
s
uc
h
a
s
c
om
ba
t
(
85.5%
)
a
nd
c
yc
li
ng
(
99%
)
,
s
o
it
c
a
n
be
a
good
opt
io
n
f
or
a
ppl
ic
a
ti
ons
w
he
r
e
s
ubt
le
r
e
c
ogni
ti
on
is
ne
e
de
d w
it
h t
ol
e
r
a
bl
e
c
om
put
a
ti
ona
l
lo
a
d.
T
a
bl
e
1
.
P
e
r
f
or
m
a
nc
e
of
V
iT
ba
s
e
m
ode
ls
C
l
a
s
s
TP
FP
FN
TN
P
r
e
c
i
s
i
on
R
e
c
a
l
l
F1
-
s
c
or
e
A
c
c
ur
a
c
y
c
a
l
l
i
ng
122
51
78
2749
70.52
61.00
65.42
95.70
c
l
a
ppi
ng
139
47
61
2753
74.73
69.50
72.02
96.40
c
yc
l
i
ng
197
7
3
2793
96.57
98.50
97.52
99.67
da
nc
i
ng
168
29
32
2771
85.28
84.00
84.63
97.97
dr
i
nki
ng
162
32
38
2768
83.51
81.00
82.23
97.67
e
a
t
i
ng
167
21
33
2779
88.83
83.50
86.08
98.20
f
i
ght
i
ng
156
22
44
2778
87.64
78.00
82.54
97.80
huggi
ng
173
39
27
2761
81.60
86.50
83.98
97.80
l
a
ughi
ng
144
41
56
2759
77.84
72.00
74.81
96.77
m
us
i
c
143
102
57
2698
58.37
71.50
64.27
94.70
r
unni
ng
178
34
22
2766
83.96
89.00
86.41
98.13
s
i
t
t
i
ng
159
149
41
2651
51.62
79.50
62.60
93.67
s
l
e
e
pi
ng
160
11
40
2789
93.57
80.00
86.25
98.30
t
e
xt
i
ng
137
46
63
2754
74.86
68.50
71.54
96.37
us
i
ng_l
a
pt
op
138
26
62
2774
84.15
69.00
75.82
97.07
T
a
bl
e
2
.
P
e
r
f
or
m
a
nc
e
of
V
iT
la
r
ge
m
ode
ls
C
l
a
s
s
TP
FP
FN
TN
P
r
e
c
i
s
i
on
R
e
c
a
l
l
F1
-
s
c
or
e
A
c
c
ur
a
c
y
c
a
l
l
i
ng
132
40
68
2760
76.74
66.00
70.97
96.40
c
l
a
ppi
ng
145
41
55
2759
77.96
72.50
75.13
96.80
c
yc
l
i
ng
198
1
2
2799
99.50
99.00
99.25
99.90
da
nc
i
ng
162
34
38
2766
82.65
81.00
81.82
97.60
dr
i
nki
ng
173
22
27
2778
88.72
86.50
87.59
98.37
e
a
t
i
ng
178
37
22
2763
82.79
89.00
85.78
98.03
f
i
ght
i
ng
180
62
20
2738
74.38
90.00
81.45
97.27
huggi
ng
174
23
26
2777
88.32
87.00
87.66
98.37
l
a
ughi
ng
152
41
48
2759
78.76
76.00
77.35
97.03
m
us
i
c
154
87
46
2713
63.90
77.00
69.84
95.57
r
unni
ng
166
10
34
2790
94.32
83.00
88.30
98.53
s
i
t
t
i
ng
124
51
76
2749
70.86
62.00
66.13
95.77
s
l
e
e
pi
ng
167
25
33
2775
86.98
83.50
85.20
98.07
t
e
xt
i
ng
145
43
55
2757
77.13
72.50
74.74
96.73
us
i
ng_l
a
pt
op
173
60
27
2740
74.25
86.50
79.91
97.10
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
. 14, No. 6, D
e
c
e
m
be
r
2025
:
5131
-
5139
5136
F
r
om
T
a
bl
e
5
w
e
c
a
n
s
e
e
th
a
t
th
e
S
w
in
T
la
r
ge
a
c
hi
e
ve
s
s
om
e
of
th
e
be
s
t
in
di
vi
dua
l
-
m
ode
l
pe
r
f
or
m
a
nc
e
w
it
h
ve
r
y
hi
gh
pr
e
c
is
io
n
a
nd
r
e
c
a
ll
on
m
os
t
a
c
ti
vi
ti
e
s
.
C
yc
li
ng,
f
or
in
s
ta
n
c
e
,
a
c
hi
e
ve
s
97.56%
pr
e
c
is
io
n
w
it
h
pe
r
f
e
c
t
r
e
c
a
ll
(
100%
)
,
a
nd
e
a
ti
ng
a
c
hi
e
ve
s
93.1
9%
pr
e
c
is
io
n
w
it
h
89%
r
e
c
a
ll
.
L
a
r
ge
r
m
ode
l
s
iz
e
doe
s
im
pr
ove
le
a
r
ni
ng
of
f
in
e
f
e
a
tu
r
e
s
s
e
e
n
in
im
pr
ove
d
p
e
r
f
or
m
a
nc
e
on
c
a
ll
in
g
(
75.26%
pr
e
c
is
io
n)
a
nd
f
ig
ht
in
g
(
85.51
%
pr
e
c
is
io
n,
88.5%
r
e
c
a
ll
)
.
S
o
m
e
.c
la
s
s
e
s
li
ke
c
la
ppi
ng
c
ont
in
ue
to
de
m
ons
tr
a
te
pr
e
c
is
io
n
a
t
66.41%
w
it
h hi
gh r
e
c
a
ll
, w
hi
c
h s
ugge
s
ts
a
f
a
ls
e
pos
it
iv
e
bi
a
s
.
F
r
om
T
a
bl
e
6
w
e
c
a
n
s
e
e
th
a
t
th
e
S
w
in
T
s
m
a
ll
,
w
hi
le
im
pr
ove
d
r
e
la
ti
ve
to
V
i
T
S
m
a
ll
,
la
g
s
la
r
ge
r
m
ode
ls
in
pr
e
c
is
io
n
a
nd
r
e
c
a
ll
c
la
s
s
if
yi
ng
f
o
r
m
os
t
c
la
s
s
e
s
.
P
r
e
c
is
io
n
in
e
voki
ng
(
55.56%
)
a
nd
c
la
ppi
ng
(
57.74%
)
is
lo
w
,
r
e
f
le
c
ti
ng
c
ha
ll
e
nge
s
w
it
h
s
ubt
le
a
c
ti
on
di
s
c
r
i
m
in
a
ti
on.
Y
e
t,
c
yc
li
ng
(
97.04%
pr
e
c
is
io
n)
a
nd
e
a
ti
ng
(
91.81%
pr
e
c
is
io
n)
a
r
e
a
c
c
ur
a
te
ly
id
e
nt
if
ie
d.
L
ow
r
e
c
a
ll
in
te
xt
in
g
(
55%
)
a
nd
la
ughi
ng
(
66.5%
)
in
di
c
a
te
s
s
om
e
pr
e
di
c
ti
on
lo
s
s
.
T
hi
s
m
ode
l
m
ig
ht
be
a
good
f
it
f
or
r
e
s
our
c
e
-
li
m
it
e
d
s
e
tt
in
gs
but
c
om
pr
om
is
e
s
pr
e
c
is
io
n
on
hi
ghl
y
c
om
pl
e
x
or
vi
s
ua
ll
y
unc
e
r
ta
in
a
c
ti
on
s
,
il
lu
s
tr
a
ti
ng
th
e
c
om
pr
om
is
e
be
twe
e
n
pe
r
f
or
m
a
nc
e
a
nd mode
l
s
iz
e
.
T
a
bl
e
3
.
P
e
r
f
or
m
a
nc
e
of
V
iT
s
m
a
ll
m
ode
l
s
C
l
a
s
s
TP
FP
FN
TN
P
r
e
c
i
s
i
on
R
e
c
a
l
l
F1
-
s
c
or
e
A
c
c
ur
a
c
y
c
a
l
l
i
ng
136
84
64
2716
61.82
68.00
64.76
95.07
c
l
a
ppi
ng
133
71
67
2729
65.20
66.50
65.84
95.40
c
yc
l
i
ng
194
3
6
2797
98.48
97.00
97.73
99.70
da
nc
i
ng
154
43
46
2757
78.17
77.00
77.58
97.03
dr
i
nki
ng
150
12
50
2788
92.59
75.00
82.87
97.93
e
a
t
i
ng
166
9
34
2791
94.86
83.00
88.53
98.57
f
i
ght
i
ng
181
126
19
2674
58.96
90.50
71.40
95.17
huggi
ng
181
54
19
2746
77.02
90.50
83.22
97.57
l
a
ughi
ng
159
115
41
2685
58.03
79.50
67.09
94.80
m
us
i
c
129
65
71
2735
66.49
64.50
65.48
95.47
r
unni
ng
151
15
49
2785
90.96
75.50
82.51
97.87
s
i
t
t
i
ng
118
54
82
2746
68.60
59.00
63.44
95.47
s
l
e
e
pi
ng
129
16
71
2784
88.97
64.50
74.78
97.10
t
e
xt
i
ng
122
45
78
2755
73.05
61.00
66.49
95.90
us
i
ng_l
a
pt
op
143
42
57
2758
77.30
71.50
74.29
96.70
T
a
bl
e
4
.
P
e
r
f
or
m
a
nc
e
of
S
w
in
T
ba
s
e
m
ode
ls
C
l
a
s
s
TP
FP
FN
TN
P
r
e
c
i
s
i
on
R
e
c
a
l
l
F1
-
s
c
or
e
A
c
c
ur
a
c
y
c
a
l
l
i
ng
136
68
64
2732
66.67
68.00
67.33
95.60
c
l
a
ppi
ng
164
67
36
2733
71.00
82.00
76.10
96.57
c
yc
l
i
ng
199
2
1
2798
99.00
99.50
99.25
99.90
da
nc
i
ng
165
32
35
2768
83.76
82.50
83.12
97.77
dr
i
nki
ng
168
13
32
2787
92.82
84.00
88.19
98.50
e
a
t
i
ng
179
17
21
2783
91.33
89.50
90.40
98.73
f
i
ght
i
ng
171
34
29
2766
83.41
85.50
84.44
97.90
huggi
ng
169
17
31
2783
90.86
84.50
87.56
98.40
l
a
ughi
ng
145
41
55
2759
77.96
72.50
75.13
96.80
m
us
i
c
140
50
60
2750
73.68
70.00
71.79
96.33
r
unni
ng
182
39
18
2761
82.35
91.00
86.46
98.10
s
i
t
t
i
ng
139
53
61
2747
72.40
69.50
70.92
96.20
s
l
e
e
pi
ng
153
22
47
2778
87.43
76.50
81.60
97.70
t
e
xt
i
ng
131
58
69
2742
69.31
65.50
67.35
95.77
us
i
ng_l
a
pt
op
171
75
29
2725
69.51
85.50
76.68
96.53
T
a
bl
e
5
.
P
e
r
f
or
m
a
nc
e
of
S
w
in
T
la
r
ge
m
ode
ls
C
l
a
s
s
TP
FP
FN
TN
P
r
e
c
i
s
i
on
R
e
c
a
l
l
F1
-
s
c
or
e
A
c
c
ur
a
c
y
c
a
l
l
i
ng
146
48
54
2752
75.26
73.00
74.11
96.60
c
l
a
ppi
ng
170
86
30
2714
66.41
85.00
74.56
96.13
c
yc
l
i
ng
200
5
0
2795
97.56
100.00
98.77
99.83
da
nc
i
ng
174
42
26
2758
80.56
87.00
83.65
97.73
dr
i
nki
ng
181
27
19
2773
87.02
90.50
88.73
98.47
e
a
t
i
ng
178
13
22
2787
93.19
89.00
91.05
98.83
f
i
ght
i
ng
177
30
23
2770
85.51
88.50
86.98
98.23
huggi
ng
171
11
29
2789
93.96
85.50
89.53
98.67
l
a
ughi
ng
157
44
43
2756
78.11
78.50
78.30
97.10
m
us
i
c
139
42
61
2758
76.80
69.50
72.97
96.57
r
unni
ng
170
22
30
2778
88.54
85.00
86.73
98.27
s
i
t
t
i
ng
141
80
59
2720
63.80
70.50
66.98
95.37
s
l
e
e
pi
ng
150
20
50
2780
88.24
75.00
81.08
97.67
t
e
xt
i
ng
128
29
72
2771
81.53
64.00
71.71
96.63
us
i
ng_l
a
pt
op
164
55
36
2745
74.89
82.00
78.28
96.97
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
A
bl
e
nde
d e
ns
e
m
bl
e
app
r
oac
h f
or
ac
c
ur
at
e
hum
an a
c
ti
v
it
y
r
e
c
o
gni
ti
on
(
R
e
z
w
ana K
ar
im
)
5137
T
a
bl
e
6
.
P
e
r
f
or
m
a
nc
e
of
S
w
in
T
s
m
a
ll
m
ode
l
C
l
a
s
s
TP
FP
FN
TN
P
r
e
c
i
s
i
on
R
e
c
a
l
l
F1
-
s
c
or
e
A
c
c
ur
a
c
y
c
a
l
l
i
ng
125
100
75
2700
55.56
62.50
58.82
94.17
c
l
a
ppi
ng
138
101
62
2699
57.74
69.00
62.87
94.57
c
yc
l
i
ng
197
6
3
2794
97.04
98.50
97.77
99.70
da
nc
i
ng
158
55
42
2745
74.18
79.00
76.51
96.77
dr
i
nki
ng
145
19
55
2781
88.41
72.50
79.67
97.53
e
a
t
i
ng
157
14
43
2786
91.81
78.50
84.64
98.10
f
i
ght
i
ng
164
52
36
2748
75.93
82.00
78.85
97.07
huggi
ng
159
93
41
2707
63.10
79.50
70.35
95.53
l
a
ughi
ng
133
55
67
2745
70.74
66.50
68.56
95.93
m
us
i
c
120
63
80
2737
65.57
60.00
62.66
95.23
r
unni
ng
156
30
44
2770
83.87
78.00
80.83
97.53
s
i
t
t
i
ng
122
92
78
2708
57.01
61.00
58.94
94.33
s
l
e
e
pi
ng
139
26
61
2774
84.24
69.50
76.16
97.10
t
e
xt
i
ng
110
64
90
2736
63.22
55.00
58.82
94.87
us
i
ng_l
a
pt
op
151
56
49
2744
72.95
75.50
74.20
96.50
F
r
om
T
a
bl
e
7
w
e
c
a
n
s
e
e
th
a
t
th
e
e
n
s
e
m
bl
e
m
ode
l
pr
opos
e
d
c
l
e
a
r
ly
out
pe
r
f
or
m
s
e
ve
r
y
s
in
gl
e
m
ode
l
in
di
vi
dua
ll
y,
w
it
h
ne
a
r
-
pe
r
f
e
c
t
pr
e
c
is
io
n
a
nd
r
e
c
a
ll
on a
lm
os
t
a
l
l
c
la
s
s
e
s
.
C
l
a
ppi
ng
a
nd
dr
in
ki
ng,
f
or
in
s
ta
nc
e
,
bot
h
s
c
or
e
100%
pr
e
c
is
io
n
a
nd
r
e
c
a
ll
,
il
lu
s
tr
a
ti
ng
p
e
r
f
e
c
t
c
la
s
s
i
f
ic
a
ti
on.
E
ve
n
th
e
m
os
t
c
ha
ll
e
ngi
ng
ta
s
k
s
li
ke
te
xt
in
g
a
nd
la
pt
op
a
r
e
s
ig
ni
f
ic
a
nt
ly
e
nha
n
c
e
d
w
it
h
pr
e
c
is
io
n a
b
ove
87%
a
nd
r
e
c
a
ll
a
bove
81%
. T
he
e
n
s
e
m
bl
e
a
ppr
oa
c
h
r
e
duc
e
s
f
a
l
s
e
pos
it
iv
e
s
a
nd
f
a
ls
e
ne
ga
ti
ve
s
c
ons
id
e
r
a
bl
y,
r
e
s
ul
ti
ng
in
F
1
-
s
c
or
e
s
a
bove
90%
in
a
lm
os
t
a
ll
c
la
s
s
e
s
a
nd
a
c
c
ur
a
c
y
a
bove
97%
.
T
hi
s
c
onf
ir
m
s
th
a
t
th
e
c
om
bi
na
ti
on
of
di
f
f
e
r
in
g
m
ode
l
s
tr
e
ngt
hs
is
c
a
pa
bl
e
of
s
uc
c
e
s
s
f
ul
ly
c
ount
e
r
a
c
ti
ng i
ndi
vi
dua
l
f
la
w
s
t
o yi
e
l
d s
tr
ong a
nd s
ta
bl
e
huma
n a
c
ti
vi
ty
r
e
c
ogni
ti
on
pe
r
f
or
m
a
nc
e
s
ui
ta
bl
e
f
or
s
a
f
e
ty
-
c
r
it
ic
a
l
r
e
a
l
-
w
or
ld
a
ppl
ic
a
ti
ons
.
T
a
bl
e
7
.
P
e
r
f
or
m
a
nc
e
of
pr
opos
e
d e
ns
e
m
bl
e
m
ode
l
C
l
a
s
s
TP
FP
FN
TN
P
r
e
c
i
s
i
on
R
e
c
a
l
l
F1
-
s
c
or
e
A
c
c
ur
a
c
y
c
a
l
l
i
ng
61
4
5
920
93.85
92.40
93.13
99.09
c
l
a
ppi
ng
66
0
0
924
100.00
100.00
100.00
100.00
c
yc
l
i
ng
64
2
2
922
96.97
97.00
96.97
99.60
da
nc
i
ng
62
4
4
920
93.94
93.90
93.94
99.19
dr
i
nki
ng
65
0
1
924
100.00
98.50
99.24
99.90
e
a
t
i
ng
59
3
7
921
95.16
89.40
92.19
98.99
f
i
ght
i
ng
66
3
0
921
95.65
100.00
97.78
99.70
huggi
ng
64
2
2
922
96.97
97.00
96.97
99.60
l
a
ughi
ng
64
2
2
922
96.97
97.00
96.97
99.60
m
us
i
c
64
4
2
920
94.12
97.00
95.52
99.39
r
unni
ng
64
2
2
922
96.97
97.00
96.97
99.60
s
i
t
t
i
ng
62
3
4
921
95.38
93.90
94.66
99.29
s
l
e
e
pi
ng
63
7
3
917
90.00
95.50
92.65
98.99
t
e
xt
i
ng
61
7
5
917
89.71
92.40
91.04
98.79
us
i
ng_l
a
pt
op
54
8
12
916
87.10
81.80
84.38
97.98
4.
C
O
N
C
L
U
S
I
O
N
W
e
pr
opos
e
d
a
nove
l
w
a
y
of
pe
r
f
or
m
in
g
H
A
R
w
it
h
s
ta
ti
c
i
m
a
ge
s
le
ve
r
a
gi
ng
th
e
c
om
pl
e
m
e
nt
in
g
s
tr
e
ngt
h
of
S
w
in
T
a
nd
V
iT
in
a
n
e
ns
e
m
bl
e
a
r
c
hi
te
c
tu
r
e
.
P
onde
r
in
g
ove
r
th
e
f
a
c
t
th
a
t
di
f
f
e
r
e
nt
tr
a
ns
f
or
m
e
r
va
r
ia
nt
s
c
a
n unde
r
s
ta
nd unique
s
pa
ti
a
l
a
nd c
ont
e
xt
ua
l
knowle
dge
, w
e
c
om
bi
ne
d s
ix
m
ode
ls
:
s
w
in
s
m
a
ll
, ba
s
e
,
la
r
ge
,
a
nd
V
i
T
s
m
a
ll
,
ba
s
e
,
la
r
ge
,
in
a
s
ta
c
ki
ng
e
n
s
e
m
bl
e
f
r
a
m
e
w
or
k.
T
hi
s
a
ll
ow
e
d
th
e
m
ode
l
to
e
nha
nc
e
ha
ndl
in
g
of
in
he
r
e
nt
c
om
pl
e
xi
ty
a
nd
di
ve
r
s
it
y
of
hum
a
n
a
c
ti
vi
ty
in
s
ti
ll
im
a
ge
s
.
T
he
e
ns
e
m
bl
e
r
a
n
s
tr
ongl
y
a
c
r
os
s
a
c
ti
vi
ty
c
la
s
s
e
s
,
e
f
f
e
c
ti
ve
ly
c
a
n
c
e
li
ng
out
w
e
a
kne
s
s
e
s
of
in
di
vi
dua
l
m
ode
ls
.
U
nde
r
th
or
ough
e
va
lu
a
ti
on,
w
e
pr
obe
d
m
ode
l
be
ha
vi
or
,
opt
im
iz
a
ti
on
te
c
hni
que
s
,
a
c
ti
va
ti
on
f
unc
ti
ons
,
a
nd
m
is
-
c
la
s
s
if
ic
a
ti
on
tr
e
nds
.
M
is
-
c
l
a
s
s
if
ic
a
ti
on
s
w
e
r
e
m
or
e
f
r
e
que
nt
a
m
ong
vi
s
ua
ll
y
a
m
bi
guous
c
l
a
s
s
e
s
li
ke
s
it
ti
ng,
da
nc
in
g,
c
a
ll
in
g,
a
nd
us
e
of
a
la
pt
op,
in
di
c
a
ti
ng
tr
oubl
e
w
it
h
s
in
gl
e
-
la
be
l
c
la
s
s
if
ic
a
ti
on.
W
hi
le
th
e
e
n
s
e
m
bl
e
w
or
ke
d
w
e
ll
,
it
s
ti
ll
ha
d
a
ha
r
d
ti
m
e
di
s
ti
ngui
s
hi
ng
ove
r
la
ppi
ng
or
vi
s
ua
ll
y
c
onf
ounde
d
a
c
ti
vi
ti
e
s
s
in
c
e
th
e
r
e
w
a
s
no
te
m
por
a
l
c
ont
e
xt
a
nd
s
in
gl
e
-
la
be
l
c
la
s
s
if
ic
a
ti
on
ha
s
a
li
m
it
a
ti
on.
C
om
put
a
ti
ona
l
e
xpe
ns
e
of
th
e
e
ns
e
m
bl
e
c
oul
d
a
ls
o
de
te
r
de
pl
oym
e
nt
in
r
e
s
our
c
e
-
s
c
a
r
c
e
or
r
e
a
l
-
ti
m
e
pl
a
tf
or
m
s
.
U
pc
om
in
g
r
e
s
e
a
r
c
h
w
il
l
s
tu
dy
vi
de
o
-
ba
s
e
d
da
ta
s
e
ts
to
in
c
or
por
a
te
m
ot
io
n
dyna
m
ic
s
,
a
ppl
y
m
ul
ti
-
la
be
l
c
la
s
s
if
ic
a
ti
on
to
be
tt
e
r
c
a
pt
ur
e
ove
r
la
ppi
ng
r
e
a
l
-
w
or
ld
a
c
ti
vi
ty
a
nd
us
e
a
tt
e
nt
io
n
-
ba
s
e
d
f
us
io
n
to
e
nha
n
c
e
f
e
a
tu
r
e
di
s
c
r
im
in
a
ti
on.
E
ns
e
m
bl
e
opt
im
iz
a
ti
on
f
or
e
f
f
ic
ie
nc
y
a
nd
da
ta
s
e
t
s
iz
e
e
xpa
ns
io
n
to
c
ove
r
a
w
id
e
r
s
e
t
of
e
nvi
r
onm
e
nt
s
a
nd
a
c
ti
vi
ti
e
s
w
il
l
a
ls
o
b
e
e
xt
e
nde
d
to
e
nl
a
r
ge
ge
ne
r
a
li
z
a
bi
li
ty
a
nd
pr
a
c
ti
c
a
l
a
ppl
ic
a
ti
on
in
a
r
e
a
s
li
ke
s
m
a
r
t
s
ur
ve
il
la
nc
e
,
he
a
lt
hc
a
r
e
m
oni
to
r
in
g, a
nd huma
n
-
c
om
put
e
r
i
nt
e
r
a
c
ti
on.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
. 14, No. 6, D
e
c
e
m
be
r
2025
:
5131
-
5139
5138
F
U
N
D
I
N
G
I
N
F
O
R
M
A
T
I
O
N
T
he
a
ut
hor
s
s
t
a
te
t
ha
t
no f
undi
ng w
a
s
i
nvol
ve
d i
n s
uppor
ti
ng t
h
is
r
e
s
e
a
r
c
h w
or
k
.
A
U
T
H
O
R
C
O
N
T
R
I
B
U
T
I
O
N
S
S
T
A
T
E
M
E
N
T
T
hi
s
jo
ur
na
l
us
e
s
th
e
C
ont
r
ib
ut
or
R
ol
e
s
T
a
xonomy
(
C
R
e
di
T
)
to
r
e
c
ogni
z
e
in
di
vi
dua
l
a
ut
hor
c
ont
r
ib
ut
io
ns
, r
e
duc
e
a
ut
hor
s
hi
p di
s
put
e
s
,
a
nd f
a
c
il
it
a
te
c
ol
la
bo
r
a
ti
on
N
am
e
o
f
A
u
t
h
or
C
M
So
Va
Fo
I
R
D
O
E
Vi
Su
P
Fu
R
e
z
w
a
na
K
a
r
im
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
A
f
s
a
na
B
e
gum
✓
✓
✓
✓
✓
✓
M
is
ka
tu
l
J
a
nn
a
t
✓
✓
✓
✓
✓
✓
✓
✓
A
bu K
ow
s
hi
r
B
it
to
✓
✓
✓
✓
✓
✓
C
:
C
onc
e
pt
ua
l
i
z
a
t
i
on
M
:
M
e
t
hodol
ogy
So
:
So
f
t
w
a
r
e
Va
:
Va
l
i
da
t
i
on
Fo
:
Fo
r
m
a
l
a
na
l
ys
i
s
I
:
I
nve
s
t
i
ga
t
i
on
R
:
R
e
s
our
c
e
s
D
:
D
a
t
a
C
ur
a
t
i
on
O
:
W
r
i
t
i
ng
-
O
r
i
gi
na
l
D
r
a
f
t
E
:
W
r
i
t
i
ng
-
R
e
vi
e
w
&
E
di
t
i
ng
Vi
:
Vi
s
ua
l
i
z
a
t
i
on
Su
:
Su
pe
r
vi
s
i
on
P
:
P
r
oj
e
c
t
a
dm
i
ni
s
t
r
a
t
i
on
Fu
:
Fu
ndi
ng a
c
qui
s
i
t
i
on
C
O
N
F
L
I
C
T
O
F
I
N
T
E
R
E
S
T
S
T
A
T
E
M
E
N
T
A
ut
hor
s
s
ta
te
no c
onf
li
c
t
of
i
nt
e
r
e
s
t.
D
A
T
A
A
V
A
I
L
A
B
I
L
I
T
Y
T
he
da
ta
s
e
t
us
e
d
in
th
is
s
tu
dy
is
publ
ic
ly
on
K
a
ggl
e
,
a
va
il
a
bl
e
a
t:
ht
tp
s
:/
/ww
w
.ka
ggl
e
.c
om
/d
a
ta
s
e
ts
/m
e
e
tn
a
g
a
di
a
/h
um
a
n
-
a
c
ti
on
-
r
e
c
ogni
ti
on
-
ha
r
-
da
ta
s
e
t
R
E
F
E
R
E
N
C
E
S
[
1]
C
.
D
hi
m
a
n
a
nd
D
.
K
.
V
i
s
hw
a
ka
r
m
a
,
“
A
r
e
vi
e
w
of
s
t
a
t
e
-
of
-
t
he
-
a
r
t
t
e
c
hni
que
s
f
or
a
bnor
m
a
l
hum
a
n
a
c
t
i
vi
t
y
r
e
c
ogni
t
i
on,
”
E
ngi
ne
e
r
i
ng A
ppl
i
c
at
i
ons
of
A
r
t
i
f
i
c
i
al
I
nt
e
l
l
i
ge
nc
e
, vol
. 77, pp. 21
–
45, J
a
n. 201
9, doi
:
10.1016/
j
.e
nga
ppa
i
.2018.08.014.
[
2]
F
.
K
ul
s
oom
,
S
.
N
a
r
e
j
o,
Z
.
M
e
hm
ood,
H
.
N
.
C
ha
udhr
y,
A
.
B
ut
t
,
a
nd
A
.
K
.
B
a
s
hi
r
,
“
A
r
e
vi
e
w
of
m
a
c
hi
ne
l
e
a
r
ni
ng
-
ba
s
e
d
hum
a
n
a
c
t
i
vi
t
y
r
e
c
ogni
t
i
on
f
or
di
ve
r
s
e
a
ppl
i
c
a
t
i
ons
,”
N
e
ur
al
C
om
put
i
ng
and
A
ppl
i
c
at
i
ons
,
vol
.
34,
no.
21,
pp.
18289
–
18324,
2022,
doi
:
10.1007/
s
00521
-
022
-
07665
-
9.
[
3]
M
.
A
.
K
h
a
n
,
M
.
M
i
t
t
a
l
, L
.
M
.
G
o
y
a
l
,
a
n
d
S
.
R
o
y
,
“
A
de
e
p s
u
r
v
e
y
o
n s
u
p
e
r
v
i
s
e
d
l
e
a
r
n
i
n
g
ba
s
e
d
h
u
m
a
n
d
e
t
e
c
t
i
o
n
a
n
d
a
c
t
i
v
i
t
y
c
l
a
s
s
i
f
i
c
a
t
i
on
m
e
t
h
o
ds
,
”
M
u
l
t
i
m
e
d
i
a
T
o
o
l
s
a
n
d
A
p
p
l
i
c
at
i
o
ns
,
v
ol
.
8
0,
n
o
.
1
8
,
p
p
.
27
8
6
7
–
2
7
9
2
3,
2
02
1
,
d
o
i
:
1
0
.
10
0
7
/
s
1
1
04
2
-
021
-
1
08
1
1
-
5.
[
4]
A
.
K
.
B
i
t
t
o
a
nd
I
.
M
a
hm
ud,
“
M
ul
t
i
c
a
t
e
gor
i
c
a
l
of
c
om
m
on
e
ye
di
s
e
a
s
e
de
t
e
c
t
us
i
ng
c
onvol
ut
i
ona
l
ne
ur
a
l
ne
t
w
or
k:
a
t
r
a
ns
f
e
r
l
e
a
r
ni
ng
a
ppr
oa
c
h,”
B
ul
l
e
t
i
n
of
E
l
e
c
t
r
i
c
al
E
ngi
ne
e
r
i
ng
and
I
nf
or
m
at
i
c
s
,
vol
.
11,
no.
4,
pp.
2378
–
2387,
A
ug.
2022
,
doi
:
10.11591/
e
e
i
.v11i
4.3834.
[
5]
J
.
Y
u
e
t
al
.
,
“
E
ns
e
m
bl
e
e
a
r
l
y
e
xi
t
ne
t
w
or
k
on
hum
a
n
a
c
t
i
vi
t
y
r
e
c
ogni
t
i
on
u
s
i
ng
w
e
a
r
a
bl
e
s
e
ns
or
s
,
”
C
om
put
e
r
N
e
t
w
or
k
s
,
vol
.
269,
2025, doi
:
10.1016/
j
.c
om
ne
t
.2025.111409.
[
6]
S
.
K
ha
n,
M
.
N
a
s
e
e
r
,
M
.
H
a
ya
t
,
S
.
W
.
Z
a
m
i
r
,
F
.
S
.
K
ha
n,
a
nd
M
.
S
ha
h,
“
T
r
a
ns
f
or
m
e
r
s
i
n
vi
s
i
on:
a
s
ur
ve
y,”
A
C
M
C
om
put
i
ng
Sur
v
e
y
s
, vol
. 54, no. 10, 2022, doi
:
10.1145/
3505244.
[
7]
K
.
H
a
n
e
t
al
.
,
“
A
s
ur
ve
y
on
vi
s
i
on
t
r
a
n
s
f
or
m
e
r
,”
I
E
E
E
T
r
ans
ac
t
i
ons
on
P
at
t
e
r
n
A
nal
y
s
i
s
and
M
ac
hi
ne
I
nt
e
l
l
i
ge
nc
e
,
vol
.
45,
no.
1,
pp. 87
–
110, 2023, doi
:
10.1109/
T
P
A
M
I
.2022.3152247.
[
8]
S
.
H
ua
n
e
t
al
.
,
“
A
l
i
ght
w
e
i
ght
hybr
i
d
vi
s
i
on
t
r
a
ns
f
or
m
e
r
ne
t
w
or
k
f
or
r
a
da
r
-
ba
s
e
d
hum
a
n
a
c
t
i
vi
t
y
r
e
c
ogni
t
i
on,”
Sc
i
e
nt
i
f
i
c
R
e
por
t
s
,
vol
. 13, no. 1, pp. 1
–
12, 2023, doi
:
10.1038/
s
41598
-
023
-
45149
-
5.
[
9]
H
.
U
l
l
a
h
a
nd
A
.
M
uni
r
,
“
H
um
a
n
a
c
t
i
vi
t
y
r
e
c
ogni
t
i
on
us
i
ng
c
a
s
c
a
de
d
dua
l
a
t
t
e
nt
i
on
C
N
N
a
nd
bi
-
di
r
e
c
t
i
ona
l
G
R
U
f
r
a
m
e
w
or
k
,”
J
our
nal
of
I
m
agi
ng
, vol
. 9, no. 7, 2023, doi
:
10.3390/
j
i
m
a
gi
ng9070130.
[
10]
R
.
K
.
V
a
ghe
l
a
,
J
.
A
.
P
a
t
e
l
,
a
nd
K
.
M
odi
,
“
H
um
a
n
a
c
t
i
vi
t
y
r
e
c
ogni
t
i
on
us
i
ng f
e
a
t
ur
e
f
u
s
i
on,”
SA
M
R
I
D
D
H
I
:
A
J
our
nal
of
P
hy
s
i
c
al
Sc
i
e
nc
e
s
, E
ngi
ne
e
r
i
ng and T
e
c
hnol
ogy
, vol
. 14, no. 2, pp. 288
–
293, 2022, doi
:
10.18090/
s
a
m
r
i
ddhi
.v14s
pl
i
02.25.
[
11]
M
.
G
.
M
or
s
he
d,
T
.
S
ul
t
a
na
,
A
.
A
l
a
m
,
a
nd
Y
.
K
.
L
e
e
,
“
H
um
a
n
a
c
t
i
on
r
e
c
ogni
t
i
on:
a
t
a
xonom
y
-
ba
s
e
d
s
ur
ve
y,
upda
t
e
s
,
a
nd
oppor
t
uni
t
i
e
s
,”
Se
ns
or
s
, vol
. 23, no. 4, 2023, doi
:
10.3390/
s
23042182.
[
12]
K
.
A
l
om
a
r
,
H
.
I
.
A
ys
e
l
,
a
nd
X
.
C
a
i
,
“
C
N
N
s
,
R
N
N
s
a
nd
t
r
a
n
s
f
or
m
e
r
s
i
n
hum
a
n
a
c
t
i
on
r
e
c
ogni
t
i
on:
a
s
ur
ve
y
a
nd
a
hybr
i
d
m
ode
l
,”
A
r
t
i
f
i
c
i
al
I
nt
e
l
l
i
ge
nc
e
R
e
v
i
e
w
, vol
. 58, 2024, doi
:
10.1007/
s
10462
-
025
-
11388
-
3.
[
13]
A
. U
l
ha
q, N
. A
kht
a
r
, G
. P
ogr
e
bna
, a
nd A
. M
i
a
n, “
V
i
s
i
on
t
r
a
ns
f
or
m
e
r
s
f
or
a
c
t
i
on r
e
c
ogni
t
i
on:
a
s
ur
ve
y.”
ar
X
i
v
:
2209.05700
,
2022
.
[
14]
A
.
D
os
ovi
t
s
ki
y
e
t
al
.
,
“
A
n
i
m
a
ge
i
s
w
or
t
h
16x16
w
or
ds
:
t
r
a
n
s
f
or
m
e
r
s
f
or
i
m
a
ge
r
e
c
ogni
t
i
on
a
t
s
c
a
l
e
,”
i
n
I
C
L
R
2021
-
9t
h
I
nt
e
r
nat
i
onal
C
onf
e
r
e
nc
e
on L
e
a
r
ni
ng R
e
pr
e
s
e
nt
at
i
ons
, 2021, pp. 1
–
22.
[
15]
Z
.
L
i
u
e
t
al
.
,
“
S
w
i
n
t
r
a
ns
f
o
r
m
e
r
:
hi
e
r
a
r
c
hi
c
a
l
v
i
s
i
on
t
r
a
ns
f
o
r
m
e
r
us
i
n
g
s
h
i
f
t
e
d
w
i
n
d
ow
s
,
”
20
21
I
E
E
E
/
C
V
F
I
nt
e
r
nat
i
o
na
l
C
on
f
e
r
e
nc
e
on C
om
p
ut
e
r
V
i
s
i
on
(
I
C
C
V
)
,
M
o
nt
r
e
a
l
,
Q
C
,
C
a
n
a
da
,
F
e
b
. 2
02
1,
pp
. 9
99
2
-
100
02
,
doi
:
10.
11
09
/
I
C
C
V
48
92
2.2
02
1.0
09
86.
[
16]
J
.
W
e
ns
e
l
,
H
.
U
l
l
a
h,
a
nd
A
.
M
uni
r
,
“
V
i
T
-
R
e
T
:
vi
s
i
on
a
nd r
e
c
ur
r
e
nt
t
r
a
ns
f
or
m
e
r
ne
ur
a
l
ne
t
w
or
ks
f
or
hu
m
a
n
a
c
t
i
vi
t
y
r
e
c
ogni
t
i
on
i
n
vi
de
os
,”
I
E
E
E
A
c
c
e
s
s
, vol
. 11, pp. 72227
–
72249, 2023, doi
:
10.1109/
A
C
C
E
S
S
.2023.3293813.
[
17]
D
.
R
.
R
e
da
,
F
.
C
ha
i
e
b,
H
.
D
r
i
r
a
,
a
nd
A
.
A
be
r
ka
ne
,
“
C
onV
i
V
i
T
-
a
de
e
p
ne
ur
a
l
ne
t
w
or
k
c
om
bi
ni
ng
c
onvol
ut
i
ons
a
nd
f
a
c
t
or
i
z
e
d
s
e
l
f
-
a
t
t
e
nt
i
on
f
or
hum
a
n
a
c
t
i
vi
t
y
r
e
c
ogni
t
i
on,”
i
n
2023
I
E
E
E
25t
h
I
nt
e
r
nat
i
onal
W
or
k
s
hop
on
M
ul
t
i
m
e
di
a
Si
gnal
P
r
oc
e
s
s
i
ng
(
M
M
SP
)
,
P
oi
t
i
e
r
s
, F
r
a
nc
e
, 2023, pp. 1
-
6
, doi
:
10.1109/
M
M
S
P
59012.2023.10337696.
[
18]
H
.
H
a
n,
H
.
Z
e
ng,
L
.
K
ua
ng,
X
.
H
a
n,
a
nd
H
.
X
ue
,
“
A
hum
a
n
a
c
t
i
vi
t
y
r
e
c
ogni
t
i
on
m
e
t
hod
ba
s
e
d
on
vi
s
i
on
t
r
a
ns
f
or
m
e
r
,”
Sc
i
e
nt
i
f
i
c
R
e
por
t
s
, vol
. 14, no. 1, J
ul
. 2024, doi
:
10.1038/
s
41598
-
024
-
65850
-
3.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
A
bl
e
nde
d e
ns
e
m
bl
e
app
r
oac
h f
or
ac
c
ur
at
e
hum
an a
c
ti
v
it
y
r
e
c
o
gni
ti
on
(
R
e
z
w
ana K
ar
im
)
5139
[
19]
T
. W
a
ng, G
. Z
hou, Y
. P
u, R
.
M
or
e
no, a
nd G
. Y
a
ng, “
G
a
i
t
r
e
c
ogni
t
i
on w
i
t
h gl
o
ba
l
–
l
oc
a
l
f
e
a
t
ur
e
f
us
i
on ba
s
e
d
on s
w
i
n t
r
a
n
s
f
or
m
e
r
-
3D
C
N
N
,”
Si
gnal
, I
m
age
and V
i
de
o P
r
oc
e
s
s
i
ng
, vol
. 19, no. 1, pp. 1
–
9, 2025, doi
:
10.1007/
s
11760
-
024
-
03612
-
4.
[
20]
Y
.
D
j
e
n
ou
r
i
a
n
d
A
.
N
.
B
e
l
b
a
c
hi
r
,
“
A
hy
b
r
i
d
v
i
s
ua
l
t
r
a
ns
f
o
r
m
e
r
f
o
r
e
f
f
i
c
i
e
n
t
de
e
p
h
u
m
a
n
a
c
t
i
v
i
t
y
r
e
c
o
gn
i
t
i
o
n,
”
i
n
20
2
3
I
E
E
E
/
C
V
F
I
n
t
e
r
n
a
t
i
o
na
l
C
o
n
f
e
r
e
n
c
e
o
n
C
o
m
p
u
t
e
r
V
i
s
i
o
n
W
or
k
s
h
ops
,
I
C
C
V
W
2
02
3
,
20
2
3,
p
p
.
7
21
–
73
0
,
do
i
:
10
.
11
09
/
I
C
C
V
W
6
07
93
.
20
23
.
00
08
0
.
[
21]
D
.
R
.
R
a
ni
a
nd
C
.
J
.
P
r
a
bha
ka
r
,
“
V
i
s
i
on
t
r
a
ns
f
or
m
e
r
-
ba
s
e
d
m
ode
l
f
o
r
hum
a
n
a
c
t
i
on
r
e
c
ogni
t
i
on
i
n
s
t
i
l
l
i
m
a
ge
s
,”
J
our
nal
of
C
om
put
at
i
onal
A
nal
y
s
i
s
and A
ppl
i
c
at
i
ons
, vol
. 33, no. 8. p
p
.
522
–
531
, 2024.
[
22]
L
.
N
a
nni
,
A
.
L
um
i
ni
,
a
nd
C
.
F
a
nt
oz
z
i
,
“
E
xpl
or
i
ng
t
he
pot
e
nt
i
a
l
of
e
ns
e
m
bl
e
s
of
de
e
p
l
e
a
r
ni
ng
ne
t
w
or
ks
f
or
i
m
a
ge
s
e
gm
e
nt
a
t
i
on
,”
I
nf
or
m
at
i
on
, vol
. 14, no. 12, 2023, doi
:
10.3390/
i
nf
o14120657.
[
23]
Z
.
Z
hong
e
t
al
.
,
“
E
s
t
i
m
a
t
i
on
of
bus
pa
s
s
e
nge
r
a
t
t
r
i
but
e
s
us
i
ng
s
w
i
n
t
r
a
ns
f
o
r
m
e
r
,”
i
n
A
C
M
I
nt
e
r
nat
i
onal
C
onf
e
r
e
nc
e
P
r
oc
e
e
di
ng
Se
r
i
e
s
, S
e
p. 2022, pp. 121
–
128
, doi
:
10.1145/
3573942.3573961.
[
24]
M
.
N
a
ga
di
a
,
“
H
um
a
n
a
c
t
i
on
r
e
c
ogni
t
i
on
(
H
A
R
)
da
t
a
s
e
t
,
”
K
aggl
e
.
2025.
A
c
c
e
s
s
e
d:
M
a
y
22,
2025.
[
O
nl
i
ne
]
.
A
va
i
l
a
bl
e
:
ht
t
ps
:
/
/
w
w
w
.ka
ggl
e
.c
om
/
da
t
a
s
e
t
s
/
m
e
e
t
na
ga
di
a
/
hum
a
n
-
a
c
t
i
on
-
r
e
c
ogni
t
i
on
-
ha
r
-
da
t
a
s
e
t
[
25]
M
.
J
a
nna
t
,
R
.
K
a
r
i
m
,
N
.
Z
.
I
s
l
a
m
,
A
.
N
.
C
hy,
a
nd
A
.
K
.
M
.
M
a
s
um
,
“
H
um
a
n
a
c
t
i
vi
t
y
r
e
c
ogni
t
i
on
us
i
ng
e
ns
e
m
bl
e
of
C
N
N
-
ba
s
e
d
t
r
a
ns
f
e
r
l
e
a
r
ni
ng
m
ode
l
s
,”
i
n
2023
I
E
E
E
I
nt
e
r
nat
i
onal
C
onf
e
r
e
nc
e
on
C
om
put
i
ng,
I
C
O
C
O
2023
,
2023,
pp.
112
–
117
,
doi
:
10.1109/
I
C
O
C
O
59262.2023.10398022.
B
I
O
G
R
A
P
H
I
E
S
O
F
A
U
T
H
O
R
S
Rezwana
Karim
earned
her
B.Sc.
in
Computer
Science
and
Eng
ineering
fro
m
the
International
Islamic
University
Chittagong
(IIUC)
and
is
current
ly
pursuing
an
M.Sc.
in
Softwa
re
Engine
ering
(Data
Scienc
e)
at
Daff
odil
Inter
nationa
l
U
nivers
ity.
Her
resea
rch
interests
include
machine
learning,
computer
vision,
and
A
I
applica
tions
in
healthca
re
and
agricult
ure.
With
experience
in
Python
,
PyTorch,
and
TensorFl
ow,
she
focuses
on
developi
ng
intelligent
systems
for
disease
detection
using
image
data.
She
can
be
contacte
d
at
email:
rezwanaiiuc@
gmail.com
.
Afsana
Begum
is
an
Assistant
Professor
in
the
Department
of
Software
Engineering
at
Daffodil
International
University,
Bangladesh,
and
i
s
pursuing
her
Ph.D.
at
Universiti
Malaysia
Perlis.
She
completed
her
M.Sc.
in
IIT
from
the
University
of
Dhaka
(1st
position)
and
her
B.Sc.
from
Hajee
Mohammad
Danesh
Science
and
Technology
University
(4th
position).
Her
research
interests
include
data
scien
ce,
machine
le
arning,
networking,
and
cybersecurit
y
. She can be contacted a
t
email:
afsana.swe@
diu.edu.
bd
.
Miskatu
l
Jannat
is
currently
a
Lecturer
in
the
Department
of
Com
puter
Science
and Engi
neering at
the Int
ernational
Islami
c Universi
ty Chi
ttagong
(II
UC), Ban
gladesh,
and is
pursuing
her
M.Sc.
in
the
same
field.
She
previously
served
as
faculty
at
D
affodil
International
University
(2024
-
2025).
Her
research
interests
include
machine
learning,
data
science,
and
large
language
model
s
(LLMs),
with
a
focus
on
AI
applications
in
natural
language
processing
and
predictive
analytics.
She
can
be
contacte
d
at
email:
miskat@
iiuc.ac.bd
.
Abu
Kowshir
Bitto
is
currently
working
as
an
AI
Solution
Sp
ecialist
at
the
BRAC
which
is
worlds
largest
NGO.
Previously
he
worked
as
Data
S
cientist
at
the
Centre
for
Data
Science
and
Research,
where
he
has
led
and
contributed
to
seve
ral
impactful
initiatives,
including
projects
funded
by
the
Governme
nt
of
Bangladesh
and
U
NESCO.
Previously,
he
served
as
a
Research
and
Development
Engineer
at
MediprospectsAI
Limited,
where
he
led
a
prestigious Innovate UK
-
funded research proj
ect. He
holds bo
th a
Bac
helor of Sci
ence (B.Sc.)
and a Mast
er of
Science (M
.Sc.) deg
ree
in So
ftware Engi
neering
with
a major i
n D
ata
Science
from
Daffodil
International
University
(D
IU),
Dhaka,
Bangladesh.
His
research
affiliations
include
the
Computational
Intelligenc
e
Lab
at
Southeast
University,
t
he
Data
Science
Lab
at
DIU,
and
the
Virtual
Multidisciplinary
Research
Lab.
He
serves
as
a
sessional
reviewer
for
several
Scopus
-
indexed
journals
and
has
published
multiple
papers
in
Scopus
and
Web
of
Scienc
e
-
indexed jour
nals and c
onferen
ces. His pr
imary res
earch in
tere
st is in
computer
visio
n.
He can be contacted at
email
: abu.kowshir
777
@
gmail.com
.
Evaluation Warning : The document was created with Spire.PDF for Python.