I
nd
o
ne
s
ia
n J
o
urna
l o
f
E
lect
rica
l En
g
ineering
a
nd
Co
m
pu
t
er
Science
Vo
l.
3
8
,
No
.
2
,
Ma
y
20
2
5
,
p
p
.
82
1
~
8
2
9
I
SS
N:
2
502
-
4
7
52
,
DOI
: 1
0
.
1
1
5
9
1
/ijee
cs
.v
3
8
.
i
2
.
pp
82
1
-
8
2
9
821
J
o
ur
na
l ho
m
ep
a
g
e
:
h
ttp
:
//ij
ee
cs
.
ia
esco
r
e.
co
m
RNN
-
d
riv
en in
teg
r
a
tion o
f
spa
tial,
t
empo
ra
l,
featu
res
f
o
r
India
n
sig
n lang
ua
g
e re
co
g
nition a
nd vid
e
o
captio
ning
Aj
a
y
M
a
no
ha
r
P
o
l
1
,
Sh
riniv
a
s
A.
P
a
t
il
2
1
D
e
p
a
r
t
me
n
t
o
f
El
e
c
t
r
o
n
i
c
s a
n
d
T
e
l
e
c
o
mm
u
n
i
c
a
t
i
o
n
,
K
I
T’
s C
o
l
l
e
g
e
o
f
En
g
i
n
e
e
r
i
n
g
(
A
u
t
o
n
o
m
o
u
s)
,
S
h
i
v
a
j
i
U
n
i
v
e
r
si
t
y
,
K
o
l
h
a
p
u
r
,
I
n
d
i
a
2
D
e
p
a
r
t
m
e
n
t
o
f
E
l
e
c
t
r
o
n
i
c
s
a
n
d
T
e
l
e
c
o
m
m
u
n
i
c
a
t
i
o
n
,
D
K
T
E
S
o
c
i
e
t
i
e
s
’
T
e
x
t
i
l
e
a
n
d
E
n
g
i
n
e
e
r
i
n
g
I
n
s
t
i
t
u
t
e
(
A
u
t
o
n
o
m
o
u
s
)
,
K
o
l
h
a
p
u
r
,
I
n
d
i
a
Art
icle
I
nfo
AB
S
T
RAC
T
A
r
ticle
his
to
r
y:
R
ec
eiv
ed
Ap
r
13
,
2
0
2
4
R
ev
is
ed
Oct
30
,
202
4
Acc
ep
ted
No
v
11
,
2
0
2
4
Th
is
p
a
p
e
r
p
re
se
n
ts
a
n
o
v
e
l
m
o
d
e
l
th
a
t
i
n
teg
ra
tes
sp
a
ti
a
l
fe
a
tu
re
s
fro
m
re
sid
u
a
l
b
lo
c
k
s
a
n
d
tem
p
o
ra
l
fe
a
tu
re
s
fro
m
F
F
T,
a
lo
n
g
si
d
e
a
s
o
p
h
isti
c
a
ted
RNN
a
rc
h
it
e
c
tu
re
c
o
m
p
risin
g
BiL
S
TM
,
g
ate
d
r
ec
u
r
r
e
n
t
u
n
i
ts
(
G
RU
)
lay
e
rs,
a
n
d
m
u
lt
i
-
h
e
a
d
a
tt
e
n
ti
o
n
.
Ac
h
iev
i
n
g
n
e
a
rly
9
9
%
a
c
c
u
ra
c
y
o
n
b
o
th
WL
ASL
a
n
d
INCLUD
E
d
a
tas
e
ts,
th
is
m
o
d
e
l
o
u
tp
e
rfo
rm
s
sta
n
d
a
rd
CNN
p
re
train
e
d
m
o
d
e
ls
i
n
fe
a
tu
re
e
x
t
ra
c
ti
o
n
.
N
o
tab
l
y
,
t
h
e
BiL
S
TM
a
n
d
G
RU
c
o
m
b
in
a
ti
o
n
p
r
o
v
e
s
s
u
p
e
rio
r
t
o
o
t
h
e
r
c
o
m
b
i
n
a
ti
o
n
s
s
u
c
h
a
s L
S
TM
a
n
d
G
RU.
Th
e
BLE
U
sc
o
re
a
n
a
ly
sis
fu
rth
e
r
v
a
li
d
a
tes
th
e
m
o
d
e
l'
s
e
ffica
c
y
,
with
sc
o
re
s
o
f
0
.
5
1
a
n
d
0
.
5
4
o
n
th
e
WL
A
S
L
a
n
d
INCLUD
E
d
a
tas
e
ts,
re
s
p
e
c
ti
v
e
ly
.
Th
e
se
re
su
lt
s
a
ffir
m
th
e
m
o
d
e
l'
s
p
ro
ficie
n
c
y
i
n
c
a
p
tu
ri
n
g
in
tri
c
a
te
s
p
a
ti
a
l
a
n
d
tem
p
o
ra
l
n
u
a
n
c
e
s
in
h
e
re
n
t
i
n
si
g
n
lan
g
u
a
g
e
g
e
st
u
re
s,
e
n
h
a
n
c
in
g
a
c
c
e
ss
ib
il
it
y
a
n
d
c
o
m
m
u
n
ica
ti
o
n
f
o
r
t
h
e
d
e
a
f
a
n
d
h
a
rd
-
of
-
h
e
a
rin
g
c
o
m
m
u
n
it
ies
.
T
h
e
c
o
m
p
a
riso
n
h
i
g
h
l
ig
h
ts
t
h
e
su
p
e
ri
o
rit
y
o
f
t
h
is
p
a
p
e
r'
s
p
ro
p
o
se
d
m
o
d
e
l
o
v
e
r
sta
n
d
a
rd
a
p
p
ro
a
c
h
e
s,
e
m
p
h
a
siz
in
g
t
h
e
sig
n
ifi
c
a
n
c
e
o
f
t
h
e
in
teg
ra
ted
a
rc
h
it
e
c
tu
re
.
Co
n
ti
n
u
e
d
re
fin
e
m
e
n
t
a
n
d
o
p
ti
m
iza
ti
o
n
h
o
l
d
p
ro
m
ise
fo
r
fu
rt
h
e
r
a
u
g
m
e
n
ti
n
g
th
e
m
o
d
e
l'
s
p
e
rfo
rm
a
n
c
e
a
n
d
a
p
p
li
c
a
b
il
it
y
i
n
r
e
a
l
-
wo
rld
sc
e
n
a
rio
s,
c
o
n
tri
b
u
ti
n
g
t
o
i
n
c
lu
siv
e
c
o
m
m
u
n
ica
ti
o
n
e
n
v
i
ro
n
m
e
n
ts.
K
ey
w
o
r
d
s
:
B
iLST
M
B
L
E
U
s
co
r
e
ev
alu
atio
n
FFT
-
b
ased
f
ea
tu
r
e
ex
tr
ac
tio
n
R
esid
u
al
b
lo
ck
s
Sig
n
lan
g
u
a
g
e
r
ec
o
g
n
itio
n
T
h
is i
s
a
n
o
p
e
n
a
c
c
e
ss
a
rticle
u
n
d
e
r th
e
CC B
Y
-
SA
li
c
e
n
se
.
C
o
r
r
e
s
p
o
nd
ing
A
uth
o
r
:
Ajay
M
an
o
h
ar
Po
l
Dep
ar
tm
en
t o
f
E
lectr
o
n
ics an
d
T
elec
o
m
m
u
n
icatio
n
,
KI
T
’
s
C
o
lleg
e
o
f
E
n
g
in
ee
r
i
n
g
(
A
u
to
n
o
m
o
u
s
)
Sh
iv
aji
Un
iv
er
s
ity
Ko
lh
ap
u
r
,
I
n
d
ia
E
m
ail:
k
ay
ajay
2
0
0
4
@
g
m
ail.
c
o
m
1.
I
NT
RO
D
UCT
I
O
N
Fo
r
ef
f
ec
tiv
e
co
m
m
u
n
icatio
n
b
etwe
en
n
o
r
m
al
a
n
d
d
ea
f
p
eo
p
le
alwa
y
s
h
u
r
d
le
is
f
ac
e
d
b
y
n
o
r
m
al
p
eo
p
le
f
o
r
i
n
ter
p
r
etatio
n
o
f
s
ig
n
lan
g
u
a
g
e
as
m
an
y
r
em
a
in
u
n
f
a
m
iliar
with
it.
T
o
m
i
n
im
ize
th
is
g
ap
o
f
co
m
m
u
n
icatio
n
,
tech
n
o
lo
g
ical
ad
v
an
ce
m
e
n
t
with
th
e
aid
o
f
s
ig
n
lan
g
u
ag
e
r
ec
o
g
n
itio
n
(
SL
R
)
tech
n
iq
u
e
f
r
o
m
v
id
eo
p
r
o
ce
s
s
in
g
p
lay
s
im
p
o
r
tan
t
r
o
le.
As,
d
ee
p
lear
n
in
g
ar
e
s
h
o
win
g
s
ig
n
if
ican
t
im
p
r
o
v
e
m
en
ts
an
d
o
p
e
n
in
g
to
n
ew
av
en
u
es,
ef
f
icien
t
SLR
s
y
s
tem
d
ev
elo
p
m
en
t
h
as
b
ec
o
m
e
p
o
s
s
ib
le
[
1
]
.
Gestu
r
es
b
ase
d
SLR
f
r
o
m
v
id
eo
s
eq
u
en
ce
s
is
p
r
o
p
o
s
ed
in
th
e
wo
r
k
p
r
esen
te
d
in
th
is
p
ap
er
.
T
h
is
s
tu
d
y
p
r
im
ar
ily
f
o
cu
s
es
o
n
th
e
r
ec
o
g
n
itio
n
o
f
s
ig
n
lan
g
u
ag
e
g
estu
r
es
f
r
o
m
v
id
eo
s
eq
u
en
ce
s
.
Vid
eo
d
ata
is
in
h
e
r
en
tly
d
y
n
a
m
ic
an
d
ca
p
t
u
r
es
th
e
tem
p
o
r
al
ev
o
lu
ti
o
n
o
f
s
ig
n
s
,
p
r
esen
tin
g
u
n
iq
u
e
c
h
allen
g
es
co
m
p
ar
ed
to
s
tatic
im
ag
e
r
ec
o
g
n
itio
n
.
T
h
e
o
b
jectiv
e
is
to
cr
ea
te
a
r
esil
ien
t
s
y
s
tem
th
at
ca
n
p
r
ec
is
ely
id
en
tif
y
an
d
in
ter
p
r
et
a
b
r
o
ad
s
p
e
ctr
u
m
o
f
s
ig
n
g
estu
r
es
in
t
o
tex
t.
T
h
is
en
d
ea
v
o
r
h
o
ld
s
g
r
ea
t
im
p
o
r
tan
ce
as
it
h
as
th
e
p
o
ten
tial
to
en
h
a
n
ce
th
e
ac
c
ess
ib
ilit
y
o
f
co
m
m
u
n
icatio
n
to
o
ls
f
o
r
th
e
Dea
f
c
o
m
m
u
n
ity
.
B
y
p
u
s
h
in
g
th
e
b
o
u
n
d
ar
ies
o
f
SLR
in
v
id
eo
s
,
th
is
r
esear
ch
s
tr
iv
es
to
f
o
s
ter
in
clu
s
iv
i
ty
,
em
p
o
wer
t
h
o
s
e
with
h
ea
r
in
g
im
p
air
m
en
ts
,
an
d
a
d
v
o
ca
te
f
o
r
eq
u
al
en
g
ag
em
en
t
in
all
f
ac
e
ts
o
f
life
[
2
]
.
T
h
e
s
u
g
g
este
d
s
y
s
tem
u
tili
ze
s
d
ee
p
lear
n
in
g
tech
n
iq
u
es to
a
d
d
r
ess
th
e
in
tr
icac
ies o
f
SLR.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
5
0
2
-
4
7
52
In
d
o
n
esian
J
E
lec
E
n
g
&
C
o
m
p
Sci
,
Vo
l.
3
8
,
No
.
2
,
May
20
2
5
:
82
1
-
8
2
9
822
SLR,
a
cr
u
cial
co
m
p
o
n
en
t
o
f
b
eh
av
io
r
id
e
n
tific
atio
n
,
o
f
te
n
h
ar
n
ess
es
m
ac
h
in
e
lear
n
in
g
,
p
ar
ticu
lar
ly
d
elv
in
g
in
to
d
e
ep
lear
n
in
g
m
eth
o
d
o
lo
g
ies,
n
ec
ess
itatin
g
co
n
s
id
er
ab
le
d
atasets
f
o
r
ef
f
ec
tiv
e
tr
ain
in
g
.
T
h
is
p
r
o
ce
s
s
en
tails
th
e
i
n
tr
icate
s
tag
es
o
f
g
estu
r
e
d
etec
tio
n
,
tr
ac
k
in
g
,
an
d
u
ltima
te
r
ec
o
g
n
itio
n
,
p
o
s
in
g
ch
allen
g
es
th
at
d
em
an
d
ef
f
icien
t f
ea
t
u
r
e
e
x
tr
ac
tio
n
tech
n
i
q
u
es.
I
n
th
e
r
ea
lm
o
f
SLR,
th
is
s
tu
d
y
e
x
p
lo
r
es
t
h
e
g
en
er
atio
n
o
f
v
i
d
eo
-
to
-
tex
t
d
escr
ip
tio
n
s
,
c
o
n
s
id
er
in
g
v
ar
io
u
s
ap
p
r
o
ac
h
es
f
o
r
p
r
o
c
ess
in
g
b
o
th
v
id
eo
an
d
tex
t
d
ata.
Yo
u
s
if
a
n
d
Al
-
J
am
m
as
[
3
]
c
o
n
tr
ib
u
te
a
s
ig
n
if
ican
t
ad
v
a
n
ce
m
en
t
with
th
eir
s
tate
-
of
-
th
e
-
ar
t
v
id
e
o
ca
p
tio
n
in
g
tech
n
iq
u
e.
T
h
eir
m
eth
o
d
i
n
co
r
p
o
r
ates
a
d
ee
p
r
ein
f
o
r
ce
m
en
t
p
o
lis
h
in
g
n
etwo
r
k
alo
n
g
s
id
e
wo
r
d
d
en
o
is
in
g
an
d
g
r
am
m
ar
c
h
ec
k
in
g
n
etwo
r
k
s
,
r
esu
ltin
g
in
o
p
tim
ized
p
er
f
o
r
m
an
ce
,
esp
ec
ially
wh
en
h
an
d
lin
g
l
en
g
th
y
v
id
eo
s
eq
u
en
ce
s
.
R
ig
o
r
o
u
s
ev
alu
atio
n
s
u
n
d
er
s
co
r
e
th
e
e
f
f
icac
y
o
f
th
i
s
ap
p
r
o
ac
h
,
af
f
i
r
m
in
g
its
ab
ilit
y
to
ac
c
u
r
ately
in
ter
p
r
et
an
d
d
escr
ib
e
co
m
p
lex
s
ig
n
lan
g
u
ag
e
g
estu
r
es.
An
o
th
er
n
o
tab
le
ef
f
o
r
t,
as
u
n
d
er
tak
en
b
y
Yasin
et
a
l
.
[
4
]
,
s
tr
ateg
ically
in
teg
r
ates
wo
r
d
em
b
ed
d
in
g
tech
n
iq
u
es
to
en
h
a
n
ce
th
e
f
o
cu
s
o
n
s
ce
n
e
o
b
jects
in
s
ig
n
lan
g
u
ag
e
v
id
eo
s
.
I
d
en
tif
y
in
g
s
im
ilar
ities
am
o
n
g
wo
r
d
s
in
d
escr
ip
tiv
e
te
x
t,
th
i
s
ap
p
r
o
ac
h
s
u
cc
ess
f
u
lly
ca
p
t
u
r
es
th
e
ess
en
ce
o
f
th
e
s
ce
n
es
d
ep
icted
in
s
ig
n
lan
g
u
ag
e.
T
h
e
r
esu
lts
o
b
tain
e
d
f
r
o
m
th
is
s
tr
ateg
y
ar
e
co
m
m
en
d
ab
le,
em
p
h
asizin
g
th
e
co
n
tex
tu
al
in
f
o
r
m
atio
n
im
p
ac
ts
.
Als
o
,
f
o
r
m
o
r
e
ef
f
ec
tiv
e
v
id
eo
ca
p
tio
n
in
g
with
in
th
e
co
n
te
x
t
o
f
SLR,
s
o
p
h
is
ticated
m
o
d
el
ar
ch
itectu
r
es
ar
e
im
p
o
r
tan
t.
S
L
R
in
v
o
lv
es
co
m
p
lex
la
n
g
u
a
g
e
r
ep
r
esen
tatio
n
.
Xu
et
al.
'
s
d
ee
p
r
ein
f
o
r
ce
m
e
n
t
lear
n
in
g
a
n
d
g
r
am
m
a
r
ch
ec
k
in
g
n
etwo
r
k
s
ad
v
an
ce
v
id
e
o
ca
p
tio
n
i
n
g
,
o
p
tim
izin
g
p
er
f
o
r
m
an
ce
f
o
r
s
ig
n
lan
g
u
ag
e
g
estu
r
es.
Yas
in
'
s
u
s
e
o
f
w
o
r
d
em
b
e
d
d
in
g
tech
n
i
q
u
es
ad
d
s
a
lin
g
u
is
tic
p
er
s
p
ec
tiv
e,
en
h
a
n
cin
g
s
ce
n
e
o
b
ject
id
en
tific
atio
n
b
y
co
n
s
id
er
in
g
co
n
tex
t.
T
h
ese
s
o
p
h
is
ticated
m
o
d
el
ar
ch
itectu
r
es
r
ef
lect
ev
o
lv
in
g
m
ac
h
in
e
lear
n
in
g
tr
en
d
s
,
ad
d
r
ess
in
g
m
u
ltifa
ce
ted
ch
allen
g
es
.
Nab
ati
an
d
B
eh
r
ad
[
5
]
in
tr
o
d
u
ce
d
a
s
tate
-
of
-
th
e
-
ar
t
ar
ch
itectu
r
e
ch
ar
a
cter
ized
b
y
p
ar
allel
p
r
o
ce
s
s
in
g
,
L
STM
n
etwo
r
k
s
,
an
d
iter
ativ
e
tr
ain
in
g
m
eth
o
d
s
r
em
in
is
ce
n
t
o
f
Ad
aBo
o
s
t.
T
h
r
o
u
g
h
r
ig
o
r
o
u
s
ex
p
er
im
e
n
tal
test
s
,
th
eir
ap
p
r
o
ac
h
s
h
o
wca
s
es
ex
ce
p
tio
n
al
p
r
o
wess
,
o
f
f
er
i
n
g
p
r
o
m
is
in
g
s
ca
lab
ilit
y
,
v
er
s
at
ilit
y
,
an
d
en
h
a
n
ce
d
tex
t
-
im
ag
e
lin
k
ag
e
f
ac
ilit
ated
b
y
en
co
d
er
-
d
ec
o
d
er
m
o
d
els.
T
h
e
p
r
o
p
o
s
ed
m
o
d
el
h
o
l
d
s
p
o
ten
tial
f
o
r
b
r
o
ad
er
ap
p
licatio
n
s
,
as it stan
d
s
at
th
e
f
o
r
ef
r
o
n
t o
f
ad
v
an
ce
m
en
ts
in
v
id
eo
ca
p
tio
n
in
g
tec
h
n
o
lo
g
y
.
C
h
o
h
an
et
a
l
.
[
6
]
co
n
tr
ib
u
te
d
to
th
e
f
ield
b
y
e
x
p
lo
r
i
n
g
i
m
ag
e
ca
p
tio
n
in
g
tec
h
n
iq
u
es
em
p
lo
y
in
g
en
co
d
er
-
d
ec
o
d
er
m
o
d
els
an
d
atten
tio
n
m
ec
h
an
is
m
s
.
T
h
eir
wo
r
k
in
clu
d
es
a
co
m
p
r
eh
en
s
iv
e
an
aly
s
is
,
s
u
g
g
esti
n
g
d
iv
e
r
s
e
ap
p
licati
o
n
s
ac
r
o
s
s
v
ar
io
u
s
d
o
m
ain
s
s
u
ch
as
m
ed
ical,
in
d
u
s
tr
y
,
ag
r
icu
ltu
r
e,
an
d
en
ter
tain
m
en
t.
T
h
e
ex
p
lo
r
atio
n
o
f
en
c
o
d
er
-
d
ec
o
d
er
m
o
d
els
an
d
atten
tio
n
m
ec
h
an
is
m
s
n
o
t
o
n
ly
en
h
an
ce
s
th
e
u
n
d
er
s
tan
d
i
n
g
o
f
im
a
g
e
ca
p
ti
o
n
in
g
p
r
o
ce
s
s
es
b
u
t
also
o
p
en
s
av
en
u
es
f
o
r
in
n
o
v
ativ
e
ap
p
licatio
n
s
in
d
if
f
er
en
t
s
ec
to
r
s
,
u
n
d
er
lin
in
g
th
e
v
er
s
atility
o
f
th
eir
p
r
o
p
o
s
ed
m
eth
o
d
o
lo
g
y
.
Mu
n
et
a
l
.
[
7
]
p
r
o
p
o
s
ed
a
n
o
v
el
v
id
eo
ca
p
tio
n
i
n
g
m
eth
o
d
th
at
p
lace
s
em
p
h
asis
o
n
tem
p
o
r
al
f
ea
tu
r
es
an
d
co
h
er
en
t
f
ea
t
u
r
e
m
atch
i
n
g
.
L
ev
e
r
ag
in
g
r
ein
f
o
r
ce
m
e
n
t
lear
n
in
g
with
ev
en
t
-
o
r
ien
ted
s
eq
u
en
ce
s
f
o
r
tr
ain
in
g
,
th
eir
ap
p
r
o
ac
h
b
ette
r
p
er
f
o
r
m
an
ce
o
n
th
e
Activ
ity
Net
c
a
p
tio
n
s
d
ataset.
B
y
p
r
io
r
itizin
g
tem
p
o
r
al
d
y
n
am
ics
a
n
d
co
h
er
e
n
t
f
ea
t
u
r
e
alig
n
m
e
n
t,
Mu
n
et
al.
'
s
m
eth
o
d
co
n
tr
i
b
u
tes
s
ig
n
if
ican
tly
t
o
th
e
im
p
r
o
v
em
e
n
t
o
f
v
id
e
o
ca
p
tio
n
i
n
g
p
r
ec
is
io
n
.
T
h
e
in
co
r
p
o
r
atio
n
o
f
r
ein
f
o
r
ce
m
e
n
t
lear
n
in
g
f
u
r
t
h
e
r
u
n
d
e
r
s
co
r
es
th
e
ad
ap
tab
ili
ty
o
f
t
h
eir
m
o
d
el
to
d
iv
er
s
e
v
id
eo
d
atasets
an
d
s
ce
n
ar
io
s
.
Xiao
an
d
Sh
i
[
8
]
d
elv
e
in
to
th
e
r
ea
lm
o
f
v
id
eo
ca
p
tio
n
i
n
g
th
r
o
u
g
h
t
h
e
len
s
o
f
d
ee
p
lear
n
in
g
,
s
p
ec
if
ically
ex
p
lo
r
in
g
th
e
in
teg
r
atio
n
o
f
C
NN
m
o
d
els
u
s
in
g
a
g
en
e
r
ativ
e
ad
v
er
s
ar
ia
l
ap
p
r
o
ac
h
.
T
h
eir
in
n
o
v
ativ
e
ap
p
r
o
ac
h
s
u
g
g
ests
a
s
ig
n
if
ican
t
d
ep
ar
tu
r
e
f
r
o
m
tr
ad
itio
n
al
m
eth
o
d
o
lo
g
ies,
in
d
ic
atin
g
th
e
p
o
ten
tial
o
f
a
d
v
er
s
ar
ial
tec
h
n
iq
u
es
in
e
n
h
an
cin
g
th
e
g
e
n
er
atio
n
o
f
d
e
s
cr
ip
tiv
e
v
id
e
o
ca
p
tio
n
s
.
T
h
e
u
tili
za
tio
n
o
f
d
ee
p
lear
n
in
g
tech
n
i
q
u
es,
p
ar
ticu
lar
ly
in
th
e
co
n
tex
t
o
f
g
en
er
ativ
e
ad
v
er
s
ar
ial
n
etwo
r
k
s
,
r
ef
lect
s
a
co
m
m
itm
en
t
to
p
u
s
h
in
g
th
e
b
o
u
n
d
ar
ies o
f
v
id
eo
ca
p
tio
n
in
g
tech
n
o
lo
g
y
.
Gu
o
et
a
l
.
[
9
]
co
n
tr
i
b
u
te
to
th
e
f
ield
b
y
in
c
o
r
p
o
r
atin
g
a
s
em
an
tic
g
u
id
an
ce
n
etwo
r
k
,
f
o
cu
s
in
g
o
n
k
ey
f
r
am
es
d
u
r
in
g
t
h
e
tr
ain
i
n
g
o
f
tar
g
et
tex
t
d
escr
ip
tio
n
s
.
T
h
e
in
clu
s
io
n
o
f
s
em
an
tic
g
u
id
a
n
ce
ad
d
s
a
lay
er
o
f
p
r
ec
is
io
n
to
th
e
ca
p
tio
n
in
g
p
r
o
ce
s
s
,
d
em
o
n
s
tr
atin
g
an
u
n
d
er
s
tan
d
in
g
o
f
th
e
im
p
o
r
ta
n
ce
o
f
co
n
tex
t
an
d
s
em
an
tic
r
elev
an
ce
in
g
en
e
r
atin
g
ac
cu
r
ate
a
n
d
m
ea
n
in
g
f
u
l
v
id
eo
ca
p
tio
n
s
.
B
y
em
p
h
asizin
g
k
ey
f
r
am
es,
th
eir
ap
p
r
o
ac
h
alig
n
s
with
th
e
s
elec
tiv
e
atten
tio
n
m
ec
h
a
n
is
m
s
cr
itical
f
o
r
ef
f
ec
tiv
e
v
id
eo
d
escr
ip
tio
n
g
en
er
atio
n
.
T
h
e
s
tr
ateg
y
o
f
u
p
d
atin
g
m
o
d
el
with
ad
d
itio
n
o
f
n
ew
d
ata
v
ec
to
r
s
f
o
r
im
p
r
o
v
em
e
n
t
o
f
p
er
f
o
r
m
a
n
ce
o
f
th
e
m
o
d
el,
Fu
jii
et
a
l
.
[
1
0
]
p
r
o
p
o
s
ed
a
m
eth
o
d
.
Par
ticu
lar
ly
with
in
th
e
f
r
am
ewo
r
k
o
f
en
co
d
er
-
d
ec
o
d
e
r
-
b
ased
m
o
d
els
in
s
u
p
er
v
is
ed
l
ea
r
n
in
g
th
e
s
tr
ateg
y
is
ev
alu
a
ted
.
T
h
is
f
o
r
war
d
-
th
in
k
in
g
ap
p
r
o
ac
h
allo
ws
t
h
e
m
o
d
el
to
ad
ap
t
an
d
e
v
o
lv
e
wi
th
th
e
in
clu
s
io
n
o
f
n
ew
in
f
o
r
m
atio
n
,
en
s
u
r
in
g
a
co
n
tin
u
o
u
s
im
p
r
o
v
em
e
n
t
in
th
e
ac
cu
r
ac
y
an
d
r
elev
an
ce
o
f
v
i
d
eo
ca
p
tio
n
s
.
T
h
e
em
p
h
asis
o
n
s
u
p
er
v
is
ed
lear
n
in
g
f
u
r
th
er
u
n
d
er
s
co
r
es
th
e
co
m
m
itm
en
t to
r
ef
in
in
g
m
o
d
el
s
th
r
o
u
g
h
m
eticu
l
o
u
s
tr
ain
i
n
g
an
d
d
ata
u
tili
za
tio
n
.
I
n
th
e
ca
p
tio
n
in
g
o
f
v
i
d
eo
s
,
Z
h
an
g
et
a
l
.
[
1
1
]
in
tr
o
d
u
ce
d
t
h
e
cr
o
s
s
-
m
o
d
al
co
m
m
o
n
s
en
s
e
r
ea
s
o
n
in
g
(
C
MC
R
)
m
o
d
el.
T
h
is
m
o
d
el
i
n
co
r
p
o
r
ates
a
cr
o
s
s
-
m
o
d
al
m
o
d
u
le,
co
m
m
o
n
s
en
s
e
r
ea
s
o
n
i
n
g
,
an
d
a
n
in
n
o
v
ativ
e
ev
en
t
r
ef
ac
t
o
r
in
g
m
ec
h
a
n
is
m
.
B
y
co
m
b
in
in
g
th
ese
elem
e
n
ts
,
th
e
C
MCR
m
o
d
el
o
f
f
er
s
a
co
m
p
r
eh
en
s
iv
e
ap
p
r
o
ac
h
to
v
id
eo
ca
p
tio
n
in
g
,
ad
d
r
ess
in
g
n
o
t
o
n
l
y
th
e
m
o
d
ality
ch
allen
g
es
b
u
t
also
i
n
f
u
s
in
g
co
m
m
o
n
s
en
s
e
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
d
o
n
esian
J
E
lec
E
n
g
&
C
o
m
p
Sci
I
SS
N:
2
5
0
2
-
4
7
52
R
N
N
-
d
r
iven
in
teg
r
a
tio
n
o
f sp
a
tia
l,
temp
o
r
a
l,
fea
tu
r
es fo
r
in
d
ia
n
s
ig
n
la
n
g
u
a
g
e
…
(
A
ja
y
M.
P
o
l
)
823
r
ea
s
o
n
in
g
f
o
r
a
m
o
r
e
n
u
an
ce
d
u
n
d
er
s
tan
d
in
g
o
f
v
id
eo
co
n
ten
t.
T
h
e
in
tr
o
d
u
ctio
n
o
f
ev
en
t
r
ef
ac
to
r
in
g
f
u
r
th
er
d
is
tin
g
u
is
h
es th
eir
ap
p
r
o
ac
h
,
s
h
o
wca
s
in
g
a
h
o
lis
tic
s
tr
ateg
y
f
o
r
im
p
r
o
v
ed
p
er
f
o
r
m
an
ce
.
I
n
th
e
wo
r
k
b
y
T
aten
o
et
a
l
.
[
1
2
]
de
v
is
ed
a
r
ef
er
en
ce
p
y
r
a
m
id
n
etwo
r
k
in
te
g
r
ated
with
R
esNet
1
5
2
f
o
r
p
r
ec
is
e
wo
r
d
r
ec
o
g
n
itio
n
i
n
s
ig
n
lan
g
u
a
g
e
v
id
e
o
s
.
T
h
is
in
n
o
v
ativ
e
m
o
d
el
p
lace
s
p
ar
ti
cu
lar
em
p
h
asis
o
n
h
ar
n
ess
in
g
th
e
ch
ar
ac
ter
is
tics
o
f
o
b
ject
m
o
tio
n
v
ec
t
o
r
s
,
en
h
a
n
cin
g
th
e
ac
cu
r
ac
y
o
f
s
ig
n
lan
g
u
ag
e
r
ec
o
g
n
itio
n
.
B
y
f
u
s
in
g
t
h
e
r
o
b
u
s
t
ca
p
ab
il
ities
o
f
R
esNet
1
5
2
with
th
e
s
tr
ateg
ic
ar
ch
itectu
r
e
o
f
a
r
ef
er
en
ce
p
y
r
am
id
n
etwo
r
k
,
L
i
u
et
al.
cr
ea
te
a
s
y
n
er
g
is
tic
f
r
am
ewo
r
k
th
at
a
d
ep
tly
ca
p
tu
r
es
an
d
in
ter
p
r
ets
t
h
e
in
tr
icate
d
y
n
a
m
ics
o
f
s
ig
n
lan
g
u
ag
e
g
estu
r
es.
T
h
is
ap
p
r
o
ac
h
n
o
t
o
n
ly
c
o
n
tr
ib
u
tes
to
h
eig
h
ten
ed
r
ec
o
g
n
itio
n
ac
cu
r
ac
y
b
u
t
also
s
ig
n
if
ies a
n
o
tewo
r
th
y
ad
v
a
n
c
em
en
t in
th
e
f
iel
d
o
f
s
ig
n
lan
g
u
ag
e
v
id
e
o
p
r
o
ce
s
s
in
g
.
On
e
n
o
tab
le
co
n
tr
i
b
u
tio
n
c
o
m
es
f
r
o
m
L
i
et
a
l
.
[
1
3
]
,
wh
o
em
p
lo
y
ed
a
d
ee
p
lear
n
in
g
m
o
d
el
u
tili
zin
g
a
tem
p
o
r
al
f
e
atu
r
es
-
b
ased
a
p
p
r
o
ac
h
.
T
h
is
in
n
o
v
ativ
e
m
eth
o
d
i
n
v
o
lv
es
e
x
tr
ac
tin
g
g
r
ap
h
-
b
ase
d
tem
p
o
r
al
f
ea
tu
r
es
to
p
r
o
ce
s
s
s
ig
n
lan
g
u
ag
e
v
i
d
e
o
s
,
p
r
o
v
id
in
g
a
m
o
r
e
n
u
an
ce
d
u
n
d
er
s
tan
d
i
n
g
o
f
th
e
tem
p
o
r
al
d
y
n
am
ics
in
h
e
r
en
t
in
s
ig
n
lan
g
u
ag
e
c
o
m
m
u
n
icatio
n
.
L
i
et
a
l
.
[
1
3
]
,
ap
p
r
o
ac
h
r
e
co
g
n
izes
th
e
im
p
o
r
tan
ce
o
f
te
m
p
o
r
al
in
f
o
r
m
atio
n
in
s
ig
n
lan
g
u
ag
e,
wh
er
e
th
e
s
eq
u
en
cin
g
an
d
d
u
r
atio
n
o
f
g
est
u
r
es p
lay
a
cr
u
cial
r
o
le
in
co
n
v
ey
in
g
m
ea
n
i
n
g
.
B
y
in
co
r
p
o
r
atin
g
g
r
ap
h
-
b
ased
tem
p
o
r
al
f
ea
tu
r
es
in
to
th
eir
d
ee
p
lear
n
in
g
m
o
d
el,
t
h
ey
e
n
h
an
c
e
th
e
s
y
s
tem
'
s
ab
ilit
y
to
ca
p
tu
r
e
an
d
in
ter
p
r
et
th
e
s
u
b
tle
in
tr
icac
ies
o
f
s
ig
n
lan
g
u
a
g
e
m
o
v
em
e
n
ts
o
v
er
tim
e.
T
h
i
s
n
o
t
o
n
ly
im
p
r
o
v
es
th
e
ac
cu
r
ac
y
o
f
s
ig
n
lan
g
u
ag
e
r
ec
o
g
n
itio
n
b
u
t
also
en
a
b
les
th
e
m
o
d
el
t
o
b
etter
h
an
d
le
v
a
r
iatio
n
s
in
s
ig
n
in
g
s
p
ee
d
an
d
r
h
y
th
m
.
B
u
ild
in
g
u
p
o
n
th
is
f
o
u
n
d
atio
n
,
Ma
h
y
o
u
b
et
a
l
.
[
1
4
]
ad
o
p
ted
a
h
y
b
r
id
m
o
d
el
th
at
co
m
b
in
e
s
VGG1
6
with
g
ated
r
ec
u
r
r
en
t
u
n
its
(
G
R
U)
f
o
r
b
o
th
tr
a
n
s
f
o
r
m
e
r
en
co
d
er
an
d
d
ec
o
d
er
m
o
d
u
les.
T
h
is
in
teg
r
atio
n
o
f
co
n
v
o
l
u
tio
n
al
n
e
u
r
al
n
etwo
r
k
s
(
C
NNs)
with
r
ec
u
r
r
e
n
t
n
eu
r
al
n
etwo
r
k
s
(
R
NNs)
allo
ws
th
e
m
o
d
el
to
ef
f
ec
tiv
ely
ca
p
tu
r
e
s
p
atial
f
ea
tu
r
es
f
r
o
m
v
id
e
o
f
r
am
es
u
s
in
g
VGG1
6
wh
ile
lev
er
ag
in
g
th
e
tem
p
o
r
al
d
ep
en
d
e
n
cies
en
co
d
ed
b
y
G
R
U.
T
h
e
tr
a
n
s
f
o
r
m
er
ar
c
h
itectu
r
e
f
u
r
th
er
r
e
f
in
es
th
e
r
e
p
r
esen
tatio
n
o
f
s
ig
n
lan
g
u
ag
e
g
estu
r
es,
en
h
a
n
cin
g
t
h
e
o
v
er
all
p
er
f
o
r
m
an
ce
o
f
th
e
r
ec
o
g
n
itio
n
s
y
s
tem
.
Ad
d
itio
n
ally
,
Xu
et
a
l
.
[
1
5
]
in
tr
o
d
u
ce
d
th
e
u
s
e
o
f
R
esNet
a
s
a
k
er
n
el
ch
o
ice
in
th
eir
r
esear
ch
o
n
s
ig
n
lan
g
u
ag
e
r
ec
o
g
n
itio
n
.
R
esNet,
k
n
o
w
n
f
o
r
its
d
ee
p
r
esid
u
al
lear
n
in
g
ca
p
ab
ilit
ies
[
1
6
]
,
[
1
7
]
,
o
f
f
er
s
a
r
o
b
u
s
t
f
r
am
ewo
r
k
f
o
r
ca
p
tu
r
in
g
h
ier
ar
ch
ical
f
ea
tu
r
es
in
co
m
p
lex
d
atasets
.
B
y
in
co
r
p
o
r
atin
g
R
esNet
as
a
k
er
n
el
,
Xu
et
a
l
.
[
1
5
]
en
h
an
ce
th
e
m
o
d
el'
s
ab
ilit
y
to
lear
n
an
d
r
ep
r
esen
t
in
tr
ic
ate
p
atter
n
s
in
s
ig
n
lan
g
u
ag
e
v
id
eo
s
,
co
n
tr
ib
u
tin
g
to
im
p
r
o
v
e
d
r
ec
o
g
n
itio
n
ac
cu
r
ac
y
.
T
h
is
p
ap
er
ev
alu
ates
s
ig
n
lan
g
u
ag
e
r
ec
o
g
n
itio
n
b
y
ex
tr
ac
tin
g
lo
ca
l
an
d
g
lo
b
al
(
s
p
atial
an
d
tem
p
o
r
al)
f
ea
tu
r
es
u
s
in
g
tr
a
n
s
f
er
lear
n
in
g
.
Var
io
u
s
C
NN
m
o
d
els,
in
clu
d
in
g
VGGN
et
s
,
R
es
Nets,
I
n
ce
p
tio
n
,
Den
s
eNe
t,
an
d
M
o
b
ileNet,
ar
e
co
m
p
ar
e
d
.
T
h
e
m
ai
n
o
b
jectiv
e
is
t
o
an
aly
ze
th
e
tem
p
o
r
al
d
y
n
a
m
ics'
in
ter
p
r
etatio
n
ca
p
ab
ilit
y
f
o
r
s
ig
n
lan
g
u
ag
e
wo
r
d
s
.
Mo
tio
n
v
ec
to
r
f
ea
tu
r
e
s
'
s
ig
n
if
ican
ce
in
v
id
e
o
p
r
o
ce
s
s
in
g
is
h
ig
h
lig
h
ted
.
T
h
e
p
r
o
p
o
s
ed
m
o
d
el
in
teg
r
ates
s
p
atial
f
ea
tu
r
es
f
r
o
m
r
esid
u
al
b
lo
ck
s
with
tem
p
o
r
al
f
ea
tu
r
e
s
f
r
o
m
f
ast
f
o
u
r
ier
tr
an
s
f
o
r
m
(
FF
T
)
,
ca
p
tu
r
in
g
b
o
th
lo
ca
l
an
d
g
lo
b
al
p
atter
n
s
to
en
h
an
ce
ac
cu
r
ac
y
a
n
d
r
o
b
u
s
tn
ess
in
I
n
d
ian
Sig
n
L
an
g
u
ag
e
r
ec
o
g
n
itio
n
.
−
E
n
h
an
ce
d
R
NN
a
r
ch
itectu
r
e:
T
h
e
p
ap
er
in
tr
o
d
u
ce
s
a
s
o
p
h
is
ticated
r
ec
u
r
r
en
t
n
e
u
r
al
n
etwo
r
k
(
R
NN)
ar
ch
itectu
r
e
co
m
p
r
is
in
g
b
id
ir
ec
tio
n
al
lo
n
g
s
h
o
r
t
-
ter
m
m
e
m
o
r
y
(
B
iLST
M)
,
GR
U
lay
er
s
,
an
d
m
u
lti
-
h
ea
d
atten
tio
n
m
ec
h
an
is
m
.
T
h
is
ar
ch
itectu
r
e
en
ab
les
th
e
m
o
d
el
to
ef
f
ec
tiv
ely
ca
p
tu
r
e
tem
p
o
r
al
d
ep
e
n
d
e
n
cies
an
d
co
n
tex
t
u
al
in
f
o
r
m
atio
n
f
r
o
m
v
id
e
o
f
r
am
e
s
eq
u
e
n
ce
s
,
f
ac
ilit
atin
g
m
o
r
e
ac
cu
r
ate
a
n
d
co
n
tex
t
u
ally
r
elev
an
t sig
n
lan
g
u
ag
e
i
n
ter
p
r
etatio
n
.
−
Su
p
er
io
r
p
e
r
f
o
r
m
an
ce
an
d
ev
a
lu
atio
n
m
etr
ics:
T
h
r
o
u
g
h
e
x
te
n
s
iv
e
ex
p
er
im
en
tatio
n
o
n
b
en
ch
m
ar
k
d
atasets
s
u
ch
as
W
L
A
SL
an
d
I
NC
L
U
DE
,
th
e
p
r
o
p
o
s
ed
m
o
d
el
ac
h
ie
v
es
ex
ce
p
tio
n
al
ac
cu
r
ac
y
,
s
u
r
p
ass
in
g
s
tan
d
ar
d
C
NN
p
r
etr
ain
ed
m
o
d
els
in
f
ea
tu
r
e
ex
tr
ac
tio
n
.
Ad
d
itio
n
a
lly
,
th
e
m
o
d
el'
s
p
er
f
o
r
m
an
c
e
is
r
ig
o
r
o
u
s
ly
ev
alu
ated
u
s
in
g
B
L
E
U
s
co
r
e
an
aly
s
is
,
p
r
o
v
id
in
g
q
u
an
ti
tativ
e
in
s
ig
h
ts
i
n
to
th
e
alig
n
m
en
t
b
etwe
en
g
en
er
ated
ca
p
tio
n
s
a
n
d
h
u
m
a
n
r
ef
e
r
en
ce
s
in
v
id
e
o
ca
p
ti
o
n
in
g
task
s
.
T
h
ese
co
n
tr
ib
u
tio
n
s
ad
v
an
ce
th
e
s
tate
-
of
-
th
e
-
ar
t
in
s
ig
n
lan
g
u
a
g
e
r
ec
o
g
n
itio
n
a
n
d
v
id
eo
ca
p
tio
n
in
g
,
f
o
s
ter
in
g
im
p
r
o
v
ed
a
cc
ess
ib
ilit
y
an
d
co
m
m
u
n
icatio
n
f
o
r
th
e
d
ea
f
an
d
h
ar
d
-
of
-
h
ea
r
in
g
co
m
m
u
n
itie
s
.
Fu
r
h
ter
in
th
e
ar
ticle,
s
ec
tio
n
2
s
h
o
ws
th
e
p
r
o
p
o
s
ed
m
eth
o
d
o
lo
g
y
,
s
ec
tio
n
3
co
v
e
r
s
th
e
r
e
s
u
lts
an
d
an
aly
s
is
an
d
s
ec
tio
n
4
c
o
n
clu
d
es th
e
p
a
p
er
.
2.
P
RO
P
O
SE
D
M
E
T
H
O
D
T
h
e
p
r
o
p
o
s
ed
m
o
d
el
co
m
b
in
es
FF
T
an
d
r
esid
u
al
b
lo
ck
-
b
ased
C
NN
f
o
r
v
i
d
eo
f
ea
tu
r
e
ex
tr
ac
tio
n
a
l
o
n
g
s
i
d
e
B
i
L
S
T
M
a
n
d
G
R
U
m
o
d
e
l
s
f
o
r
s
i
g
n
l
a
n
g
u
a
g
e
i
n
t
e
r
p
r
e
t
a
t
i
o
n
.
I
n
t
h
e
d
e
p
i
c
t
e
d
b
l
o
c
k
d
i
a
g
r
a
m
F
i
g
u
r
e
1
,
in
p
u
t
v
id
eo
s
with
p
air
ed
s
ig
n
lan
g
u
ag
e
in
ter
p
r
etatio
n
s
u
n
d
er
g
o
p
r
o
ce
s
s
in
g
.
Vid
eo
s
ar
e
f
ed
in
to
th
e
C
NN
m
o
d
el
to
ex
tr
ac
t
lo
ca
l
an
d
g
lo
b
al
f
ea
tu
r
es
f
r
o
m
f
r
am
es.
Simu
ltan
eo
u
s
ly
,
tex
t
d
ata
is
to
k
en
ized
f
o
r
p
r
ep
r
o
ce
s
s
in
g
.
T
h
e
ex
tr
ac
ted
f
ea
tu
r
es
an
d
to
k
en
ized
tex
t
v
ec
to
r
s
s
er
v
e
as
in
p
u
ts
f
o
r
th
e
B
iLST
M
an
d
G
R
U
m
o
d
els
[
1
8
]
.
Du
r
in
g
test
in
g
,
v
id
eo
f
ea
tu
r
es
ar
e
s
im
ilar
ly
ex
tr
ac
ted
,
an
d
th
e
m
o
d
el
p
r
ed
icts
s
ig
n
lan
g
u
ag
e
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
5
0
2
-
4
7
52
In
d
o
n
esian
J
E
lec
E
n
g
&
C
o
m
p
Sci
,
Vo
l.
3
8
,
No
.
2
,
May
20
2
5
:
82
1
-
8
2
9
824
in
ter
p
r
etatio
n
s
.
Pre
d
ictio
n
s
ar
e
co
m
p
a
r
ed
with
g
r
o
u
n
d
t
r
u
t
h
f
o
r
e
v
alu
atio
n
.
T
h
is
c
o
m
p
r
eh
en
s
iv
e
ap
p
r
o
ac
h
lev
er
ag
es
b
o
th
v
is
u
al
an
d
tex
t
u
al
cu
es
f
o
r
ac
c
u
r
ate
s
ig
n
la
n
g
u
ag
e
r
ec
o
g
n
itio
n
,
p
r
o
m
is
in
g
r
o
b
u
s
t
p
er
f
o
r
m
a
n
ce
ac
r
o
s
s
v
ar
io
u
s
s
ce
n
ar
io
s
.
Fig
u
r
e
1
.
Stag
es in
v
o
lv
ed
i
n
th
e
p
r
o
p
o
s
ed
s
y
s
tem
f
r
am
ewo
r
k
2
.
1
.
Video
f
e
a
t
ure
ex
t
r
a
ct
io
n
T
h
e
p
r
o
p
o
s
ed
C
NN
ar
ch
itectu
r
e
f
o
r
e
x
tr
ac
tio
n
o
f
f
ea
tu
r
es
f
r
o
m
v
id
e
o
f
r
am
es
is
s
h
o
wn
i
n
Fig
u
r
e
2
.
T
h
e
m
o
d
el
co
n
s
is
ts
o
f
4
r
esid
u
al
b
lo
ck
s
in
p
a
r
allel
to
FF
T
f
ea
tu
r
es
[1
9]
.
T
h
e
co
m
b
in
ed
f
ea
tu
r
es
p
r
o
v
id
e
im
p
o
r
tan
t
f
ea
tu
r
es
i
n
ter
m
s
o
f
lo
ca
l
a
n
d
g
lo
b
al.
L
et
R
i
b
e
th
e
o
u
tp
u
t
o
f
r
esid
u
al
b
lo
ck
.
T
h
u
s
,
in
g
e
n
er
alize
d
v
iew
it c
an
b
e
r
e
p
r
esen
ted
as (
1
)
:
=
(
)
(
1
)
W
h
er
e,
v
i
is
th
e
v
ec
to
r
o
f
p
ix
el
s
f
r
o
m
i
th
f
r
am
e
.
Similar
ly
,
in
p
ar
allel
b
r
an
c
h
o
f
th
e
m
o
d
el,
th
e
FF
T
f
ea
tu
r
es
ar
e
ex
tr
ac
ted
.
T
h
e
g
en
er
alize
d
eq
u
atio
n
f
o
r
FF
T
f
ea
tu
r
es c
an
b
e
r
ep
r
esen
ted
as (
2
)
:
(
)
=
(
)
(
2
)
T
h
e
o
u
tp
u
t
o
b
tain
ed
f
r
o
m
r
esi
d
u
al
b
lo
c
k
a
n
d
FF
T
b
r
an
ch
ar
e
th
en
c
o
n
ca
ten
ated
wh
ich
co
m
b
in
es
b
o
t
h
s
p
tial
an
d
tem
p
o
r
al
f
ea
tu
r
es.
T
h
u
s
,
c
o
m
b
in
ed
f
ea
tu
r
es c
an
b
e
ex
p
r
e
s
s
ed
as
(
3
)
,
_
=
(
4
,
(
)
)
_
=
(
4
,
(
)
)
(
3
)
Fig
u
r
e
2
.
Vid
e
o
f
ea
tu
r
es e
x
tr
a
ctio
n
m
o
d
el
2
.
2
.
T
ra
ini
ng
o
f
m
o
del us
ing
v
ideo
f
ea
t
ures a
nd
t
ex
t
T
h
e
tr
ain
in
g
p
r
o
ce
s
s
in
v
o
lv
es
a
s
y
s
tem
atic
ap
p
r
o
ac
h
to
lev
er
ag
e
v
i
d
eo
f
ea
tu
r
es
a
n
d
co
r
r
esp
o
n
d
in
g
tex
tu
al
lab
els
f
o
r
SLR.
T
h
is
s
ec
tio
n
o
u
tlin
es
t
h
e
m
ain
co
m
p
o
n
en
ts
an
d
m
eth
o
d
o
lo
g
ies
ap
p
lied
in
tr
ain
i
n
g
th
e
m
o
d
el,
wh
ich
co
n
s
is
ts
o
f
th
r
ee
cr
itical
p
h
ases
:
f
ea
tu
r
e
ex
tr
ac
tio
n
,
tem
p
o
r
al
d
ep
e
n
d
en
c
y
m
o
d
elin
g
,
an
d
lab
el
p
r
ed
ictio
n
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
d
o
n
esian
J
E
lec
E
n
g
&
C
o
m
p
Sci
I
SS
N:
2
5
0
2
-
4
7
52
R
N
N
-
d
r
iven
in
teg
r
a
tio
n
o
f sp
a
tia
l,
temp
o
r
a
l,
fea
tu
r
es fo
r
in
d
ia
n
s
ig
n
la
n
g
u
a
g
e
…
(
A
ja
y
M.
P
o
l
)
825
I
n
p
u
t V
id
eo
Featu
r
es: A
v
id
eo
s
eq
u
en
ce
o
f
s
ig
n
lan
g
u
ag
e
g
e
s
tu
r
es is
p
r
o
ce
s
s
ed
u
s
in
g
a
h
y
b
r
id
m
o
d
el
co
m
b
in
in
g
C
NN
an
d
FF
T
.
T
h
is
h
y
b
r
id
ap
p
r
o
ac
h
,
r
ep
r
esen
ted
as
−
,
ex
tr
ac
ts
m
ea
n
in
g
f
u
l
s
p
atial
an
d
f
r
eq
u
e
n
cy
-
d
o
m
ain
f
ea
tu
r
es f
r
o
m
th
e
v
id
eo
,
r
esu
ltin
g
in
a
f
ea
tu
r
e
m
atr
i
x
.
T
h
e
co
m
b
in
atio
n
o
f
C
NN
an
d
FF
T
allo
ws th
e
m
o
d
el
to
ca
p
tu
r
e
b
o
th
s
p
atial
p
atter
n
s
an
d
s
p
ec
tr
al
in
f
o
r
m
atio
n
,
wh
ich
a
r
e
ess
en
tial f
o
r
r
ec
o
g
n
izin
g
d
y
n
am
ic
g
estu
r
es in
s
ig
n
lan
g
u
ag
e.
B
iLST
M
an
d
GR
U
Mo
d
els:
T
h
e
ex
tr
ac
ted
f
ea
tu
r
e
m
atr
ix
is
f
ed
in
to
B
iLST
M
an
d
GR
U
ar
ch
itectu
r
es
to
ca
p
tu
r
e
tem
p
o
r
al
d
e
p
en
d
e
n
cies
in
th
e
v
id
eo
s
eq
u
en
ce
.
T
h
ese
ar
ch
it
ec
tu
r
es
p
r
o
ce
s
s
th
e
f
ea
tu
r
es
b
id
ir
ec
tio
n
ally
to
ac
co
u
n
t
f
o
r
b
o
th
p
ast
an
d
f
u
t
u
r
e
co
n
tex
t,
r
esu
ltin
g
in
o
u
tp
u
t
r
ep
r
esen
tatio
n
s
d
en
o
ted
as
an
d
,
r
esp
ec
tiv
ely
:
=
(
)
(
4
)
=
(
)
(5)
T
h
ese
o
u
tp
u
ts
s
er
v
e
as h
ig
h
-
le
v
el
tem
p
o
r
al
r
ep
r
esen
tatio
n
s
o
f
th
e
v
id
e
o
d
ata.
-
SLR
L
ab
els:
T
h
e
ex
tr
ac
ted
f
ea
tu
r
es
an
d
tem
p
o
r
al
r
e
p
r
esen
tatio
n
s
ar
e
m
ap
p
e
d
to
co
r
r
esp
o
n
d
in
g
s
ig
n
lan
g
u
ag
e
la
b
els,
d
en
o
ted
as
.
T
h
ese
lab
els r
ep
r
esen
t th
e
g
r
o
u
n
d
tr
u
th
f
o
r
s
ig
n
g
estu
r
es in
th
e
v
id
eo
d
ataset.
-
T
r
a
i
n
i
n
g
:
T
h
e
t
r
a
i
n
i
n
g
p
h
as
e
in
v
o
l
v
e
s
o
p
t
i
m
iz
i
n
g
m
o
d
e
l
p
a
r
a
m
e
t
e
r
s
f
o
r
t
h
e
C
N
N
-
FF
T
h
y
b
r
i
d
,
B
i
L
S
T
M
,
a
n
d
G
R
U
a
r
c
h
it
e
c
t
u
r
es
j
o
i
n
t
l
y
.
A
co
m
b
i
n
e
d
l
o
s
s
f
u
n
c
ti
o
n
,
s
u
c
h
a
s
c
a
t
e
g
o
r
i
c
al
c
r
o
s
s
-
e
n
t
r
o
p
y
,
i
s
m
i
n
i
m
i
z
e
d
o
v
e
r
t
h
e
t
r
a
i
n
i
n
g
d
a
t
as
e
t
u
s
i
n
g
s
t
o
c
h
as
t
ic
g
r
a
d
i
e
n
t
d
es
c
e
n
t
(
S
GD
)
o
r
t
h
e
A
d
a
m
o
p
t
i
m
i
z
e
r
.
T
h
e
S
L
R
p
r
e
d
i
c
t
i
o
n
s
,
d
e
n
o
t
e
d
a
s
(
,
,
)
,
a
r
e
c
o
m
p
a
r
e
d
a
g
a
i
n
s
t
t
h
e
g
r
o
u
n
d
t
r
u
t
h
l
a
b
e
l
s
t
o
c
al
c
u
l
a
te
th
e
l
o
s
s
:
=
∑
lo
s
s
(
Pre
d
ictio
n
s
,
Gr
o
u
n
d
T
r
u
th
)
=
1
(
7
)
T
h
e
h
y
b
r
id
C
NN
-
FF
T
m
o
d
el
en
s
u
r
es
r
o
b
u
s
t
f
ea
tu
r
e
ex
tr
ac
ti
o
n
,
wh
ile
th
e
B
iLST
M
an
d
G
R
U
lay
er
s
ef
f
ec
tiv
ely
m
o
d
el
tem
p
o
r
al
d
ep
en
d
en
cies.
T
h
e
co
m
b
in
ed
a
r
ch
itectu
r
e
is
d
esig
n
ed
to
m
ax
im
ize
r
ec
o
g
n
itio
n
ac
cu
r
ac
y
b
y
lev
er
a
g
in
g
co
m
p
lem
en
tar
y
s
tr
en
g
th
s
o
f
s
p
atial,
s
p
ec
tr
al,
an
d
tem
p
o
r
al
m
o
d
eli
n
g
.
N
is
th
e
n
u
m
b
er
o
f
s
am
p
les
in
th
e
tr
ain
in
g
d
at
aset.
L
o
s
s
is
th
e
ch
o
s
en
lo
s
s
f
u
n
ctio
n
,
s
u
c
h
as
ca
teg
o
r
ical
cr
o
s
s
-
en
tr
o
p
y
l
o
s
s
.
Fig
u
r
e
3
s
h
o
ws th
e
p
r
o
p
o
s
ed
m
o
d
el
ar
ch
itectu
r
e.
T
h
e
m
o
d
el
is
tr
ain
ed
f
o
r
2
0
0
ep
o
ch
s
u
s
in
g
th
e
Ad
a
m
o
p
tim
izer
with
a
lear
n
i
n
g
r
ate
o
f
0
.
0
0
1
.
E
ac
h
ep
o
ch
in
v
o
lv
ed
p
r
o
ce
s
s
in
g
v
i
d
eo
s
eq
u
en
ce
s
to
ex
tr
ac
t
f
ea
tu
r
es
u
s
in
g
th
e
C
NN
-
FF
T
h
y
b
r
id
m
o
d
el,
f
o
llo
we
d
b
y
tem
p
o
r
al
m
o
d
ellin
g
th
r
o
u
g
h
B
iLST
M
an
d
GR
U
lay
er
s
.
T
h
e
tr
ain
in
g
m
in
im
ized
a
co
m
b
in
ed
ca
teg
o
r
ical
cr
o
s
s
-
en
tr
o
p
y
lo
s
s
f
u
n
ctio
n
o
v
er
th
e
p
r
e
d
i
ctio
n
s
an
d
g
r
o
u
n
d
tr
u
th
lab
els.
B
atch
s
ize
was
s
et
to
3
2
,
an
d
ea
r
ly
s
to
p
p
in
g
m
o
n
ito
r
e
d
v
alid
atio
n
lo
s
s
to
p
r
ev
e
n
t
o
v
er
f
itti
n
g
.
T
r
ain
in
g
ac
h
iev
ed
r
o
b
u
s
t
f
e
atu
r
e
lear
n
in
g
an
d
ef
f
ec
tiv
e
tem
p
o
r
al
d
ep
e
n
d
en
c
y
m
o
d
ellin
g
.
Fig
u
r
e
3
.
Pro
p
o
s
ed
s
y
s
tem
ar
c
h
itectu
re
3.
RE
SU
L
T
S AN
D
D
I
SCU
SS
I
O
N
T
h
e
ev
alu
atio
n
o
f
SLR
p
er
f
o
r
m
an
ce
in
v
o
lv
es
p
r
e
p
ar
in
g
th
e
d
ataset
an
d
em
p
lo
y
in
g
v
ar
io
u
s
s
tan
d
ar
d
C
NN
m
o
d
els
f
o
r
ex
tr
ac
tin
g
v
id
eo
f
ea
tu
r
es.
T
h
e
p
r
o
p
o
s
ed
m
o
d
el,
in
teg
r
atin
g
B
iLST
M
an
d
GR
U
lay
er
s
,
is
ass
es
s
ed
u
s
in
g
ap
p
r
o
p
r
iate
p
e
r
f
o
r
m
an
ce
m
etr
ics.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
5
0
2
-
4
7
52
In
d
o
n
esian
J
E
lec
E
n
g
&
C
o
m
p
Sci
,
Vo
l.
3
8
,
No
.
2
,
May
20
2
5
:
82
1
-
8
2
9
826
3
.
1
.
Da
t
a
s
et
p
re
pa
ra
t
io
n
T
h
e
two
-
p
h
ase
ev
alu
atio
n
co
m
p
ar
es
d
atasets
f
o
r
Am
er
ican
s
ig
n
lan
g
u
ag
e
(
ASL)
an
d
I
n
d
ian
s
ig
n
lan
g
u
ag
e
(
I
SL)
.
T
h
e
wo
r
d
-
le
v
el
Am
er
ican
s
ig
n
la
n
g
u
a
g
e
(
W
L
ASL)
d
ataset
[
2
0
]
s
h
o
wca
s
es
2
0
0
0
+
ASL
wo
r
d
s
b
y
1
0
0
+
s
ig
n
e
r
s
.
T
h
e
I
NC
L
UDE
d
ataset
[
2
1
]
,
p
r
e
v
io
u
s
ly
I
n
d
ian
lex
ico
n
s
ig
n
l
an
g
u
ag
e
d
ataset,
is
tailo
r
ed
f
o
r
I
SL
task
s
,
with
0
.
2
7
m
illi
o
n
f
r
a
m
es
ac
r
o
s
s
4
,
2
8
7
v
id
e
o
s
.
I
t
co
v
e
r
s
2
6
3
u
n
iq
u
e
I
SL
s
ig
n
s
ca
teg
o
r
ized
in
to
15
-
wo
r
d
g
r
o
u
p
s
,
r
ef
lectin
g
d
iv
e
r
s
e
lin
g
u
is
tic
co
n
ce
p
ts
.
T
h
is
f
ac
ilit
ates
r
esear
ch
in
I
S
L
r
ec
o
g
n
itio
n
,
o
f
f
er
in
g
a
b
r
o
ad
s
p
ec
tr
u
m
o
f
I
SL
ex
p
r
ess
io
n
s
an
d
v
o
ca
b
u
lar
y
.
3
.
2
.
P
er
f
o
r
m
a
nce
a
na
ly
s
is
Fo
r
t
h
e
s
i
n
g
le
n
-
g
r
a
m
,
t
h
e
B
L
E
U
s
co
r
e
is
c
al
cu
lat
ed
u
s
i
n
g
(
8
).
=
m
i
n
(
,
)
m
ax
(
,
)
(
8
)
W
h
e
r
e
c
a
n
d
r
t
h
e
n
-
g
r
a
m
c
o
u
n
t
s
f
r
o
m
i
n
f
e
r
e
n
c
e
o
u
t
p
u
t
a
n
d
r
e
f
e
r
e
n
c
e
g
r
o
u
n
d
t
r
u
t
h
w
o
r
d
s
r
e
s
p
e
c
t
i
v
e
l
y
.
I
f
t
h
e
i
r
m
u
l
t
i
p
l
e
n
-
g
r
a
m
s
,
t
h
e
n
w
e
i
g
h
t
e
d
g
e
o
m
e
t
r
i
c
m
e
a
n
i
s
c
a
l
c
u
l
a
t
e
d
a
s
i
n
(
6
)
w
i
t
h
m
a
x
i
m
u
m
s
i
z
e
N
.
=
×
e
xp
(
∑
(
l
og
(
)
=
1
)
(
9
)
I
n
ca
s
e
o
f
s
h
o
r
ter
ca
n
d
id
ate
g
e
n
er
atio
n
s
,
B
P is
u
s
ed
to
p
en
ali
ze
.
B
P is
th
e
b
r
ev
ity
p
en
alty
,
g
iv
en
b
y
(
1
0
)
.
=
(
1
−
)
(
1
0
)
C
las
s
if
icatio
n
p
er
f
o
r
m
a
n
ce
an
aly
s
is
u
tili
ze
s
ac
cu
r
ac
y
,
s
p
ec
if
icity
,
s
en
s
itiv
ity
,
an
d
F1
s
co
r
e
p
ar
am
eter
s
,
d
etailed
in
T
a
b
le
1
.
V
id
e
o
f
r
a
m
e
f
ea
tu
r
e
ex
t
r
ac
tio
n
em
p
lo
y
s
s
tan
d
ar
d
C
NN
m
o
d
els.
Fig
u
r
e
4
illu
s
tr
ates
th
e
co
m
p
ar
ativ
e
p
e
r
f
o
r
m
a
n
ce
an
aly
s
is
o
f
th
ese
m
o
d
els.
T
h
is
n
u
an
ce
d
ev
alu
at
io
n
o
f
SLR
in
clu
d
es
ac
cu
r
ac
y
,
s
p
ec
if
icity
,
s
en
s
itiv
ity
,
an
d
F1
s
co
r
e
,
cr
u
cial
f
o
r
id
en
tify
in
g
ar
ea
s
n
e
e
d
in
g
i
m
p
r
o
v
e
m
en
t.
L
o
w
s
en
s
itiv
ity
f
o
r
a
g
estu
r
e
in
d
icate
s
r
ec
o
g
n
itio
n
en
h
an
ce
m
e
n
t
is
n
ec
ess
ar
y
.
B
alan
cin
g
t
h
ese
m
etr
ics
en
s
u
r
es
ac
cu
r
ac
y
an
d
ef
f
ec
tiv
e
n
ess
in
ca
p
tu
r
in
g
s
ig
n
lan
g
u
ag
e
n
u
an
ce
s
.
Fig
u
r
e
s
5
an
d
6
s
h
o
w
th
e
co
m
p
ar
ativ
e
an
aly
s
is
o
f
th
e
p
r
o
p
o
s
ed
m
o
d
el
with
o
th
er
s
tan
d
ar
d
C
NN
m
o
d
els
-
b
ased
f
ea
tu
r
e
ex
tr
ac
ti
o
n
o
f
v
id
eo
s
eq
u
en
ce
an
d
co
m
b
in
atio
n
s
o
f
d
if
f
e
r
en
t
ty
p
es o
f
lay
er
s
in
R
NN.
Fig
u
r
es
5
(
a)
a
n
d
5
(
b
)
,
Fig
u
er
e
s
6
(
a)
an
d
6
(
b
)
d
e
p
ict
co
m
p
ar
i
s
o
n
s
o
f
p
r
et
r
ain
ed
m
o
d
els
an
d
atten
tio
n
-
b
ased
m
o
d
els
u
s
in
g
W
L
ASL
an
d
I
NC
L
UDE
d
atasets
.
T
h
is
wo
r
k
d
em
o
n
s
tr
ates
th
e
im
p
r
o
v
em
en
t
in
B
L
E
U
s
co
r
e
with
r
esp
ec
t
to
co
m
b
in
a
tio
n
s
o
f
th
e
m
o
d
els
in
wh
ich
p
r
o
p
o
s
ed
m
o
d
el
s
h
o
ws
h
ig
h
est
p
er
f
o
r
m
a
n
ce
o
v
e
r
o
th
er
s
.
T
h
e
co
m
p
ar
ativ
e
s
tu
d
y
with
o
th
er
ex
is
tin
g
m
et
h
o
d
s
d
em
o
n
s
tr
ated
in
t
h
e
T
a
b
le
2
s
h
o
ws
th
at,
th
e
p
r
o
p
o
s
ed
m
o
d
el
o
u
tp
e
r
f
o
r
m
s
o
v
er
th
e
o
th
er
m
eth
o
d
s
.
T
ab
le
1
.
C
lass
if
icatio
n
p
er
f
o
r
m
an
ce
p
ar
am
ete
r
s
P
a
r
a
me
t
e
r
F
o
r
mu
l
a
A
c
c
u
r
a
c
y
TP+TN
/
(
TP
+
TN
+
F
P
+
F
N
)
S
p
e
c
i
f
i
c
i
t
y
TN
/
(
TN
+
F
P
)
S
e
n
s
i
t
i
v
i
t
y
/
R
e
c
a
l
l
TP/(TP
+
F
N
)
P
r
e
c
i
s
i
o
n
TP/(TP
+
F
P
)
F
1
S
c
o
r
e
2
*
(
R
e
c
a
l
l
*
P
r
e
c
i
si
o
n
)
/
(
R
e
c
a
l
l
P
r
e
c
i
si
o
n
)
Fig
u
r
e
4
.
Av
e
r
ag
e
B
L
E
U
an
al
y
s
is
f
o
r
1
0
v
id
eo
s
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
d
o
n
esian
J
E
lec
E
n
g
&
C
o
m
p
Sci
I
SS
N:
2
5
0
2
-
4
7
52
R
N
N
-
d
r
iven
in
teg
r
a
tio
n
o
f sp
a
tia
l,
temp
o
r
a
l,
fea
tu
r
es fo
r
in
d
ia
n
s
ig
n
la
n
g
u
a
g
e
…
(
A
ja
y
M.
P
o
l
)
827
(
a)
(
b
)
Fig
u
r
e
5
.
C
o
m
p
a
r
ativ
e
an
aly
s
i
s
o
n
W
L
ASL
d
ataset
(
a)
d
if
f
er
en
t f
ea
tu
r
e
e
x
tr
ac
tio
n
m
o
d
els
an
d
(
b
)
c
o
m
b
in
atio
n
s
o
f
L
STM
an
d
GR
U
(
a)
(
b
)
Fig
u
r
e
6
.
C
o
m
p
a
r
ativ
e
an
aly
s
i
s
o
n
I
NC
L
UDE
d
ataset
an
d
(
a
)
d
if
f
er
e
n
t f
ea
tu
r
e
ex
tr
ac
tio
n
m
o
d
els
an
d
(
b
)
c
o
m
b
in
atio
n
s
o
f
L
STM
an
d
GR
U
T
ab
le
2
.
C
o
m
p
a
r
ativ
e
an
aly
s
is
with
o
th
er
ex
is
tin
g
m
eth
o
d
s
M
e
t
h
o
d
D
a
t
a
s
e
t
P
e
r
f
o
r
ma
n
c
e
C
u
s
t
o
mi
z
e
d
d
e
si
g
n
o
f
C
N
N
mo
d
e
l
[
2
2
]
I
S
L
R
O
B
I
TA
D
a
t
a
se
t
A
c
c
u
r
a
c
y
8
7
.
5
%
M
o
d
i
f
i
c
a
t
i
o
n
s
i
n
st
a
n
d
a
r
d
C
N
N
m
o
d
e
l
[
2
3
]
B
a
b
y
S
i
g
n
La
n
g
u
a
g
e
A
c
c
u
r
a
c
y
8
9
%
A
mo
d
u
l
e
t
h
a
t
e
x
t
r
a
c
t
g
l
o
b
a
l
a
n
d
l
o
c
a
l
f
e
a
t
u
r
e
s (G
L
R
)
[
9
]
LSA
6
4
,
I
N
C
LU
D
E
a
n
d
W
LA
S
L
A
c
c
u
r
a
c
y
9
1
%
si
g
n
l
a
n
g
u
a
g
e
t
r
a
n
s
l
a
t
i
o
n
n
e
t
w
o
r
k
(
S
L
TN
)
[
2
4
]
C
S
L
-
d
a
i
l
y
,
S
L
R
-
1
0
0
,
R
W
TH
A
c
c
u
r
a
c
y
9
2
%
P
r
o
p
o
se
d
W
LA
S
L
I
N
C
LU
D
E
A
c
c
u
r
a
c
y
9
9
%
A
c
c
u
r
a
c
y
9
9
.
1
%
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
5
0
2
-
4
7
52
In
d
o
n
esian
J
E
lec
E
n
g
&
C
o
m
p
Sci
,
Vo
l.
3
8
,
No
.
2
,
May
20
2
5
:
82
1
-
8
2
9
828
T
h
e
p
r
o
p
o
s
ed
m
o
d
el,
with
4
r
esid
u
al
b
lo
ck
s
an
d
FF
T
f
o
r
f
ea
tu
r
e
ex
tr
ac
tio
n
p
lu
s
B
iLST
M
an
d
GR
U
f
o
r
te
x
t
p
r
o
ce
s
s
in
g
,
ex
ce
ls
o
n
W
L
ASL
an
d
I
NC
L
UDE
d
atas
ets.
R
esid
u
al
b
lo
ck
s
ca
p
tu
r
e
s
p
atial
f
ea
tu
r
es,
FF
T
ad
d
s
f
r
eq
u
e
n
cy
d
ata,
an
d
B
iLST
M
with
G
R
U
h
an
d
les
tem
p
o
r
al
d
ep
e
n
d
en
cies,
o
p
tim
izin
g
s
ig
n
r
ec
o
g
n
itio
n
.
Data
s
et
-
s
p
ec
if
ic
tu
n
in
g
en
h
a
n
ce
s
ac
cu
r
ac
y
,
ac
h
iev
in
g
9
9
%
o
n
W
L
ASL
an
d
9
9
.
1
%
o
n
I
NC
L
UDE
,
s
ig
n
if
ican
tly
o
u
tp
er
f
o
r
m
in
g
p
r
ev
io
u
s
m
et
h
o
d
s
with
ac
cu
r
ac
ies
b
etwe
en
8
7
.
5
%
a
n
d
9
2
%,
m
ar
k
i
n
g
a
n
o
tab
le
im
p
r
o
v
em
e
n
t a
cr
o
s
s
d
at
asets
.
T
h
e
p
r
o
p
o
s
ed
m
o
d
el
s
ig
n
if
ic
an
tly
o
u
tp
er
f
o
r
m
s
ex
is
tin
g
ap
p
r
o
ac
h
es,
s
u
ch
as
th
e
c
u
s
to
m
ized
C
N
N
d
esig
n
[
1
6
]
,
w
h
ich
ac
h
ie
v
ed
8
7
.
5
%
ac
cu
r
ac
y
o
n
th
e
I
SL
R
OB
I
T
A
d
ataset,
an
d
th
e
M
o
d
if
ied
C
NN
Mo
d
el
[
2
5
]
,
wh
ic
h
r
ea
ch
e
d
8
9
%
ac
cu
r
ac
y
o
n
th
e
B
ab
y
Sig
n
L
a
n
g
u
ag
e
d
ataset.
T
h
e
p
r
o
p
o
s
e
d
m
o
d
e
l'
s
ad
v
an
ce
d
f
ea
tu
r
e
ex
tr
ac
tio
n
an
d
atten
tio
n
m
ec
h
an
is
m
s
lead
t
o
a
s
u
b
s
t
an
tial
p
er
f
o
r
m
an
ce
b
o
o
s
t,
s
u
r
p
ass
in
g
th
e
Glo
b
al
an
d
lo
ca
l
f
ea
tu
r
e
ex
t
r
ac
tio
n
(
GL
R
)
Mo
d
u
le
[
1
7
]
,
wh
ich
attain
ed
9
1
%
ac
cu
r
ac
y
o
n
d
atasets
lik
e
L
SA6
4
,
I
NC
L
UDE
,
an
d
W
L
ASL.
Mo
r
eo
v
e
r
,
th
e
m
o
d
el
o
u
t
p
er
f
o
r
m
s
th
e
s
tate
-
of
-
th
e
-
a
r
t
s
ig
n
l
an
g
u
ag
e
tr
an
s
latio
n
n
etwo
r
k
(
SLT
N)
[
1
9
]
,
wh
ich
ac
h
iev
ed
9
2
%
ac
cu
r
ac
y
,
b
y
ef
f
ec
tiv
ely
ca
p
tu
r
i
n
g
a
n
d
tr
a
n
s
latin
g
s
ig
n
lan
g
u
ag
e,
as
ev
id
en
ce
d
b
y
th
e
p
r
o
p
o
s
e
d
m
o
d
el'
s
s
u
p
er
io
r
ac
cu
r
ac
y
o
f
9
9
-
9
9
.
1
%
o
n
th
e
W
L
ASL
an
d
I
NC
L
UDE
d
atasets
.
W
h
ile
th
e
p
r
o
p
o
s
ed
m
o
d
el
d
e
m
o
n
s
tr
ates
ex
ce
p
ti
o
n
al
p
er
f
o
r
m
a
n
ce
o
n
th
e
W
L
ASL
an
d
I
NC
L
UDE
d
atasets
,
f
u
r
th
er
test
in
g
ac
r
o
s
s
ad
d
itio
n
al
d
atasets
wo
u
ld
h
elp
v
alid
ate
its
r
o
b
u
s
tn
ess
an
d
g
en
er
aliza
b
ilit
y
.
Ad
d
itio
n
ally
,
ass
ess
in
g
th
e
m
o
d
el'
s
co
m
p
u
tatio
n
al
ef
f
icien
c
y
an
d
in
f
e
r
en
ce
s
p
ee
d
is
cr
u
c
ial
f
o
r
en
s
u
r
in
g
its
s
u
itab
ilit
y
f
o
r
r
ea
l
-
tim
e
a
p
p
lic
atio
n
s
,
p
ar
ticu
lar
ly
in
r
eso
u
r
ce
-
co
n
s
tr
ain
ed
e
n
v
ir
o
n
m
en
ts
.
4.
CO
NCLU
SI
O
N
T
h
e
p
r
o
p
o
s
e
d
m
o
d
el,
i
n
teg
r
at
in
g
s
p
atial
f
ea
tu
r
es
f
r
o
m
r
esi
d
u
al
b
lo
c
k
s
an
d
tem
p
o
r
al
f
ea
tu
r
es
f
r
o
m
FF
T
,
alo
n
g
with
a
s
o
p
h
is
ticated
R
NN
ar
ch
itectu
r
e
co
m
p
r
is
in
g
B
iLST
M,
GR
U
lay
er
s
,
an
d
m
u
lti
-
h
ea
d
atten
tio
n
,
d
em
o
n
s
tr
ates
ex
ce
p
tio
n
al
p
er
f
o
r
m
an
ce
.
Ach
iev
in
g
n
ea
r
ly
9
9
%
ac
cu
r
ac
y
o
n
b
o
th
th
e
W
L
ASL
an
d
I
NC
L
UDE
d
atasets
,
th
i
s
m
o
d
el
o
u
tp
er
f
o
r
m
s
s
tan
d
ar
d
C
NN
p
r
etr
ain
ed
m
o
d
els
in
f
ea
tu
r
e
ex
tr
ac
tio
n
.
No
tab
l
y
,
th
e
B
iLST
M
an
d
GR
U
co
m
b
in
atio
n
p
r
o
v
es
s
u
p
e
r
io
r
to
o
t
h
er
co
m
b
in
atio
n
s
s
u
ch
as
L
S
T
M
an
d
GR
U.
T
h
e
B
L
E
U
s
co
r
e
an
aly
s
is
f
u
r
th
er
v
al
id
ates
th
e
m
o
d
el'
s
ef
f
icac
y
,
with
s
co
r
es
o
f
0
.
5
1
an
d
0
.
5
4
o
n
th
e
W
L
ASL
an
d
I
NC
L
UDE
d
atasets
,
r
esp
ec
tiv
ely
.
T
h
ese
r
esu
lts
af
f
ir
m
th
e
m
o
d
el'
s
p
r
o
f
icien
cy
i
n
ca
p
tu
r
i
n
g
in
tr
icate
s
p
atial
an
d
tem
p
o
r
al
n
u
a
n
ce
s
in
h
er
e
n
t
in
s
ig
n
lan
g
u
a
g
e
g
estu
r
es,
th
er
eb
y
en
h
an
cin
g
ac
ce
s
s
ib
ilit
y
an
d
co
m
m
u
n
icatio
n
f
o
r
th
e
d
ea
f
an
d
h
ar
d
-
of
-
h
ea
r
i
n
g
co
m
m
u
n
ities
.
T
h
e
co
m
p
ar
i
s
o
n
h
ig
h
lig
h
ts
th
e
s
u
p
er
i
o
r
ity
o
f
o
u
r
m
o
d
el
o
v
er
s
tan
d
ar
d
ap
p
r
o
ac
h
es,
em
p
h
asi
zin
g
th
e
s
ig
n
if
ican
ce
o
f
th
e
i
n
teg
r
ated
ar
ch
itectu
r
e
.
Mo
v
i
n
g
f
o
r
war
d
,
co
n
tin
u
ed
r
ef
in
em
e
n
t
a
n
d
o
p
tim
izatio
n
h
o
ld
p
r
o
m
is
e
f
o
r
f
u
r
th
e
r
au
g
m
en
tin
g
th
e
m
o
d
el'
s
p
er
f
o
r
m
a
n
ce
an
d
ap
p
licab
ilit
y
in
r
ea
l
-
wo
r
ld
s
ce
n
ar
io
s
.
T
h
e
f
in
d
in
g
s
u
n
d
er
s
co
r
e
th
e
p
o
ten
tial
o
f
o
u
r
ap
p
r
o
ac
h
to
a
d
v
an
ce
SLR
an
d
v
id
e
o
ca
p
tio
n
in
g
tec
h
n
o
lo
g
ies,
co
n
tr
ib
u
tin
g
to
i
n
clu
s
iv
e
co
m
m
u
n
ic
atio
n
en
v
ir
o
n
m
en
ts
.
RE
F
E
R
E
NC
E
S
[
1
]
M
.
L
i
,
Y
.
Ji
a
n
g
,
Y
.
Z
h
a
n
g
,
a
n
d
H
.
Z
h
u
,
“
M
e
d
i
c
a
l
i
ma
g
e
a
n
a
l
y
si
s
u
si
n
g
d
e
e
p
l
e
a
r
n
i
n
g
a
l
g
o
r
i
t
h
ms
,
”
Fr
o
n
t
.
P
u
b
l
i
c
H
e
a
l
.
,
v
o
l
.
1
1
,
2
0
2
3
,
d
o
i
:
1
0
.
3
3
8
9
/
F
P
U
B
H
.
2
0
2
3
.
1
2
7
3
2
5
3
.
[
2
]
M
.
P
a
p
a
t
s
i
m
o
u
l
i
,
P
.
S
a
r
i
g
i
a
n
n
i
d
i
s,
a
n
d
G
.
F
.
F
r
a
g
u
l
i
s,
“
A
S
u
r
v
e
y
o
f
A
d
v
a
n
c
e
me
n
t
s
i
n
R
e
a
l
-
T
i
me
S
i
g
n
L
a
n
g
u
a
g
e
Tr
a
n
sl
a
t
o
r
s:
I
n
t
e
g
r
a
t
i
o
n
w
i
t
h
I
o
T
T
e
c
h
n
o
l
o
g
y
,
”
T
e
c
h
n
o
l
.
2
0
2
3
,
V
o
l
.
1
1
,
P
a
g
e
8
3
,
v
o
l
.
1
1
,
n
o
.
4
,
p
.
8
3
,
Ju
n
.
2
0
2
3
,
d
o
i
:
1
0
.
3
3
9
0
/
TE
C
H
N
O
LO
G
I
ES1
1
0
4
0
0
8
3
.
[
3
]
A
.
J.
Y
o
u
si
f
a
n
d
M
.
H
.
A
l
-
Jamm
a
s,
“
Ex
p
l
o
r
i
n
g
d
e
e
p
l
e
a
r
n
i
n
g
a
p
p
r
o
a
c
h
e
s
f
o
r
v
i
d
e
o
c
a
p
t
i
o
n
i
n
g
:
A
c
o
mp
r
e
h
e
n
s
i
v
e
r
e
v
i
e
w
,
”
e
-
Pr
i
m
e
-
Ad
v
.
E
l
e
c
t
r
.
E
n
g
.
E
l
e
c
t
r
o
n
.
En
e
rg
y
,
v
o
l
.
6
,
p
.
1
0
0
3
7
2
,
D
e
c
.
2
0
2
3
,
d
o
i
:
1
0
.
1
0
1
6
/
J
.
P
R
I
M
E.
2
0
2
3
.
1
0
0
3
7
2
.
[
4
]
D
.
Y
a
si
n
,
A
.
S
o
h
a
i
l
,
a
n
d
I
.
S
i
d
d
i
q
i
,
“
S
e
ma
n
t
i
c
V
i
d
e
o
R
e
t
r
i
e
v
a
l
u
si
n
g
D
e
e
p
L
e
a
r
n
i
n
g
Te
c
h
n
i
q
u
e
s,
”
Pr
o
c
.
2
0
2
0
1
7
t
h
I
n
t
.
B
h
u
r
b
a
n
C
o
n
f
.
A
p
p
l
.
S
c
i
.
T
e
c
h
n
o
l
.
I
BC
AS
T
2
0
2
0
,
p
p
.
3
3
8
–
3
4
3
,
Ja
n
.
2
0
2
0
,
d
o
i
:
1
0
.
1
1
0
9
/
I
B
C
A
S
T4
7
8
7
9
.
2
0
2
0
.
9
0
4
4
6
0
1
.
[
5
]
M
.
N
a
b
a
t
i
a
n
d
A
.
B
e
h
r
a
d
,
“
V
i
d
e
o
c
a
p
t
i
o
n
i
n
g
u
s
i
n
g
b
o
o
st
e
d
a
n
d
p
a
r
a
l
l
e
l
Lo
n
g
S
h
o
r
t
-
T
e
r
m M
e
mo
r
y
n
e
t
w
o
r
k
s,”
C
o
m
p
u
t
.
V
i
s.
I
m
a
g
e
U
n
d
e
rst
.
,
v
o
l
.
1
9
0
,
p
.
1
0
2
8
4
0
,
J
a
n
.
2
0
2
0
,
d
o
i
:
1
0
.
1
0
1
6
/
J
.
C
V
I
U
.
2
0
1
9
.
1
0
2
8
4
0
.
[
6
]
M
.
C
h
o
h
a
n
,
A
.
K
h
a
n
,
M
.
S
.
M
a
h
a
r
,
S
.
H
a
ss
a
n
,
A
.
G
h
a
f
o
o
r
,
a
n
d
M
.
K
h
a
n
,
“
I
mag
e
C
a
p
t
i
o
n
i
n
g
u
s
i
n
g
D
e
e
p
Le
a
r
n
i
n
g
:
A
S
y
st
e
ma
t
i
c
Li
t
e
r
a
t
u
r
e
R
e
v
i
e
w
,
”
I
n
t
.
J
.
A
d
v
.
C
o
m
p
u
t
.
S
c
i
.
A
p
p
l
.
,
v
o
l
.
1
1
,
n
o
.
5
,
p
p
.
2
7
8
–
2
8
6
,
2
0
2
0
,
d
o
i
:
1
0
.
1
4
5
6
9
/
I
JA
C
S
A
.
2
0
2
0
.
0
1
1
0
5
3
7
.
[
7
]
J.
M
u
n
,
L.
Y
a
n
g
,
Z.
R
e
n
,
N
.
X
u
,
a
n
d
B
.
H
a
n
,
“
S
t
r
e
a
ml
i
n
e
d
D
e
n
s
e
V
i
d
e
o
C
a
p
t
i
o
n
i
n
g
,
”
Pro
c
.
I
EEE
C
o
m
p
u
t
.
S
o
c
.
C
o
n
f
.
C
o
m
p
u
t
.
Vi
s.
Pa
t
t
e
rn
Re
c
o
g
n
i
t
.
,
v
o
l
.
2
0
1
9
-
Ju
n
e
,
p
p
.
6
5
8
1
–
6
5
9
0
,
A
p
r
.
2
0
1
9
,
d
o
i
:
1
0
.
1
1
0
9
/
C
V
P
R
.
2
0
1
9
.
0
0
6
7
5
.
[
8
]
H
.
X
i
a
o
a
n
d
J.
S
h
i
,
“
V
i
d
e
o
c
a
p
t
i
o
n
i
n
g
w
i
t
h
t
e
x
t
-
b
a
s
e
d
d
y
n
a
m
i
c
a
t
t
e
n
t
i
o
n
a
n
d
s
t
e
p
-
by
-
st
e
p
l
e
a
r
n
i
n
g
,
”
P
a
t
t
e
r
n
R
e
c
o
g
n
i
t
.
L
e
t
t
.
,
v
o
l
.
1
3
3
,
p
p
.
3
0
5
–
3
1
2
,
M
a
y
2
0
2
0
,
d
o
i
:
1
0
.
1
0
1
6
/
J
.
P
A
TR
E
C
.
2
0
2
0
.
0
3
.
0
0
1
.
[
9
]
Z.
G
u
o
,
Y
.
H
o
u
,
a
n
d
W
.
L
i
,
“
S
i
g
n
l
a
n
g
u
a
g
e
r
e
c
o
g
n
i
t
i
o
n
v
i
a
d
i
m
e
n
s
i
o
n
a
l
g
l
o
b
a
l
–
l
o
c
a
l
sh
i
f
t
a
n
d
c
r
o
ss
-
sc
a
l
e
a
g
g
r
e
g
a
t
i
o
n
,
”
N
e
u
ra
l
C
o
m
p
u
t
.
Ap
p
l
.
,
p
p
.
1
–
1
3
,
M
a
r
.
2
0
2
3
,
d
o
i
:
1
0
.
1
0
0
7
/
S
0
0
5
2
1
-
0
2
3
-
0
8
3
8
0
-
9
/
M
ETR
I
C
S
.
[
1
0
]
T.
F
u
j
i
i
,
Y
.
S
e
i
,
Y
.
Ta
h
a
r
a
,
R
.
O
r
i
h
a
r
a
,
a
n
d
A
.
O
h
s
u
g
a
,
“
‘
N
e
v
e
r
f
r
y
c
a
r
r
o
t
s
w
i
t
h
o
u
t
c
u
t
t
i
n
g
.
’
C
o
o
k
i
n
g
R
e
c
i
p
e
G
e
n
e
r
a
t
i
o
n
f
r
o
m
V
i
d
e
o
s
U
si
n
g
D
e
e
p
Le
a
r
n
i
n
g
C
o
n
s
i
d
e
r
i
n
g
P
r
e
v
i
o
u
s
P
r
o
c
e
ss
,
”
Pr
o
c
.
-
2
0
1
9
I
E
EE
/
AC
I
S
4
t
h
I
n
t
.
C
o
n
f
.
Bi
g
D
a
t
a
,
C
l
o
u
d
C
o
m
p
u
t
.
D
a
t
a
S
c
i
.
B
C
D
2
0
1
9
,
p
p
.
1
2
4
–
1
2
9
,
M
a
y
2
0
1
9
,
d
o
i
:
1
0
.
1
1
0
9
/
B
C
D
.
2
0
1
9
.
8
8
8
5
2
2
2
.
[
1
1
]
X
.
Zh
a
n
g
,
F
.
Z
h
a
n
g
,
a
n
d
C
.
X
u
,
“
E
x
p
l
i
c
i
t
C
r
o
ss
-
M
o
d
a
l
R
e
p
r
e
se
n
t
a
t
i
o
n
Le
a
r
n
i
n
g
f
o
r
V
i
s
u
a
l
C
o
mm
o
n
se
n
se
R
e
a
s
o
n
i
n
g
,
”
I
E
EE
T
ra
n
s
.
M
u
l
t
i
m
e
d
.
,
v
o
l
.
2
4
,
p
p
.
2
9
8
6
–
2
9
9
7
,
2
0
2
2
,
d
o
i
:
1
0
.
1
1
0
9
/
TM
M
.
2
0
2
1
.
3
0
9
1
8
8
2
.
[
1
2
]
S
.
Ta
t
e
n
o
,
H
.
L
i
u
,
a
n
d
J
.
O
u
,
“
D
e
v
e
l
o
p
m
e
n
t
o
f
S
i
g
n
La
n
g
u
a
g
e
M
o
t
i
o
n
R
e
c
o
g
n
i
t
i
o
n
S
y
st
e
m
f
o
r
H
e
a
r
i
n
g
-
I
mp
a
i
r
e
d
P
e
o
p
l
e
U
s
i
n
g
El
e
c
t
r
o
my
o
g
r
a
p
h
y
S
i
g
n
a
l
,
”
S
e
n
so
rs
(
Ba
se
l
)
.
,
v
o
l
.
2
0
,
n
o
.
2
0
,
p
p
.
1
–
2
2
,
O
c
t
.
2
0
2
0
,
d
o
i
:
1
0
.
3
3
9
0
/
S
2
0
2
0
5
8
0
7
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
d
o
n
esian
J
E
lec
E
n
g
&
C
o
m
p
Sci
I
SS
N:
2
5
0
2
-
4
7
52
R
N
N
-
d
r
iven
in
teg
r
a
tio
n
o
f sp
a
tia
l,
temp
o
r
a
l,
fea
tu
r
es fo
r
in
d
ia
n
s
ig
n
la
n
g
u
a
g
e
…
(
A
ja
y
M.
P
o
l
)
829
[
1
3
]
D
.
Li
,
C
.
R
.
O
p
a
z
o
,
X
.
Y
u
,
a
n
d
H
.
L
i
,
“
W
o
r
d
-
l
e
v
e
l
D
e
e
p
S
i
g
n
La
n
g
u
a
g
e
R
e
c
o
g
n
i
t
i
o
n
f
r
o
m
V
i
d
e
o
:
A
N
e
w
La
r
g
e
-
sca
l
e
D
a
t
a
set
a
n
d
M
e
t
h
o
d
s
C
o
m
p
a
r
i
s
o
n
,
”
Pro
c
.
-
2
0
2
0
I
EEE
Wi
n
t
e
r
C
o
n
f
.
A
p
p
l
.
C
o
m
p
u
t
.
Vi
si
o
n
,
W
AC
V
2
0
2
0
,
p
p
.
1
4
4
8
–
1
4
5
8
,
O
c
t
.
2
0
1
9
,
d
o
i
:
1
0
.
1
1
0
9
/
W
A
C
V
4
5
5
7
2
.
2
0
2
0
.
9
0
9
3
5
1
2
.
[
1
4
]
M
.
M
a
h
y
o
u
b
,
F
.
N
a
t
a
l
i
a
,
S
.
S
u
d
i
r
m
a
n
,
a
n
d
J
.
M
u
s
t
a
f
i
n
a
,
“
S
i
g
n
La
n
g
u
a
g
e
R
e
c
o
g
n
i
t
i
o
n
u
s
i
n
g
D
e
e
p
L
e
a
r
n
i
n
g
,
”
Pro
c
.
-
I
n
t
.
C
o
n
f
.
D
e
v
.
e
S
y
st
e
m
s E
n
g
.
D
e
S
E
,
v
o
l
.
2
0
2
3
-
Jan
u
a
r
y
,
p
p
.
1
8
4
–
1
8
9
,
2
0
2
3
,
d
o
i
:
1
0
.
1
1
0
9
/
D
ESE5
8
2
7
4
.
2
0
2
3
.
1
0
1
0
0
0
5
5
.
[
1
5
]
X
.
X
u
,
K
.
M
e
n
g
,
C
.
C
h
e
n
,
a
n
d
L
.
L
u
,
“
I
so
l
a
t
e
d
W
o
r
d
S
i
g
n
La
n
g
u
a
g
e
R
e
c
o
g
n
i
t
i
o
n
B
a
s
e
d
o
n
I
mp
r
o
v
e
d
S
K
R
e
sN
e
t
-
TC
N
N
e
t
w
o
r
k
,
”
J
.
S
e
n
s
o
rs
,
v
o
l
.
2
0
2
3
,
p
p
.
1
–
1
0
,
J
u
l
.
2
0
2
3
,
d
o
i
:
1
0
.
1
1
5
5
/
2
0
2
3
/
9
5
0
3
9
6
1
.
[
1
6
]
J.
R
a
st
u
s
S
h
a
n
e
a
n
d
V
.
V
a
n
i
t
h
a
,
“
S
i
g
n
La
n
g
u
a
g
e
D
e
t
e
c
t
i
o
n
U
si
n
g
F
a
st
e
r
R
C
N
N
R
e
s
n
e
t
,
”
2
n
d
I
n
t
.
C
o
n
f
.
A
d
v
.
El
e
c
t
r
.
El
e
c
t
ro
n
.
C
o
m
m
u
n
.
C
o
m
p
u
t
.
Au
t
o
m
.
I
C
A
EC
A
2
0
2
3
,
2
0
2
3
,
d
o
i
:
1
0
.
1
1
0
9
/
I
C
A
EC
A
5
6
5
6
2
.
2
0
2
3
.
1
0
2
0
0
9
8
7
.
[
1
7
]
S
.
W
a
n
g
,
K
.
W
a
n
g
,
T
.
Y
a
n
g
,
Y
.
Li
,
a
n
d
D
.
F
a
n
,
“
I
mp
r
o
v
e
d
3
D
-
R
e
sN
e
t
si
g
n
l
a
n
g
u
a
g
e
r
e
c
o
g
n
i
t
i
o
n
a
l
g
o
r
i
t
h
m
w
i
t
h
e
n
h
a
n
c
e
d
h
a
n
d
f
e
a
t
u
r
e
s,
”
S
c
i
.
Re
p
o
r
t
s
2
0
2
2
1
2
1
,
v
o
l
.
1
2
,
n
o
.
1
,
p
p
.
1
–
1
9
,
O
c
t
.
2
0
2
2
,
d
o
i
:
1
0
.
1
0
3
8
/
s4
1
5
9
8
-
022
-
2
1
6
3
6
-
z.
[
1
8
]
S
.
V
a
sh
i
s
h
t
,
P
.
K
u
mar,
a
n
d
M
.
C
.
T
r
i
v
e
d
i
,
“
En
h
a
n
c
e
d
G
R
U
-
B
i
LST
M
T
e
c
h
n
i
q
u
e
f
o
r
C
r
o
p
Y
i
e
l
d
P
r
e
d
i
c
t
i
o
n
,
”
Mu
l
t
i
m
e
d
.
T
o
o
l
s
Ap
p
l
.
,
p
p
.
1
–
2
6
,
M
a
r
.
2
0
2
4
,
d
o
i
:
1
0
.
1
0
0
7
/
S
1
1
0
4
2
-
0
2
4
-
1
8
8
9
8
-
2
/
M
ETR
I
C
S
.
[
1
9
]
N
.
N
.
H
.
V
a
n
,
P
.
H
.
D
o
,
V
.
N
.
H
o
a
n
g
,
A
.
B
o
r
o
d
k
o
,
a
n
d
T
.
D
.
L
e
,
“
Le
v
e
r
a
g
i
n
g
F
F
T
a
n
d
H
y
b
r
i
d
Ef
f
i
c
i
e
n
t
N
e
t
f
o
r
En
h
a
n
c
e
d
A
c
t
i
o
n
R
e
c
o
g
n
i
t
i
o
n
i
n
V
i
d
e
o
S
e
q
u
e
n
c
e
s,
”
A
C
M
I
n
t
.
C
o
n
f
.
Pro
c
e
e
d
i
n
g
S
e
r.
,
p
p
.
3
2
–
3
9
,
D
e
c
.
2
0
2
3
,
d
o
i
:
1
0
.
1
1
4
5
/
3
6
2
8
7
9
7
.
3
6
2
8
8
2
7
.
[
2
0
]
D
.
Li
,
C
.
R
.
O
p
a
z
o
,
X
.
Y
u
,
a
n
d
H
.
Li
,
“
W
o
rd
-
l
e
v
e
l
d
e
e
p
s
i
g
n
l
a
n
g
u
a
g
e
r
e
c
o
g
n
i
t
i
o
n
f
r
o
m
v
i
d
e
o
:
A
n
e
w
l
a
r
g
e
-
sca
l
e
d
a
t
a
se
t
a
n
d
met
h
o
d
s
c
o
mp
a
r
i
s
o
n
,
”
Pr
o
c
.
-
2
0
2
0
I
EEE
W
i
n
t
e
r
C
o
n
f
.
A
p
p
l
.
C
o
m
p
u
t
.
Vi
si
o
n
,
WA
C
V
2
0
2
0
,
p
p
.
1
4
4
8
–
1
4
5
8
,
M
a
r
.
2
0
2
0
,
d
o
i
:
1
0
.
1
1
0
9
/
W
A
C
V
4
5
5
7
2
.
2
0
2
0
.
9
0
9
3
5
1
2
.
[
2
1
]
A
.
S
r
i
d
h
a
r
,
R
.
G
.
G
a
n
e
sa
n
,
P
.
K
u
mar,
a
n
d
M
.
K
h
a
p
r
a
,
“
I
N
C
LU
D
E:
A
La
r
g
e
S
c
a
l
e
D
a
t
a
se
t
f
o
r
I
n
d
i
a
n
S
i
g
n
La
n
g
u
a
g
e
R
e
c
o
g
n
i
t
i
o
n
,
”
MM
2
0
2
0
-
Pro
c
.
2
8
t
h
AC
M
I
n
t
.
C
o
n
f
.
M
u
l
t
i
m
e
d
.
,
p
p
.
1
3
6
6
–
1
3
7
5
,
O
c
t
.
2
0
2
0
,
d
o
i
:
1
0
.
1
1
4
5
/
3
3
9
4
1
7
1
.
3
4
1
3
5
2
8
.
[
2
2
]
G
.
A
r
u
n
P
r
a
s
a
t
h
a
n
d
K
.
A
n
n
a
p
u
r
a
n
i
,
“
P
r
e
d
i
c
t
i
o
n
o
f
si
g
n
l
a
n
g
u
a
g
e
r
e
c
o
g
n
i
t
i
o
n
b
a
se
d
o
n
m
u
l
t
i
l
a
y
e
r
e
d
C
N
N
,
”
Mu
l
t
i
m
e
d
.
T
o
o
l
s
Ap
p
l
.
,
p
p
.
1
–
2
1
,
M
a
r
.
2
0
2
3
,
d
o
i
:
1
0
.
1
0
0
7
/
S
1
1
0
4
2
-
0
2
3
-
1
4
5
4
8
-
1
/
M
ETR
I
C
S
.
[
2
3
]
V
.
En
i
r
e
d
d
y
,
J.
A
n
i
t
h
a
,
N
.
M
a
h
e
n
d
r
a
,
a
n
d
G
.
K
i
sh
o
r
e
,
“
A
n
o
p
t
i
mi
z
e
d
a
u
t
o
ma
t
e
d
r
e
c
o
g
n
i
t
i
o
n
o
f
i
n
f
a
n
t
s
i
g
n
l
a
n
g
u
a
g
e
u
s
i
n
g
e
n
h
a
n
c
e
d
c
o
n
v
o
l
u
t
i
o
n
n
e
u
r
a
l
n
e
t
w
o
r
k
a
n
d
d
e
e
p
LST
M
,
”
M
u
l
t
i
m
e
d
.
T
o
o
l
s
Ap
p
l
.
,
p
p
.
1
–
2
3
,
F
e
b
.
2
0
2
3
,
d
o
i
:
1
0
.
1
0
0
7
/
S
1
1
0
4
2
-
0
2
3
-
1
4
4
2
8
-
8
/
M
E
TR
I
C
S
.
[
2
4
]
R
.
L
i
a
n
d
L.
M
e
n
g
,
“
S
i
g
n
l
a
n
g
u
a
g
e
r
e
c
o
g
n
i
t
i
o
n
a
n
d
t
r
a
n
sl
a
t
i
o
n
n
e
t
w
o
r
k
b
a
s
e
d
o
n
m
u
l
t
i
-
v
i
e
w
d
a
t
a
,
”
Ap
p
l
.
I
n
t
e
l
l
.
,
v
o
l
.
5
2
,
n
o
.
1
3
,
p
p
.
1
4
6
2
4
–
1
4
6
3
8
,
J
u
l
.
2
0
2
2
,
d
o
i
:
1
0
.
1
0
0
7
/
S
1
0
4
8
9
-
022
-
0
3
4
0
7
-
5
/
M
ETR
I
C
S
.
[
2
5
]
H
.
C
h
a
o
,
W
.
F
e
n
h
u
a
,
a
n
d
Z.
R
a
n
,
“
S
i
g
n
l
a
n
g
u
a
g
e
r
e
c
o
g
n
i
t
i
o
n
b
a
se
d
o
n
C
B
A
M
-
R
ESN
ET
,
”
A
C
M
I
n
t
.
C
o
n
f
.
Pr
o
c
e
e
d
i
n
g
S
e
r.
,
O
c
t
.
2
0
1
9
,
d
o
i
:
1
0
.
1
1
4
5
/
3
3
5
8
3
3
1
.
3
3
5
8
3
7
9
.
B
I
O
G
RAP
H
I
E
S O
F
AUTH
O
RS
Aja
y
Ma
n
o
h
a
r
P
o
l
re
c
e
iv
e
d
t
h
e
Ba
c
h
e
l
o
r
o
f
E
n
g
in
e
e
ri
n
g
in
El
e
c
tro
n
ics
En
g
i
n
e
e
rin
g
fro
m
S
h
iv
a
ji
Un
i
v
e
rsity
Ko
lh
a
p
u
r,
In
d
ia
a
n
d
h
o
l
d
s
a
M
.
Tec
h
.
d
e
g
re
e
in
El
e
c
tro
n
ics
E
n
g
i
n
e
e
rin
g
with
sp
e
c
ializa
ti
o
n
in
d
i
g
it
a
l
s
y
ste
m
s
fro
m
S
a
v
it
ri
b
a
i
P
h
u
le
P
u
n
e
Un
iv
e
rsity
P
u
n
e
,
In
d
ia.
Cu
rre
n
t
ly
h
e
is
wo
rk
in
g
a
s
a
n
As
sista
n
t
P
r
o
fe
ss
o
r
a
t
Ko
lh
a
p
u
r
In
sti
tu
te
Tec
h
n
o
l
o
g
y
’s
(KIT’s)
Co
ll
e
g
e
o
f
En
g
in
e
e
rin
g
,
Ko
l
h
a
p
u
r,
S
h
iv
a
ji
Un
iv
e
rsity
o
f
Ko
l
h
a
p
u
r
,
In
d
ia.
He
is
h
a
v
i
n
g
1
7
y
e
a
rs o
f
te
a
c
h
in
g
.
His res
e
a
rc
h
a
re
a
s a
re
im
a
g
e
/sig
n
a
l
p
ro
c
e
ss
in
g
,
ima
g
e
a
n
a
ly
sis
,
p
a
tt
e
rn
re
c
o
g
n
it
i
o
n
,
a
n
d
a
rti
ficia
l
in
telli
g
e
n
c
e
a
n
d
m
a
c
h
in
e
lea
rn
in
g
.
He
u
se
d
t
o
h
o
ld
a
d
m
in
istrativ
e
p
o
sts
a
s
De
p
u
t
y
R
e
g
istrar
o
f
E
x
a
m
in
a
ti
o
n
a
n
d
Ev
a
l
u
a
ti
o
n
in
KIT’s
C
o
ll
e
g
e
o
f
En
g
i
n
e
e
rin
g
,
S
h
i
v
a
ji
Un
i
v
e
rsity
,
Ko
lh
a
p
u
r
,
In
d
ia,
fro
m
2
0
2
2
t
o
p
re
se
n
t.
He
c
a
n
b
e
c
o
n
tac
ted
a
t
e
m
a
il
:
k
a
y
a
jay
2
0
0
4
@g
m
a
il
.
c
o
m
,
p
o
l.
a
jay
@
k
it
c
o
e
k
.
i
n
.
S
h
r
in
iv
a
s
A.
Pa
ti
l
re
c
e
iv
e
d
th
e
B.
E.
d
e
g
re
e
in
El
e
c
tro
n
ics
En
g
i
n
e
e
rin
g
fr
o
m
KIT’s,
Co
l
leg
e
o
f
E
n
g
in
e
e
rin
g
,
Ko
lh
a
p
u
r,
M
.
Tec
h
.
in
Bi
o
-
M
e
d
ica
l
En
g
in
e
e
rin
g
.
fro
m
IIT
,
Bo
m
b
a
y
,
a
n
d
P
h
D
in
El
e
c
tr
o
n
ics
fro
m
S
h
i
v
a
ji
Un
i
v
e
rsity
,
K
o
lh
a
p
u
r.
Cu
rre
n
tl
y
h
e
is
wo
rk
in
g
a
s
a
P
ro
fe
ss
o
r
a
n
d
H
e
a
d
o
f
El
e
c
tro
n
ics
a
n
d
Tele
c
o
m
m
u
n
ica
ti
o
n
En
g
i
n
e
e
rin
g
.
a
t
DK
TE
’s,
Tex
ti
le
&
En
g
in
e
e
rin
g
.
I
n
stit
u
te,
Ic
h
a
lk
a
ra
n
ji
,
M
a
h
a
ra
sh
tra
S
tate
,
I
n
d
ia.
He
is
h
a
v
in
g
3
3
y
e
a
rs
o
f
tea
c
h
in
g
a
n
d
one
-
y
e
a
r
in
d
u
str
ial
e
x
p
e
rien
c
e
.
He
h
a
s
su
p
e
rv
ise
d
a
n
d
c
o
-
su
p
e
r
v
ise
d
m
o
re
th
a
n
1
5
M
.
Tec
h
.
a
n
d
8
P
h
.
D
.
stu
d
e
n
ts.
He
h
a
s
a
u
t
h
o
re
d
,
c
o
a
u
t
h
o
r
e
d
,
a
n
d
p
re
se
n
ted
m
o
re
t
h
a
n
8
0
p
u
b
li
c
a
ti
o
n
s
i
n
p
e
e
r
re
v
iew
e
d
jo
u
r
n
a
ls
a
n
d
in
tern
a
ti
o
n
a
l
c
o
n
fe
re
n
c
e
s.
His
re
se
a
rc
h
in
tere
sts
in
c
lu
d
e
ima
g
e
p
ro
c
e
ss
in
g
,
a
rti
fic
ial
in
tell
ig
e
n
c
e
a
n
d
m
a
c
h
i
n
e
lea
rn
in
g
,
e
m
b
e
d
d
e
d
a
n
d
VLS
I
sy
ste
m
d
e
sig
n
.
He
c
a
n
b
e
c
o
n
tac
ted
a
t
e
m
a
il
:
sa
p
a
ti
l@d
k
te.ac
.
in
.
Evaluation Warning : The document was created with Spire.PDF for Python.