I
AE
S
I
n
t
e
r
n
at
ion
al
Jou
r
n
al
of
Ar
t
if
icial
I
n
t
e
ll
ig
e
n
c
e
(
I
J
-
AI
)
Vol.
14
,
No.
4
,
Augus
t
2025
,
pp.
2864
~
2875
I
S
S
N:
2252
-
8938
,
DO
I
:
10
.
11591/i
jai
.
v
14
.i
4
.
pp
28
64
-
2875
2864
Jou
r
n
al
h
omepage
:
ht
tp:
//
ij
ai
.
iaes
c
or
e
.
c
om
E
n
h
an
c
e
d
p
r
e
-
b
r
oad
c
ast
vi
d
e
o c
od
e
c
val
id
at
io
n
u
si
n
g
h
yb
r
i
d
CNN
-
L
S
T
M
w
ith
a
t
t
e
n
t
io
n
a
n
d
a
u
t
oe
n
c
o
d
e
r
-
b
ase
d
an
om
a
ly
d
e
t
e
c
t
io
n
Kh
ali
d
E
l
F
ayq
,
S
ai
d
T
k
at
e
k
,
L
ah
c
e
n
I
d
ou
gl
id
L
a
bor
a
to
r
y f
or
C
omput
e
r
S
c
ie
nc
e
R
e
s
e
a
r
c
h, F
a
c
ul
ty
of
S
c
ie
nc
e
,
I
bn
T
of
a
il
U
ni
ve
r
s
it
y
,
K
e
ni
tr
a
, M
or
oc
c
o
Ar
t
icle
I
n
f
o
AB
S
T
RA
CT
A
r
ti
c
le
h
is
tor
y
:
R
e
c
e
ived
Aug
16,
2024
R
e
vis
e
d
M
a
r
24,
2025
Ac
c
e
pted
J
un
8,
2025
T
h
i
s
s
t
u
d
y
p
re
s
en
t
s
a
mach
i
n
e
l
earn
i
n
g
-
b
a
s
ed
ap
p
ro
ac
h
fo
r
p
r
o
act
i
v
e
v
i
d
e
o
co
d
ec
err
o
r
d
et
ec
t
i
o
n
,
e
n
s
u
ri
n
g
u
n
i
n
t
erru
p
t
e
d
t
el
e
v
i
s
i
o
n
b
ro
a
d
cas
t
i
n
g
fo
r
T
V
L
aay
o
u
n
e,
p
art
o
f
Mo
r
o
cco
’
s
SN
R
T
n
e
t
w
o
rk
.
Bu
i
l
d
i
n
g
u
p
o
n
p
rev
i
o
u
s
ap
p
r
o
ach
e
s
,
o
u
r
met
h
o
d
i
n
t
r
o
d
u
ces
a
u
t
o
e
n
c
o
d
er
s
fo
r
i
mp
ro
v
ed
a
n
o
ma
l
y
d
et
ec
t
i
o
n
a
n
d
i
n
t
e
g
rat
es
d
a
t
a
au
g
men
t
at
i
o
n
t
o
en
h
an
ce
mo
d
el
re
s
i
l
i
e
n
ce
t
o
rare
co
d
ec
co
n
fi
g
u
ra
t
i
o
n
s
.
By
c
o
mb
i
n
i
n
g
co
n
v
o
l
u
t
i
o
n
al
n
eu
ra
l
n
e
t
w
o
rk
s
(CN
N
s
)
an
d
l
o
n
g
s
h
o
r
t
-
t
erm
memo
r
y
(L
ST
M)
n
e
t
w
o
r
k
s
w
i
t
h
a
n
at
t
en
t
i
o
n
mech
an
i
s
m,
t
h
e
s
y
s
t
em
effect
i
v
e
l
y
ca
p
t
u
res
s
p
at
i
al
an
d
t
em
p
o
ra
l
v
i
d
e
o
feat
u
re
s
.
T
h
i
s
arch
i
t
ect
u
re
emp
h
a
s
i
ze
s
cri
t
i
cal
met
a
d
at
a
at
t
ri
b
u
t
es
t
h
a
t
i
n
f
l
u
e
n
ce
v
i
d
e
o
p
l
ay
b
ack
q
u
a
l
i
t
y
.
E
m
b
ed
d
ed
w
i
t
h
i
n
t
h
e
b
ro
a
d
ca
s
t
i
n
g
p
i
p
el
i
n
e,
t
h
e
mo
d
el
e
n
ab
l
es
rea
l
-
t
i
me
erro
r
d
et
ec
t
i
o
n
a
n
d
al
e
r
t
s
,
mi
n
i
m
i
zi
n
g
man
u
a
l
i
n
t
er
v
en
t
i
o
n
an
d
red
u
c
i
n
g
t
ran
s
m
i
s
s
i
o
n
d
i
s
r
u
p
t
i
o
n
s
.
E
x
p
eri
me
n
t
a
l
res
u
l
t
s
d
emo
n
s
t
rat
e
a
9
7
%
accu
rac
y
i
n
d
e
t
ect
i
n
g
co
d
ec
e
rro
rs
,
o
u
t
p
erfo
rm
i
n
g
t
rad
i
t
i
o
n
al
mach
i
n
e
l
earn
i
n
g
mo
d
el
s
.
T
h
i
s
s
t
u
d
y
h
i
g
h
l
i
g
h
t
s
t
h
e
t
ran
s
fo
rma
t
i
v
e
ro
l
e
o
f
mac
h
i
n
e
l
ear
n
i
n
g
i
n
b
ro
a
d
cas
t
i
n
g
,
en
ab
l
i
n
g
s
cal
a
b
l
e
d
ep
l
o
y
men
t
acro
s
s
d
i
v
er
s
e
t
el
e
v
i
s
i
o
n
n
et
w
o
r
k
s
.
K
e
y
w
o
r
d
s
:
Autoe
nc
ode
r
-
a
nomaly
de
tec
ti
on
Da
ta
a
ugmenta
ti
on
M
a
c
hine
lea
r
ning
Vide
o
c
ode
c
e
r
r
or
s
Vide
o
c
ode
c
va
li
da
ti
on
Vide
o
meta
da
ta
a
na
lys
is
Th
i
s
i
s
a
n
o
p
en
a
c
ces
s
a
r
t
i
c
l
e
u
n
d
e
r
t
h
e
CC
B
Y
-
SA
l
i
ce
n
s
e.
C
or
r
e
s
pon
din
g
A
u
th
or
:
Kha
li
d
E
l
F
a
yq
L
a
bor
a
tor
y
f
or
C
omput
e
r
S
c
ienc
e
R
e
s
e
a
r
c
h,
F
a
c
ult
y
of
S
c
ienc
e
,
I
bn
T
o
f
a
il
Unive
r
s
it
y
Ke
nit
r
a
,
M
or
oc
c
o
E
mail:
kha
li
de
lf
a
yq@gmail
.
c
om
1.
I
NT
RODU
C
T
I
ON
T
e
levis
ion
br
oa
dc
a
s
ti
ng
is
c
r
uc
ial
in
de
li
ve
r
ing
c
ontent
to
lar
ge
a
udienc
e
s
,
making
unint
e
r
r
upted,
high
-
qua
li
ty
video
t
r
a
ns
mi
s
s
ion
e
s
s
e
nti
a
l.
T
V
L
a
a
youne
,
pa
r
t
o
f
M
or
oc
c
o’
s
na
ti
ona
l
br
oa
dc
a
s
ti
ng
ne
twor
k,
of
ten
f
a
c
e
s
dis
r
upti
ons
c
a
us
e
d
by
video
c
ode
c
i
nc
omp
a
ti
bil
it
ies
,
pa
r
ti
c
ular
ly
dur
ing
li
ve
br
oa
dc
a
s
ts
.
T
V
L
a
a
youne
r
e
li
e
s
on
the
pr
opr
ieta
r
y
'Or
igo'
s
e
r
ve
r
a
s
one
of
ten
c
ha
nne
ls
ope
r
a
ti
ng
on
a
s
ha
r
e
d
inf
r
a
s
tr
uc
tur
e
.
Although
the
s
e
r
ve
r
a
dhe
r
e
s
to
int
e
r
na
ti
ona
l
br
oa
d
c
a
s
ti
ng
s
tanda
r
ds
,
c
ode
c
incons
is
ten
c
ies
p
e
r
s
is
t,
lea
ding
to
playba
c
k
is
s
ue
s
,
int
e
r
r
upti
ons
,
or
c
ompl
e
te
s
ign
a
l
los
s
.
T
he
s
e
e
r
r
or
s
f
r
e
que
ntl
y
or
igi
na
te
f
r
om
va
r
ious
c
a
mer
a
s
a
nd
video
f
or
mats
us
e
d
a
c
r
os
s
the
ne
t
wor
k.
Additi
ona
ll
y
,
pos
t
-
pr
oduc
ti
on
e
xpor
t
p
r
oc
e
s
s
e
s
c
a
n
int
r
oduc
e
f
ur
ther
c
ode
c
mi
s
matc
he
s
.
T
o
a
ddr
e
s
s
th
is
c
ha
ll
e
nge
,
we
pr
opos
e
a
n
a
utom
a
ted,
mac
hine
l
e
a
r
ning
-
ba
s
e
d
pr
e
-
br
oa
dc
a
s
t
va
li
da
ti
on
s
ys
tem
that
ut
il
i
z
e
s
a
utoenc
ode
r
s
f
or
a
nomaly
de
tec
ti
on,
s
ynthe
ti
c
da
ta
ge
ne
r
a
ti
on,
a
nd
a
hybr
id
c
onvolut
ional
ne
ur
a
l
ne
t
wor
k
(
C
NN
)
a
nd
long
s
hor
t
-
ter
m
memor
y
(
L
S
T
M
)
model
with
a
tt
e
nti
on
mec
ha
nis
ms
.
B
y
a
na
lyzing
s
pa
ti
a
l
a
nd
tempor
a
l
meta
da
ta
e
xtr
a
c
ted
thr
ough
F
F
m
pe
g,
the
s
ys
tem
pr
ovides
r
e
a
l
-
ti
me
e
r
r
or
de
tec
ti
on
a
nd
a
uto
mate
d
a
ler
ts
,
e
nha
nc
ing
ove
r
a
ll
br
oa
dc
a
s
ti
ng
r
e
li
a
b
il
it
y.
T
he
pr
im
a
r
y
objec
ti
ve
is
to
e
ns
ur
e
unint
e
r
r
upted
b
r
oa
dc
a
s
ti
ng
by
pr
oa
c
ti
ve
ly
de
tec
ti
ng
incompa
ti
ble
video
c
ode
c
s
,
r
e
duc
ing
dis
r
upti
ons
,
a
nd
im
p
r
oving
ope
r
a
ti
ona
l
e
f
f
icie
nc
y
a
c
r
os
s
a
ll
c
ha
nne
ls
us
ing
th
e
s
ha
r
e
d
inf
r
a
s
tr
uc
tur
e
.
T
he
pr
opos
e
d
s
olut
ion
e
nha
nc
e
s
c
r
it
ica
l
pe
r
f
o
r
manc
e
metr
i
c
s
,
including
a
c
c
ur
a
c
y,
p
r
e
c
is
ion,
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
Ar
ti
f
I
ntell
I
S
S
N:
2252
-
8938
E
nhanc
e
d
pr
e
-
br
oadc
as
t
v
ideo
c
ode
c
v
ali
dati
on
us
ing
hy
br
id
C
N
N
-
L
ST
M
w
it
h
…
(
K
hali
d
E
l
F
ay
q
)
2865
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
,
s
tr
e
ngthening
the
r
e
s
il
ienc
e
o
f
video
tr
a
ns
mi
s
s
ion
s
ys
tems
.
De
s
igned
to
be
s
c
a
la
ble,
thi
s
a
ppr
oa
c
h
not
only
a
ddr
e
s
s
e
s
c
ode
c
e
r
r
or
s
f
or
T
V
L
a
a
youne
but
e
xtends
it
s
be
ne
f
it
s
to
other
S
NR
T
c
ha
nne
ls
ope
r
a
ti
ng
unde
r
the
s
a
me
inf
r
a
s
tr
uc
tur
e
.
Or
igo,
the
pr
opr
ieta
r
y
b
r
oa
dc
a
s
t
s
e
r
ve
r
,
is
c
e
nt
r
a
l
t
o
video
inges
ti
on,
pr
oc
e
s
s
ing,
a
nd
t
r
a
ns
mi
s
s
ion
a
s
s
hown
in
F
igur
e
s
1
a
nd
2.
T
his
s
ys
tem
f
o
r
ms
th
e
ba
c
kbone
of
S
NR
T
’
s
br
oa
dc
a
s
ti
ng
ne
twor
k,
b
ut
c
ode
c
incons
is
tenc
ie
s
f
r
e
que
ntl
y
dis
r
upt
it
s
ope
r
a
ti
on.
T
his
pa
pe
r
de
lves
int
o
the
int
e
gr
a
ti
on
of
the
p
r
opos
e
d
mac
hine
lea
r
ning
s
ys
tem
with
Or
igo,
a
im
ing
to
p
r
e
e
mpt
video
c
ode
c
e
r
r
or
s
that
may
dis
r
upt
li
ve
br
oa
dc
a
s
ts
,
e
ns
ur
ing
s
e
a
ml
e
s
s
tele
vis
ion
tr
a
ns
mi
s
s
ion.
F
igur
e
1.
Or
igo
s
e
r
ve
r
b
r
oa
dc
a
s
ti
ng
int
e
r
f
a
c
e
f
o
r
T
V
L
a
a
youne
F
igur
e
2.
Or
igo
s
e
r
ve
r
wor
k
f
low:
inges
ti
on
,
pr
oc
e
s
s
ing,
s
tor
a
ge
,
a
nd
tr
a
ns
mi
s
s
ion
T
he
incr
e
a
s
ing
c
ompl
e
xit
y
of
v
ideo
f
or
mats
a
nd
th
e
e
xpa
ns
ion
of
digi
tal
br
oa
dc
a
s
ti
ng
ne
twor
ks
ha
ve
a
mpl
if
ied
the
c
ha
ll
e
nge
s
pos
e
d
by
video
c
ode
c
e
r
r
or
s
.
R
e
s
e
a
r
c
he
r
s
ha
ve
pr
opos
e
d
va
r
ious
s
olut
ions
,
r
a
nging
f
r
om
t
r
a
dit
ional
he
ur
is
ti
c
-
ba
s
e
d
tec
hniques
to
a
dv
a
nc
e
d
mac
hine
lea
r
ning
models
de
s
igned
to
a
utom
a
te
a
nd
e
nha
nc
e
e
r
r
or
de
tec
ti
on.
C
onve
nti
ona
l
methods
f
or
de
tec
ti
n
g
c
ode
c
e
r
r
o
r
s
pr
im
a
r
il
y
r
e
ly
on
manua
l
ins
pe
c
ti
ons
or
pr
e
de
f
ined
r
u
le
-
ba
s
e
d
s
y
s
tems
,
of
ten
r
e
s
ult
ing
in
inef
f
icie
nc
ies
a
nd
inac
c
ur
a
c
ies
.
He
ur
is
ti
c
a
lgor
it
hms
a
pply
pr
e
de
f
ined
pa
tt
e
r
ns
to
identi
f
y
c
omm
on
e
r
r
or
s
,
but
their
s
tatic
na
tur
e
li
mi
ts
a
da
pt
a
bil
it
y
to
e
volvi
ng
video
f
or
mats
.
S
ignal
p
r
oc
e
s
s
ing
te
c
hniques
,
while
e
f
f
e
c
ti
ve
f
or
de
tec
ti
ng
a
r
ti
f
a
c
ts
a
nd
s
ync
hr
oniza
ti
on
e
r
r
or
s
,
f
r
e
que
ntl
y
f
a
ll
s
hor
t
in
r
e
a
l
-
ti
me
s
c
e
na
r
ios
,
lea
ving
br
oa
dc
a
s
ter
s
vulner
a
ble
to
une
xpe
c
ted
dis
r
upti
ons
dur
ing
li
ve
tr
a
ns
mi
s
s
ions
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
Ar
ti
f
I
ntell
,
Vol.
14
,
No.
4
,
Augus
t
2025
:
286
4
-
2875
2866
M
a
c
hine
lea
r
ning
models
of
f
e
r
a
tr
a
ns
f
or
mative
a
lt
e
r
na
ti
ve
by
lea
r
n
ing
f
r
om
la
r
ge
da
tas
e
ts
a
nd
dyna
mi
c
a
ll
y
a
da
pti
ng
to
ne
w
e
r
r
or
pa
tt
e
r
ns
.
T
his
a
da
ptabili
ty
is
c
r
uc
ial
in
e
volvi
ng
br
oa
dc
a
s
ti
ng
e
nvir
onments
whe
r
e
video
f
or
mats
a
nd
c
ode
c
s
f
r
e
que
ntl
y
c
ha
nge
.
Klink
e
t
al.
[
1
]
r
e
view
e
d
mac
hine
lea
r
ning
f
r
a
mew
or
ks
f
or
video
qua
li
ty
p
r
e
diction,
il
lus
tr
a
ti
n
g
the
li
mi
tations
of
tr
a
dit
ional
methods
in
ha
ndli
ng
diver
s
e
video
input
s
.
Opr
e
a
e
t
al
.
[
2
]
high
li
ghted
how
de
e
p
video
c
ompr
e
s
s
ion
tec
hniques
leve
r
a
ge
ne
ur
a
l
ne
twor
ks
to
e
nha
nc
e
pr
oc
e
s
s
ing
e
f
f
icie
nc
y,
s
howc
a
s
ing
the
potential
of
mac
hine
lea
r
ning
in
video
b
r
oa
dc
a
s
ti
ng
wor
kf
lows
.
M
us
ka
a
n
e
t
al
.
[
3]
de
mons
tr
a
ted
the
e
f
f
e
c
ti
ve
ne
s
s
of
C
NN
-
L
S
T
M
a
r
c
hit
e
c
tur
e
s
in
id
e
nti
f
ying
a
nomalies
,
s
uc
h
a
s
de
e
pf
a
ke
s
,
f
ur
ther
e
mphas
izing
the
uti
li
ty
of
mac
hine
lea
r
ning
in
video
int
e
gr
it
y
c
he
c
ks
.
H
ybr
id
models
that
c
ombi
ne
C
NN
s
a
nd
L
S
T
M
s
a
r
e
incr
e
a
s
ingl
y
popular
f
or
video
pr
oc
e
s
s
ing
tas
ks
.
C
NN
s
e
xc
e
l
a
t
e
xtr
a
c
ti
ng
c
r
it
ica
l
s
pa
ti
a
l
f
e
a
tur
e
s
f
r
om
video
f
r
a
mes
,
while
L
S
T
M
s
c
a
ptur
e
t
e
mpor
a
l
de
pe
nde
nc
ies
a
c
r
os
s
s
e
que
n
c
e
s
,
making
them
i
de
a
l
f
or
a
na
lyzing
video
s
tr
e
a
ms
.
Ka
ur
a
nd
M
is
hr
a
[
4]
e
mpl
oye
d
L
S
T
M
to
ge
ne
r
a
te
c
onc
is
e
video
s
umm
a
r
ies
f
r
om
lengthy
s
e
que
nc
e
s
,
highl
ight
ing
the
s
ig
nif
ica
nc
e
of
tempor
a
l
a
na
lys
is
in
video
da
ta.
B
idwe
e
t
a
l
.
[
5]
de
mons
tr
a
ted
the
s
uc
c
e
s
s
of
thi
s
a
r
c
hit
e
c
tur
e
f
or
video
c
ompr
e
s
s
ion,
while
B
e
noughidene
a
nd
T
it
ouna
[
6]
a
ppli
e
d
C
NN
-
L
S
T
M
models
f
or
video
s
hot
b
ounda
r
y
de
tec
ti
on.
P
a
nne
e
r
s
e
lvam
e
t
al
.
[
7]
e
xplor
e
d
e
f
f
i
c
ient
video
c
omp
r
e
s
s
ion
us
ing
de
e
p
lea
r
ning
tec
hniques
,
unde
r
li
ning
the
be
ne
f
i
ts
of
c
ombi
n
ing
C
NN
s
a
nd
L
S
T
M
s
f
or
c
ompl
e
x
da
ta
pa
tt
e
r
ns
.
De
s
pit
e
their
e
f
f
e
c
ti
ve
ne
s
s
,
tr
a
dit
ional
C
NN
-
L
S
T
M
models
may
ove
r
look
s
ubtl
e
or
r
a
r
e
a
nomalies
.
A
tt
e
nti
on
mec
ha
nis
ms
f
ur
ther
im
pr
ove
pe
r
f
or
man
c
e
by
e
na
bli
ng
the
ne
twor
k
to
p
r
ior
it
ize
the
mos
t
r
e
leva
nt
f
e
a
tur
e
s
dur
ing
pr
oc
e
s
s
ing
[
8]
.
Autoe
nc
ode
r
s
pr
o
vide
a
r
obus
t,
uns
upe
r
vis
e
d
lea
r
ning
method
f
o
r
a
nomaly
de
tec
ti
on
in
video
s
tr
e
a
ms
.
T
he
s
e
models
r
e
c
ons
tr
uc
t
video
f
r
a
mes
or
meta
da
ta
a
nd
f
lag
dis
c
r
e
pa
nc
ies
by
a
na
lyzing
r
e
c
ons
tr
uc
ti
on
e
r
r
or
s
.
Ga
s
hnikov
[
9
]
s
uc
c
e
s
s
f
ull
y
a
ppli
e
d
a
utoenc
ode
r
s
to
v
ideo
c
ode
c
va
li
da
ti
on,
a
c
hieving
s
upe
r
ior
a
c
c
ur
a
c
y
ove
r
tr
a
dit
ional
s
ys
te
ms
.
B
y
lea
r
ning
to
r
e
c
ons
tr
uc
t
video
input
s
,
a
uto
e
nc
ode
r
s
e
f
f
e
c
ti
ve
ly
de
tec
t
a
nomalies
indi
c
a
ti
ve
of
c
ode
c
mi
s
matc
he
s
or
da
ta
c
or
r
upti
on.
I
ntegr
a
ti
ng
a
utoenc
od
e
r
s
with
C
NN
-
L
S
T
M
models
f
ur
ther
e
nha
nc
e
s
the
ov
e
r
a
l
l
de
tec
ti
on
pipeline
,
pa
r
ti
c
ular
ly
f
or
r
a
r
e
or
s
ubt
le
c
ode
c
e
r
r
or
s
that
mi
gh
t
o
ther
wis
e
e
s
c
a
pe
tr
a
dit
ional
de
tec
ti
on
methods
.
Augme
nti
ng
da
tas
e
ts
thr
ough
s
im
ulate
d
video
c
onf
igur
a
ti
ons
,
bit
r
a
te
a
lt
e
r
a
ti
ons
,
a
nd
s
ynt
he
ti
c
video
ge
ne
r
a
ti
on
ha
s
pr
ove
n
e
s
s
e
nti
a
l
f
or
e
nha
nc
ing
model
ge
ne
r
a
li
z
a
bil
it
y.
W
a
ng
e
t
al
.
[
10]
unde
r
s
c
or
e
d
the
im
po
r
tanc
e
of
s
ynthetic
da
ta
in
tr
a
ini
ng
mac
hine
lea
r
ning
models
f
or
video
e
nha
nc
e
ment.
T
his
a
pp
r
oa
c
h
diver
s
if
ies
the
tr
a
ini
ng
s
e
t,
e
xpos
ing
the
m
ode
l
to
a
wide
r
a
nge
of
br
oa
dc
a
s
ti
ng
s
c
e
na
r
ios
,
ult
im
a
tely
e
nha
nc
ing
de
tec
ti
on
pe
r
f
or
manc
e
.
W
hil
e
pr
ior
s
tudi
e
s
a
ddr
e
s
s
a
s
pe
c
ts
of
video
pr
oc
e
s
s
ing
a
nd
e
r
r
or
de
tec
ti
on,
l
im
it
e
d
r
e
s
e
a
r
c
h
f
oc
us
e
s
on
pr
oa
c
ti
ve
,
pr
e
-
br
oa
dc
a
s
t
c
ode
c
va
li
da
ti
on
in
li
ve
tele
vis
ion
e
nvir
onments
.
T
his
s
tudy
br
idges
thi
s
ga
p
by
int
e
gr
a
ti
ng
a
utoen
c
ode
r
s
,
C
NN
-
L
S
T
M
models
with
a
tt
e
nti
on
mec
ha
nis
ms
[
11]
,
a
nd
e
xtens
ive
da
ta
a
ugmenta
ti
on
to
c
r
e
a
te
a
c
omp
r
e
he
ns
ive
pr
e
-
br
oa
dc
a
s
t
video
c
ode
c
va
li
da
ti
on
s
ys
tem
[
12]
.
T
he
pr
opos
e
d
model
leve
r
a
ge
s
s
pa
ti
a
l
a
nd
tempor
a
l
meta
da
ta
f
e
a
tur
e
s
e
xtr
a
c
ted
thr
ough
F
F
mpeg,
e
ns
ur
ing
high
de
tec
ti
on
a
c
c
ur
a
c
y
while
ope
r
a
ti
ng
in
r
e
a
l
ti
me
.
C
NN
s
ha
ve
be
e
n
wide
ly
a
dopted
f
or
video
a
nd
i
mage
a
na
lys
is
due
to
their
a
bil
it
y
to
e
f
f
e
c
ti
ve
ly
c
a
ptur
e
s
pa
ti
a
l
hier
a
r
c
hies
in
da
ta
[
13
]
.
Ya
n
e
t
al.
[
14]
leve
r
a
ge
d
C
NN
s
f
or
f
r
a
c
ti
ona
l
-
pixel
m
oti
on
c
ompens
a
ti
on,
de
mons
tr
a
ti
ng
their
uti
li
ty
in
vid
e
o
pr
oc
e
s
s
ing.
C
ui
e
t
al
.
[
15]
a
ppli
e
d
C
NN
-
ba
s
e
d
pos
t
-
f
il
ter
ing
t
o
c
ompr
e
s
s
e
d
im
a
ge
s
a
nd
videos
,
a
c
hieving
notable
im
p
r
ove
ments
c
ompar
e
d
to
tr
a
dit
ional
tec
hniques
.
Additi
ona
ll
y,
E
l
F
a
yq
e
t
al.
[
16]
e
mpl
oye
d
mac
hine
lea
r
ning
to
de
tec
t
a
nd
e
xtr
a
c
t
f
a
c
e
s
a
nd
text
f
r
om
a
udiovi
s
ua
l
a
r
c
hives
,
highl
ight
ing
the
ve
r
s
a
ti
l
it
y
of
C
NN
s
in
va
r
ious
a
ppli
c
a
ti
ons
.
M
a
c
h
in
e
le
a
r
n
in
g
a
p
pl
ica
t
io
ns
i
n
b
r
o
a
dc
a
s
t
in
g
e
n
v
ir
o
nm
e
n
ts
ha
ve
be
e
n
wi
de
ly
e
xp
l
or
e
d
.
D
a
r
w
ich
a
nd
B
a
yo
u
mi
[
1
7
]
i
nt
e
g
r
a
te
d
C
NN
a
n
d
r
e
c
u
r
r
e
nt
ne
ur
a
l
ne
tw
o
r
k
(
R
N
N
)
m
ode
ls
f
o
r
v
i
de
o
qu
a
l
it
y
a
da
pt
a
t
io
n
,
r
e
duc
i
ng
li
ve
br
oa
dc
a
s
t
d
is
r
u
p
ti
ons
.
S
h
a
r
r
a
b
e
t
al
.
[
1
8
]
de
m
ons
t
r
a
ted
th
e
r
o
le
o
f
ma
c
h
ine
le
a
r
n
in
g
i
n
r
e
a
l
-
ti
me
v
i
de
o
c
om
mu
ni
c
a
ti
on
,
h
ig
hl
ig
ht
i
ng
a
u
to
ma
te
d
e
r
r
o
r
d
e
tec
t
io
n
a
s
c
r
i
t
ica
l
c
o
mp
one
nt
.
B
o
ua
a
f
ia
e
t
a
l
.
[
19]
a
pp
l
ie
d
d
e
e
p
le
a
r
n
in
g
-
ba
s
e
d
v
id
e
o
qua
l
it
y
e
n
ha
nc
e
me
nt
t
e
c
hn
iq
ue
s
,
a
c
h
iev
i
ng
s
ig
ni
f
ica
nt
im
p
r
o
ve
men
ts
in
vi
de
o
p
r
oc
e
s
s
i
ng
.
C
he
n
e
t
a
l
.
[
2
0]
de
v
e
l
op
e
d
R
L
-
A
F
E
C
,
a
r
e
in
f
or
c
e
men
t
l
e
a
r
n
i
ng
-
b
a
s
e
d
a
da
p
ti
ve
f
o
r
wa
r
d
e
r
r
o
r
c
o
r
r
e
c
ti
on
s
ys
tem
,
de
mo
ns
tr
a
t
in
g
the
po
te
nt
ia
l
o
f
a
dv
a
n
c
e
d
ma
c
h
in
e
le
a
r
n
in
g
t
e
c
h
n
iq
ue
s
f
o
r
e
r
r
o
r
de
te
c
ti
o
n
.
De
s
pit
e
s
igni
f
ica
nt
pr
ogr
e
s
s
,
c
ha
ll
e
nge
s
pe
r
s
is
t
in
e
ns
ur
ing
r
e
a
l
-
ti
me
pe
r
f
o
r
manc
e
a
nd
s
c
a
labili
ty.
Ac
hieving
low
-
late
nc
y
de
tec
ti
on
without
c
omp
r
o
mi
s
ing
a
c
c
ur
a
c
y
r
e
quir
e
s
opti
m
izing
model
a
r
c
hi
tec
tur
e
s
a
nd
e
xpa
nding
da
tas
e
ts
to
c
ove
r
diver
s
e
c
ode
c
c
on
f
igur
a
ti
ons
.
L
iu
e
t
a
l
.
[
21]
e
mphas
ize
d
da
tas
e
t
div
e
r
s
it
y
a
s
e
s
s
e
nti
a
l
f
or
video
c
oding
models
,
while
M
a
e
t
al
.
[
22
]
a
dd
r
e
s
s
e
d
the
ne
e
d
f
or
r
e
a
l
-
ti
me
opti
mi
z
a
ti
ons
in
ne
ur
a
l
ne
twor
ks
f
or
video
c
ompr
e
s
s
ion.
Z
ha
ng
e
t
al
.
[
23
]
r
e
view
e
d
mac
hine
lea
r
ning
-
ba
s
e
d
a
ppr
oa
c
he
s
to
video
c
oding
opti
mi
z
a
ti
ons
,
highl
ight
ing
the
im
po
r
tanc
e
of
de
ve
lopi
ng
r
obus
t
a
nd
s
c
a
lable
s
olut
ions
.
T
he
int
e
gr
a
ti
on
of
mac
h
ine
lea
r
ning
tec
hniques
—
s
uc
h
a
s
a
utoenc
ode
r
s
,
C
NN
-
L
S
T
M
models
with
a
tt
e
nti
on
mec
ha
nis
ms
,
a
nd
da
ta
a
ugmenta
ti
on
—
r
e
pr
e
s
e
nts
a
s
igni
f
ica
nt
a
dva
nc
e
ment
in
v
ideo
c
o
de
c
e
r
r
or
de
tec
ti
on.
T
he
s
e
a
dva
nc
e
ments
e
nha
nc
e
r
e
li
a
bil
it
y
a
nd
e
f
f
icie
nc
y
a
c
r
os
s
tele
vis
ion
b
r
oa
dc
a
s
ti
ng
s
ys
tems
.
Our
pr
opos
e
d
s
ys
tem
buil
ds
upon
thi
s
f
ounda
ti
on,
int
e
g
r
a
ti
ng
thes
e
methods
to
p
r
oa
c
ti
ve
ly
de
tec
t
a
nd
mi
ti
ga
te
video
c
ode
c
e
r
r
or
s
,
e
ns
ur
ing
s
e
a
ml
e
s
s
br
oa
dc
a
s
ti
ng
f
or
T
V
L
a
a
youne
a
nd
other
S
NR
T
c
ha
nne
ls
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
Ar
ti
f
I
ntell
I
S
S
N:
2252
-
8938
E
nhanc
e
d
pr
e
-
br
oadc
as
t
v
ideo
c
ode
c
v
ali
dati
on
us
ing
hy
br
id
C
N
N
-
L
ST
M
w
it
h
…
(
K
hali
d
E
l
F
ay
q
)
2867
R
e
s
e
a
r
c
h
ha
s
c
ons
is
t
e
ntl
y
highl
ight
e
d
the
im
por
ta
nc
e
of
a
c
c
ur
a
c
y,
pr
e
c
is
ion,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
in
e
va
luating
the
e
f
f
e
c
ti
ve
ne
s
s
of
thes
e
models
.
S
t
udies
r
e
por
t
im
pr
ove
ments
in
thes
e
metr
ics
whe
n
us
ing
mac
hine
lea
r
ning
-
ba
s
e
d
a
ppr
oa
c
he
s
c
ompar
e
d
to
tr
a
dit
ional
methods
.
S
teine
r
t
a
nd
S
tabe
r
na
c
k
[
24
]
de
s
igned
a
low
late
nc
y
H.
264/AVC
video
c
ode
c
f
or
r
obus
t
mac
hine
lea
r
ning
-
ba
s
e
d
im
a
ge
c
las
s
if
ica
ti
on,
de
mons
tr
a
ti
ng
s
igni
f
ica
nt
im
pr
ove
ments
in
pr
e
c
is
ion
a
nd
r
e
c
a
ll
.
Additi
ona
ll
y
,
P
utr
i
e
t
al
.
[
25]
c
onduc
ted
a
c
om
pa
r
a
ti
ve
a
na
lys
is
of
video
qua
li
ty
us
ing
c
ode
c
s
VP8
a
nd
H.
265,
highl
ight
ing
how
c
ode
c
pe
r
f
or
manc
e
im
pa
c
ts
video
c
omm
unica
ti
on
qua
li
ty.
T
he
ir
f
indi
ngs
e
mphas
ize
the
im
por
tanc
e
of
s
e
lec
ti
ng
a
nd
va
li
da
ti
ng
c
ode
c
s
to
mi
nim
ize
pa
c
ke
t
los
s
a
nd
op
ti
mi
z
e
thr
oughput
,
f
ur
ther
r
e
inf
or
c
ing
the
ne
e
d
f
o
r
mac
hine
lea
r
nin
g
-
dr
iven
a
ppr
oa
c
he
s
to
c
ode
c
e
r
r
or
de
tec
ti
on.
T
he
r
e
s
t
of
thi
s
pa
pe
r
is
or
ga
nize
d
a
s
f
oll
ows
:
s
e
c
ti
on
2
outl
ines
the
methodology
.
S
e
c
ti
on
3
pr
e
s
e
nts
e
xpe
r
im
e
ntal
r
e
s
ult
s
.
F
inally
,
s
e
c
ti
on
4
c
onc
ludes
the
s
tudy
with
di
r
e
c
ti
ons
f
or
f
utur
e
r
e
s
e
a
r
c
h.
2.
M
E
T
HO
D
2.
1.
Dat
as
e
t
s
T
h
e
e
f
f
e
c
t
ive
ne
s
s
o
f
a
ny
ma
c
h
ine
le
a
r
n
in
g
m
od
e
l
,
pa
r
ti
c
u
la
r
ly
f
o
r
v
ide
o
c
od
e
c
e
r
r
o
r
de
tec
t
io
n
,
d
e
pe
nds
o
n
the
qu
a
l
it
y
,
d
i
ve
r
s
i
ty
,
a
n
d
r
e
p
r
e
s
e
n
ta
t
ive
ne
s
s
o
f
th
e
d
a
t
a
s
e
t
s
us
e
d
d
u
r
i
ng
t
r
a
in
in
g
a
n
d
e
va
lua
t
io
n
.
I
n
th
is
s
tu
dy
,
a
n
e
x
te
ns
i
ve
d
a
ta
s
e
t
w
a
s
c
o
mp
i
led
,
c
om
p
r
is
i
ng
o
ve
r
1
0
,
0
00
v
i
de
o
c
li
ps
s
o
u
r
c
e
d
f
r
o
m
T
V
L
a
a
yo
un
e
’
s
i
nt
e
r
na
l
a
r
c
h
ives
a
nd
pu
bl
ic
ly
a
va
il
a
b
le
r
e
p
os
i
t
or
ies
.
T
his
da
tas
e
t
r
e
f
l
e
c
ts
r
e
a
l
-
wo
r
ld
b
r
oa
dc
a
s
ti
ng
c
on
d
it
io
ns
c
o
m
mo
nl
y
e
nc
ou
nt
e
r
e
d
by
T
V
L
a
a
yo
une
a
n
d
o
th
e
r
c
h
a
n
ne
ls
w
it
h
in
th
e
S
NR
T
n
e
t
wo
r
k
.
M
e
tada
ta
e
xtr
a
c
ti
on,
a
c
r
it
ica
l
c
omponent
of
da
ta
s
e
t
pr
e
pa
r
a
ti
on,
wa
s
c
onduc
ted
us
ing
F
F
mpeg,
a
wide
ly
us
e
d
mul
ti
media
pr
oc
e
s
s
ing
f
r
a
mew
or
k.
T
h
r
ough
thi
s
a
utom
a
ted
pr
oc
e
s
s
,
e
s
s
e
nti
a
l
meta
da
ta
a
tt
r
ibut
e
s
we
r
e
e
xtr
a
c
ted
f
r
o
m
e
a
c
h
video
c
l
ip,
including
c
ode
c
type,
r
e
s
olut
ion,
f
r
a
me
r
a
te
,
a
udio
c
ode
c
,
bit
r
a
te,
c
ontaine
r
f
or
mat
,
a
nd
a
s
pe
c
t
r
a
ti
o
.
T
he
s
e
a
tt
r
ibut
e
s
s
e
r
ve
a
s
input
f
e
a
tur
e
s
f
or
the
mac
hine
lea
r
ning
model,
pr
ovidi
ng
c
r
uc
i
a
l
ins
ight
s
int
o
the
tec
hnica
l
s
pe
c
if
ica
ti
ons
of
e
a
c
h
c
li
p.
T
a
ble
1
s
umm
a
r
ize
s
the
ke
y
meta
da
ta
f
ields
e
xtr
a
c
ted,
while
F
igur
e
3
il
lus
tr
a
tes
the
met
a
da
ta
e
xtr
a
c
ti
on
pr
oc
e
s
s
,
high
li
ghti
ng
the
ke
y
s
tage
s
f
r
om
video
inges
ti
on
to
f
e
a
tur
e
s
tor
a
ge
.
T
a
ble
1.
M
e
tada
ta
e
xtr
a
c
ted
us
ing
F
F
mpeg
M
e
ta
da
ta
t
ype
D
e
s
c
r
ip
ti
on
C
ode
c
t
ype
H
.264,
a
nd
M
P
E
G
-
4
R
e
s
ol
ut
io
n
720p, 1080p
F
r
a
me
r
a
te
24 f
ps
, 30 f
ps
, 60 f
ps
A
udi
o
c
ode
c
A
A
C
, M
P
3
B
it
r
a
te
1 M
bps
, 5 M
bp
s
C
ont
a
in
e
r
f
or
ma
t
M
P
4, M
K
V
, M
X
F
,
M
O
V
A
s
pe
c
t
r
a
ti
o
D
is
pl
a
y
a
s
pe
c
t
r
a
ti
o (
16:
9, 4:
3)
D
ur
a
ti
on (
s
e
c
)
L
e
ngt
h of
t
he
vi
de
o i
n s
e
c
onds
F
il
e
s
iz
e
(
M
B
)
S
iz
e
of
t
he
vi
de
o f
il
e
i
n me
ga
byt
e
s
C
hr
oma
s
ubs
a
mpl
in
g
C
ol
or
c
ompr
e
s
s
io
n f
or
ma
t
(
4:
2:
0, 4:
4:
4)
C
ol
or
de
pt
h
C
ol
or
bi
t
de
pt
h (
8
-
bi
t,
10
-
b
it
)
F
igur
e
3.
M
e
tada
ta
e
xtr
a
c
ti
on
pr
oc
e
s
s
us
ing
F
F
mp
e
g
T
o
e
ns
ur
e
r
e
leva
nc
e
to
T
V
L
a
a
youne
’
s
br
oa
dc
a
s
ti
ng
e
nvir
onment
,
e
a
c
h
video
c
li
p
wa
s
manua
ll
y
a
nnotate
d
a
nd
labe
led
a
s
c
ompatibl
e
o
r
incompa
ti
ble
with
the
s
tation’
s
br
oa
dc
a
s
ti
ng
s
e
r
ve
r
.
Annota
ti
on
a
nd
labe
li
ng
we
r
e
c
onduc
ted
by
e
xpe
r
ienc
e
d
br
oa
dc
a
s
t
tec
hnicia
ns
,
us
ing
his
tor
ica
l
playba
c
k
l
ogs
a
nd
pe
r
f
or
manc
e
da
ta
to
va
li
da
te
e
a
c
h
c
li
p.
C
omp
a
ti
ble
c
li
ps
a
li
gn
with
the
s
e
r
ve
r
’
s
c
ode
c
a
nd
f
or
mat
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
Ar
ti
f
I
ntell
,
Vol.
14
,
No.
4
,
Augus
t
2025
:
286
4
-
2875
2868
r
e
quir
e
ments
,
e
ns
ur
ing
s
moot
h
p
layba
c
k.
C
onve
r
s
e
ly,
incompa
ti
ble
c
li
ps
e
xhi
bit
playba
c
k
e
r
r
or
s
,
dis
r
upti
ons
,
or
incompa
ti
bil
i
ty
is
s
ue
s
dur
ing
li
ve
t
r
a
ns
mi
s
s
ions
.
T
his
labe
led
da
tas
e
t
e
na
bles
the
mac
hine
lea
r
nin
g
model
to
dis
ti
nguis
h
be
twe
e
n
e
r
r
or
-
f
r
e
e
c
li
ps
a
nd
pr
oble
matic
f
il
e
s
.
T
o
e
ns
u
r
e
a
b
a
l
a
nc
e
d
a
nd
u
nb
ias
e
d
e
v
a
l
ua
ti
on
o
f
th
e
mo
de
l
,
t
he
da
tas
e
t
wa
s
s
pl
it
in
to
t
h
r
e
e
s
u
bs
e
ts
.
T
h
e
t
r
a
in
in
g
s
e
t
c
o
mp
r
is
e
s
7
0%
o
f
t
he
da
ta
a
n
d
c
on
ta
i
ns
a
n
e
ve
n
dis
t
r
i
bu
t
io
n
of
c
om
pa
ti
bl
e
a
nd
i
nc
om
pa
t
ib
le
c
l
ips
to
p
r
e
v
e
n
t
m
od
e
l
b
ias
d
u
r
i
ng
le
a
r
n
in
g
.
T
he
va
l
ida
t
io
n
s
e
t
r
e
p
r
e
s
e
n
ts
1
5
%
o
f
th
e
da
ta
a
n
d
is
us
e
d
t
o
f
i
ne
-
t
u
ne
t
he
mo
de
l
’
s
hy
pe
r
p
a
r
a
met
e
r
s
,
e
ns
u
r
in
g
op
t
im
a
l
pe
r
f
o
r
ma
nc
e
w
hi
le
m
i
ti
ga
ti
ng
o
ve
r
f
i
t
ti
ng
.
T
he
r
e
ma
in
in
g
1
5
%
c
ons
t
it
ut
e
s
t
he
tes
t
s
e
t
,
e
xc
lus
i
ve
l
y
r
e
s
e
r
ve
d
f
or
f
i
na
l
mo
de
l
e
va
l
ua
ti
on
t
o
a
s
s
e
s
s
a
c
c
u
r
a
c
y
,
p
r
e
c
is
ion
,
r
e
c
a
l
l
,
a
nd
ge
ne
r
a
li
z
a
b
i
li
ty
.
T
his
is
ol
a
te
d
tes
t
s
e
t
e
ns
u
r
e
s
a
n
o
b
jec
ti
ve
mea
s
u
r
e
o
f
pe
r
f
or
ma
nc
e
on
uns
e
e
n
da
ta
.
B
e
f
or
e
model
tr
a
ini
ng
,
e
xtens
ive
pr
e
pr
oc
e
s
s
ing
wa
s
a
ppli
e
d
to
the
da
tas
e
t.
Nume
r
ica
l
f
e
a
tur
e
s
s
uc
h
a
s
bit
r
a
te,
f
r
a
me
r
a
te
,
a
nd
f
il
e
s
ize
we
r
e
nor
malize
d
to
e
ns
ur
e
unif
or
m
s
c
a
li
ng
a
c
r
os
s
dif
f
e
r
e
nt
f
e
a
tur
e
r
a
nge
s
.
C
a
tegor
ica
l
a
tt
r
ibut
e
s
,
including
c
ode
c
type
,
a
udi
o
c
ode
c
,
a
nd
c
ontaine
r
f
or
mat
,
we
r
e
one
-
hot
e
nc
ode
d
to
f
a
c
il
it
a
te
s
e
a
ml
e
s
s
int
e
gr
a
ti
on
int
o
the
mac
hine
l
e
a
r
ning
pipeline.
M
is
s
ing
or
incomplete
meta
da
t
a
e
nt
r
ies
we
r
e
a
ddr
e
s
s
e
d
thr
ough
mea
n
im
putation
f
o
r
num
e
r
ica
l
da
ta
or
r
e
moved
i
f
they
r
e
pr
e
s
e
nted
les
s
than
1%
o
f
the
tot
a
l
da
tas
e
t.
T
his
pr
e
pr
oc
e
s
s
ing
e
ns
ur
e
d
that
the
da
tas
e
t
wa
s
c
lea
n,
c
ons
is
tent,
a
nd
opti
mi
z
e
d
f
or
the
mac
hine
lea
r
ning
model.
B
y
a
s
s
e
mbl
ing
a
diver
s
e
,
r
e
pr
e
s
e
ntative,
a
nd
high
-
qua
li
ty
da
tas
e
t,
thi
s
s
tudy
a
im
s
to
e
nha
nc
e
the
a
c
c
ur
a
c
y
a
nd
r
obus
tnes
s
of
c
ode
c
e
r
r
o
r
de
tec
ti
on,
e
ns
ur
ing
s
e
a
ml
e
s
s
a
nd
unint
e
r
r
upted
te
levis
ion
br
oa
dc
a
s
ti
ng
f
or
T
V
L
a
a
youne
a
nd
other
c
ha
nne
ls
in
the
S
NR
T
ne
twor
k.
2.
2.
P
r
op
os
e
d
m
od
e
l
2.
2.
1.
M
od
e
l
ar
c
h
i
t
e
c
t
u
r
e
T
he
pr
opos
e
d
model
a
dopts
a
hyb
r
id
a
r
c
hit
e
c
tur
e
c
ombi
ning
C
NN
s
a
nd
L
S
T
M
ne
twor
ks
to
de
tec
t
video
c
ode
c
e
r
r
or
s
e
f
f
e
c
ti
ve
ly.
T
his
int
e
gr
a
ti
on
lev
e
r
a
ge
s
the
s
tr
e
ngths
of
C
NN
s
f
o
r
s
pa
ti
a
l
f
e
a
tur
e
e
x
tr
a
c
ti
on
a
nd
L
S
T
M
ne
twor
ks
f
or
modeling
tempor
a
l
de
pe
nde
nc
ies
withi
n
the
meta
da
ta
of
video
c
li
ps
.
B
y
c
a
ptur
ing
both
s
tatic
meta
da
ta
a
tt
r
ibu
tes
a
nd
s
e
que
nti
a
l
pa
tt
e
r
ns
,
the
model
e
ns
ur
e
s
c
ompr
e
he
ns
ive
a
na
lys
is
of
potential
c
ode
c
incompa
ti
bil
it
ies
a
c
r
os
s
video
f
r
a
mes
.
T
he
C
NN
c
omponent
of
the
model
is
r
e
s
pons
ibl
e
f
or
e
xtr
a
c
ti
ng
s
pa
ti
a
l
f
e
a
tur
e
s
f
r
om
ke
y
meta
da
ta
f
ields
,
including
r
e
s
olut
ion,
c
ode
c
type,
a
nd
bit
r
a
t
e
.
C
onvolut
ional
laye
r
s
a
pply
mul
ti
ple
f
il
ter
s
to
e
mphas
ize
c
r
it
ica
l
a
tt
r
ibut
e
s
that
inf
luenc
e
c
ode
c
c
ompatibi
li
t
y,
e
na
bli
ng
the
model
to
identi
f
y
s
ubtl
e
s
pa
ti
a
l
pa
tt
e
r
ns
in
the
meta
da
ta.
T
his
laye
r
plays
a
c
r
uc
ial
r
ole
in
de
tec
ti
ng
a
nomalies
r
e
late
d
to
s
tatic
video
a
tt
r
ibut
e
s
,
s
uc
h
a
s
im
pr
ope
r
r
e
s
olut
ions
or
uns
uppor
ted
c
ode
c
s
.
F
oll
owing
the
C
NN
laye
r
s
,
the
e
xtr
a
c
ted
f
e
a
tur
e
maps
a
r
e
pa
s
s
e
d
to
the
L
S
T
M
c
omponent,
whic
h
pr
oc
e
s
s
e
s
s
e
que
nti
a
l
meta
da
ta
ove
r
ti
me.
L
S
T
M
ne
twor
ks
e
xc
e
l
a
t
c
a
ptur
ing
tempo
r
a
l
de
pe
nde
nc
ies
,
a
ll
owing
the
model
to
de
tec
t
ir
r
e
gular
it
ies
s
uc
h
a
s
f
luctua
ti
ng
f
r
a
me
r
a
tes
or
incons
is
tent
bit
r
a
tes
that
ma
y
s
ignal
potential
c
ode
c
e
r
r
or
s
.
T
his
s
e
que
nti
a
l
modeling
is
e
s
s
e
nti
a
l
f
o
r
r
e
c
ognizing
pa
tt
e
r
ns
that
manif
e
s
t
ove
r
mul
ti
ple
video
s
e
gments
,
c
ontr
ibut
ing
to
im
pr
ove
d
de
tec
ti
on
a
c
c
ur
a
c
y.
T
o
f
u
r
t
he
r
e
nh
a
nc
e
t
he
mo
de
l
’
s
pe
r
f
or
ma
nc
e
,
a
n
a
tt
e
n
t
io
n
m
e
c
h
a
n
is
m
is
i
n
c
o
r
p
o
r
a
ted
i
nt
o
t
he
L
S
T
M
l
a
y
e
r
o
ut
pu
ts
.
T
h
e
a
t
te
nt
i
on
lay
e
r
s
e
lec
ti
ve
ly
p
r
i
o
r
i
ti
z
e
s
t
he
mos
t
r
e
l
e
v
a
n
t
f
e
a
tu
r
e
s
by
dy
na
mi
c
a
ll
y
we
i
gh
t
in
g
t
he
i
m
po
r
tan
c
e
o
f
d
i
f
f
e
r
e
nt
met
a
d
a
ta
a
t
t
r
i
bu
tes
.
T
h
is
f
o
c
us
on
c
r
it
ica
l
f
e
a
tu
r
e
s
no
t
on
l
y
bo
os
ts
de
tec
ti
on
a
c
c
u
r
a
c
y
but
a
ls
o
r
e
du
c
e
s
f
a
ls
e
pos
it
iv
e
s
(
F
P
)
b
y
m
i
ni
mi
z
i
ng
t
he
i
nf
lu
e
n
c
e
o
f
les
s
s
ig
n
if
ic
a
n
t
m
e
t
a
da
ta
pa
tt
e
r
ns
.
A
v
is
ua
l
r
e
p
r
e
s
e
n
tat
i
on
o
f
t
he
mo
de
l
a
r
c
hi
te
c
t
u
r
e
is
il
l
us
t
r
a
te
d
i
n
F
i
gu
r
e
4
,
de
mo
ns
t
r
a
t
in
g
the
f
lo
w
o
f
d
a
t
a
t
h
r
o
ug
h
t
he
h
yb
r
id
C
NN
-
L
S
T
M
s
t
r
uc
t
u
r
e
.
T
h
e
d
i
a
g
r
a
m
ou
t
li
ne
s
the
s
e
q
ue
nc
e
of
op
e
r
a
t
i
ons
f
r
o
m
m
e
tad
a
ta
i
n
ge
s
ti
on
,
c
o
nv
ol
ut
io
na
l
f
i
l
te
r
i
ng
,
L
S
T
M
p
r
oc
e
s
s
i
ng
,
a
n
d
a
t
ten
t
io
n
-
ba
s
e
d
f
e
a
t
u
r
e
s
e
le
c
t
io
n
,
c
ul
mi
na
ti
n
g
in
th
e
f
i
na
l
ou
tp
u
t
lay
e
r
r
e
s
po
ns
i
bl
e
f
o
r
c
ode
c
c
om
pa
t
ib
i
li
ty
c
las
s
i
f
i
c
a
t
i
on
.
T
h
is
hy
br
id
d
e
s
i
gn
e
ns
u
r
e
s
a
r
o
bus
t,
s
c
a
l
a
b
le
s
ol
ut
io
n
f
o
r
ide
n
ti
f
yi
ng
a
nd
m
it
ig
a
t
in
g
v
i
de
o
c
o
de
c
e
r
r
o
r
s
i
n
r
e
a
l
-
t
i
me
b
r
oa
dc
a
s
t
in
g
e
n
vi
r
on
me
nt
s
.
F
igur
e
4.
M
a
c
hine
lea
r
ning
model
a
r
c
hit
e
c
tur
e
f
or
video
c
ode
c
e
r
r
or
de
tec
ti
on
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
Ar
ti
f
I
ntell
I
S
S
N:
2252
-
8938
E
nhanc
e
d
pr
e
-
br
oadc
as
t
v
ideo
c
ode
c
v
ali
dati
on
us
ing
hy
br
id
C
N
N
-
L
ST
M
w
it
h
…
(
K
hali
d
E
l
F
ay
q
)
2869
T
a
ble
2
outl
ines
the
hype
r
pa
r
a
mete
r
s
gove
r
ning
th
e
hybr
id
C
NN
-
L
S
T
M
a
r
c
hit
e
c
tur
e
int
e
gr
a
ted
with
a
tt
e
nti
on
mec
ha
nis
ms
a
nd
a
utoen
c
ode
r
-
ba
s
e
d
a
nomaly
de
tec
ti
on.
T
he
s
e
hype
r
pa
r
a
mete
r
s
s
h
a
pe
the
a
r
c
hit
e
c
tur
e
a
nd
tr
a
ini
ng
pr
oc
e
s
s
,
opti
mi
z
ing
the
C
NN
f
or
s
pa
ti
a
l
f
e
a
tu
r
e
e
xtr
a
c
ti
on,
the
L
S
T
M
ne
t
wor
k
f
or
modeling
tempor
a
l
de
pe
nde
nc
ies
,
a
nd
the
a
utoenc
ode
r
f
o
r
identif
ying
a
nomalies
.
T
he
model
is
f
ine
-
tuned
to
a
c
hieve
e
f
f
icie
nt
lea
r
ning
a
nd
r
obus
t
pe
r
f
or
manc
e
.
Ke
y
hype
r
pa
r
a
mete
r
s
include
a
lea
r
n
ing
r
a
te
of
0
.
001,
ba
tc
h
s
ize
of
32,
a
nd
35
tr
a
ini
ng
e
poc
hs
to
a
c
c
oun
t
f
or
the
a
ddit
ional
c
ompl
e
xit
y
int
r
oduc
e
d
by
the
a
tt
e
nti
on
mec
ha
nis
m.
T
he
d
r
opout
r
a
te
is
s
e
t
to
0.
4
to
m
it
igate
ove
r
f
it
ti
ng
,
while
c
onvolut
ional
laye
r
s
uti
li
z
e
ke
r
ne
l
s
ize
s
of
3×
3
a
nd
f
il
ter
s
a
t
incr
e
a
s
ing
de
pths
(
64,
128,
256,
256
)
.
T
he
a
tt
e
nti
on
mec
ha
nis
m
ope
r
a
tes
a
longs
ide
the
L
S
T
M
laye
r
,
e
nha
nc
ing
f
e
a
tur
e
p
r
ior
it
iza
ti
on
,
while
the
a
utoenc
ode
r
a
nomaly
de
tec
ti
on
unit
is
t
r
a
ined
in
pa
r
a
ll
e
l
us
ing
the
s
a
me
da
tas
e
t.
R
e
c
ti
f
ied
li
ne
a
r
unit
(
R
e
L
U)
r
e
mains
the
pr
im
a
r
y
a
c
ti
va
ti
on
f
unc
ti
on,
with
He
ini
ti
a
li
z
a
ti
on
a
ppli
e
d
f
or
e
f
f
icie
nt
we
ight
dis
tr
ib
uti
on.
T
a
ble
2.
Hype
r
pa
r
a
mete
r
s
f
or
C
NN
-
L
S
T
M
with
a
t
tention
a
nd
a
utoenc
ode
r
a
r
c
hit
e
c
tur
e
H
ype
r
pa
r
a
me
te
r
V
a
lu
e
L
e
a
r
ni
ng r
a
te
0.001
B
a
tc
h s
iz
e
32
N
umbe
r
of
e
poc
hs
3
5
D
r
opout r
a
te
0.
4
K
e
r
ne
l
s
iz
e
3
×
3
F
il
te
r
s
64, 128, 256, 256
L
S
T
M
uni
ts
128
A
tt
e
nt
io
n me
c
ha
ni
s
m uni
ts
64
A
ut
oe
nc
ode
r
hi
dde
n s
iz
e
128
A
c
ti
va
ti
on f
unc
ti
on
R
e
L
U
W
e
ig
ht
i
ni
ti
a
li
z
a
ti
on
H
e
i
ni
ti
a
li
z
a
ti
on
2.
2.
2.
I
n
t
e
gr
at
ion
of
au
t
oe
n
c
od
e
r
s
f
or
an
om
aly
d
e
t
e
c
t
ion
Autoe
nc
ode
r
s
play
a
c
r
uc
ial
r
ole
in
de
tec
ti
ng
r
a
r
e
or
s
ubtl
e
video
c
ode
c
a
nomalies
that
s
tanda
r
d
c
las
s
if
ica
ti
on
models
mi
ght
mi
s
s
.
As
uns
upe
r
vis
e
d
lea
r
ning
models
,
a
utoenc
ode
r
s
r
e
c
ons
tr
uc
t
input
da
ta
by
e
nc
oding
it
int
o
a
lowe
r
-
dim
e
ns
ional
late
nt
s
pa
c
e
a
nd
de
c
oding
it
ba
c
k.
B
y
c
ompar
ing
the
r
e
c
o
ns
tr
uc
ted
meta
da
ta
with
the
or
igi
na
l
input
,
dis
c
r
e
pa
nc
ies
indi
c
a
ti
ng
potential
c
ode
c
e
r
r
or
s
a
r
e
identif
ied,
e
nha
nc
ing
s
ys
tem
r
obus
tnes
s
a
ga
ins
t
a
nomalies
that
c
ould
dis
r
upt
br
oa
dc
a
s
ts
.
T
he
a
utoenc
ode
r
is
tr
a
ined
on
m
e
tada
ta
f
r
om
c
ompatibl
e
v
ideo
c
ode
c
s
,
lea
r
ning
to
r
e
c
ons
tr
uc
t
thi
s
da
ta
with
mi
nim
a
l
e
r
r
o
r
.
W
he
n
p
r
e
s
e
nted
with
ne
w
da
ta,
c
li
ps
de
viating
f
r
om
the
lea
r
ne
d
dis
tr
ibut
io
n
pr
oduc
e
higher
r
e
c
ons
tr
uc
ti
on
e
r
r
or
s
,
s
ignaling
potential
c
ode
c
incons
is
tenc
ie
s
.
T
his
pr
ompt
s
f
ur
ther
r
e
view
be
f
or
e
in
tegr
a
ti
on
int
o
the
b
r
oa
dc
a
s
t
pipeline.
T
he
tr
a
ini
ng
us
e
s
a
mea
n
s
qua
r
e
d
e
r
r
or
(
M
S
E
)
los
s
f
unc
ti
on
ove
r
100
e
poc
hs
,
with
e
a
r
ly
s
toppi
ng
to
pr
e
ve
nt
ove
r
f
it
ti
ng
a
nd
im
pr
ove
ge
ne
r
a
li
z
a
ti
on
.
T
he
a
r
c
hit
e
c
tur
e
include
s
th
r
e
e
e
nc
ode
r
lay
e
r
s
f
or
c
ompr
e
s
s
ion,
mi
r
r
o
r
e
d
by
de
c
ode
r
laye
r
s
f
or
r
e
c
ons
tr
uc
ti
on.
Dr
opout
r
e
gular
iza
ti
on
is
a
ppli
e
d
t
o
e
ns
ur
e
r
e
s
il
ienc
e
.
B
y
c
ompl
e
menting
the
C
NN
-
L
S
T
M
a
r
c
hit
e
c
tur
e
,
the
a
utoenc
ode
r
a
dds
a
n
e
xtr
a
va
li
da
ti
on
laye
r
,
im
pr
oving
the
a
c
c
ur
a
c
y
a
nd
r
e
li
a
bi
li
ty
of
c
ode
c
e
r
r
or
de
tec
ti
on.
T
his
hyb
r
id
a
ppr
oa
c
h
e
nha
nc
e
s
s
e
a
ml
e
s
s
br
oa
dc
a
s
ti
ng
f
or
T
V
L
a
a
youne
a
nd
other
c
ha
nne
ls
on
the
ne
twor
k
.
2.
2.
3.
Dat
a
au
gm
e
n
t
at
ion
an
d
s
yn
t
h
e
t
ic
d
at
a
ge
n
e
r
at
ion
Da
ta
a
ugmenta
ti
on
a
nd
s
ynthetic
da
ta
ge
ne
r
a
ti
on
a
r
e
e
s
s
e
nti
a
l
f
or
e
n
ha
nc
ing
the
r
obus
tnes
s
a
nd
ge
ne
r
a
li
z
a
ti
on
of
the
pr
opos
e
d
mac
hine
lea
r
ning
model.
T
he
s
e
tec
hniques
e
xpa
nd
the
t
r
a
ini
ng
d
a
tas
e
t
by
int
r
oduc
ing
c
ontr
o
ll
e
d
va
r
iations
,
s
im
ulating
diver
s
e
br
oa
dc
a
s
ti
ng
s
c
e
na
r
ios
the
model
may
e
nc
ounter
dur
ing
li
ve
br
oa
dc
a
s
ts
.
T
his
pr
oc
e
s
s
im
pr
ove
s
the
mode
l's
a
da
ptabili
ty,
r
e
duc
ing
ove
r
f
it
t
ing
a
nd
e
ns
ur
ing
r
e
li
a
ble
pe
r
f
or
manc
e
a
c
r
os
s
dif
f
e
r
e
nt
video
c
ondit
ions
.
A
ug
me
nt
a
t
io
n
in
vo
lv
e
s
m
od
i
f
y
in
g
e
xis
t
in
g
v
id
e
o
m
e
tad
a
t
a
t
o
r
e
f
l
e
c
t
v
a
r
y
in
g
c
o
de
c
c
o
n
f
i
gu
r
a
t
io
ns
,
f
r
a
me
r
a
tes
,
a
nd
bi
t
r
a
tes
.
F
o
r
e
xa
mp
le
,
v
ide
o
c
li
ps
a
r
e
a
d
jus
te
d
t
o
s
im
ul
a
t
e
f
r
a
m
e
r
a
t
e
s
o
f
2
4
f
ps
,
30
f
ps
,
a
n
d
60
f
ps
,
wi
th
b
it
r
a
tes
r
a
ng
i
ng
f
r
o
m
1
M
bps
t
o
5
M
bp
s
.
C
o
nt
r
ol
le
d
n
ois
e
is
a
ls
o
i
nt
r
od
uc
e
d
to
r
e
pl
ica
te
c
om
m
on
t
r
a
ns
m
is
s
i
on
e
r
r
o
r
s
,
e
na
b
l
in
g
th
e
mo
de
l
t
o
de
tec
t
s
u
b
tl
e
dis
c
r
e
p
a
nc
ies
tha
t
s
i
gn
a
l
c
ode
c
is
s
ue
s
.
S
yn
the
t
ic
da
ta
g
e
n
e
r
a
t
io
n
f
u
r
th
e
r
d
ive
r
s
i
f
i
e
s
t
he
d
a
t
a
s
e
t
b
y
a
lt
e
r
in
g
c
od
e
c
a
tt
r
ib
ut
e
s
,
c
on
ta
ine
r
f
or
ma
ts
(
M
P
4
,
M
XF
,
M
O
V
)
,
a
n
d
a
s
pe
c
t
r
a
t
i
os
(
16
:9
,
4
:3
)
.
T
h
is
p
r
o
c
e
s
s
c
r
e
a
tes
ne
w
m
e
ta
da
ta
s
a
mp
les
t
ha
t
r
e
p
r
e
s
e
n
t
r
a
r
e
o
r
e
d
ge
-
c
a
s
e
e
r
r
o
r
s
,
a
l
low
i
ng
th
e
m
od
e
l
to
f
a
m
il
ia
r
ize
i
ts
e
l
f
w
it
h
u
nus
ua
l
c
on
f
ig
u
r
a
ti
ons
th
a
t
c
o
ul
d
d
is
r
u
p
t
b
r
oa
d
c
a
s
ts
.
B
y
e
xp
os
i
ng
t
he
m
ode
l
to
a
br
oa
de
r
a
r
r
a
y
o
f
da
ta
,
a
u
gm
e
n
ta
ti
on
,
a
n
d
s
yn
th
e
t
ic
ge
ne
r
a
ti
on
s
t
r
e
ng
th
e
n
i
ts
r
e
s
il
ie
nc
e
a
g
a
i
ns
t
u
ne
xp
e
c
ted
a
no
ma
li
e
s
,
e
nh
a
nc
in
g
i
ts
a
bi
l
it
y
to
d
e
tec
t
c
o
de
c
e
r
r
o
r
s
a
c
r
os
s
v
a
r
ie
d
b
r
o
a
dc
a
s
t
in
g
c
o
nd
it
ion
s
.
T
h
is
c
om
p
r
e
he
ns
i
ve
a
p
p
r
o
a
c
h
s
ig
ni
f
ica
nt
l
y
r
e
du
c
e
s
bi
a
s
e
s
a
nd
im
p
r
o
ve
s
r
e
l
ia
bi
l
it
y
,
s
u
pp
or
t
in
g
s
e
a
ml
e
s
s
a
nd
u
n
in
te
r
r
u
pt
e
d
te
lev
is
io
n
b
r
oa
d
c
a
s
ts
f
o
r
T
V
L
a
a
y
ou
ne
a
n
d
o
th
e
r
c
h
a
n
ne
ls
wi
t
hi
n
t
he
ne
two
r
k
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
Ar
ti
f
I
ntell
,
Vol.
14
,
No.
4
,
Augus
t
2025
:
286
4
-
2875
2870
2.
2.
4.
T
r
ain
in
g
an
d
va
li
d
at
io
n
T
he
tr
a
ini
ng
pr
oc
e
s
s
f
oll
ows
a
s
tr
uc
tu
r
e
d
a
ppr
oa
c
h
de
s
igned
to
im
p
r
ove
the
model's
ge
ne
r
a
li
z
a
ti
on
a
nd
r
obus
tnes
s
,
a
s
il
lus
tr
a
ted
in
F
igur
e
5
.
T
he
model
is
tr
a
ined
us
ing
a
s
upe
r
vis
e
d
lea
r
n
ing
me
thod
with
labe
led
video
c
li
ps
that
indi
c
a
te
c
ode
c
c
ompatibi
li
ty.
T
o
incr
e
a
s
e
da
tas
e
t
diver
s
it
y
a
nd
r
e
s
il
ienc
e
,
va
r
ious
da
ta
a
ugmenta
ti
on
tec
hniques
a
r
e
a
ppli
e
d,
including
m
odif
ica
ti
ons
to
r
e
s
olut
ions
,
bit
r
a
tes
,
a
nd
f
r
a
me
r
a
te
s
,
a
s
we
ll
a
s
the
int
r
odu
c
ti
on
of
c
ontr
ol
led
nois
e
.
T
he
s
e
a
ugmenta
ti
ons
e
xpos
e
the
model
to
a
wide
a
r
r
a
y
o
f
c
ode
c
c
onf
igur
a
ti
ons
,
e
nha
nc
ing
it
s
a
bil
it
y
to
ge
ne
r
a
li
z
e
e
f
f
e
c
ti
ve
ly
to
uns
e
e
n
da
ta
in
r
e
a
l
-
wor
ld
br
o
a
dc
a
s
ti
ng
s
c
e
na
r
ios
.
F
igur
e
5.
F
lowc
ha
r
t
of
the
video
c
ode
c
e
r
r
or
de
tec
ti
on
methodology
B
inar
y
c
r
os
s
-
e
ntr
opy
(
B
C
E
)
is
s
e
lec
ted
a
s
the
los
s
f
unc
t
ion
due
to
i
ts
s
uit
a
bil
it
y
f
o
r
binar
y
c
las
s
if
ica
ti
on
tas
ks
,
mea
s
ur
ing
the
diver
ge
nc
e
be
t
we
e
n
pr
e
dicte
d
pr
oba
bil
it
ies
a
nd
a
c
tual
labe
ls
.
T
h
e
Ada
m
opti
mi
z
e
r
is
e
mpl
oye
d
f
or
it
s
e
f
f
icie
nc
y
in
ha
ndli
n
g
lar
ge
da
tas
e
ts
a
nd
it
s
a
da
pti
ve
lea
r
ning
r
a
te,
c
ont
r
ibut
ing
to
s
table
c
onve
r
ge
nc
e
thr
oughout
the
t
r
a
ini
ng
p
r
o
c
e
s
s
.
T
o
e
ns
ur
e
model
r
e
li
a
bil
i
ty,
a
k
-
f
old
c
r
os
s
-
v
a
li
da
ti
on
s
tr
a
tegy
is
im
pleme
nted,
divi
ding
the
da
tas
e
t
int
o
k
s
ubs
e
ts
.
T
h
e
model
unde
r
goe
s
t
r
a
ini
ng
a
nd
v
a
li
da
ti
on
k
ti
mes
,
u
ti
li
z
ing
a
dif
f
e
r
e
nt
s
ubs
e
t
f
o
r
va
li
da
ti
on
dur
ing
e
a
c
h
it
e
r
a
t
ion
while
the
r
e
maining
s
ubs
e
ts
a
r
e
us
e
d
f
or
t
r
a
ini
ng.
T
his
c
ompr
e
he
ns
ive
e
va
luation
met
hod
a
ll
ows
the
model
to
ge
ne
r
a
li
z
e
a
c
r
os
s
dive
r
s
e
da
ta
dis
tr
ibut
ions
.
Additi
ona
ll
y
,
e
a
r
ly
s
toppi
ng
ba
s
e
d
on
va
li
da
ti
on
los
s
pr
e
ve
nts
ove
r
f
it
ti
ng
,
e
ns
ur
ing
opti
mal
pe
r
f
or
manc
e
without
e
xc
e
s
s
ive
tr
a
ini
ng
e
poc
hs
.
C
ons
ider
ing
the
r
e
a
l
-
ti
me
c
ons
tr
a
int
s
of
b
r
oa
dc
a
s
ti
ng
e
nvir
onments
,
mi
xe
d
-
pr
e
c
is
ion
t
r
a
ini
ng
is
e
xpl
or
e
d
to
a
c
c
e
ler
a
te
c
omput
a
ti
ons
a
nd
mi
nim
i
z
e
model
s
ize
.
T
his
make
s
it
s
uit
a
ble
f
or
de
plo
yment
in
r
e
s
our
c
e
-
c
on
s
tr
a
ined
e
nvir
onments
.
F
utur
e
wor
k
will
f
oc
us
on
f
u
r
ther
opti
mi
z
a
ti
ons
,
including
model
pr
uning
a
nd
qua
nti
z
a
ti
on
,
to
e
nha
nc
e
inf
e
r
e
nc
e
s
pe
e
ds
dur
i
ng
li
ve
b
r
oa
dc
a
s
ts
.
An
e
r
r
or
a
na
lys
is
is
planne
d
to
a
dd
r
e
s
s
the
5%
mi
s
c
las
s
if
ica
ti
ons
,
guidi
ng
im
pr
ove
ments
s
uc
h
a
s
the
int
e
gr
a
ti
on
of
a
ddit
ional
f
e
a
tur
e
s
or
a
djus
tm
e
nts
to
the
model
a
r
c
hit
e
c
tur
e
f
o
r
ha
ndli
ng
c
ompl
e
x
c
ode
c
c
onf
igur
a
ti
ons
.
T
his
c
omp
r
e
he
ns
i
ve
methodo
logy
-
int
e
gr
a
ti
ng
da
ta
a
ugmenta
ti
on
,
c
r
os
s
-
va
li
da
ti
on,
r
e
gular
iza
ti
on,
a
nd
op
ti
mi
z
a
ti
on
-
e
ns
ur
e
s
that
the
model
pe
r
f
o
r
ms
we
ll
dur
ing
tr
a
ini
ng
.
I
t
a
ls
o
ge
n
e
r
a
li
z
e
s
e
f
f
e
c
ti
ve
ly
to
ne
w
video
c
li
ps
a
c
r
os
s
va
r
ious
br
oa
d
c
a
s
ti
ng
c
ondit
ions
.
2.
2.
5.
M
od
e
l
e
val
u
at
ion
T
he
pe
r
f
or
manc
e
o
f
the
pr
opos
e
d
hybr
id
mode
l,
whic
h
int
e
gr
a
tes
C
NN
-
L
S
T
M
ne
twor
ks
with
a
tt
e
nti
on
mec
ha
nis
ms
a
nd
a
utoenc
od
e
r
-
ba
s
e
d
a
nomaly
de
tec
ti
on,
is
e
va
luate
d
us
ing
mul
ti
ple
metr
ics
:
a
c
c
ur
a
c
y
(
1)
,
pr
e
c
is
ion
(
2)
,
r
e
c
a
ll
(
3
)
,
a
nd
F
1
-
s
c
or
e
(
4)
.
T
he
s
e
metr
ics
p
r
ovide
a
c
ompr
e
he
ns
ive
a
s
s
e
s
s
men
t
of
the
model
’
s
a
bil
it
y
to
de
tec
t
a
nd
c
las
s
if
y
video
c
ode
c
e
r
r
or
s
e
f
f
e
c
ti
ve
ly.
=
+
+
+
+
(
1)
=
+
(
2)
=
+
(
3)
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
Ar
ti
f
I
ntell
I
S
S
N:
2252
-
8938
E
nhanc
e
d
pr
e
-
br
oadc
as
t
v
ideo
c
ode
c
v
ali
dati
on
us
ing
hy
br
id
C
N
N
-
L
ST
M
w
it
h
…
(
K
hali
d
E
l
F
ay
q
)
2871
1
−
=
2
×
×
+
(
4)
W
he
r
e
tr
ue
pos
it
ives
(
T
P
)
:
ins
tanc
e
s
c
or
r
e
c
tl
y
id
e
nti
f
ied
a
s
pos
it
ive;
tr
ue
ne
ga
ti
ve
s
(
T
N)
:
ins
tanc
e
s
c
or
r
e
c
tl
y
identif
ied
a
s
ne
ga
ti
ve
;
F
P
:
ins
tanc
e
s
incor
r
e
c
tl
y
i
de
nti
f
ied
a
s
pos
it
ive;
a
nd
f
a
ls
e
ne
ga
ti
ve
s
(
F
N)
:
i
ns
tanc
e
s
incor
r
e
c
tl
y
identif
ied
a
s
ne
ga
ti
ve
.
T
he
model
e
va
luation
is
c
onduc
ted
us
ing
the
tes
t
s
e
t,
whic
h
c
ons
is
ts
of
video
c
li
ps
not
us
e
d
dur
ing
the
tr
a
ini
ng
or
va
li
da
ti
on
pha
s
e
s
.
T
his
s
e
pa
r
a
ti
on
e
ns
ur
e
s
a
n
unbias
e
d
a
s
s
e
s
s
ment,
mea
s
ur
ing
the
model’
s
ge
ne
r
a
li
z
a
bil
it
y
to
uns
e
e
n
da
ta.
I
n
a
ddit
ion
to
e
va
luating
c
las
s
if
ica
ti
on
pe
r
f
or
manc
e
,
the
a
utoenc
ode
r
’
s
r
e
c
ons
tr
uc
ti
on
e
r
r
o
r
is
a
na
lyze
d
to
identif
y
s
ubtl
e
c
ode
c
a
nomalies
that
tr
a
dit
ional
models
mi
ght
o
ve
r
look.
R
e
c
ons
tr
uc
ti
on
e
r
r
or
s
a
r
e
mea
s
ur
e
d
us
ing
M
S
E
,
whe
r
e
higher
e
r
r
or
s
indi
c
a
te
gr
e
a
ter
de
viations
f
r
o
m
nor
mal
pa
tt
e
r
ns
,
s
ignaling
potential
incom
pa
ti
bil
it
ies
.
T
hi
s
dua
l
-
e
va
luation
a
ppr
oa
c
h
a
ll
ows
f
or
the
de
tec
ti
o
n
of
both
e
xpli
c
it
a
nd
nua
nc
e
d
e
r
r
o
r
s
,
e
nha
nc
ing
the
ove
r
a
ll
r
obus
tnes
s
of
the
s
ys
tem.
T
he
t
r
a
ined
model
is
int
e
gr
a
ted
d
ir
e
c
tl
y
int
o
th
e
T
V
L
a
a
youne
br
oa
dc
a
s
ti
ng
pipeline
to
e
ns
ur
e
ope
r
a
ti
o
na
l
e
f
f
icie
nc
y.
T
he
wor
kf
low
be
gins
with
F
F
mpeg
c
onti
nuous
ly
e
xtr
a
c
ti
ng
meta
da
ta
f
r
om
i
nc
omi
ng
video
c
li
ps
in
r
e
a
l
ti
me.
T
his
meta
da
ta,
inclu
ding
c
ode
c
type,
r
e
s
olut
ion,
f
r
a
me
r
a
te,
a
nd
b
it
r
a
te,
is
pr
e
pr
oc
e
s
s
e
d
a
nd
pa
s
s
e
d
to
the
C
NN
-
L
S
T
M
m
ode
l
with
a
tt
e
nti
on.
T
he
model
e
va
luate
s
the
s
pa
ti
a
l
a
nd
tempor
a
l
pa
tt
e
r
ns
of
the
meta
da
ta
while
the
a
utoen
c
ode
r
c
onc
ur
r
e
ntl
y
a
s
s
e
s
s
e
s
the
a
nomaly
s
c
or
e
.
I
f
the
model
de
tec
ts
a
po
tential
incompa
ti
bi
li
ty,
a
n
a
ler
t
is
t
r
igger
e
d
f
or
the
br
oa
dc
a
s
ti
ng
ope
r
a
tor
to
r
e
view
the
f
la
gge
d
c
li
p.
Ope
r
a
tor
s
c
a
n
then
take
c
or
r
e
c
ti
ve
a
c
ti
on,
s
uc
h
a
s
r
e
-
e
nc
oding
the
video
or
a
djus
ti
ng
c
ode
c
s
e
tt
ings
,
pr
e
ve
nti
ng
b
r
oa
dc
a
s
ti
ng
e
r
r
o
r
s
b
e
f
or
e
the
c
li
p
r
e
a
c
he
s
li
ve
t
r
a
ns
mi
s
s
ion.
T
his
p
r
oa
c
ti
ve
pr
oc
e
s
s
r
e
duc
e
s
manua
l
int
e
r
ve
nti
on,
mi
ni
mi
z
e
s
e
r
r
or
s
,
a
nd
e
nha
nc
e
s
the
ove
r
a
ll
br
oa
dc
a
s
ti
ng
e
xpe
r
ie
nc
e
.
T
he
int
e
gr
a
ti
on
of
a
utoenc
ode
r
a
nomaly
de
tec
ti
on
a
longs
ide
the
C
NN
-
L
S
T
M
a
r
c
hit
e
c
tur
e
with
a
tt
e
nti
on
of
f
e
r
s
a
r
obus
t
a
nd
s
c
a
lable
s
olut
ion
f
or
c
ode
c
e
r
r
or
de
tec
ti
on.
T
his
a
ppr
oa
c
h
e
ns
ur
e
s
that
c
ode
c
e
r
r
or
s
a
r
e
identif
ied
a
t
va
r
ious
leve
ls
—
both
thr
ough
s
pa
ti
a
l
-
tempor
a
l
a
na
lys
is
a
nd
a
nom
a
ly
-
ba
s
e
d
r
e
c
ons
tr
uc
ti
on
.
I
t
pr
ovid
es
gr
e
a
ter
a
c
c
ur
a
c
y
a
nd
r
e
s
il
ienc
e
in
video
tr
a
ns
mi
s
s
ion
a
c
r
os
s
diver
s
e
br
oa
dc
a
s
ti
ng
c
ondit
ions
.
3.
RE
S
UL
T
S
AN
D
DI
S
CU
S
S
I
ON
3.
1.
T
r
ain
in
g
m
e
t
h
od
o
logy
T
he
e
xpe
r
im
e
nts
we
r
e
c
onduc
ted
us
ing
a
da
tas
e
t
c
ompr
is
ing
video
c
li
ps
f
r
om
both
int
e
r
na
l
a
r
c
hives
of
T
V
L
a
a
youne
a
nd
publi
c
ly
a
va
il
a
ble
s
our
c
e
s
.
T
he
da
tas
e
t
wa
s
divi
de
d
int
o
tr
a
ini
ng,
va
li
da
ti
on,
a
nd
tes
t
s
e
ts
in
the
r
a
ti
o
o
f
70:15:
15
.
T
he
f
oll
owing
too
ls
a
nd
li
br
a
r
ies
we
r
e
us
e
d:
‒
F
F
mpeg
f
or
f
e
a
tur
e
e
xtr
a
c
ti
on
a
nd
pr
e
pr
oc
e
s
s
ing
of
video
meta
da
ta.
‒
T
e
ns
or
F
low
a
nd
Ke
r
a
s
f
o
r
buil
d
ing
a
nd
t
r
a
ini
ng
th
e
mac
hine
lea
r
ning
model
.
‒
S
c
iki
t
-
lea
r
n
f
or
pe
r
f
or
manc
e
e
va
luation
a
nd
metr
ic
s
c
a
lcula
ti
on.
T
o
a
s
s
e
s
s
the
e
f
f
e
c
ti
ve
ne
s
s
of
the
hybr
id
C
NN
-
L
S
T
M
model
with
a
tt
e
nti
on
a
nd
a
utoenc
ode
r
a
nomaly
de
tec
ti
on,
a
ddit
ional
e
xpe
r
im
e
nts
we
r
e
c
o
nduc
ted
unde
r
va
r
ious
c
ondit
ions
.
T
he
s
e
c
ondit
ion
s
r
e
f
lec
t
r
e
a
l
-
wor
ld
br
oa
dc
a
s
ti
ng
s
c
e
na
r
ios
a
t
T
V
L
a
a
youn
e
,
including
diver
s
e
c
ode
c
types
,
r
e
s
olut
ions
,
f
r
a
me
r
a
tes
,
a
nd
bit
r
a
te
c
onf
igur
a
ti
ons
.
T
a
ble
3
s
umm
a
r
ize
s
the
e
xtende
d
e
xpe
r
im
e
ntal
r
e
s
ult
s
,
highl
ight
ing
a
c
c
ur
a
c
y,
pr
e
c
is
ion,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
a
c
r
os
s
dif
f
e
r
e
nt
pa
r
a
mete
r
va
r
iations
.
T
a
ble
3.
E
xtende
d
e
xpe
r
i
menta
l
r
e
s
ult
s
P
a
r
a
me
te
r
V
a
lu
e
A
c
c
ur
a
c
y (
%
)
P
r
e
c
is
io
n (
%
)
R
e
c
a
ll
(
%
)
F1
-
s
c
or
e
(
%
)
C
ode
c
t
ype
H
.264, M
P
E
G
-
4
95.2
94.5
95.7
95.1
R
e
s
ol
ut
io
n
720p,1080p
94.8
94.0
95.0
94.5
F
r
a
me
r
a
te
24 f
ps
, 30 f
ps
, 60 f
ps
95.4
94.7
95.9
95.3
B
it
r
a
te
1 M
bps
, 5
M
bp
s
94.9
94.2
95.2
94.7
A
udi
o c
ode
c
A
A
C
, M
P
3
95.1
94.4
95.6
95.0
T
he
e
xpe
r
im
e
ntal
r
e
s
ult
s
de
mons
tr
a
te
c
ons
is
tent
ly
high
a
c
c
ur
a
c
y
(
94.
5%
-
95.
2
%
)
a
c
r
os
s
va
r
ious
c
onf
igur
a
ti
ons
,
a
f
f
ir
mi
ng
the
model
’
s
r
obus
tnes
s
a
nd
a
da
ptabili
ty
.
C
ode
c
type
a
nd
f
r
a
me
r
a
te
v
a
r
iations
yielde
d
the
highes
t
a
c
c
ur
a
c
y
a
nd
r
e
c
a
ll
,
r
e
f
lec
ti
ng
the
model’
s
e
f
f
e
c
ti
ve
ne
s
s
in
mana
ging
tempor
a
l
incons
is
tenc
ie
s
a
nd
diver
s
e
e
nc
oding
f
or
mats
.
S
im
il
a
r
ly,
va
r
iations
in
r
e
s
olut
ion
a
nd
bit
r
a
te
ma
int
a
ined
s
tr
ong
pe
r
f
or
manc
e
,
highl
ight
ing
the
model
’
s
c
a
pa
c
it
y
to
pr
oc
e
s
s
videos
wi
th
dif
f
e
r
ing
vis
ua
l
qu
a
li
ty
a
nd
c
ompr
e
s
s
ion
leve
ls
.
T
he
int
e
g
r
a
ti
on
o
f
a
utoenc
ode
r
-
ba
s
e
d
a
nomaly
de
tec
ti
on
s
igni
f
ica
ntl
y
e
nha
nc
e
s
the
model’
s
s
e
ns
it
ivi
ty
to
s
ubtl
e
c
ode
c
e
r
r
or
s
that
may
e
va
d
e
tr
a
dit
ional
s
pa
ti
a
l
or
tem
por
a
l
a
na
lys
is
.
T
his
d
ua
l
-
laye
r
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
Ar
ti
f
I
ntell
,
Vol.
14
,
No.
4
,
Augus
t
2025
:
286
4
-
2875
2872
a
ppr
oa
c
h
im
pr
ove
s
the
de
tec
ti
on
of
r
a
r
e
or
e
dg
e
-
c
a
s
e
c
ode
c
a
nomalies
.
I
t
c
ontr
ibu
t
es
to
e
leva
ted
r
e
c
a
ll
(
up
to
95
.
7%
)
a
nd
mi
nim
izing
the
r
is
k
o
f
unde
tec
te
d
e
r
r
or
s
.
T
he
s
e
f
indi
ngs
va
li
da
te
the
model’
s
ge
ne
r
a
li
z
a
bil
it
y
a
c
r
os
s
va
r
ious
br
oa
dc
a
s
ti
ng
e
nvir
onments
,
c
onf
ir
mi
ng
it
s
s
uit
a
bil
it
y
f
or
de
ploym
e
nt
with
in
T
V
L
a
a
youne
’
s
br
oa
dc
a
s
t
pipeline.
B
y
a
ddr
e
s
s
in
g
c
ode
c
incons
is
tenc
ie
s
a
t
mul
ti
ple
leve
ls
-
s
pa
ti
a
l,
tempor
a
l,
a
nd
a
nomaly
de
tec
ti
on
-
the
pr
opos
e
d
s
olut
ion
p
r
ovides
a
c
ompr
e
he
ns
ive
f
r
a
mew
or
k
.
I
t
he
lps
r
e
duc
e
br
oa
dc
a
s
ti
ng
dis
r
upti
ons
a
nd
e
nha
nc
e
tele
vis
ion
tr
a
n
s
mi
s
s
ion
r
e
li
a
bil
it
y.
3.
2.
M
od
e
l
p
e
r
f
or
m
an
c
e
T
he
pe
r
f
or
manc
e
of
the
hybr
id
C
NN
-
L
S
T
M
model,
e
nha
nc
e
d
with
a
n
a
tt
e
nti
on
mec
ha
nis
m
a
nd
a
utoenc
ode
r
a
nomaly
de
tec
ti
on,
wa
s
e
va
luate
d
us
ing
ke
y
c
las
s
if
ica
ti
on
metr
ics
:
a
c
c
ur
a
c
y,
pr
e
c
is
ion,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
.
T
he
s
e
metr
ics
pr
ovide
a
c
ompr
e
he
ns
ive
a
s
s
e
s
s
ment
of
the
model’
s
a
bil
it
y
to
de
tec
t
video
c
ode
c
e
r
r
or
s
e
f
f
icie
ntl
y.
T
he
model
a
c
hieve
s
c
ons
is
tently
high
pe
r
f
o
r
manc
e
a
c
r
os
s
a
ll
metr
ics
.
T
he
a
c
c
ur
a
c
y
of
97.
0%
unde
r
s
c
or
e
s
the
model’
s
e
xc
e
pti
ona
l
r
e
li
a
bi
li
ty
in
c
o
r
r
e
c
tl
y
c
las
s
if
ying
video
c
li
ps
,
while
the
p
r
e
c
is
ion
of
96.
3
%
highl
igh
ts
it
s
e
f
f
e
c
ti
ve
ne
s
s
in
mi
nim
i
z
ing
FP
.
A
r
e
c
a
ll
o
f
97
.
5%
r
e
f
lec
ts
the
model
’
s
s
tr
o
ng
s
e
ns
it
ivi
ty
in
identif
ying
incompa
ti
ble
video
c
li
ps
,
e
ns
ur
ing
mi
nim
a
l
unde
tec
ted
e
r
r
or
s
.
T
he
F
1
-
s
c
or
e
of
96.
9%
ba
lanc
e
s
pr
e
c
is
ion
a
nd
r
e
c
a
ll
,
r
e
inf
or
c
ing
the
model’
s
r
obus
tnes
s
a
nd
a
da
ptabili
ty
f
or
r
e
a
l
-
wor
ld
br
oa
dc
a
s
ti
ng
c
ondit
ions
.
T
he
s
e
r
e
s
ult
s
va
li
d
a
te
the
model’
s
s
c
a
labili
ty
a
nd
de
pe
nda
bil
it
y
a
c
r
os
s
diver
s
e
br
oa
dc
a
s
ti
ng
e
nvir
onments
.
T
he
h
igh
r
e
c
a
ll
is
pa
r
ti
c
ular
ly
e
s
s
e
nti
a
l
f
o
r
li
ve
br
oa
dc
a
s
ts
,
r
e
duc
ing
the
r
is
k
of
c
od
e
c
e
r
r
or
s
bypa
s
s
ing
de
tec
ti
on
a
nd
c
a
us
ing
int
e
r
r
upti
ons
.
S
i
mul
tane
ous
ly,
the
mode
l's
high
pr
e
c
is
ion
mi
nim
ize
s
f
a
ls
e
a
ler
ts
,
a
ll
owing
ope
r
a
tor
s
to
f
oc
us
on
ly
on
ge
nuine
ly
incompa
ti
ble
c
li
ps
.
B
y
in
tegr
a
ti
ng
s
pa
ti
a
l,
temp
or
a
l,
a
nd
a
nomaly
-
ba
s
e
d
de
tec
ti
on
mec
ha
nis
m
s
,
the
model
outper
f
or
ms
t
r
a
dit
ional
he
ur
is
ti
c
methods
,
e
s
tablis
hing
it
s
e
lf
a
s
a
va
luable
a
ddit
ion
to
T
V
L
a
a
youne
’
s
br
oa
dc
a
s
ti
ng
wor
k
f
low.
T
he
s
ys
tem
e
na
bles
r
e
a
l
-
ti
me
de
tec
ti
on
a
nd
a
utom
a
ted
a
ler
ts
,
p
r
e
e
mpt
ing
po
tential
dis
r
upti
ons
a
nd
e
nha
nc
ing
ove
r
a
ll
b
r
oa
dc
a
s
ti
ng
r
e
li
a
bil
it
y.
3.
3.
Com
p
ar
a
t
ive
an
alys
is
T
o
e
va
luate
the
e
f
f
e
c
ti
ve
ne
s
s
of
the
pr
opos
e
d
C
NN
-
L
S
T
M
hybr
id
model
with
a
tt
e
nti
on
a
nd
a
utoenc
ode
r
a
nomaly
de
tec
ti
on
,
it
s
pe
r
f
or
manc
e
wa
s
c
ompar
e
d
a
ga
ins
t
t
r
a
dit
ional
he
ur
is
ti
c
-
ba
s
e
d
methods
a
nd
a
ba
s
e
li
ne
log
is
ti
c
r
e
gr
e
s
s
ion
model
.
As
s
hown
in
T
a
ble
4,
the
hybr
id
model
c
ons
is
te
ntl
y
out
pe
r
f
or
ms
both
tr
a
dit
ional
a
nd
ba
s
e
li
ne
a
ppr
oa
c
he
s
.
T
his
hold
s
a
c
r
os
s
a
ll
ke
y
pe
r
f
or
manc
e
met
r
ics
[
26]
,
[
27]
.
T
a
ble
4.
C
ompar
a
ti
ve
e
xpe
r
im
e
ntal
r
e
s
ult
s
M
ode
l
A
c
c
ur
a
c
y (
%
)
P
r
e
c
is
io
n (
%
)
R
e
c
a
ll
(
%
)
F1
-
s
c
or
e
(
%
)
H
e
ur
is
ti
c
-
ba
s
e
d
85.3
84.1
86.5
85.3
L
ogi
s
ti
c
r
e
gr
e
s
s
io
n
89.5
88.7
90.1
89.4
C
N
N
-
L
S
T
M
hybr
id
(
P
r
opos
e
d)
97.0
96.3
97.5
96.8
T
he
s
upe
r
ior
pe
r
f
or
manc
e
of
the
hyb
r
id
model
r
e
f
l
e
c
ts
the
e
f
f
e
c
ti
ve
ne
s
s
of
c
ombi
ning
c
onvolut
ional
a
nd
r
e
c
ur
r
e
nt
ne
ur
a
l
ne
twor
ks
f
or
video
c
ode
c
e
r
r
o
r
de
tec
ti
on.
T
his
a
li
gns
with
r
e
c
e
nt
r
e
s
e
a
r
c
h
de
mo
ns
tr
a
ti
ng
the
a
dva
ntage
s
of
mac
hine
lea
r
ning
in
video
c
o
d
ing,
qua
li
ty
a
da
ptation,
a
nd
r
e
a
l
-
ti
me
a
nomaly
de
tec
ti
on
[
28]
,
[
29]
.
Ke
y
f
a
c
tor
s
c
ontr
ibu
ti
ng
to
the
e
nha
nc
e
d
pe
r
f
or
manc
e
:
‒
C
NN
+
L
S
T
M
s
yne
r
gy:
C
NN
s
e
xtr
a
c
t
s
pa
ti
a
l
f
e
a
tur
e
s
,
while
L
S
T
M
s
c
a
ptur
e
tempor
a
l
de
pe
nde
nc
ies
,
a
ll
owing
the
model
to
de
tec
t
c
ompl
e
x
pa
tt
e
r
ns
in
meta
da
ta.
‒
Da
ta
a
ugmenta
ti
on:
s
im
ulating
dif
f
e
r
e
nt
c
ode
c
c
onf
igur
a
ti
ons
,
a
lt
e
r
ing
f
r
a
me
r
a
tes
,
a
nd
inj
e
c
ti
ng
nois
e
incr
e
a
s
e
d
the
model’
s
ge
ne
r
a
li
z
a
bil
it
y
to
diver
s
e
br
oa
dc
a
s
ti
ng
s
c
e
na
r
ios
.
‒
R
e
gular
iza
ti
on
tec
hniques
:
dr
opout
a
nd
L
2
r
e
gular
iza
ti
on
e
f
f
e
c
ti
ve
ly
mi
ti
ga
ted
ove
r
f
it
ti
ng
,
im
pr
ovin
g
the
model’
s
r
e
s
il
ienc
e
to
uns
e
e
n
da
ta.
Dur
ing
li
ve
br
oa
dc
a
s
ts
a
t
T
V
L
a
a
youne
,
the
model
s
uc
c
e
s
s
f
ull
y
de
tec
ted
c
ode
c
incompa
ti
bil
it
ies
a
nd
s
ync
hr
oniza
ti
on
e
r
r
or
s
in
video
c
li
ps
,
p
r
e
ve
nti
ng
tr
a
ns
mi
s
s
ion
dis
r
upti
ons
.
I
ts
int
e
gr
a
ti
on
int
o
t
he
Or
igo
s
e
r
ve
r
pipeline
a
ll
owe
d
f
or
r
e
a
l
-
ti
me
e
r
r
or
de
t
e
c
ti
on,
r
e
duc
ing
manua
l
qua
li
ty
c
he
c
ks
a
nd
e
nha
nc
ing
ope
r
a
ti
ona
l
e
f
f
icie
nc
y.
De
s
pit
e
pr
omi
s
ing
r
e
s
ult
s
,
f
ur
ther
im
pr
ove
ments
a
r
e
ne
c
e
s
s
a
r
y
to
e
ns
ur
e
long
-
ter
m
s
c
a
labili
ty.
F
utu
r
e
wo
r
k
wi
ll
f
oc
us
on
e
xpa
nding
the
da
tas
e
t
to
c
ove
r
r
a
r
e
c
ode
c
c
on
f
igur
a
ti
ons
,
op
ti
mi
z
ing
r
e
a
l
-
ti
me
pr
oc
e
s
s
ing,
a
nd
incor
por
a
ti
ng
vis
ua
l/
a
u
dio
c
ontent
a
na
lys
is
to
a
dd
r
e
s
s
e
r
r
or
s
be
yond
m
e
tada
ta
-
ba
s
e
d
de
tec
ti
on.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
Ar
ti
f
I
ntell
I
S
S
N:
2252
-
8938
E
nhanc
e
d
pr
e
-
br
oadc
as
t
v
ideo
c
ode
c
v
ali
dati
on
us
ing
hy
br
id
C
N
N
-
L
ST
M
w
it
h
…
(
K
hali
d
E
l
F
ay
q
)
2873
3.
4.
Dis
c
u
s
s
ion
T
he
s
tudy
de
mons
tr
a
tes
the
e
f
f
e
c
ti
ve
ne
s
s
of
int
e
gr
a
ti
ng
a
utoenc
ode
r
s
,
a
tt
e
nti
on
mec
ha
nis
ms
,
a
nd
a
hybr
id
C
NN
-
L
S
T
M
mo
de
l
f
or
video
c
ode
c
e
r
r
or
d
e
tec
ti
on.
T
he
p
r
opos
e
d
a
ppr
oa
c
h
c
ons
is
tently
de
li
ve
r
s
high
a
c
c
ur
a
c
y,
r
e
duc
ing
li
ve
b
r
oa
dc
a
s
t
dis
r
upti
ons
by
identif
ying
incompa
ti
ble
video
c
li
ps
.
Automati
ng
e
r
r
or
de
tec
ti
on
s
tr
e
a
ml
ines
qua
li
ty
c
ontr
ol,
mi
n
im
izing
manua
l
ins
pe
c
ti
o
ns
a
nd
c
ons
e
r
ving
r
e
s
our
c
e
s
.
T
he
model
int
e
gr
a
tes
s
e
a
ml
e
s
s
ly
with
the
Or
igo
br
oa
dc
a
s
t
s
e
r
ve
r
,
e
na
bli
ng
r
e
a
l
-
ti
me
de
tec
ti
on
withou
t
a
dding
s
i
gnif
ica
nt
c
omput
a
ti
ona
l
load.
W
hil
e
the
r
e
s
ult
s
a
r
e
pr
omi
s
ing,
f
u
r
ther
im
pr
ove
m
e
nts
a
r
e
ne
e
de
d.
E
xpa
nding
the
da
tas
e
t
to
include
diver
s
e
c
ode
c
c
onf
igur
a
ti
ons
a
nd
ge
ne
r
a
ti
ng
s
ynthetic
da
ta
f
or
r
a
r
e
e
r
r
or
s
wi
ll
e
nha
nc
e
the
model’
s
a
da
ptabili
ty.
Optim
izing
the
s
ys
tem
f
or
lar
ge
r
vi
de
o
s
tr
e
a
ms
a
nd
lowe
r
late
nc
y
is
e
s
s
e
nti
a
l
f
or
r
e
a
l
-
ti
me
br
oa
dc
a
s
ti
ng.
F
utur
e
e
nha
nc
e
ments
may
a
ls
o
inc
or
por
a
te
c
ontent
-
a
wa
r
e
a
na
lys
i
s
to
de
tec
t
vis
ua
l
a
nomalies
be
yond
meta
da
ta
incons
is
tenc
ie
s
,
s
tr
e
ngthening
th
e
model’
s
va
lue
f
or
T
V
L
a
a
youne
a
nd
the
br
oa
de
r
S
NR
T
ne
twor
k.
4.
CONC
L
USI
ON
T
his
pa
pe
r
int
r
oduc
e
s
a
n
e
nha
nc
e
d
mac
hine
l
e
a
r
ning
-
dr
iven
a
ppr
oa
c
h
to
im
p
r
ove
tele
vis
ion
br
oa
dc
a
s
ti
ng
r
e
li
a
bil
it
y
by
pr
oa
c
ti
ve
ly
de
tec
ti
ng
video
c
ode
c
e
r
r
or
s
.
Add
r
e
s
s
ing
r
e
c
ur
r
ing
is
s
ue
s
with
incompa
ti
ble
c
ode
c
s
,
the
pr
opos
e
d
model
tar
ge
ts
dis
r
upti
ons
dur
i
ng
li
ve
b
r
oa
dc
a
s
ts
on
T
V
L
a
a
youne
.
B
y
int
e
gr
a
ti
ng
C
NN
s
,
L
S
T
M
ne
twor
ks
,
a
utoenc
ode
r
s
,
a
nd
a
tt
e
nti
on
mec
ha
nis
ms
,
the
s
ys
tem
e
f
f
e
c
ti
ve
ly
identif
ies
potential
c
ode
c
a
nomalies
a
nd
a
ler
ts
ope
r
a
tor
s
be
f
o
r
e
e
r
r
or
s
dis
r
upt
li
ve
br
oa
dc
a
s
ts
.
Us
ing
a
diver
s
e
da
tas
e
t,
meta
da
ta
wa
s
e
xt
r
a
c
ted
th
r
ough
F
F
mpe
g,
a
nd
the
model
wa
s
t
r
a
ined
to
p
r
e
ve
nt
ove
r
f
it
ti
ng
a
nd
maximi
z
e
pr
e
c
is
ion.
T
he
c
ombi
na
ti
on
of
a
utoenc
ode
r
-
ba
s
e
d
a
nomaly
de
tec
ti
on
with
c
onvolut
ional
a
nd
r
e
c
ur
r
e
nt
laye
r
s
e
na
bles
the
c
a
ptu
r
e
of
s
pa
ti
a
l,
tempor
a
l,
a
nd
la
tent
pa
tt
e
r
ns
,
e
nha
nc
ing
the
model’
s
r
obus
tnes
s
a
nd
a
c
c
ur
a
c
y.
E
xpe
r
im
e
ntal
r
e
s
ult
s
de
mons
tr
a
te
notable
a
c
c
ur
a
c
y
im
pr
ove
ments
(
9
7%
)
ove
r
tr
a
dit
ional
he
ur
is
ti
c
-
ba
s
e
d
a
nd
ba
s
e
li
ne
mac
hine
lea
r
ning
methods
.
T
he
s
ys
tem
int
e
gr
a
tes
s
moot
hly
int
o
the
e
xis
ti
ng
br
oa
dc
a
s
ti
ng
pipeline,
e
na
bli
ng
r
e
a
l
-
ti
me
de
tec
ti
on
a
nd
a
utom
a
ted
a
ler
ts
,
ult
im
a
tely
boos
ti
ng
ope
r
a
ti
ona
l
e
f
f
icie
nc
y.
Ke
y
a
dva
ntage
s
include
i
mpr
ove
d
de
tec
ti
on
r
a
tes
thr
ough
the
hybr
id
C
N
N
-
L
S
T
M
a
r
c
hit
e
c
tur
e
,
e
nha
nc
e
d
a
nomaly
de
tec
ti
on
wi
th
a
u
toenc
ode
r
s
,
a
nd
r
e
duc
e
d
manua
l
ove
r
s
ight
by
a
ut
omating
e
r
r
or
identif
ica
ti
on.
T
he
s
e
a
ml
e
s
s
int
e
gr
a
ti
on
o
f
the
model
int
o
T
V
L
a
a
youne
’
s
wor
kf
low
m
ini
mi
z
e
s
dis
r
upti
ons
a
nd
e
ns
ur
e
s
r
e
li
a
ble
video
playba
c
k.
T
e
s
ts
c
onduc
ted
a
c
r
os
s
va
r
ious
video
c
li
ps
unde
r
s
c
or
e
the
model’
s
c
a
pa
c
i
ty
to
pr
e
ve
nt
br
oa
dc
a
s
t
e
r
r
or
s
,
r
e
in
f
or
c
ing
it
s
potential
a
s
a
c
r
it
ica
l
tool
f
o
r
e
ns
ur
ing
s
moot
h
a
nd
unint
e
r
r
upted
tele
vis
ion
t
r
a
ns
mi
s
s
ion.
I
n
c
onc
lus
ion,
the
pr
opos
e
d
mac
hine
lea
r
ning
model,
e
nr
i
c
he
d
by
a
utoenc
ode
r
s
a
nd
a
tt
e
nti
on
mec
ha
nis
ms
,
pr
ovid
e
s
a
s
c
a
lable
s
olut
ion
f
or
video
c
ode
c
e
r
r
o
r
d
e
tec
ti
on,
s
igni
f
ica
ntl
y
e
nha
nc
ing
tele
vis
ion
br
oa
dc
a
s
ti
ng
r
e
li
a
bil
it
y.
T
he
s
ubs
tantial
im
pr
ove
ments
in
a
c
c
ur
a
c
y
a
nd
ope
r
a
ti
ona
l
e
f
f
icie
nc
y
highl
igh
t
the
model’
s
r
e
leva
nc
e
in
a
ddr
e
s
s
ing
c
ompl
e
x
br
oa
dc
a
s
ti
ng
c
ha
ll
e
nge
s
,
pa
ving
the
wa
y
f
or
br
oa
de
r
a
ppli
c
a
ti
ons
a
c
r
os
s
S
NR
T
’
s
c
ha
nne
ls
.
AC
KNOWL
E
DGM
E
N
T
S
W
e
e
xpr
e
s
s
our
s
ince
r
e
g
r
a
ti
tude
to
I
bn
T
of
a
i
l
Unive
r
s
it
y
f
or
their
unwa
ve
r
ing
s
uppor
t
a
nd
e
nc
our
a
ge
ment
thr
oughout
thi
s
pr
ojec
t.
Ou
r
he
a
r
tf
e
lt
thanks
a
ls
o
go
to
T
V
L
a
a
youne
f
or
pr
ovidi
ng
in
va
luable
r
e
s
our
c
e
s
a
nd
f
os
ter
ing
a
c
oll
a
bor
a
ti
ve
e
nvir
on
m
e
nt,
whic
h
playe
d
a
c
r
uc
ial
r
ole
in
the
de
ve
lop
ment
a
nd
im
pleme
ntation
of
our
mac
hine
lea
r
ning
model
f
or
de
tec
ti
ng
video
c
ode
c
e
r
r
or
s
.
F
UN
DI
NG
I
NF
ORM
AT
I
ON
T
he
a
uthor
s
s
tate
no
f
und
ing
invol
ve
d.
AU
T
HO
R
CONT
RI
B
U
T
I
ONS
S
T
AT
E
M
E
N
T
T
his
jour
na
l
us
e
s
the
C
ontr
ibut
o
r
R
oles
T
a
xo
nomy
(
C
R
e
diT
)
to
r
e
c
ognize
indi
vidual
a
uthor
c
ontr
ibut
ions
,
r
e
duc
e
a
utho
r
s
hip
dis
putes
,
a
nd
f
a
c
il
it
a
te
c
oll
a
bor
a
ti
on.
Nam
e
of
Au
t
h
or
C
M
So
Va
Fo
I
R
D
O
E
Vi
Su
P
Fu
Kha
li
d
E
l
F
a
yq
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
S
a
id
T
ka
tek
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
L
a
hc
e
n
I
dougli
d
✓
✓
✓
✓
✓
Evaluation Warning : The document was created with Spire.PDF for Python.