I
A
E
S
I
n
t
e
r
n
at
io
n
al
Jou
r
n
al
of
A
r
t
if
ic
ia
l
I
n
t
e
ll
ig
e
n
c
e
(
I
J
-
AI
)
V
ol
.
14
, N
o.
2
,
A
pr
il
2025
, pp.
1441
~
1449
I
S
S
N
:
2252
-
8938
,
D
O
I
:
10.11591/
ij
a
i.
v
14
.i
2
.pp
1441
-
1449
1441
Jou
r
n
al
h
om
e
page
:
ht
tp
:
//
ij
ai
.
ia
e
s
c
or
e
.c
om
E
ve
n
t
d
e
t
e
c
t
i
on
i
n
soc
c
e
r
m
at
c
h
e
s t
h
r
ou
gh
au
d
i
o c
l
ass
i
f
i
c
at
i
on
u
si
n
g t
r
an
sf
e
r
l
e
ar
n
i
n
g
B
ij
al
U
t
s
av Gad
h
ia
1
, S
h
ah
id
S
. M
od
as
iy
a
2
1
D
e
pa
r
t
m
e
nt
of
C
om
put
e
r
E
ngi
ne
e
r
i
ng, G
uj
a
r
a
t
T
e
c
hnol
ogi
c
a
l
U
ni
ve
r
s
i
t
y, A
hm
e
da
ba
d, I
ndi
a
2
D
e
pa
r
t
m
e
nt
of
E
l
e
c
t
r
oni
c
s
a
nd C
om
m
uni
c
a
t
i
on,
G
ove
r
nm
e
nt
E
ngi
ne
e
r
i
ng C
ol
l
e
ge
, G
a
ndhi
na
ga
r
, I
ndi
a
A
r
t
ic
le
I
n
f
o
A
B
S
T
R
A
C
T
A
r
ti
c
le
h
is
to
r
y
:
R
e
c
e
iv
e
d
M
a
r
26, 2024
R
e
vi
s
e
d
O
c
t
31, 2024
A
c
c
e
pt
e
d
N
ov 14, 2024
Addressing
the
complexities
of
generating
sports
summaries
t
hrough
machine
learning,
our
resear
ch
aims
to
bridge
the
gap
in
audio
-
based
event
detection,
particularly
in
soccer
games.
We
introduce
an
extended
R
esNet
-
50
deep
learning
approach
for
soccer
audio,
emphasizing
key
moment
s
from
large
soccer
content
archives
through
the
use
of
transfer
learnin
g.
The
proposed
model
accurately
classifies
soccer
audio
segments
int
o
two
categories:
i)
e
vents
,
representing
crucial
in
-
game
occurrences
and
ii)
no
events
,
d
enoting
less
impactful
moments
.
The
model
involv
es
complete
audio
preprocessing,
the
implementation
of
proposed
model
using
t
ransfer
learning
and
the
classifica
tion
of
events
.
The
model’s
reliability
is
va
lidated
using
the
dataset
soccer
action
dataset
compilation
(SADC),
involves
dataset
creation b
y footb
all fans.
Comparat
ive analy
sis wi
th
pre
-
trained mod
el
s such
as
VGG19,
DesNet121,
and
EfficientNetB7
demonstrates
the
s
uperior
performance
of
the
extended
ResNet
-
50
based
approach.
Results
across
different
epochs
reveal
consistently
high
accuracy,
precision,
recall,
and
F1
-
score,
emphasizing
the
proposed
model
'
s
effectiveness
in
event
de
tection
through
audio
classifica
tion.
The
paper
concludes
that
the
proposed
model
offers
a
robust
solution
for
detecting
an
event
from
audio
of
soccer
sports
providing
valuable
insights
for
fans,
analysts,
and
content
creat
ors
to
identify inte
rested mome
nts from
soccer
game with low failure.
K
e
y
w
o
r
d
s
:
A
udi
o c
la
s
s
if
ic
a
ti
on
D
e
e
p l
e
a
r
ni
ng
R
e
s
N
e
t
-
50
S
oc
c
e
r
s
um
m
a
r
iz
a
ti
on
T
r
a
ns
f
e
r
l
e
a
r
ni
ng
This is an
open
acce
ss artic
le unde
r the
CC BY
-
SA
license.
C
or
r
e
s
pon
di
n
g A
u
th
or
:
B
ij
a
l
U
ts
a
v G
a
dhi
a
D
e
pa
r
tm
e
nt
of
C
om
put
e
r
E
ngi
ne
e
r
in
g, G
u
ja
r
a
t
T
e
c
hnol
ogi
c
a
l
U
ni
ve
r
s
it
y
A
hm
e
da
ba
d, G
uj
a
r
a
t,
I
ndi
a
E
m
a
il
:
bi
j.
1988@
gm
a
il
.c
om
1.
I
N
T
R
O
D
U
C
T
I
O
N
T
he
e
xpa
n
s
io
n
of
m
ul
ti
m
e
di
a
c
ont
e
nt
,
in
c
lu
di
ng
vi
de
os
u
ti
li
z
e
d
f
or
bot
h
e
nt
e
r
ta
in
m
e
nt
a
nd
pr
of
e
s
s
io
na
l
pur
pos
e
s
,
h
a
s
e
xp
e
r
ie
nc
e
d
unpr
e
c
e
d
e
nt
e
d
gr
ow
th
in
r
e
c
e
nt
ye
a
r
s
[
1]
,
[
2]
.
T
he
c
om
m
e
r
c
ia
l
pot
e
nt
ia
l
of
a
ut
om
a
ti
c
s
por
ts
vi
de
o
s
um
m
a
r
iz
a
ti
on
te
c
hni
que
s
ha
s
g
a
th
e
r
e
d
s
ig
ni
f
ic
a
nt
a
tt
e
nt
io
n,
s
pa
r
ki
ng
in
te
r
e
s
t
in
va
r
io
us
a
ppr
oa
c
he
s
to
a
ddr
e
s
s
th
is
a
s
pe
c
t
[
3]
,
[
4]
.
S
oc
c
e
r
,
of
te
n
r
e
f
e
r
r
e
d
to
a
s
th
e
w
or
ld
'
s
m
os
t
popula
r
s
por
t,
f
a
s
c
in
a
te
s
m
il
li
ons
of
f
a
ns
w
or
ld
w
id
e
w
it
h
it
s
th
r
il
li
ng
m
a
tc
he
s
a
nd
ic
oni
c
m
om
e
nt
s
[
5]
.
I
n
th
e
a
ge
of
di
gi
ta
l
m
e
di
a
,
th
e
a
v
a
il
a
bi
li
ty
of
va
s
t
a
r
c
hi
ve
s
of
a
n
y
s
por
ts
c
ont
e
nt
,
in
c
lu
di
ng
vi
de
o
s
a
nd
a
udi
o
r
e
c
or
di
ngs
,
ha
s
c
r
e
a
t
e
d
a
tr
e
a
s
ur
e
tr
ove
of
in
f
or
m
a
ti
on
w
a
it
in
g
to
be
e
xpl
or
e
d
[
6]
.
S
oc
c
e
r
s
um
m
a
r
iz
a
ti
on,
is
a
gr
ow
in
g
f
ie
ld
a
t
th
e
in
te
r
s
e
c
ti
on
of
s
por
ts
a
na
ly
ti
c
s
a
nd
a
r
ti
f
ic
ia
l
in
te
ll
ig
e
nc
e
,
a
c
ti
vi
ti
e
s
to
unl
oc
k
th
e
f
ul
l
pot
e
nt
ia
l
of
t
hi
s
r
ic
h m
ul
ti
m
e
di
a
da
ta
.
K
ha
n
a
nd
P
a
w
a
r
[
7]
r
e
vi
e
w
s
r
e
c
e
nt
w
or
k
on
ke
y
f
r
a
m
e
-
ba
s
e
d
a
nd
dyna
m
ic
vi
de
o
s
um
m
a
r
iz
a
ti
on
te
c
hni
que
s
, di
s
c
u
s
s
in
g c
h
a
ll
e
nge
s
a
nd f
ut
ur
e
di
r
e
c
ti
ons
i
n t
he
f
ie
ld
of
s
por
ts
.
J
a
don a
nd
J
a
s
im
[
8]
a
tt
e
m
pt
e
d t
o
a
ddr
e
s
s
vi
d
e
o
s
um
m
a
r
iz
a
ti
on
us
in
g
a
n
uns
up
e
r
vi
s
e
d
le
a
r
ni
ng
pa
r
a
di
gm
w
hi
c
h
w
a
s
a
c
hi
e
ve
d
by
a
ppl
yi
ng
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
.
14
, N
o.
2
,
A
pr
il
2025
:
1441
-
1449
1442
c
onve
nt
io
na
l
vi
s
io
n
-
ba
s
e
d
a
lg
or
it
hm
s
f
or
pr
e
c
is
e
f
e
a
tu
r
e
e
xt
r
a
c
ti
on
f
r
om
vi
de
o
f
r
a
m
e
s
.
A
bove
m
e
th
ods
f
o
r
vi
de
o
s
um
m
a
r
iz
a
ti
on,
in
c
lu
di
ng
ke
y
f
r
a
m
e
-
ba
s
e
d
a
nd
dyna
m
ic
t
e
c
hni
que
s
,
ha
ve
m
a
d
e
s
tr
id
e
s
,
th
e
y
of
te
n
la
c
k
th
e
a
bi
li
ty
to
e
f
f
ic
ie
nt
ly
di
f
f
e
r
e
nt
ia
te
be
twe
e
n
s
ig
ni
f
ic
a
nt
a
n
d
le
s
s
im
pa
c
tf
ul
m
om
e
nt
s
in
s
o
c
c
e
r
[
7]
,
[
8]
.
S
oc
c
e
r
s
um
m
a
r
ie
s
a
r
e
e
s
s
e
nt
ia
l
b
e
c
a
u
s
e
th
e
y
c
a
n
r
e
du
c
e
h
our
s
of
vi
de
o
in
to
c
onc
is
e
a
nd
in
f
or
m
a
ti
ve
hi
ghl
ig
ht
s
.
T
he
s
e
hi
ghl
ig
ht
s
a
r
e
not
onl
y
va
lu
a
bl
e
f
o
r
f
a
ns
s
e
e
ki
ng
to
r
e
li
ve
th
e
m
os
t
e
xc
it
in
g
m
om
e
nt
s
bu
t
a
ls
o
f
or
a
na
ly
s
t
s
,
c
oa
c
he
s
,
a
nd
pl
a
ye
r
s
s
tr
iv
in
g
to
ga
in
de
e
pe
r
in
s
ig
ht
s
in
to
te
a
m
s
tr
a
te
gi
e
s
a
nd
pl
a
y
e
r
pe
r
f
or
m
a
nc
e
.
R
ongve
d
e
t
al
.
[
9]
in
tr
oduc
e
s
a
3D
c
onvolut
io
na
l
ne
ur
a
l
ne
twor
k
(
3D
-
C
N
N
)
a
lg
o
r
it
hm
f
o
r
a
ut
om
a
te
d
e
ve
nt
d
e
te
c
ti
on
in
s
oc
c
e
r
vi
de
o
s
.
P
a
bl
os
e
t
al
.
[
10]
pr
opos
e
d
3D
-
C
N
N
ba
s
e
d
d
e
e
p
ne
ur
a
l
ne
twor
k
a
ddr
e
s
s
in
g
th
e
c
ha
ll
e
nge
of
un
e
di
te
d
us
e
r
-
ge
ne
r
a
te
d
ke
ndo
s
p
or
t
c
ont
e
nt
.
E
m
on
e
t
al
.
[
11]
s
ugge
s
te
d
d
e
e
p
c
r
ic
ke
t
s
um
m
a
r
iz
a
ti
on
ne
twor
k
(
D
C
S
N
)
a
ppr
oa
c
h
to
pr
ovi
de
c
onc
is
e
s
ynop
s
e
s
of
lo
ng
c
r
ic
ke
t
m
a
tc
he
s
by
us
in
g
C
N
N
lo
ng
s
hor
t
-
te
r
m
m
e
m
or
y
(
L
S
T
M
)
a
ppr
oa
c
h.
T
he
pr
opos
e
d
s
ys
te
m
,
e
va
lu
a
te
d
on
th
e
ne
w
c
r
ic
s
um
da
ta
s
e
t
us
in
g
m
e
a
n
opi
ni
on
s
c
or
e
(
M
O
S
)
.
A
f
e
w
r
e
s
e
a
r
c
he
r
s
ha
ve
de
lv
e
d
in
to
a
udi
o
pr
oc
e
s
s
in
g
to
pr
e
di
c
t
pr
e
c
is
e
e
ve
nt
s
i
n di
ve
r
s
e
doma
in
.
S
ound
pl
a
ys
a
pi
vot
a
l
r
o
le
in
c
a
pt
ur
in
g
a
tt
e
nt
io
n
a
nd
c
a
n
pr
of
i
c
ie
nt
l
y
di
s
c
e
r
n
s
a
li
e
nc
y
t
o
e
xt
r
a
c
t
out
im
por
ta
nt
oc
c
ur
r
e
n
c
e
s
f
r
o
m
vi
de
o
[
12]
–
[
15]
.
S
a
n
a
br
i
a
e
t
al
.
[
12]
de
v
is
e
d
a
n
a
r
c
h
it
e
c
tu
r
a
l
f
r
a
m
e
w
or
k
th
a
t
e
m
pl
oy
s
a
m
ul
ti
p
le
in
s
ta
n
c
e
l
e
a
r
ni
n
g
(
M
I
L
)
a
ppr
oa
c
h
to
c
o
ns
id
e
r
th
e
s
e
qu
e
nt
ia
l
in
te
r
d
e
pe
n
de
n
c
e
a
m
ong
e
ve
nt
s
.
A
dd
it
io
n
a
ll
y,
it
i
nc
or
p
or
a
te
s
a
hi
e
r
a
r
c
hi
c
a
l
m
ul
ti
m
oda
l
a
tt
e
nt
i
on
l
a
y
e
r
w
it
h
a
udi
o
f
e
a
tu
r
e
s
de
s
ig
n
e
d
t
o
di
s
c
e
r
n
t
he
s
i
gni
f
ic
a
nc
e
of
e
a
c
h
e
ve
nt
w
it
h
in
a
n a
c
ti
on c
ont
e
xt
.
E
va
nge
l
opoulo
s
e
t
al
.
[
13]
ha
s
in
te
gr
a
te
d
a
udi
o
f
e
a
tu
r
e
t
hr
o
ugh
w
a
v
e
f
or
m
m
odu
la
ti
o
n
w
it
h
vi
s
ua
l
to
id
e
n
ti
f
y
s
a
l
i
e
n
c
y
f
r
om
m
ov
ie
vi
de
o s
tr
e
a
m
s
a
nd
c
on
c
lu
de
d
th
a
t
m
ul
ti
m
o
d
a
l
s
a
l
ie
nc
y
pr
o
du
c
in
g
s
ubj
e
c
ti
v
e
l
y
hi
gh
qu
a
l
it
y
s
um
m
a
r
i
e
s
.
V
a
nd
e
r
pl
a
e
t
s
e
a
nd
D
up
on
t
[
14]
de
ta
il
e
d
a
n
e
x
pe
r
im
e
nt
a
l
in
v
e
s
ti
ga
ti
on
t
o e
xpl
or
e
t
he
in
te
gr
a
ti
on
of
a
udi
o
a
n
d
v
id
e
o
in
f
or
m
a
ti
on
w
it
h
in
va
r
io
u
s
s
ta
g
e
s
of
de
e
p
ne
ur
a
l
ne
t
w
or
k
a
r
c
hi
te
c
tu
r
e
s
.
I
l
s
e
e
t
al
.
[
15]
a
ddr
e
s
s
e
s
M
I
L
by
f
or
m
ul
a
ti
n
g
th
e
pr
o
bl
e
m
a
s
le
a
r
ni
n
g
th
e
B
e
r
n
oul
li
di
s
tr
ib
u
ti
on
of
b
a
g
l
a
be
l
s
u
s
in
g
n
e
ur
a
l
ne
t
w
or
k
s
.
I
t
in
tr
od
uc
e
s
a
n
a
tt
e
nt
io
n
-
ba
s
e
d
ope
r
a
to
r
, pr
ovi
di
ng i
n
s
ig
ht
s
i
nt
o t
he
c
ont
r
ib
ut
io
n
of
e
a
c
h i
ns
t
a
nc
e
t
o l
a
be
l
.
D
e
e
p
le
a
r
ni
ng
te
c
hni
que
s
in
tr
ic
a
te
ly
e
xt
r
a
c
t
f
e
a
tu
r
e
r
e
pr
e
s
e
nt
a
ti
ons
[
16]
–
[
18]
.
S
a
na
br
ia
e
t
al
.
[
16]
s
ol
e
ly
r
e
li
e
d on the
e
ne
r
gy o
f
t
he
a
udi
o
s
ig
na
l,
w
hi
c
h, i
n ot
he
r
c
ont
e
xt
s
, ha
ve
pr
ove
n be
ne
f
ic
ia
l
f
or
e
nha
nc
in
g
c
la
s
s
if
ic
a
ti
ons
in
s
o
c
c
e
r
ga
m
e
s
.
A
gye
m
a
n
e
t
al
.
[
17]
pr
e
s
e
nt
e
d
de
e
p
le
a
r
ni
ng
f
or
s
um
m
a
r
iz
in
g
le
ngt
hy
s
oc
c
e
r
vi
de
os
,
ut
il
iz
in
g
a
3D
-
C
N
N
a
nd
L
S
T
M
r
e
c
ur
r
e
nt
ne
ur
a
l
n
e
t
w
or
k
(
R
N
N
)
.
Ji
e
t
al
.
[
18]
pr
opos
e
d
a
d
e
e
p
le
a
r
ni
ng
f
r
a
m
e
w
or
k
f
o
r
vi
de
o
s
um
m
a
r
iz
a
ti
on,
w
hi
c
h
us
e
s
G
o
ogl
e
N
e
t
w
it
h
B
iL
S
T
M
f
r
a
m
e
w
or
k
to
a
ddr
e
s
s
c
ha
ll
e
nge
s
of
r
e
la
ti
on
di
s
c
ove
r
y
a
nd
s
e
m
a
nt
ic
lo
s
s
by
in
te
gr
a
ti
ng
e
nc
ode
r
-
de
c
ode
r
a
tt
e
nt
io
n
a
nd
s
e
m
a
nt
ic
pr
e
s
e
r
vi
ng l
os
s
.
A
r
e
c
e
nt
br
e
a
kt
hr
ough
in
th
is
f
ie
ld
,
a
s
e
m
pha
s
i
z
e
d
in
[
19]
–
[
23]
r
obus
tl
y
unde
r
s
c
or
e
s
th
e
e
f
f
e
c
ti
ve
ne
s
s
of
th
e
s
e
te
c
hni
que
s
in
di
s
c
r
im
in
a
ti
ng
a
w
id
e
a
r
r
a
y
of
ke
y
e
ve
nt
s
w
it
hi
n
th
e
c
ont
e
xt
of
s
por
ts
s
um
m
a
r
iz
a
ti
on. R
a
f
iq
e
t
al
.
[
19
]
w
or
ke
d
on s
c
e
ne
c
la
s
s
if
ic
a
ti
on
i
n c
r
ic
ke
t
s
por
ts
by a
ppl
yi
ng
t
r
a
ns
f
e
r
l
e
a
r
ni
n
g
on
A
le
xN
e
t
C
N
N
to
pr
e
ve
nt
m
ode
l
f
r
om
ove
r
f
it
ti
ng.
D
e
li
e
ge
e
t
al
.
[
20]
a
la
r
ge
-
s
c
a
le
a
nnot
a
te
d
da
ta
s
e
t
of
500
unt
r
im
m
e
d
s
oc
c
e
r
br
oa
dc
a
s
t
vi
de
os
is
in
tr
oduc
e
d,
w
hi
c
h
i
s
us
e
d
by
m
a
ny
r
e
s
e
r
c
he
r
s
f
or
a
c
ti
on
s
pot
ti
ng,
c
a
m
e
r
a
s
hot
s
e
gm
e
nt
a
ti
on,
a
nd
r
e
pl
a
y
gr
ounding
L
iu
e
t
al
.
[
21]
a
ls
o
us
e
d
vi
s
ua
l
a
nd
a
udi
o
da
ta
to
c
onduc
t
a
n
a
na
ly
s
is
w
hi
c
h
in
vol
ve
s
uns
upe
r
vi
s
e
d
s
hot
c
lu
s
te
r
in
g
a
nd s
upe
r
vi
s
e
d
a
udi
o
c
l
a
s
s
if
ic
a
ti
on
to
c
a
pt
ur
e
m
id
-
le
ve
l
pa
tt
e
r
ns
.
R
a
ve
nt
ós
e
t.
al
.
[
22]
pr
opos
e
d
m
e
th
odol
ogy
r
e
li
e
s
on
s
e
gm
e
nt
in
g
th
e
vi
de
o
s
e
que
nc
e
in
to
s
hot
s
a
nd
pl
a
c
e
s
p
a
r
ti
c
ul
a
r
e
m
pha
s
is
on
le
ve
r
a
gi
ng
a
udi
o
in
f
or
m
a
ti
on
to
e
nha
nc
e
th
e
ove
r
a
ll
r
obus
tn
e
s
s
of
th
e
s
um
m
a
r
iz
a
ti
on
s
ys
te
m
.
S
hi
h
[
23]
e
xt
e
ns
iv
e
ly
e
xpl
or
e
d
c
ont
e
nt
-
a
w
a
r
e
te
c
hni
que
s
f
or
a
na
ly
z
in
g
a
nd
s
um
m
a
r
iz
in
g
s
por
ts
vi
de
os
a
c
r
os
s
a
br
oa
d
s
pe
c
tr
um
of
s
por
ts
,
c
ha
ll
e
nge
s
,
a
ppr
oa
c
he
s
,
da
ta
s
e
ts
,
a
nd
e
va
lu
a
ti
on me
tr
ic
s
.
T
he
a
bove
s
tu
di
e
s
h
a
ve
in
di
c
a
te
d
th
a
t
th
e
e
xpe
r
im
e
nt
s
ul
ti
m
a
te
ly
il
lu
s
tr
a
te
how
th
e
us
e
of
a
udi
o
f
e
a
tu
r
e
s
e
nha
nc
e
s
th
e
pe
r
f
or
m
a
nc
e
of
e
ve
nt
de
te
c
ti
on
f
or
e
ve
nt
c
la
s
s
if
ic
a
ti
on.
T
hi
s
p
a
pe
r
a
ddr
e
s
s
e
s
a
c
r
it
ic
a
l
ga
p
by
in
c
or
por
a
ti
ng
a
udi
o
c
la
s
s
if
ic
a
ti
on
in
to
th
e
s
um
m
a
r
iz
a
ti
on
pr
oc
e
s
s
.
O
ur
in
nova
ti
on
e
xt
e
nds
to
a
ddr
e
s
s
in
g
no
e
ve
nt
s
,
e
na
bl
in
g
th
e
e
x
c
lu
s
io
n
of
ir
r
e
le
va
nt
s
e
c
ti
ons
.
T
hi
s
im
pr
ove
m
e
nt
f
in
e
-
tu
ne
s
th
e
s
um
m
a
r
iz
a
ti
on
pr
oc
e
s
s
,
le
a
di
ng
to
a
m
or
e
e
f
f
e
c
ti
ve
ut
il
iz
a
ti
on
of
a
udi
o
c
la
s
s
if
ic
a
ti
on.
C
om
pr
e
he
ns
iv
e
m
e
th
odol
ogy
de
ta
il
s
a
r
e
pr
ovi
de
d
in
th
e
f
ol
lo
w
in
g
s
e
c
ti
on.
I
n
t
hi
s
pa
pe
r
,
w
e
f
oc
us
on
s
oc
c
e
r
s
um
m
a
r
iz
a
ti
on
th
r
ough
th
e
e
xpl
or
a
ti
on
of
a
de
e
p
le
a
r
ni
ng
-
ba
s
e
d
a
udi
o
c
la
s
s
if
ic
a
ti
on
m
e
th
od.
W
e
e
m
pl
oy
our
e
xt
e
nde
d
R
e
s
N
e
t
-
50
b
a
s
e
d
pr
opos
e
d
m
ode
l
to
a
na
ly
z
e
a
udi
o
f
il
e
s
f
r
om
s
oc
c
e
r
m
a
tc
he
s
,
pr
e
di
c
ti
ng
th
e
s
e
c
ond
s
th
a
t
e
nc
om
pa
s
s
s
ig
ni
f
ic
a
nt
in
-
ga
m
e
e
v
e
nt
s
u
s
in
g
tr
a
ns
f
e
r
le
a
r
ni
ng.
O
ur
a
ppr
oa
c
h
e
f
f
e
c
ti
ve
ly
c
a
te
gor
iz
e
s
a
udi
o
s
e
gm
e
nt
s
in
to
two
c
la
s
s
e
s
:
i)
e
ve
nt
s
,
r
e
pr
e
s
e
nt
in
g
c
r
uc
ia
l
a
nd
th
r
il
li
ng
m
om
e
nt
s
a
nd
ii
)
no
e
ve
nt
s
,
in
di
c
a
ti
ng
le
s
s
im
pa
c
tf
ul
pa
r
ts
.
T
he
s
e
id
e
nt
if
ie
d
c
r
uc
ia
l
a
nd
th
r
il
li
ng
m
om
e
nt
s
c
a
n
s
ub
s
e
que
nt
ly
be
ut
il
iz
e
d
to
g
e
ne
r
a
te
hi
ghl
ig
ht
s
.
T
o
e
ns
ur
e
a
c
c
ur
a
c
y,
w
e
c
a
r
e
f
ul
ly
c
om
pi
le
d
our
ow
n
da
ta
s
e
t,
th
e
s
o
c
c
e
r
a
c
ti
on
da
ta
s
e
t
c
om
pi
la
ti
on
(
S
A
D
C
)
,
a
s
de
s
c
r
ib
e
d
in
s
e
c
ti
on
2
is
th
e
pr
opos
e
d
m
e
th
od
.
W
e
c
ondu
c
t
a
c
om
pa
r
a
ti
ve
a
na
ly
s
is
of
our
pr
opos
e
d
a
ppr
oa
c
h
w
it
h
pr
e
-
tr
a
in
e
d
de
e
p
le
a
r
ni
ng
m
ode
ls
,
in
c
lu
di
ng
V
G
G
19,
D
e
s
N
e
t1
21,
a
nd
E
f
f
ic
ie
nt
N
e
tB
7, pr
e
s
e
nt
in
g t
he
r
e
s
ul
ts
i
n
se
c
ti
on 3
is
t
he
r
e
s
ul
ts
a
nd dis
c
us
s
io
n
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
E
v
e
nt
de
te
c
ti
on i
n s
oc
c
e
r
m
at
c
he
s
t
hr
ough audio
c
la
s
s
if
ic
at
io
n
us
in
g t
r
ans
fe
r
l
e
ar
ni
ng
(
B
ij
al
U
ts
av
G
adhi
a
)
1443
2.
P
R
O
P
O
S
E
D
M
E
T
H
O
D
O
ur
goa
l
is
to
de
t
e
c
t
s
ig
ni
f
ic
a
nt
e
ve
nt
s
w
it
hi
n
s
oc
c
e
r
a
udi
o.
S
pe
c
if
ic
a
ll
y,
w
e
ta
r
ge
t
a
udi
o
s
e
gm
e
nt
s
e
nc
om
pa
s
s
in
g
e
le
m
e
nt
s
s
uc
h
a
s
e
nt
hus
ia
s
ti
c
c
r
ow
d
c
he
e
r
in
g
or
he
ig
ht
e
ne
d
pi
tc
h
in
c
om
m
e
nt
a
to
r
s
'
voi
c
e
s
,
w
hi
c
h
of
te
n
c
or
r
e
s
pond
to
ke
y
oc
c
ur
r
e
nc
e
s
a
s
s
ugge
s
t
e
d
in
[
17
]
.
O
ur
pr
opos
e
d
a
ppr
oa
c
h
in
vol
ve
s
c
a
te
gor
iz
in
g
th
e
m
os
t
im
por
ta
nt
a
nd
non
-
im
por
ta
nt
pa
r
ts
of
i
nput
s
oc
c
e
r
ga
m
e
a
udi
o
in
te
r
m
s
of
s
e
c
onds
.
B
y
or
ga
ni
z
in
g
th
e
s
e
s
ig
ni
f
ic
a
nt
s
e
gm
e
nt
s
s
e
que
nt
ia
ll
y,
w
e
c
a
n
c
r
e
a
te
hi
ghl
ig
ht
s
.
T
o
a
c
hi
e
ve
th
is
,
our
m
e
th
odol
ogy
is
di
vi
de
d
in
to
two
s
e
c
ti
ons
,
na
m
e
ly
da
ta
s
e
t
c
o
m
pi
la
ti
on
a
nd
e
ve
nt
r
e
c
ogni
ti
on
f
r
a
m
e
w
or
k
.
D
a
ta
s
e
t
c
om
pi
la
ti
on
e
xpl
a
in
s
how
our
ow
n
da
ta
s
e
t
na
m
e
d
S
A
D
C
,
w
a
s
f
or
m
e
d.
E
ve
nt
-
r
e
c
ogni
ti
on
f
r
a
m
e
w
or
k
il
lu
s
tr
a
te
s
th
e
te
c
hni
que
us
e
d
to
pr
e
di
c
t
a
nd
c
la
s
s
if
y
im
por
ta
nt
m
om
e
nt
s
in
s
e
c
onds
.
T
he
id
e
nt
if
ie
d
m
om
e
nt
s
c
a
n s
ubs
e
que
nt
ly
be
vi
s
ua
ll
y a
r
r
a
nge
d i
n a
s
e
que
nt
ia
l
m
a
nn
e
r
t
o c
r
e
a
te
hi
ghl
ig
ht
s
, a
s
pr
opos
e
d i
n
[
8]
.
2.1
.
D
at
as
e
t
c
om
p
il
at
io
n
A
s
in
di
c
a
te
d
in
[
20]
,
a
n
opt
im
a
l
da
ta
s
e
t
is
r
e
qui
r
e
d
to
e
xpl
o
r
e
in
nova
ti
ve
ta
s
ks
a
nd
a
ppr
oa
c
he
s
in
th
e
dom
a
in
of
s
oc
c
e
r
s
um
m
a
r
iz
a
ti
on.
S
A
D
C
,
a
da
ta
s
e
t
w
e
c
r
e
a
te
d
on
our
ow
n,
c
om
pr
is
e
s
25
f
oot
ba
ll
vi
de
o
f
il
m
s
dow
nl
oa
de
d
f
r
om
Y
ouT
ube
w
it
h
a
c
um
ul
a
ti
ve
r
unt
im
e
of
34
hour
s
,
33
m
in
ut
e
s
a
nd
58
s
e
c
ond
s
(
124,038
s
e
c
onds
)
.
A
gr
oup
o
f
f
iv
e
f
oot
ba
ll
f
a
ns
c
a
r
e
f
ul
ly
e
xa
m
in
e
d
th
e
s
e
vi
de
os
.
T
he
s
ta
r
t
a
nd
e
nd
ti
m
e
s
of
a
va
r
ie
ty
of
ga
m
e
r
e
la
te
d
e
ve
nt
s
w
e
r
e
c
a
r
e
f
ul
ly
r
e
c
or
de
d
in
th
is
da
ta
s
e
t
a
s
.c
s
v
f
il
e
.
T
h
e
ta
bl
e
f
or
m
a
t
of
it
a
s
pe
r
T
a
bl
e
1.
I
m
por
ta
nt
oc
c
ur
r
e
nc
e
s
in
c
lu
di
ng
go
a
ls
,
go
a
l
a
tt
e
m
pt
s
,
pe
na
lt
y
ki
c
k
s
,
f
r
e
e
ki
c
ks
,
p
e
na
lt
y
c
or
ne
r
s
,
a
nd
ye
ll
ow
c
a
r
ds
a
r
e
a
m
ong
th
e
e
ve
nt
s
th
a
t
w
e
r
e
r
e
c
or
de
d.
F
ig
ur
e
1
il
lu
s
tr
a
te
s
th
e
pr
oc
e
s
s
of
e
ve
nt
r
e
c
or
di
ng
by
f
oot
ba
ll
f
a
ns
.
I
t
m
a
r
ks
a
n
“
e
ve
nt
”
w
he
n
th
e
a
udi
e
nc
e
c
he
e
r
in
g
r
e
a
c
he
s
a
c
e
r
ta
in
vol
um
e
w
hi
le
w
a
tc
hi
ng
a
f
oot
ba
ll
m
a
tc
h;
ot
he
r
w
is
e
, i
t
is
c
ons
id
e
r
e
d a
s
“
no e
ve
nt
”.
F
ig
ur
e
1.
P
r
oc
e
s
s
of
r
e
c
or
di
ng e
ve
nt
s
O
ur
f
oc
us
is
on
id
e
nt
i
f
yi
ng
c
r
uc
ia
l
oc
c
ur
r
e
nc
e
s
in
or
de
r
to
s
ynt
he
s
iz
e
s
ig
ni
f
ic
a
nt
in
s
ig
ht
s
.
A
s
a
r
e
s
ul
t,
w
e
di
vi
de
d
th
e
r
e
c
or
de
d
c
a
s
e
s
in
to
two
s
e
p
a
r
a
te
c
a
te
gor
ie
s
:
i)
e
ve
nt
s
,
w
hi
c
h
r
e
pr
e
s
e
nt
s
im
por
ta
nt
oc
c
ur
r
e
nc
e
s
a
nd
ii
)
n
o
e
ve
nt
s
,
w
hi
c
h
r
e
pr
e
s
e
nt
s
a
ll
ot
he
r
in
s
ta
n
c
e
s
a
s
p
e
r
T
a
bl
e
1.
T
o
e
nh
a
nc
e
th
e
s
t
a
bi
li
ty
of
our
m
ode
l
a
nd
e
ns
ur
e
a
c
c
ur
a
te
pr
e
di
c
ti
on
of
a
ll
m
om
e
nt
s
,
w
e
r
e
c
or
de
d
a
ll
e
ve
nt
s
w
it
hi
n s
pe
c
if
ic
ti
m
e
f
r
a
m
e
s
,
li
ke
40, 50, 60, a
nd
90 s
e
c
onds
a
s
pe
r
t
he
f
or
m
a
t
s
how
n i
n
T
a
bl
e
2. T
o f
a
c
il
it
a
te
t
he
t
r
a
in
in
g
of
t
he
m
ode
l,
t
he
vi
de
o f
il
e
s
ha
ve
be
e
n t
r
a
ns
f
or
m
e
d i
nt
o .m
p3 a
udi
o f
il
e
s
. T
he
s
e
a
udi
o f
il
e
s
w
e
r
e
t
he
n m
a
de
a
va
il
a
bl
e
a
lo
ngs
id
e
th
e
ge
ne
r
a
te
d .c
s
v f
il
e
t
o e
n
s
ur
e
a
c
om
pr
e
he
n
s
iv
e
t
r
a
in
in
g a
ppr
oa
c
h.
T
a
bl
e
1. R
e
c
or
de
d e
ve
nt
of
S
A
D
C
E
ve
nt
na
m
e
S
t
a
r
t
t
i
m
e
(
s
e
c
.)
E
nd t
i
m
e
(
s
e
c
.)
F
i
l
e
na
m
e
G
oa
l
0
40
M
a
t
c
h1.m
p3
N
o e
ve
nt
41
101
M
a
t
c
h1.m
p3
N
o e
ve
nt
102
457
M
a
t
c
h1.m
p3
P
e
na
l
t
y
458
466
M
a
t
c
h1.m
p3
M
a
t
c
h1.m
p3
F
r
e
e
ki
c
k
6165
6214
M
a
t
c
h1.m
p3
T
a
bl
e
2. P
r
oc
e
s
s
e
d e
ve
nt
of
S
A
D
C
E
ve
nt
na
m
e
S
t
a
r
t
t
i
m
e
(
s
e
c
.)
E
nd t
i
m
e
(
s
e
c
.)
F
i
l
e
na
m
e
E
ve
nt
0
40
M
a
t
c
h1.m
p3
N
o e
ve
nt
41
125
M
a
t
c
h1.m
p3
N
o e
ve
nt
126
185
M
a
t
c
h1.m
p3
M
a
t
c
h1.m
p3
E
ve
nt
5561
5650
M
a
t
c
h1.m
p3
E
ve
nt
5651
5740
M
a
t
c
h1.m
p3
2.2
.
E
ve
n
t
r
e
c
ogn
it
io
n
f
r
am
e
w
or
k
T
hi
s
s
e
c
ti
on
pr
e
s
e
nt
s
th
e
s
y
s
te
m
a
ti
c
m
e
th
odol
ogy
e
m
pl
oye
d
to
a
c
hi
e
ve
a
c
c
ur
a
te
a
udi
o
-
ba
s
e
d
c
la
s
s
if
ic
a
ti
on
by
u
s
in
g
S
A
D
C
da
ta
s
e
t.
T
he
s
ugge
s
te
d
a
ppr
oa
c
h
in
c
lu
de
s
a
num
be
r
of
s
te
ps
th
a
t
pr
ovi
de
th
e
pr
e
di
c
ti
on
c
la
s
s
la
be
ls
"
e
v
e
nt
"
a
nd
"
no
e
ve
nt
"
f
or
th
e
tr
a
in
in
g
a
udi
o
da
ta
pr
ovi
de
d.
A
r
a
nge
of
li
br
a
r
ie
s
,
in
c
lu
di
ng
L
ib
r
os
a
,
P
a
nda
s
,
T
e
ns
or
F
lo
w
,
K
e
r
a
s
,
a
nd
P
I
L
,
a
r
e
im
por
te
d
to
f
a
c
il
it
a
te
th
e
ta
s
ks
a
t
ha
nd.
T
he
obj
e
c
ti
ve
is
to
c
r
e
a
t
e
a
s
ys
te
m
a
ti
c
a
ppr
oa
c
h
to
c
la
s
s
if
y
im
po
r
ta
nt
e
ve
nt
s
f
r
om
s
oc
c
e
r
a
udi
o.
T
he
pr
oc
e
s
s
di
a
gr
a
m
of
t
he
pr
opos
e
d a
ppr
oa
c
h i
s
i
ll
us
tr
a
te
d i
n F
ig
ur
e
2
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
.
14
, N
o.
2
,
A
pr
il
2025
:
1441
-
1449
1444
F
ig
ur
e
2.
P
r
oc
e
s
s
di
a
gr
a
m
of
e
ve
nt
r
e
c
ogni
ti
on f
r
a
m
e
w
or
k a
nd pr
e
di
c
ti
on
2.2.1.
I
n
p
u
t
au
d
io
w
i
t
h
s
oc
c
e
r
ac
t
io
n
d
at
as
e
t
c
om
p
il
at
io
n
d
at
as
e
t
I
n
th
is
s
e
c
ti
on
th
e
pr
ov
id
e
d
da
ta
s
e
t
S
A
D
C
is
lo
a
de
d,
f
or
m
in
g
th
e
f
ounda
ti
ons
f
or
s
ubs
e
que
nt
ope
r
a
ti
ons
.
T
he
da
t
a
is
m
a
ni
pul
a
te
d,
or
ga
ni
z
e
d
,
a
nd
a
ls
o
r
e
c
ti
f
ie
s
di
s
c
r
e
pa
nc
ie
s
a
nd
s
t
a
nda
r
di
z
e
s
pa
r
a
m
e
te
r
va
lu
e
s
f
or
a
c
c
ur
a
te
a
na
ly
s
i
s
.
A
f
te
r
th
a
t
a
ll
.m
p3
a
udi
o
f
il
e
s
e
f
f
ic
ie
nt
ly
lo
a
de
d
us
in
g
th
e
“
A
udi
oF
il
e
C
li
p”
f
unc
ti
on
f
r
om
th
e
“
M
ovi
e
P
y”
li
br
a
r
y
w
hi
c
h
c
a
lc
ul
a
te
s
th
e
a
udi
o'
s
dur
a
ti
on
in
s
e
c
onds
,
l
a
be
le
d
a
s
"
dur
a
ti
on,"
w
hi
c
h
is
a
s
ig
ni
f
ic
a
nt
pa
r
a
m
e
te
r
.
F
o
r
e
f
f
e
c
ti
ve
a
na
ly
s
is
,
s
ubs
e
ts
of
th
e
da
ta
s
e
t
a
r
e
e
xt
r
a
c
te
d
ba
s
e
d
on
s
pe
c
if
ic
a
udi
o f
il
e
s
a
nd e
ve
nt
t
ype
s
w
hi
c
h i
s
t
he
n gi
ve
n
a
s
a
n i
nput
t
o t
h
e
da
ta
pr
e
pr
oc
e
s
s
in
g s
ta
ge
.
2.2.2.
D
at
a
-
p
r
e
p
r
oc
e
s
s
in
g
S
ound
f
e
a
tu
r
e
s
r
e
ly
on
ps
yc
hoa
c
ous
ti
c
s
ound
pr
ope
r
ti
e
s
li
ke
lo
udne
s
s
,
pi
tc
h,
a
nd
ti
m
br
e
.
It
c
om
m
onl
y us
e
d c
e
ps
tr
a
l
f
e
a
tu
r
e
s
, s
uc
h a
s
m
e
l
-
f
r
e
que
nc
y c
e
p
s
tr
a
l
c
oe
f
f
ic
ie
nt
s
(
M
F
C
C
)
a
nd t
he
ir
de
r
iv
a
ti
ve
s
[
24]
.
I
n
pr
e
pr
oc
e
s
s
in
g
s
e
c
ti
on,
r
a
w
a
udi
o
tr
a
ns
f
or
m
e
d
in
to
v
is
ua
ll
y
in
s
ig
ht
f
ul
s
pe
c
tr
ogr
a
m
im
a
ge
s
w
hi
c
h
co
-
or
di
na
te
s
th
e
e
xt
r
a
c
ti
on
of
a
udi
o
s
e
gm
e
nt
s
c
or
r
e
s
ponding
to
pr
e
de
f
in
e
d
s
ta
r
t
a
nd
e
nd
ti
m
e
s
,
th
e
r
e
by
th
e
e
xt
r
a
c
ti
on
of
a
udi
o
s
e
gm
e
nt
s
s
li
c
in
g
a
udi
o
in
to
m
e
a
ni
ngf
ul
f
r
a
gm
e
nt
s
.
T
he
s
e
f
r
a
gm
e
nt
s
a
r
e
tr
a
ns
f
or
m
e
d
in
to
M
F
C
C
a
s
s
how
n
in
F
ig
ur
e
s
3
a
nd
4
w
hi
c
h
a
r
e
s
to
r
e
d
ba
s
e
d
on
th
e
ir
c
la
s
s
if
ic
a
ti
on
c
a
te
gor
y
w
it
h
a
ppr
opr
ia
te
f
il
e
na
m
e
s
i
n pr
e
de
f
in
e
d di
r
e
c
to
r
ie
s
c
la
s
s
if
ie
d a
s
“
e
ve
nt
”
a
nd
“
n
o e
ve
nt
”.
F
ig
ur
e
3. M
F
C
C
i
m
a
ge
f
or
“
e
ve
nt
”
F
ig
ur
e
4. M
F
C
C
i
m
a
ge
f
or
“
no e
ve
nt
”
2.2.3.
T
r
an
s
f
e
r
le
ar
n
in
g w
it
h
R
e
s
N
e
t
-
50
T
r
a
ns
f
e
r
le
a
r
ni
ng
m
e
th
ods
,
a
ppl
ie
d
a
c
r
os
s
va
r
io
us
dom
a
in
s
u
ti
li
z
e
knowle
dge
a
c
qui
r
e
d
f
r
om
one
s
our
c
e
to
a
ddr
e
s
s
c
la
s
s
if
ic
a
ti
on,
r
e
gr
e
s
s
io
n,
a
nd
c
lu
s
te
r
in
g
c
ha
ll
e
nge
s
in
a
di
f
f
e
r
e
nt
de
s
ti
na
ti
on
[
25]
.
T
hi
s
s
e
c
ti
on
f
oc
us
e
d
on
th
e
a
ppl
yi
ng
tr
a
ns
f
e
r
le
a
r
ni
ng
on
R
e
s
N
e
t
-
50
m
ode
l
a
s
s
how
n
in
F
ig
ur
e
5.
F
ir
s
t
,
it
r
e
a
ds
im
a
ge
s
f
r
om
a
s
pe
c
if
ie
d
di
r
e
c
to
r
y
a
nd
a
s
s
ig
ns
in
f
e
r
r
e
d
la
be
ls
ba
s
e
d
on
th
e
s
ubdi
r
e
c
to
r
y
s
tr
uc
tu
r
e
.
T
he
c
a
t
e
gor
ic
a
l
la
be
l
m
ode
is
c
hos
e
n,
a
nd
im
a
ge
s
a
r
e
r
e
s
iz
e
d
to
256
×
256
a
s
im
pl
e
m
e
nt
e
d
in
[
19]
.
T
he
e
xt
e
nde
d R
e
s
N
e
t
-
50 mode
l
s
e
r
ve
s
a
s
t
he
f
ounda
ti
ona
l
ba
c
kbon
e
f
or
t
he
c
la
s
s
if
ic
a
ti
on a
r
c
hi
te
c
tu
r
e
, a
s
de
pi
c
te
d
in
F
ig
ur
e
5.
I
ni
t
ia
ll
y,
a
ll
la
ye
r
s
o
f
th
e
R
e
s
N
e
t
-
50
a
r
e
de
s
ig
na
te
d
a
s
non
-
tr
a
in
a
bl
e
.
S
ubs
e
que
nt
a
ugm
e
nt
a
ti
on
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
E
v
e
nt
de
te
c
ti
on i
n s
oc
c
e
r
m
at
c
he
s
t
hr
ough audio
c
la
s
s
if
ic
at
io
n
us
in
g t
r
ans
fe
r
l
e
ar
ni
ng
(
B
ij
al
U
ts
av
G
adhi
a
)
1445
in
vol
ve
s
th
e
a
ddi
ti
on
of
e
xt
r
a
la
ye
r
s
,
in
c
lu
di
ng
gl
oba
l
a
ve
r
a
ge
pool
in
g,
de
ns
e
la
ye
r
s
w
it
h
dr
opout
f
o
r
r
e
gul
a
r
iz
a
ti
on,
a
nd
a
f
in
a
l
de
ns
e
la
ye
r
w
it
h
s
of
tm
a
x
a
c
ti
va
ti
o
n
f
or
bi
na
r
y
c
la
s
s
if
ic
a
ti
on.
T
he
tr
a
in
M
ode
l
is
in
tr
ic
a
te
ly
de
s
ig
ne
d
to
c
om
pi
le
a
nd
tr
a
in
th
e
m
ode
l
f
o
r
a
p
r
e
de
te
r
m
in
e
d
num
be
r
of
e
poc
hs
.
T
he
bi
na
r
y
c
r
os
s
-
e
nt
r
opy los
s
f
unc
ti
on i
s
e
m
pl
oye
d, a
nd a
c
c
ur
a
c
y i
s
m
oni
t
or
e
d i
n r
e
a
l
-
ti
m
e
dur
in
g t
r
a
in
in
g. A
ddi
t
io
na
ll
y,
tr
a
in
in
g
hi
s
to
r
y
is
s
ys
te
m
a
ti
c
a
ll
y
lo
gge
d
f
or
s
ubs
e
que
nt
a
na
ly
ti
c
a
l
pur
pos
e
s
.
T
he
tr
a
in
e
d
m
ode
l
is
pe
r
m
a
ne
nt
ly
s
to
r
e
d a
t
a
s
pe
c
if
ie
d l
oc
a
ti
on f
or
f
ut
ur
e
de
pl
oym
e
nt
.
F
ig
ur
e
5.
F
lo
w
c
ha
r
t
of
e
xt
e
nde
d R
e
s
N
e
t
-
50
2.2.4.
E
ve
n
t
p
r
e
d
ic
t
io
n
f
r
om
au
d
io
i
m
age
s
T
hi
s
s
e
c
ti
on
in
tr
oduc
e
s
two
c
r
uc
ia
l
pr
oc
e
s
s
e
s
:
“
pr
e
pr
oc
e
s
s
_i
m
a
ge
”
a
nd
“
pr
e
di
c
t_
f
il
e
_e
ve
nt
s
”
.
“
pr
e
pr
oc
e
s
s
_i
m
a
ge
”
ha
ndl
e
s
im
a
g
e
f
il
e
s
,
pr
oc
e
s
s
e
s
th
e
m
,
a
nd
r
e
a
di
e
s
th
e
m
f
or
pr
e
di
c
ti
on.
“
pr
e
di
c
t_
f
il
e
_e
ve
nt
s
”
i
s
r
e
s
pon
s
ib
le
f
or
t
he
e
nt
ir
e
pr
oc
e
s
s
of
i
m
a
ge
pr
e
pr
oc
e
s
s
in
g, e
v
e
nt
pr
e
di
c
ti
on, a
nd r
e
s
ul
t
r
e
c
or
di
ng.
T
he
pr
e
pr
oc
e
s
s
in
g s
te
p
in
vol
ve
s
lo
a
di
ng a
n
a
udi
o
im
a
ge
f
r
om
th
e
s
pe
c
if
ie
d
pa
th
,
c
onv
e
r
ti
ng
it
in
to
a
n
a
r
r
a
y,
a
nd
nor
m
a
li
z
in
g
pi
xe
l
va
lu
e
s
.
S
ubs
e
que
nt
ly
,
th
e
a
udi
o
is
di
vi
de
d
in
to
60
-
s
e
c
ond
in
te
r
va
ls
.
F
or
e
a
c
h
s
e
gm
e
nt
,
a
M
e
l
s
pe
c
tr
ogr
a
m
im
a
ge
is
c
r
e
a
te
d
a
s
s
how
n
in
F
ig
ur
e
4
a
nd
s
a
ve
d
w
it
h
a
n
a
ppr
op
r
ia
te
f
i
le
na
m
e
.
A
f
te
r
th
e
pr
e
pr
oc
e
s
s
in
g
s
ta
ge
,
th
e
bi
na
r
y
c
la
s
s
if
ic
a
ti
on
m
ode
l
tr
a
in
e
d
w
it
h
our
e
xt
e
nde
d
R
e
s
N
e
t
-
50
a
r
c
hi
te
c
tu
r
e
.
I
m
a
ge
f
il
e
s
f
r
om
th
e
s
pe
c
if
ie
d
lo
c
a
ti
on
a
r
e
lo
a
de
d,
e
xt
e
nde
d
pr
e
di
c
ti
ons
a
r
e
m
a
de
f
or
e
a
c
h
im
a
ge
,
a
nd
th
e
m
ode
l'
s
out
put
de
t
e
r
m
in
e
s
th
e
pr
e
di
c
te
d
c
la
s
s
l
a
be
l.
T
hi
s
in
f
or
m
a
ti
on
is
th
e
n
s
to
r
e
d
w
it
h
th
e
c
or
r
e
s
ponding
s
ta
r
t
a
nd
e
nd
ti
m
e
s
in
th
e
pr
e
di
c
ti
ons
li
s
t
a
s
pe
r
T
a
bl
e
3.
A
f
te
r
c
om
pl
e
ti
ng
th
is
pr
oc
e
s
s
,
w
e
c
om
pa
r
e
th
e
obs
e
r
ve
d
e
ve
nt
w
it
h
pr
e
di
c
te
d
e
v
e
nt
.
I
f
th
e
y
m
a
tc
h,
w
e
c
la
s
s
if
y
th
e
pr
e
di
c
ti
on
out
c
om
e
a
s
a
"
m
a
tc
h"
;
ot
he
r
w
is
e
,
it
is
c
la
s
s
if
ie
d
a
s
a
"
no
m
a
tc
h."
B
a
s
e
d
on
t
hi
s
c
om
pa
r
is
on,
w
e
c
a
lc
ul
a
t
e
th
e
c
la
s
s
if
ic
a
ti
on
m
e
tr
ic
s
.
T
a
bl
e
3
.
E
ve
nt
pr
e
di
c
ti
on e
va
lu
a
ti
on
O
bs
e
r
ve
d e
ve
nt
P
r
e
di
c
t
e
d e
ve
nt
S
t
a
r
t
t
i
m
e
(
s
e
c
.)
E
nd t
i
m
e
(
s
e
c
.)
P
r
e
di
c
t
i
on out
c
om
e
C
l
a
s
s
l
a
b
e
l
E
ve
nt
N
o e
ve
nt
0
59
N
o m
a
t
c
h
FN
N
o e
ve
nt
E
ve
nt
60
119
N
o m
a
t
c
h
FP
N
o e
ve
nt
N
o e
ve
nt
120
179
M
a
t
c
h
TN
E
ve
nt
E
ve
nt
180
239
M
a
t
c
h
TP
…
…
…
…
…
…
E
ve
nt
E
ve
nt
5
,
400
5
,
459
M
a
t
c
h
TP
3.
R
E
S
U
L
T
S
A
N
D
D
I
S
C
U
S
S
I
O
N
T
he
pr
opos
e
d
m
e
th
odol
ogy
w
a
s
a
ppl
ie
d
on
two
di
s
ti
nc
t
s
oc
c
e
r
te
s
t
a
udi
o
in
put
s
e
a
c
h
of
90
m
in
ut
e
s
s
oc
c
e
r
ga
m
e
dow
nl
oa
de
d
f
r
om
Y
ouT
ub
e
w
it
h
f
our
di
f
f
e
r
e
nt
e
poc
hs
li
ke
25,
30,
35,
a
nd
40.
B
ot
h
te
s
t
a
udi
o
in
put
s
w
e
r
e
c
la
s
s
if
ie
d
in
to
“
e
ve
nt
”
a
nd
“
no
e
ve
nt
”
a
t
60
s
e
c
ond
in
te
r
va
ls
by
two
di
f
f
e
r
e
nt
f
oot
ba
ll
f
a
ns
.
T
he
f
oot
ba
ll
f
a
ns
pr
e
c
is
e
ly
r
e
c
or
de
d
e
a
c
h
e
ve
nt
.
A
f
te
r
th
a
t
th
e
a
lg
or
it
hm
'
s
pr
e
di
c
te
d
e
ve
nt
s
w
e
r
e
c
om
pa
r
e
d
w
it
h
th
e
obs
e
r
ve
d
e
ve
nt
s
not
e
d
by
th
e
f
oot
ba
ll
f
a
ns
,
a
nd
t
he
r
e
s
ul
ts
w
e
r
e
s
ub
s
e
que
nt
ly
ge
ne
r
a
te
d
a
nd
a
na
ly
z
e
d
f
or
f
ur
th
e
r
e
va
lu
a
ti
on
a
s
pe
r
th
e
T
a
bl
e
3.
A
c
onf
us
io
n
m
a
tr
ix
is
c
r
e
a
te
d
in
c
la
s
s
if
ic
a
ti
on
to
e
v
a
lu
a
te
th
e
pe
r
f
or
m
a
nc
e
of
a
m
ode
l.
T
he
s
e
m
e
tr
ic
s
c
ol
le
c
ti
ve
ly
pr
ovi
de
a
s
s
e
s
s
m
e
nt
of
a
c
la
s
s
if
ic
a
ti
on
m
ode
ls
by
c
a
lc
ul
a
ti
ng
pr
e
c
is
io
n,
a
c
c
ur
a
c
y,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
c
ons
id
e
r
in
g
bot
h
c
or
r
e
c
t
a
nd
in
c
or
r
e
c
t
p
r
e
di
c
ti
ons
a
s
pr
opos
e
d
in
[
26]
.
T
he
r
e
s
ul
ts
w
e
r
e
qua
nt
it
a
ti
ve
ly
e
va
lu
a
te
d
f
o
r
a
c
c
ur
a
c
y
a
nd
c
om
pa
r
e
d
w
it
h
th
os
e
obt
a
in
e
d
f
r
om
pr
e
-
tr
a
in
e
d
m
ode
ls
li
ke
V
G
G
19,
D
e
s
N
e
t1
21,
a
nd
E
f
f
ic
ie
nt
N
e
tB
7.
T
a
bl
e
4
s
how
s
a
c
c
ur
a
c
y
c
om
pa
r
is
on
of
our
p
r
opos
e
d
a
ppr
oa
c
h
w
it
h
ot
he
r
p
r
e
-
tr
a
in
e
d
m
ode
ls
,
a
nd
T
a
bl
e
5
di
s
pl
a
ys
pr
e
c
is
io
n,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
va
lu
e
s
f
or
di
f
f
e
r
e
nt
m
e
th
ods
a
t
e
poc
h
40.
A
c
c
ur
a
c
y
is
m
e
a
s
ur
e
d
a
s
th
e
ove
r
a
ll
c
or
r
e
c
tn
e
s
s
of
th
e
m
ode
l
by
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
.
14
, N
o.
2
,
A
pr
il
2025
:
1441
-
1449
1446
c
a
lc
ul
a
ti
ng
th
e
r
a
ti
o
of
c
or
r
e
c
tl
y
pr
e
di
c
te
d
e
ve
nt
s
to
th
e
to
ta
l
e
ve
nt
s
[
20]
.
T
e
s
t
a
udi
o
1
c
ont
a
in
s
a
to
ta
l
of
101
e
ve
nt
s
,
w
hi
le
te
s
t
a
udi
o
2
c
om
pr
is
e
s
105
e
ve
nt
s
.
O
ur
e
xp
e
r
im
e
nt
s
w
e
r
e
c
onduc
te
d
in
th
e
G
oogl
e
C
ol
a
b
e
nvi
r
onm
e
nt
.
I
n
our
obs
e
r
va
ti
ons
,
th
e
pr
opos
e
d
m
ode
l
a
c
hi
e
ve
s
a
n
a
c
c
ur
a
c
y
c
lo
s
e
to
80%
a
f
te
r
40
e
poc
hs
.
F
ig
ur
e
s
6 a
nd 7 s
how
t
he
a
c
c
ur
a
c
y m
e
a
s
ur
e
m
e
nt
s
of
bot
h t
e
s
t
a
udi
o f
il
e
s
ove
r
di
f
f
e
r
e
nt
e
poc
hs
. I
nc
r
e
a
s
in
g t
he
num
be
r
of
e
poc
hs
c
a
n
pot
e
nt
ia
ll
y
im
pr
ove
a
c
c
ur
a
c
y.
S
im
il
a
r
ly
,
ot
he
r
pr
e
-
tr
a
in
e
d
m
ode
ls
a
ls
o
s
how
e
d
e
nha
nc
e
d
p
e
r
f
or
m
a
nc
e
w
it
h
m
or
e
e
poc
hs
but
e
nc
ount
e
r
e
d
m
e
m
or
y
li
m
it
a
ti
ons
,
of
te
n
r
e
s
ul
ti
ng
in
c
r
a
s
h
e
s
.
H
ow
e
ve
r
,
th
is
is
not
th
e
c
a
s
e
w
it
h
th
e
e
xt
e
nd
e
d
R
e
s
N
e
t
-
50
.
I
nc
r
e
a
s
in
g
th
e
num
be
r
of
e
po
c
hs
w
it
h
th
e
e
xt
e
nde
d
R
e
s
N
e
t
-
50
le
a
ds
to
hi
ghe
r
a
c
c
ur
a
c
y,
pr
e
c
is
io
n,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
w
it
h
r
e
a
s
ona
bl
e
pr
oc
e
s
s
in
g
ti
m
e
.
T
a
bl
e
4. A
c
c
ur
a
c
y c
om
pa
r
is
on of
pr
opos
e
d m
ode
l
v
s
. pr
e
-
tr
a
in
e
d m
ode
ls
of
t
e
s
t
a
udi
o
E
poc
h=40
A
c
c
ur
a
c
y
(%)
P
r
e
c
i
s
i
on
R
e
c
a
l
l
F1
-
s
c
or
e
T
e
s
t
a
udi
o
-
1
E
f
f
i
c
i
e
nt
N
e
t
B
7
58.42
0.36
0.22
0.27
V
G
G
19
48.51
0.22
0.25
0.23
D
e
s
ne
t
121
48.51
0.22
0.25
0.23
P
r
opos
e
d m
ode
l
79.21
0.79
0.45
0.58
T
e
s
t
a
udi
o
-
2
E
f
f
i
c
i
e
nt
N
e
t
B
7
65.35
0.31
0.31
0.31
V
G
G
19
69.52
0.36
0.35
0.35
D
e
s
N
e
t
121
40.59
0.24
0.62
0.34
P
r
opos
e
d m
ode
l
79.05
0.54
0.77
0.63
T
a
bl
e
5. P
e
r
f
or
m
a
nc
e
m
e
tr
ic
s
a
t
e
poc
h
40 f
or
t
e
s
t
a
udi
o
-
1 a
nd t
e
s
t
a
udi
o
-
2
E
poc
h=40
A
c
c
ur
a
c
y
P
r
e
c
i
s
i
on
R
e
c
a
l
l
F1
-
s
c
or
e
T
e
s
t
A
udi
o
-
1
E
f
f
i
c
i
e
nt
N
e
t
B
7
0.36
0.22
0.27
0.27
V
G
G
19
0.22
0.25
0.23
0.23
D
e
s
ne
t
121
0.22
0.25
0.23
0.23
P
r
opos
e
d
mo
de
l
0.79
0.45
0.58
0.58
T
e
s
t
A
udi
o
-
2
E
f
f
i
c
i
e
nt
N
e
t
B
7
0.31
0.31
0.31
0.31
V
G
G
19
0.36
0.35
0.35
0.35
D
e
s
N
e
t
121
0.24
0.62
0.34
0.34
P
r
opos
e
d
mo
de
l
0.54
0.77
0.63
0.63
F
ig
ur
e
6. A
c
c
ur
a
c
y m
e
a
s
ur
e
of
t
e
s
t
a
udi
o
-
1 a
c
r
os
s
va
r
io
us
e
poc
hs
F
ig
ur
e
7. A
c
c
ur
a
c
y m
e
a
s
ur
e
of
t
e
s
t
a
udi
o
-
2 a
c
r
os
s
va
r
io
us
e
poc
hs
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
E
v
e
nt
de
te
c
ti
on i
n s
oc
c
e
r
m
at
c
he
s
t
hr
ough audio
c
la
s
s
if
ic
at
io
n
us
in
g t
r
ans
fe
r
l
e
ar
ni
ng
(
B
ij
al
U
ts
av
G
adhi
a
)
1447
I
t
is
a
ls
o
not
ic
e
a
bl
e
f
r
om
F
ig
ur
e
s
8
a
nd
9
th
a
t
our
pr
opo
s
e
d
m
ode
l
a
c
hi
e
ve
s
hi
gh
pr
e
c
is
io
n.
F
ig
ur
e
s
10
a
nd
11
il
lu
s
tr
a
te
s
th
a
t
w
hi
le
m
a
in
ta
in
in
g
s
ig
ni
f
ic
a
nt
ly
hi
gh
pr
e
c
is
io
n,
e
xt
e
nd
e
d
R
e
s
N
e
t
-
50
m
a
na
ge
s
to
a
c
hi
e
ve
a
r
e
a
s
ona
bl
e
le
v
e
l
of
r
e
c
a
ll
a
t
e
poc
h
40
f
or
bot
h
te
s
t
a
udi
o.
T
hi
s
s
ugge
s
ts
th
a
t
th
e
m
ode
l
e
f
f
e
c
ti
ve
ly
id
e
nt
if
ie
s
a
s
ub
s
ta
nt
ia
l
por
ti
on
of
a
c
tu
a
l
e
ve
nt
a
nd
i
ndi
c
a
te
s
it
s
a
bi
li
ty
to
m
in
im
iz
e
f
a
ls
e
pos
it
iv
e
s
a
nd
e
nha
nc
e
th
e
r
e
le
va
nc
e
of
d
e
te
c
te
d
e
ve
nt
.
O
ve
r
a
ll
,
th
e
ge
n
e
r
a
l
a
nd
c
onc
lu
di
ng
obs
e
r
va
ti
on
is
th
a
t
a
s
th
e
tr
a
in
in
g
e
poc
hs
in
c
r
e
a
s
e
,
th
e
r
e
i
s
a
not
ic
e
a
bl
e
im
pr
ove
m
e
nt
in
th
e
pe
r
f
or
m
a
nc
e
m
e
tr
ic
s
f
or
a
ll
m
ode
ls
.
A
m
ong
th
e
m
,
e
xt
e
nde
d
R
e
s
N
e
t
-
50
c
on
s
is
te
nt
ly
s
ta
nd
s
out
,
s
e
c
ur
in
g
th
e
hi
ghe
s
t
a
c
c
ur
a
c
y
a
nd
m
a
in
ta
in
in
g
a
w
e
ll
-
ba
la
nc
e
d
pr
e
c
is
io
n,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
.
V
G
G
19
a
nd
E
f
f
i
c
ie
nt
N
e
tB
7
de
m
ons
tr
a
te
s
lo
w
im
pr
ove
m
e
nt
in
pe
r
f
or
m
a
nc
e
in
di
f
f
e
r
e
nt
a
s
pe
c
ts
of
pr
e
c
is
io
n
a
nd
r
e
c
a
ll
.
O
n
th
e
ot
he
r
ha
nd,
D
e
s
N
e
t1
21
f
a
ll
s
be
hi
nd
th
e
ot
he
r
m
ode
ls
c
onc
e
r
ni
ng
ove
r
a
ll
a
c
c
ur
a
c
y
a
nd
pr
e
c
is
io
n.
D
e
s
pi
te
th
e
pr
om
is
in
g
r
e
s
ul
ts
,
our
s
tu
dy
is
li
m
it
e
d
by
th
e
r
e
li
a
nc
e
on
a
m
a
nua
ll
y
a
nnot
a
te
d
d
a
ta
s
e
t
a
nd
th
e
c
ons
tr
a
in
ts
of
c
om
put
a
ti
ona
l
r
e
s
our
c
e
s
a
va
il
a
bl
e
dur
in
g
te
s
ti
ng.
W
hi
le
our
m
ode
l
e
f
f
e
c
ti
ve
ly
di
s
ti
ngui
s
he
s
be
tw
e
e
n
"
e
ve
nt
"
a
nd
"
no
e
ve
nt
,"
th
e
di
ve
r
s
it
y
of
s
oc
c
e
r
m
a
tc
h
s
c
e
n
a
r
io
s
a
nd
va
r
yi
ng
a
udi
o
qua
li
ti
e
s
c
oul
d
a
f
f
e
c
t
th
e
ge
ne
r
a
li
z
a
bi
li
ty
of
our
r
e
s
ul
ts
.
F
ur
th
e
r
te
s
ti
ng
on
m
or
e
di
ve
r
s
e
a
nd l
a
r
ge
r
da
ta
s
e
t
s
i
s
ne
e
d
e
d t
o va
li
da
te
t
he
br
oa
d
e
r
a
ppl
ic
a
bi
li
ty
of
our
m
e
th
od.
F
ig
ur
e
8. P
r
opos
e
d m
ode
l
vs
. ot
he
r
pr
e
-
tr
a
in
e
d m
ode
ls
:
te
s
t
a
ud
io
-
1 pr
e
c
is
io
n a
c
r
os
s
di
f
f
e
r
e
nt
e
poc
hs
F
ig
ur
e
9. P
r
opos
e
d m
ode
l
vs
. ot
he
r
pr
e
-
tr
a
in
e
d m
ode
ls
:
te
s
t
a
ud
io
-
2 pr
e
c
is
io
n a
c
r
os
s
di
f
f
e
r
e
nt
e
poc
hs
F
ig
ur
e
10. M
e
a
s
ur
e
s
of
pe
r
f
or
m
a
nc
e
pa
r
a
m
e
te
r
ove
r
e
poc
h 40 o
f
t
e
s
t
a
udi
o
-
1
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
A
r
ti
f
I
nt
e
ll
,
V
ol
.
14
, N
o.
2
,
A
pr
il
2025
:
1441
-
1449
1448
F
ig
ur
e
11. M
e
a
s
ur
e
s
of
pe
r
f
or
m
a
nc
e
pa
r
a
m
e
te
r
ove
r
e
poc
h 40 o
f
t
e
s
t
a
udi
o
-
2
4.
C
O
N
C
L
U
S
I
O
N
T
hi
s
pa
pe
r
pr
e
s
e
nt
s
a
nove
l
a
ppr
oa
c
h
to
s
o
c
c
e
r
a
udi
o
c
la
s
s
if
ic
a
ti
on
us
in
g
a
n
e
xt
e
nde
d
R
e
s
N
e
t
-
50
ba
s
e
d
de
e
p
le
a
r
ni
ng
m
ode
l.
T
he
pr
opos
e
d
m
e
th
odol
ogy,
va
li
da
te
d
w
it
h
th
e
pr
e
c
is
e
ly
c
om
pi
le
d
S
A
D
C
,
de
m
ons
tr
a
te
d
s
upe
r
io
r
pe
r
f
or
m
a
nc
e
in
a
c
c
ur
a
te
ly
c
la
s
s
if
yi
ng
s
ig
ni
f
ic
a
nt
in
-
ga
m
e
e
ve
nt
s
.
A
c
om
pa
r
a
ti
ve
a
na
ly
s
is
w
a
s
c
onduc
te
d
b
e
twe
e
n
th
e
pr
opos
e
d
m
od
e
l
a
nd
pr
e
-
tr
a
in
e
d
m
ode
ls
s
uc
h
a
s
V
G
G
19,
D
e
s
N
e
t1
21,
a
nd
E
f
f
ic
ie
nt
N
e
tB
7.
A
m
ong
th
e
s
e
,
th
e
pr
opo
s
e
d
m
ode
l
e
m
e
r
g
e
d
a
s
th
e
m
o
s
t
e
f
f
e
c
ti
ve
in
e
xt
r
a
c
ti
ng
r
e
le
va
nt
e
ve
nt
s
f
r
om
s
oc
c
e
r
a
udi
o
w
hi
le
f
il
te
r
in
g
out
ir
r
e
le
va
nt
one
s
.
T
he
r
e
s
ul
t
s
,
e
va
lu
a
t
e
d
a
c
r
os
s
di
f
f
e
r
e
nt
e
poc
h
s
,
hi
ghl
ig
ht
th
e
m
ode
l'
s
s
ta
bi
li
ty
a
nd
a
c
c
ur
a
c
y
in
di
s
ti
ngui
s
hi
ng
i
m
por
ta
nt
f
r
om
uni
m
por
ta
nt
e
ve
nt
s
w
it
hi
n
th
e
gi
ve
n
s
oc
c
e
r
a
udi
o
in
put
.
I
n
th
e
br
oa
de
r
c
ont
e
xt
of
s
por
ts
a
na
ly
ti
c
s
,
th
e
pr
opos
e
d
m
ode
l
s
ta
nds
out
a
s
a
pr
om
is
in
g
s
ol
ut
io
n
f
or
c
ont
e
nt
c
r
e
a
to
r
s
,
a
na
ly
s
ts
,
a
nd
f
a
ns
s
e
e
k
in
g
c
onc
is
e
a
nd
in
f
or
m
a
ti
ve
s
oc
c
e
r
hi
ghl
ig
ht
s
.
L
ooki
ng
a
he
a
d,
th
is
a
ppr
oa
c
h
c
oul
d
be
a
ppl
ie
d
to
ot
he
r
f
ie
ld
g
a
m
e
s
li
ke
c
r
ic
ke
t
or
hoc
k
e
y
a
nd
e
nha
nc
e
d
by
in
c
or
por
a
ti
ng vis
ua
ls
t
o f
ur
th
e
r
i
m
pr
ove
a
c
c
ur
a
c
y
.
R
E
F
E
R
E
N
C
E
S
[
1]
A
.
G
.
M
one
y
a
nd
H
.
A
gi
us
,
“
V
i
de
o
s
um
m
a
r
i
s
a
t
i
on:
a
c
onc
e
pt
ua
l
f
r
a
m
e
w
or
k
a
nd
s
ur
ve
y
of
t
he
s
t
a
t
e
of
t
he
a
r
t
,”
J
ou
r
nal
of
V
i
s
ual
C
om
m
uni
c
at
i
on and I
m
age
R
e
pr
e
s
e
nt
at
i
on
, vol
. 19, no. 2, pp. 121
–
143, 2008, d
oi
:
10.1016/
j
.j
vc
i
r
.2007.04.002.
[
2]
V
.
K
.
V
i
ve
kr
a
j
,
S
.
E
.
N
.
D
e
ba
s
hi
s
,
a
nd
B
.
R
a
m
a
n,
“
V
i
de
o
s
ki
m
m
i
ng:
t
a
xono
m
y
a
nd
c
om
pr
e
he
ns
i
ve
s
ur
ve
y,”
A
C
M
C
om
put
i
ng
Sur
v
e
y
s
, vol
. 52, no. 5, 2019, doi
:
10.1145/
3347712.
[
3]
B
.
U
.
G
a
dhi
a
a
nd
S
.
S
.
M
oda
s
i
ya
,
“
A
n
e
va
l
ua
t
i
on
-
ba
s
e
d
a
na
l
y
s
i
s
of
vi
de
o
s
um
m
a
r
i
s
i
ng
m
e
t
hods
f
or
di
ve
r
s
e
dom
a
i
ns
,”
J
our
nal
of
I
nnov
at
i
v
e
I
m
age
P
r
oc
e
s
s
i
ng
, vol
. 5, no. 2, pp. 127
–
139, 2023, doi
:
10.36548/
j
i
i
p.2023.2.005.
[
4]
M
.
B
a
s
a
va
r
a
j
a
i
a
h
a
nd
P
.
S
ha
r
m
a
,
“
G
V
S
U
M
:
ge
ne
r
i
c
vi
de
o
s
um
m
a
r
i
z
a
t
i
on
u
s
i
ng
de
e
p
vi
s
ua
l
f
e
a
t
ur
e
s
,”
M
ul
t
i
m
e
di
a
T
ool
s
and
A
ppl
i
c
at
i
ons
, vol
. 80, no. 9, pp. 14459
–
14476, 2021, doi
:
10.1007/
s
11042
-
020
-
10460
-
0.
[
5]
E
.
M
e
ndi
,
H
.
B
.
C
l
e
m
e
nt
e
,
a
nd
C
.
B
a
yr
a
k,
“
S
por
t
s
vi
de
o
s
um
m
a
r
i
z
a
t
i
on
ba
s
e
d
on
m
ot
i
on
a
na
l
y
s
i
s
,”
C
om
put
e
r
s
and
E
l
e
c
t
r
i
c
a
l
E
ngi
ne
e
r
i
ng
, vol
. 39, no. 3, pp. 790
–
796, 2013, doi
:
10.1016/
j
.c
om
pe
l
e
c
e
ng.2012.11.020.
[
6]
Y
.
T
a
ka
ha
s
hi
,
N
.
N
i
t
t
a
,
a
nd
N
.
B
a
b
a
guc
hi
,
“
V
i
de
o
s
um
m
a
r
i
z
a
t
i
on
f
or
l
a
r
ge
s
por
t
s
vi
de
o
a
r
c
hi
ve
s
,”
i
n
2005
I
E
E
E
I
nt
e
r
nat
i
onal
C
onf
e
r
e
nc
e
on M
ul
t
i
m
e
di
a and E
x
po
, 2005, pp. 1170
–
1173, doi
:
10.1109/
I
C
M
E
.2005.1521635.
[
7]
Y
.
S
.
K
ha
n
a
nd
S
.
P
a
w
a
r
,
“
V
i
de
o
s
um
m
a
r
i
z
a
t
i
on:
s
ur
ve
y
on
e
ve
nt
d
e
t
e
c
t
i
on
a
nd
s
um
m
a
r
i
z
a
t
i
on
i
n
s
oc
c
e
r
vi
de
os
,”
I
nt
e
r
nat
i
onal
J
our
nal
of
A
dv
anc
e
d C
om
put
e
r
Sc
i
e
nc
e
and A
ppl
i
c
at
i
ons
, vol
. 6, no. 11, 2015,
doi
:
10.14569/
I
J
A
C
S
A
.2015.061133.
[
8]
S
.
J
a
don
a
nd
M
.
J
a
s
i
m
,
“
U
ns
upe
r
vi
s
e
d
vi
de
o
s
um
m
a
r
i
z
a
t
i
on
f
r
a
m
e
w
or
k
us
i
ng
ke
yf
r
a
m
e
e
xt
r
a
c
t
i
on
a
nd
vi
de
o
s
ki
m
m
i
ng,”
i
n
2020
I
E
E
E
5t
h
I
nt
e
r
nat
i
onal
C
onf
e
r
e
nc
e
on
C
om
put
i
ng
C
om
m
uni
c
at
i
on
and
A
ut
om
at
i
on
(
I
C
C
C
A
)
,
2020,
pp.
140
–
145,
doi
:
10.1109/
I
C
C
C
A
49541.2020.9250764.
[
9]
O
.
A
.
N
.
R
ongve
d
e
t
al
.
,
“
R
e
a
l
-
t
i
m
e
de
t
e
c
t
i
on
of
e
ve
nt
s
i
n
s
oc
c
e
r
vi
de
os
us
i
ng
3D
c
onvol
ut
i
ona
l
ne
ur
a
l
ne
t
w
or
ks
,”
i
n
2020
I
E
E
E
I
nt
e
r
nat
i
onal
Sy
m
pos
i
um
on M
ul
t
i
m
e
di
a (
I
SM
)
, 2020, pp. 135
–
144, doi
:
10.110
9/
I
S
M
.2020.00030.
[
10]
A
.
T
.
D
.
-
P
a
bl
os
,
Y
.
N
a
ka
s
hi
m
a
,
T
.
S
a
t
o,
N
.
Y
okoya
,
M
.
L
i
nna
,
a
nd
E
.
R
a
ht
u,
“
S
um
m
a
r
i
z
a
t
i
on
of
us
e
r
-
ge
ne
r
a
t
e
d
s
por
t
s
vi
de
o
by
us
i
ng
de
e
p
a
c
t
i
on
r
e
c
ogni
t
i
on
f
e
a
t
ur
e
s
,”
I
E
E
E
T
r
ans
ac
t
i
ons
on
M
ul
t
i
m
e
d
i
a
,
vol
.
20,
no.
8,
pp.
2000
–
2011,
2018,
doi
:
10.1109/
T
M
M
.2018.2794265.
[
11]
S
.
H
.
E
m
on,
A
.
H
.
M
.
A
nnur
,
A
.
H
.
X
i
a
n,
K
.
M
.
S
ul
t
a
na
,
a
nd
S
.
M
.
S
ha
hr
i
a
r
,
“
A
ut
om
a
t
i
c
vi
de
o
s
um
m
a
r
i
z
a
t
i
on
f
r
om
c
r
i
c
ke
t
vi
de
os
us
i
ng
de
e
p l
e
a
r
ni
ng,”
i
n
2020 23
r
d I
nt
e
r
nat
i
onal
C
onf
e
r
e
nc
e
on
C
om
put
e
r
and I
nf
or
m
at
i
on T
e
c
hnol
ogy
(
I
C
C
I
T
)
, 2020, pp.
1
–
6, doi
:
10.1109/
I
C
C
I
T
51783.2020.9392707.
[
12]
M
.
S
a
na
br
i
a
,
F
.
P
r
e
c
i
os
o,
a
nd
T
.
M
e
nguy,
“
H
i
e
r
a
r
c
hi
c
a
l
m
ul
t
i
m
oda
l
a
t
t
e
nt
i
on
f
or
de
e
p
vi
de
o
s
um
m
a
r
i
z
a
t
i
on,”
i
n
2020
25t
h
I
nt
e
r
nat
i
onal
C
onf
e
r
e
nc
e
on P
at
t
e
r
n R
e
c
ogni
t
i
on (
I
C
P
R
)
, 2021, pp. 7977
–
7984
, doi
:
10.1109/
I
C
P
R
48806.2021.9413097.
[
13]
G
. E
va
nge
l
opoul
os
e
t
al
.
, “
M
ul
t
i
m
oda
l
s
a
l
i
e
nc
y
a
nd f
us
i
on f
or
m
ovi
e
s
um
m
a
r
i
z
a
t
i
on ba
s
e
d on
a
ur
a
l
, vi
s
ua
l
, a
nd t
e
xt
u
a
l
a
t
t
e
nt
i
on,”
I
E
E
E
T
r
ans
ac
t
i
ons
on M
ul
t
i
m
e
di
a
, vol
. 15, no. 7, pp. 1553
–
1568, 2013, doi
:
10.1109/
T
M
M
.2013.2267205.
[
14]
B
.
V
a
nde
r
pl
a
e
t
s
e
a
nd
S
.
D
upont
,
“
I
m
pr
ove
d
s
oc
c
e
r
a
c
t
i
on
s
pot
t
i
ng
us
i
ng
bo
t
h
a
udi
o
a
nd
vi
de
o
s
t
r
e
a
m
s
,”
i
n
2020
I
E
E
E
/
C
V
F
C
onf
e
r
e
nc
e
on
C
om
put
e
r
V
i
s
i
on
and
P
at
t
e
r
n
R
e
c
ogni
t
i
on
W
or
k
s
hops
(
C
V
P
R
W
)
,
2020,
pp.
3921
–
3931,
doi
:
10.1109/
C
V
P
R
W
50498.2020.00456.
[
15]
M
.
I
l
s
e
,
J
.
M
.
T
om
c
z
a
k,
a
nd
M
.
W
e
l
l
i
ng,
“
A
t
t
e
nt
i
on
-
ba
s
e
d
de
e
p
m
ul
t
i
pl
e
i
n
s
t
a
nc
e
l
e
a
r
ni
ng,”
i
n
35t
h
I
nt
e
r
nat
i
onal
C
onf
e
r
e
nc
e
on
M
ac
hi
ne
L
e
ar
ni
ng, I
C
M
L
2018
, 2018, vol
. 5, pp. 3376
–
3391.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
A
r
ti
f
I
nt
e
ll
I
S
S
N
:
2252
-
8938
E
v
e
nt
de
te
c
ti
on i
n s
oc
c
e
r
m
at
c
he
s
t
hr
ough audio
c
la
s
s
if
ic
at
io
n
us
in
g t
r
ans
fe
r
l
e
ar
ni
ng
(
B
ij
al
U
ts
av
G
adhi
a
)
1449
[
16]
M
.
S
a
na
br
i
a
,
S
he
r
l
y,
F
.
P
r
e
c
i
os
o,
a
nd
T
.
M
e
nguy,
“
A
de
e
p
a
r
c
hi
t
e
c
t
ur
e
f
or
m
ul
t
i
m
oda
l
s
um
m
a
r
i
z
a
t
i
on
of
s
oc
c
e
r
ga
m
e
s
,”
i
n
P
r
oc
e
e
di
ngs
P
r
o
c
e
e
di
ngs
of
t
he
2nd
I
nt
e
r
nat
i
onal
W
or
k
s
hop
on
M
ul
t
i
m
e
di
a
C
ont
e
nt
A
nal
y
s
i
s
i
n
Spo
r
t
s
,
O
c
t
.
2019,
pp.
16
–
24
,
doi
:
10.1145/
3347318.3355524.
[
17]
R
.
A
gye
m
a
n,
R
.
M
uha
m
m
a
d,
a
nd
G
.
S
.
C
hoi
,
“
S
oc
c
e
r
vi
de
o
s
um
m
a
r
i
z
a
t
i
on
u
s
i
ng
de
e
p
l
e
a
r
ni
ng,”
i
n
2019
I
E
E
E
C
onf
e
r
e
nc
e
on
M
ul
t
i
m
e
di
a I
nf
or
m
at
i
on P
r
oc
e
s
s
i
ng and R
e
t
r
i
e
v
al
(
M
I
P
R
)
, 2019, pp. 270
–
273,
doi
:
10.1109/
M
I
P
R
.2019.00055.
[
18]
Z
.
J
i
,
F
.
J
i
a
o,
Y
.
P
a
ng,
a
nd
L
.
S
ha
o,
“
D
e
e
p
a
t
t
e
nt
i
ve
a
nd
s
e
m
a
nt
i
c
pr
e
s
e
r
vi
ng
vi
de
o
s
um
m
a
r
i
z
a
t
i
on,”
N
e
u
r
oc
om
put
i
ng
,
vol
.
405
,
pp. 200
–
207, 2020, doi
:
10.1016/
j
.ne
uc
om
.2020.04.132.
[
19]
M
.
R
a
f
i
q,
G
.
R
a
f
i
q,
R
.
A
gy
e
m
a
n,
G
.
S
.
C
hoi
,
a
nd
S
.
-
I
.
J
i
n,
“
S
c
e
ne
c
l
a
s
s
i
f
i
c
a
t
i
on
f
or
s
por
t
s
vi
de
o
s
um
m
a
r
i
z
a
t
i
on
us
i
ng
t
r
a
ns
f
e
r
l
e
a
r
ni
ng,”
Se
ns
or
s
, vol
. 20, no. 6, 2020, doi
:
10.3390/
s
20061702.
[
20]
A
.
D
e
l
i
e
ge
e
t
al
.
,
“
S
oc
c
e
r
N
e
t
-
v2:
a
da
t
a
s
e
t
a
nd
be
n
c
hm
a
r
ks
f
or
hol
i
s
t
i
c
unde
r
s
t
a
ndi
ng
of
br
oa
dc
a
s
t
s
oc
c
e
r
vi
de
os
,
”
i
n
2021
I
E
E
E
/
C
V
F
C
onf
e
r
e
nc
e
on
C
om
put
e
r
V
i
s
i
on
and
P
at
t
e
r
n
R
e
c
ogni
t
i
on
W
or
k
s
hop
s
(
C
V
P
R
W
)
,
2021,
pp.
4503
–
4514,
doi
:
10.1109/
C
V
P
R
W
53098.2021.00508.
[
21]
C
.
L
i
u,
Q
.
H
ua
ng,
S
.
J
i
a
ng,
L
.
X
i
ng,
Q
.
Y
e
,
a
nd
W
.
G
a
o,
“
A
f
r
a
m
e
w
or
k
f
or
f
l
e
xi
bl
e
s
um
m
a
r
i
z
a
t
i
on
of
r
a
c
que
t
s
por
t
s
vi
de
o
us
i
n
g
m
ul
t
i
pl
e
m
oda
l
i
t
i
e
s
,”
C
om
put
e
r
V
i
s
i
on
and
I
m
age
U
nde
r
s
t
andi
ng
,
vol
.
113,
no.
3,
pp.
415
–
424,
2009,
doi
:
10.1016/
j
.c
vi
u.2008.08.002.
[
22]
A
. R
a
ve
nt
ós
,
R
. Q
ui
j
a
da
,
L
. T
or
r
e
s
, a
nd F
. T
a
r
r
é
s
, “
A
ut
om
a
t
i
c
s
um
m
a
r
i
z
a
t
i
on o
f
s
oc
c
e
r
hi
ghl
i
ght
s
us
i
ng a
udi
o
-
vi
s
ua
l
de
s
c
r
i
pt
or
s
,”
Spr
i
nge
r
P
l
us
, vol
. 4, no. 1, 2015, doi
:
10.1186/
s
40064
-
015
-
1065
-
9.
[
23]
H.
-
C
.
S
hi
h,
“
A
s
ur
ve
y
of
c
ont
e
nt
-
a
w
a
r
e
vi
d
e
o
a
na
l
y
s
i
s
f
or
s
por
t
s
,
”
I
E
E
E
T
r
ans
ac
t
i
ons
on
C
i
r
c
ui
t
s
and
Sy
s
t
e
m
s
f
o
r
V
i
de
o
T
e
c
hnol
ogy
, vol
. 28, no. 5, pp. 1212
–
1231, 2018, doi
:
10.1109/
T
C
S
V
T
.2017.26
55624.
[
24]
E
.
T
s
a
l
e
r
a
,
A
.
P
a
pa
da
ki
s
,
a
nd
M
.
S
a
m
a
r
a
kou,
“
C
om
p
a
r
i
s
on
of
pr
e
-
t
r
a
i
ne
d
C
N
N
s
f
or
a
udi
o
c
l
a
s
s
i
f
i
c
a
t
i
on
u
s
i
ng
t
r
a
ns
f
e
r
l
e
a
r
ni
ng,
”
J
our
nal
of
Se
ns
o
r
and A
c
t
uat
or
N
e
t
w
o
r
k
s
, vol
. 10, no. 4, 2021, doi
:
10.3390/
j
s
a
n10040072.
[
25]
N
.
Z
a
ka
r
i
a
,
F
.
M
oh
a
m
e
d,
R
.
A
bde
l
gha
ni
,
a
nd
K
.
S
unda
r
a
j
,
“
V
G
G
16,
R
e
s
N
e
t
-
50,
a
nd
G
oog
L
e
N
e
t
de
e
p
l
e
a
r
ni
ng
a
r
c
hi
t
e
c
t
ur
e
f
or
br
e
a
t
hi
ng
s
ound
c
l
a
s
s
i
f
i
c
a
t
i
on:
a
c
om
pa
r
a
t
i
ve
s
t
udy,”
i
n
2021
I
nt
e
r
nat
i
onal
C
onf
e
r
e
n
c
e
on
A
r
t
i
f
i
c
i
al
I
nt
e
l
l
i
ge
nc
e
f
or
C
y
be
r
Se
c
ur
i
t
y
Sy
s
t
e
m
s
and P
r
i
v
ac
y
(
A
I
-
C
SP
)
, 2021, pp. 1
–
6, doi
:
10.1109
/
A
I
-
C
S
P
52
968.2021.9671124.
[
26]
S
.
J
.
P
a
n
a
nd
Q
.
Y
a
ng,
“
A
s
ur
ve
y
on
t
r
a
ns
f
e
r
l
e
a
r
ni
ng,”
I
E
E
E
T
r
ans
ac
t
i
ons
on
K
now
l
e
dge
and
D
at
a
E
ngi
ne
e
r
i
ng
,
vol
.
22,
no.
10,
pp. 1345
–
1359, 2010, doi
:
10.1109/
T
K
D
E
.2009.191.
B
I
O
G
R
A
P
H
I
E
S
O
F
A
U
T
H
O
R
S
Bijal
Utsav
Gadhia
is
pursuing
Ph.D.
in
computer
engineering
from
Gujarat
Technological University (State University), Gujarat, India. C
urrently,
she
is a fac
ulty member
at
Government
Engineering
College,
Gandhinagar
(Government
E
mployee),
Gujarat
,
India
and
has
served
several
governmental
activities
around
the
university
a
nd
outside.
Her
r
esearch
interests
are
the
application
of
deep
learning,
machine
learning,
ima
ge
processing,
and
data
science.
She
has
published
various
research
papers
in
the
field
of
im
age
processing
and
deep
learning.
She ca
n be c
ontact
ed at
email
:
bij.1988@
gmail.com
.
Dr.
Shahid
S.
Modasiya
is
an
Assistant
Professor
at
the
Department
of
Electronics
and
Communication
Engineering
at
Government
Engineering
College,
Gandhinagar
under
the
affiliation
of
Gujarat
Technological
Universi
ty.
His
research
interest
areas
are
image
processing,
artificial
intelligenc
e,
RF
and
microwave
and
antenna
design.
H
e
has
also
published
two
patents
and
various
papers
in
the
field
of
his
r
esearch
interest.
He
can
be contacted at email
:
shahid@
gecg28.ac.in
.
Evaluation Warning : The document was created with Spire.PDF for Python.