C
om
p
u
t
e
r
S
c
ie
n
c
e
an
d
I
n
f
or
m
at
io
n
T
e
c
h
n
ol
ogi
e
s
V
ol
.
7
, N
o.
1
,
M
a
r
c
h 2026
, pp.
66
~
73
I
S
S
N
:
2722
-
3221
,
D
O
I
:
10.11591/cs
it
.
v
7
i
1
.
pp
66
-
73
66
Jou
r
n
al
h
om
e
page
:
ht
tp
:
//
ia
e
s
pr
ime
.c
om
/i
nde
x
.php/c
s
it
D
e
e
p
l
e
ar
n
i
n
g f
or
se
n
t
i
m
e
n
t
an
al
ysi
s an
d
t
op
i
c
e
xt
r
ac
t
i
on
i
n
h
e
al
t
h
i
n
su
r
an
c
e
M
u
z
on
d
iwa Kar
om
o, M
ai
n
f
or
d
M
u
t
an
d
avar
i,
Wi
lt
on
M
u
z
a
va
S
c
hool
of
I
nf
or
m
a
t
i
on S
c
i
e
nc
e
a
nd T
e
c
hnol
ogy, H
a
r
a
r
e
I
ns
t
i
t
ut
e
of
T
e
c
hnol
ogy
, H
a
r
a
r
e
, Z
i
m
ba
bw
e
A
r
t
ic
le
I
n
f
o
A
B
S
T
R
A
C
T
A
r
ti
c
le
h
is
to
r
y
:
R
e
c
e
iv
e
d
J
un 8, 2025
R
e
vi
s
e
d
J
ul
12, 2025
A
c
c
e
pt
e
d
J
ul
17, 2025
Social
media
has
transfo
rmed
into
a
vital
channel
for
real
-
tim
e,
unso
licited
feedback
in
healthcare,
yet
health
insurance
providers
often
lack
the
t
ools
to
mine
insights
from
such
data.
This
study
proposes
a
cloud
-
based
system
leverag
ing
deep
learning
for
sentiment
analysis
and
topic
modeling
t
ailored
to
the
Commercia
l
and
Industria
l
Medical
Aid
Society
(
CIMAS
)
health
insurance
in
Zimbabwe.
Using
bidirectional
encoder
representation
s
from
transforme
rs
(
BERT
)
,
a
convolut
ional
neural
network
(
CNN
)
,
a
r
andom
forest
(RF)
,
and
a
utoencoders,
the
system
processes
multilingual
dat
a
from
platforms
like
Twitter
and
Facebook,
identifying
customer
concerns
in
real
time.
Over
15,000
posts
were
analyzed,
with
CNN
achieving
91.4%
accuracy
in
sentimen
t
classifi
cation
and
BERTop
ic
extractin
g
co
herent
themes.
The
system
detected
issues
such
as
claim
delays,
app
navi
gation
problems,
and
unreported
anomalies.
Findings
demonstrate
that
AI
can
improve
service
delivery,
customer
satisfactio
n,
and
responsive
n
ess
in
African insuranc
e context
s.
K
e
y
w
o
r
d
s
:
D
e
e
p
l
e
a
r
ni
ng
H
e
a
lt
h c
a
r
e
i
ns
ur
a
nc
e
S
e
nt
im
e
nt
a
na
ly
s
is
S
oc
ia
l
m
e
di
a
a
na
ly
ti
c
s
T
opi
c
m
ode
li
ng
This is an
open
acce
ss artic
le unde
r the
CC BY
-
SA
license.
C
or
r
e
s
pon
di
n
g A
u
th
or
:
M
uz
ondi
w
a
K
a
r
om
o
S
c
hool
of
I
nf
or
m
a
ti
on S
c
ie
nc
e
a
nd T
e
c
hnol
ogy, H
a
r
a
r
e
I
ns
ti
tu
te
of
T
e
c
hnol
ogy
H
a
r
a
r
e
, Z
im
ba
bw
e
E
m
a
il
:
m
ka
r
om
o@
gm
a
il
.c
om
1.
I
N
T
R
O
D
U
C
T
I
O
N
H
e
a
lt
hc
a
r
e
pr
ovi
de
r
s
to
da
y
f
a
c
e
a
n
ove
r
w
he
lm
in
g
f
lo
od
of
uns
tr
uc
tu
r
e
d
s
oc
ia
l
m
e
di
a
f
e
e
dba
c
k,
m
a
ki
ng
it
di
f
f
ic
ul
t
to
id
e
nt
if
y
a
c
ti
ona
bl
e
in
s
ig
ht
s
,
e
s
pe
c
i
a
ll
y
in
de
ve
lo
pi
ng
c
ount
r
ie
s
w
he
r
e
di
gi
ta
l
in
f
r
a
s
tr
uc
tu
r
e
a
nd
a
na
ly
ti
c
a
l
to
ol
s
r
e
m
a
in
li
m
it
e
d.
T
o
a
ddr
e
s
s
th
is
,
s
e
nt
im
e
nt
a
na
ly
s
is
,
a
s
ubf
ie
ld
of
na
tu
r
a
l
la
ngua
ge
pr
oc
e
s
s
in
g
(
N
L
P
)
,
is
e
m
pl
oye
d
to
s
ys
t
e
m
a
ti
c
a
ll
y
i
nt
e
r
pr
e
t
a
nd
c
la
s
s
if
y
e
m
ot
io
ns
w
it
hi
n
te
xt
u
a
l
c
ont
e
nt
.
I
t
a
s
s
ig
ns
pol
a
r
it
y
,
pos
it
iv
e
,
ne
ga
ti
ve
,
or
ne
ut
r
a
l
,
to
e
a
c
h
opi
ni
on.
W
hi
le
tr
a
di
ti
ona
ll
y
a
ppl
ie
d
in
s
e
c
to
r
s
s
uc
h
a
s
e
-
c
om
m
e
r
c
e
a
nd
e
nt
e
r
ta
in
m
e
nt
,
s
e
nt
im
e
nt
a
na
l
ys
is
is
now
ga
in
in
g
m
om
e
nt
um
in
he
a
lt
hc
a
r
e
,
w
he
r
e
unde
r
s
ta
ndi
ng publi
c
f
e
e
dba
c
k i
s
c
r
it
ic
a
l
f
or
e
nha
nc
in
g s
e
r
vi
c
e
de
li
ve
r
y.
T
o a
ddr
e
s
s
t
hi
s
c
ha
ll
e
ng
e
, w
e
pr
opos
e
a
n a
ut
om
a
te
d s
e
nt
im
e
nt
a
na
ly
s
is
pi
pe
li
ne
bui
lt
us
in
g N
L
P
a
nd
m
a
c
hi
ne
le
a
r
ni
ng
te
c
hni
que
s
,
ta
il
or
e
d
f
or
he
a
lt
hc
a
r
e
f
e
e
dba
c
k
in
Z
im
ba
bw
e
.
C
ol
le
c
te
d
th
r
ough
c
u
s
to
m
e
r
r
e
vi
e
w
w
e
bs
it
e
s
a
nd
s
oc
ia
l
m
e
di
a
onl
in
e
po
s
ts
.
U
s
in
g
s
upe
r
vi
s
e
d
m
a
c
hi
ne
le
a
r
ni
ng
a
lg
or
it
hm
s
,
i.
e
.,
th
e
s
uppor
t
ve
c
to
r
m
a
c
hi
ne
(
S
V
M
)
a
nd
n
a
iv
e
B
a
ye
s
(
N
B
)
c
la
s
s
if
ie
r
,
it
is
e
xpe
c
te
d
th
a
t
th
e
s
ys
te
m
w
il
l
a
c
c
ur
a
te
ly
c
a
te
gor
iz
e
s
e
nt
im
e
nt
.
T
he
s
e
nt
im
e
nt
a
n
a
ly
s
is
pi
pe
li
ne
in
v
ol
ve
s
da
ta
pr
e
pr
oc
e
s
s
in
g
(
e
.g.,
s
to
p
w
or
d
e
li
m
in
a
ti
on,
to
ke
ni
z
a
ti
on,
a
nd
le
m
m
a
ti
z
a
ti
on)
,
te
r
m
f
r
e
que
nc
y
-
in
ve
r
s
e
doc
um
e
nt
f
r
e
que
nc
y
(
TF
-
I
D
F
)
ba
s
e
d
f
e
a
tu
r
e
e
xt
r
a
c
ti
on, a
nd c
la
s
s
if
ic
a
ti
on ba
s
e
d on le
a
r
ne
d da
ta
.
A
c
om
pr
e
he
ns
iv
e
a
na
ly
s
i
s
w
a
s
c
ondu
c
te
d
to
d
e
te
r
m
in
e
th
e
f
e
a
s
ib
il
it
y
of
us
in
g
s
e
nt
im
e
nt
a
na
ly
s
i
s
to
unc
ove
r
th
e
ove
r
a
ll
publ
ic
pe
r
c
e
pt
io
n
of
th
e
C
om
m
e
r
c
ia
l
a
nd
I
ndus
tr
ia
l
M
e
di
c
a
l
A
id
S
oc
ie
ty
(
C
I
M
A
S
)
.
T
he
im
pl
e
m
e
nt
a
ti
on
of
s
uc
h
a
m
ode
l
w
oul
d
a
ll
ow
th
e
or
ga
ni
z
a
ti
on
to
pr
oa
c
ti
ve
ly
r
e
s
pond
to
c
u
s
to
m
e
r
c
onc
e
r
n
s
,
Evaluation Warning : The document was created with Spire.PDF for Python.
C
om
put
S
c
i
I
nf
T
e
c
hnol
I
S
S
N
:
2722
-
3221
D
e
e
p l
e
ar
ni
ng f
or
s
e
nt
ime
nt
analy
s
is
and topi
c
e
x
t
r
ac
ti
on i
n he
al
th
i
ns
ur
anc
e
(
M
uz
ondi
w
a K
a
r
om
o)
67
m
e
a
s
ur
e
s
a
ti
s
f
a
c
ti
on
tr
e
nd
s
ove
r
ti
m
e
,
a
nd
s
uppor
t
s
tr
a
te
gi
c
im
pr
ove
m
e
nt
s
in
c
om
m
uni
c
a
ti
on
a
nd
s
e
r
vi
c
e
of
f
e
r
in
gs
.
B
y
a
dopt
in
g
th
is
s
e
nt
im
e
nt
a
na
ly
s
i
s
m
ode
l,
C
I
M
A
S
c
a
n
m
a
in
ta
in
a
c
om
pe
ti
ti
ve
e
dge
in
th
e
he
a
lt
hc
a
r
e
s
e
c
to
r
by a
li
gni
ng i
ts
ope
r
a
ti
ons
m
or
e
c
lo
s
e
ly
w
it
h c
us
to
m
e
r
e
xpe
c
ta
ti
ons
.
I
n
Z
im
ba
bw
e
,
m
e
di
c
a
l
a
id
pr
ovi
de
r
s
s
uc
h
a
s
C
I
M
A
S
s
tr
uggl
e
to
le
ve
r
a
ge
r
e
a
l
-
ti
m
e
publ
ic
s
e
nt
im
e
nt
due
to
li
m
it
e
d
a
dopt
io
n
of
a
dva
nc
e
d
a
na
ly
ti
c
s
to
ol
s
.
T
hi
s
la
c
k
of
in
s
ig
ht
c
a
n
le
a
d
to
m
is
s
e
d
s
e
r
vi
c
e
im
pr
ove
m
e
nt
oppor
tu
ni
ti
e
s
,
c
us
to
m
e
r
di
s
s
a
ti
s
f
a
c
ti
on,
a
nd
r
e
a
c
ti
ve
pr
obl
e
m
ha
ndl
in
g.
W
hi
le
s
e
v
e
r
a
l
s
tu
di
e
s
ha
ve
s
how
n
th
e
s
uc
c
e
s
s
of
s
e
nt
im
e
nt
a
na
ly
s
is
a
c
r
o
s
s
in
dus
tr
ie
s
l
ik
e
e
-
c
om
m
e
r
c
e
a
nd
ge
ne
r
a
l
he
a
lt
hc
a
r
e
,
ve
r
y
f
e
w
ha
ve
a
ppl
ie
d
th
e
s
e
te
c
hni
que
s
w
it
hi
n
th
e
c
ont
e
xt
of
de
ve
lo
pi
ng
c
ount
r
ie
s
,
p
a
r
ti
c
ul
a
r
ly
in
Z
im
ba
bw
e
. T
hi
s
r
e
s
e
a
r
c
h a
ddr
e
s
s
e
s
t
ha
t
ga
p by f
oc
us
in
g on the
C
I
M
A
S
M
e
di
c
a
l
A
id
S
oc
ie
ty
, of
f
e
r
in
g a
t
a
il
or
e
d de
e
p l
e
a
r
ni
ng
a
ppr
oa
c
h t
o e
xt
r
a
c
t
a
c
ti
ona
bl
e
i
ns
ig
ht
s
f
r
om
publi
c
s
e
nt
im
e
nt
.
2.
R
E
L
A
T
E
D
WORK
F
r
om
th
e
r
e
la
te
d
w
or
k,
it
is
e
vi
de
nt
th
a
t
s
e
nt
im
e
nt
a
na
ly
s
is
ha
s
e
vol
ve
d
in
to
a
w
id
e
ly
a
ppl
ic
a
bl
e
to
ol
a
c
r
os
s
in
dus
tr
ie
s
,
w
it
h
gr
ow
in
g
r
e
le
va
nc
e
in
he
a
lt
hc
a
r
e
a
nd
s
e
r
vi
c
e
qua
li
ty
a
s
s
e
s
s
m
e
nt
.
T
h
e
f
ounda
ti
ona
l
w
or
k
by
L
iu
[
1]
in
“
S
e
nt
im
e
nt
a
na
ly
s
is
a
nd
opi
ni
on
m
in
in
g
"
l
a
id
th
e
gr
oundwor
k
f
or
te
xt
pol
a
r
it
y
de
te
c
ti
on
us
in
g s
upe
r
vi
s
e
d m
a
c
hi
ne
l
e
a
r
ni
ng t
e
c
hni
que
s
s
uc
h a
s
NB
a
nd
S
V
M
.
T
hi
s
w
or
k u
nd
e
r
s
c
or
e
d
th
e
i
m
por
ta
nc
e
of
te
x
tu
a
l
f
e
a
tu
r
e
e
xt
r
a
c
ti
o
n
a
nd
l
e
xi
c
on
-
b
a
s
e
d
a
p
pr
o
a
c
h
e
s
i
n i
m
pr
o
vi
ng
c
la
s
s
i
f
ic
a
t
io
n
a
c
c
ur
a
c
y
[
2]
–
[
4]
.
I
n
th
e
pa
pe
r
“
T
w
it
te
r
s
e
nt
im
e
nt
c
la
s
s
if
ic
a
ti
on
us
in
g
di
s
ta
nt
s
up
e
r
vi
s
io
n
”
by
G
o
e
t
al
.
[
5]
,
th
e
a
ut
hor
s
de
ve
lo
pe
d
a
s
e
nt
im
e
nt
c
la
s
s
if
ie
r
us
in
g
w
e
a
kl
y
la
be
le
d
T
w
it
te
r
da
ta
.
T
h
e
ir
r
e
s
e
a
r
c
h
de
m
on
s
tr
a
te
d
th
a
t
e
v
e
n
noi
s
y,
in
f
or
m
a
l
la
ngua
ge
on
s
oc
ia
l
m
e
di
a
c
oul
d
yi
e
ld
r
e
l
ia
bl
e
s
e
nt
im
e
nt
pr
e
di
c
ti
ons
us
in
g
m
a
c
hi
ne
le
a
r
ni
ng
[
6]
–
[
11]
.
T
he
s
tu
dy
“
m
a
c
hi
ne
le
a
r
ni
ng
a
nd
s
e
nt
im
e
n
t
a
na
ly
s
is
:
a
na
ly
z
in
g
c
us
to
m
e
r
f
e
e
dba
c
k”
by
S
ha
r
m
a
a
nd
J
a
in
[
12]
f
oc
us
e
d
on
pr
oc
e
s
s
in
g
onl
in
e
r
e
vi
e
w
s
to
de
te
r
m
in
e
c
us
to
m
e
r
s
a
ti
s
f
a
c
ti
on
le
ve
ls
in
c
or
por
a
te
e
nvi
r
onm
e
nt
s
.
T
he
ir
r
e
s
ul
ts
s
how
e
d
th
a
t
r
e
a
l
-
ti
m
e
s
e
nt
im
e
nt
c
la
s
s
if
ic
a
ti
on
c
a
n
be
a
vi
ta
l
in
put
f
or
m
a
na
ge
m
e
nt
de
c
is
io
n
-
m
a
ki
ng i
n c
u
s
to
m
e
r
-
f
a
c
in
g or
ga
ni
z
a
ti
ons
[
13]
–
[
16]
.
A
a
tt
ou
c
hi
e
t
al
.
[
17]
e
xpl
or
e
d
how
pa
ti
e
nt
s
e
nt
im
e
nt
s
e
xpr
e
s
s
e
d
in
r
e
v
ie
w
s
a
nd
f
or
um
s
c
oul
d
be
a
na
ly
z
e
d
to
im
pr
ove
ho
s
pi
t
a
l
a
n
d
in
s
ur
a
nc
e
s
e
r
vi
c
e
qua
li
t
y.
T
he
y
e
m
p
ha
s
iz
e
d
th
a
t
m
a
c
hi
n
e
le
a
r
ni
ng
-
b
a
s
e
d
s
e
nt
i
m
e
nt
m
o
de
l
s
c
a
n unc
o
ve
r
pa
ti
e
nt
p
a
in
poi
nt
s
a
nd a
s
s
i
s
t
in
p
ol
ic
y r
e
f
in
e
m
e
n
t
[
18]
–
[
22]
.
F
in
a
ll
y, a
p
a
pe
r
by
S
he
ng
e
t
al
.
[
23]
ha
s
hi
ghl
i
ght
e
d
th
e
s
ig
ni
f
ic
a
nc
e
of
m
a
c
hi
n
e
-
ba
s
e
d
s
e
nt
im
e
nt
id
e
nt
if
ic
a
ti
on
in
de
te
c
ti
ng
tr
e
nd
s
in
th
e
pu
bl
ic
m
ood,
e
s
pe
c
i
a
ll
y
dur
in
g
h
e
a
lt
h
e
m
e
r
ge
n
c
ie
s
a
nd
pol
ic
y
u
pda
t
e
s
r
e
ga
r
di
ng
s
e
r
vi
c
e
s
[
24]
–
[
2
8]
.
T
oge
th
e
r
,
th
e
s
e
a
r
ti
c
le
s
in
di
c
a
te
th
e
a
dva
n
c
e
d
a
bi
li
ty
of
m
a
c
hi
ne
le
a
r
ni
ng
to
obt
a
in
hum
a
n
e
m
ot
io
n
a
nd
opi
ni
on
a
c
r
os
s
va
r
io
us
s
our
c
e
s
.
T
he
li
te
r
a
tu
r
e
c
onf
ir
m
s
th
a
t
s
upe
r
vi
s
e
d
le
a
r
ni
ng
m
ode
ls
s
uc
h
a
s
NB
,
lo
gi
s
ti
c
r
e
gr
e
s
s
io
n
(
L
R
)
,
a
nd
S
V
M
s
a
r
e
hi
ghl
y
e
f
f
e
c
ti
ve
f
or
s
e
nt
im
e
nt
c
la
s
s
if
ic
a
ti
on
ta
s
k
s
.
H
ow
e
ve
r
,
th
e
us
e
of
s
uc
h
te
c
hni
que
s
f
or
he
a
lt
hc
a
r
e
or
ga
ni
z
a
ti
on
s
in
de
v
e
lo
pi
ng
c
ount
r
ie
s
li
ke
Z
im
ba
bw
e
r
e
m
a
in
s
li
m
it
e
d,
pa
r
ti
c
ul
a
r
ly
w
it
hi
n t
he
c
ont
e
xt
of
m
e
di
c
a
l
a
id
s
oc
ie
ti
e
s
s
uc
h
a
s
C
I
M
A
S
[
29]
, [
30]
.
K
e
y
c
ont
r
ib
ut
io
ns
of
th
is
s
tu
dy
is
t
hi
s
s
tu
dy
w
il
l
f
oc
us
on
de
ve
lo
pi
ng
a
nd
te
s
ti
ng
a
m
a
c
hi
ne
le
a
r
ni
ng
-
ba
s
e
d
s
e
nt
im
e
nt
a
n
a
ly
s
is
m
ode
l
u
s
in
g
r
e
a
l
-
w
or
ld
f
e
e
dba
c
k
da
ta
r
e
la
te
d
to
C
I
M
A
S
M
e
di
c
a
l
A
id
S
oc
ie
ty
.
T
he
m
ode
l
w
il
l
be
tr
a
in
e
d
to
de
te
c
t
s
e
nt
im
e
nt
pol
a
r
it
y
w
it
h
hi
gh
pr
e
c
is
io
n
us
in
g
bot
h
c
la
s
s
ic
a
l
a
nd
m
ode
r
n N
L
P
t
e
c
hni
que
s
. T
he
c
ont
r
ib
ut
io
ns
of
t
hi
s
s
tu
dy a
r
e
:
i)
H
e
a
lt
hc
a
r
e
-
s
pe
c
if
ic
s
e
nt
im
e
nt
a
na
ly
s
is
m
ode
l
:
t
hi
s
pr
oj
e
c
t
de
v
e
lo
ps
a
s
e
nt
im
e
nt
c
la
s
s
if
ic
a
ti
on
pi
pe
li
ne
ta
il
or
e
d s
pe
c
if
ic
a
ll
y t
o he
a
lt
hc
a
r
e
s
e
r
vi
c
e
f
e
e
dba
c
k,
a
ddr
e
s
s
in
g
dom
a
in
-
s
pe
c
if
ic
l
in
gui
s
ti
c
pa
tt
e
r
ns
.
ii)
M
a
c
hi
ne
le
a
r
ni
ng
-
ba
s
e
d
a
ut
om
a
ti
on
:
t
he
s
tu
dy
in
c
or
por
a
te
s
s
upe
r
vi
s
e
d
le
a
r
ni
ng
a
lg
or
it
hm
s
to
bui
ld
a
r
obus
t,
s
c
a
la
bl
e
, a
nd
a
ut
om
a
te
d s
y
s
te
m
t
ha
t
c
la
s
s
if
ie
s
publi
c
s
e
nt
im
e
nt
i
n r
e
a
l
ti
m
e
.
iii)
S
tr
a
te
gi
c
va
lu
e
t
o
C
I
M
A
S
:
‒
E
na
bl
e
s
da
ta
-
dr
iv
e
n s
e
r
vi
c
e
i
m
pr
ove
m
e
n
ts
ba
s
e
d on c
u
s
to
m
e
r
pe
r
c
e
pt
io
n;
‒
H
e
lp
s
i
de
nt
if
y r
e
c
ur
r
in
g ne
ga
ti
ve
t
he
m
e
s
t
ha
t
m
a
y r
e
qui
r
e
ope
r
a
ti
ona
l
a
tt
e
nt
io
n;
‒
E
nha
nc
e
s
e
nga
ge
m
e
nt
s
tr
a
te
gi
e
s
by r
e
c
ogni
z
in
g pos
it
iv
e
s
e
nt
im
e
nt
t
r
e
nds
.
iv
)
C
ont
e
xt
ua
l
r
e
le
va
nc
e
:
t
hi
s
r
e
s
e
a
r
c
h f
il
ls
a
ga
p i
n
e
xi
s
ti
ng l
it
e
r
a
t
ur
e
by a
ppl
yi
ng
s
e
nt
im
e
nt
a
na
ly
s
is
i
n t
he
Z
im
ba
bw
e
a
n
he
a
lt
hc
a
r
e
c
ont
e
xt
,
w
he
r
e
di
gi
ta
l
f
e
e
dba
c
k
m
e
c
ha
ni
s
m
s
a
r
e
gr
ow
in
g
but
r
e
m
a
in
unde
r
ut
il
iz
e
d f
or
i
ns
ig
ht
e
xt
r
a
c
ti
on.
3.
M
E
T
H
O
D
I
n
th
is
r
e
s
e
a
r
c
h,
a
m
a
c
hi
ne
le
a
r
ni
ng
-
ba
s
e
d
s
e
nt
im
e
nt
a
na
ly
s
is
m
ode
l
is
de
ve
lo
pe
d
to
c
la
s
s
if
y
c
us
to
m
e
r
f
e
e
dba
c
k
r
e
la
te
d
to
C
I
M
A
S
M
e
di
c
a
l
A
id
S
oc
ie
ty
in
to
pos
it
iv
e
,
ne
ga
ti
ve
,
or
ne
ut
r
a
l
s
e
nt
im
e
nt
s
.
T
he
m
e
th
odol
ogy
in
c
lu
de
s
da
ta
c
ol
le
c
ti
on,
da
ta
pr
e
pr
oc
e
s
s
in
g,
f
e
a
tu
r
e
e
xt
r
a
c
ti
on,
m
ode
l
s
e
le
c
ti
on
a
nd
tr
a
in
in
g,
a
nd
m
ode
l
e
va
lu
a
ti
on.
T
he
s
ys
te
m
is
de
s
ig
ne
d
to
pr
oc
e
s
s
uns
tr
uc
tu
r
e
d
te
xt
ua
l
da
ta
(
e
.g.,
c
us
to
m
e
r
r
e
vi
e
w
s
a
nd
s
oc
ia
l
m
e
di
a
c
om
m
e
nt
s
)
a
nd r
e
tu
r
n s
e
nt
im
e
nt
l
a
be
ls
w
it
h hi
gh pr
e
di
c
ti
ve
a
c
c
ur
a
c
y.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2722
-
3221
C
om
put
S
c
i
I
nf
T
e
c
hnol
,
V
ol
. 7, No. 1, M
a
r
c
h 2026
:
66
-
73
68
3
.1.
P
la
n
n
in
g a
n
d
d
at
a
p
r
e
p
ar
at
io
n
T
he
r
e
s
e
a
r
c
h
f
r
a
m
e
w
or
k
w
a
s
de
ve
lo
pe
d
by
f
ir
s
t
id
e
nt
if
yi
ng
r
e
le
va
nt
da
ta
s
our
c
e
s
,
s
e
le
c
ti
ng
a
ppr
opr
ia
te
N
L
P
te
c
hni
que
s
,
a
nd
de
f
in
in
g
th
e
s
upe
r
vi
s
e
d
m
a
c
h
in
e
le
a
r
ni
ng
pi
pe
li
ne
.
H
is
to
r
ic
a
l
f
e
e
dba
c
k
da
ta
r
e
la
te
d
to
C
I
M
A
S
w
a
s
c
ol
le
c
te
d
f
r
om
publ
ic
ly
a
va
il
a
bl
e
onl
in
e
pl
a
tf
or
m
s
.
T
he
s
e
in
c
lu
d
e
d
T
w
it
te
r
,
F
a
c
e
book
c
om
m
e
nt
s
, a
nd c
us
to
m
e
r
r
e
vi
e
w
s
it
e
s
. T
h
e
t
e
xt
da
ta
w
a
s
t
he
n c
le
a
ne
d, s
tr
uc
tu
r
e
d, a
nd l
a
be
le
d f
or
t
r
a
in
in
g a
nd
te
s
ti
ng t
he
m
ode
l.
3
.2.
D
at
a
c
ol
le
c
t
io
n
an
d
p
r
e
p
r
oc
e
s
s
in
g
R
a
w
te
xt
da
ta
w
a
s
s
c
r
a
p
e
d
us
in
g
A
P
I
s
a
nd
w
e
b
s
c
r
a
pi
ng
to
ol
s
,
f
ol
lo
w
in
g
e
th
ic
a
l
gui
de
li
ne
s
f
or
publ
ic
da
ta
us
a
ge
.
T
he
da
ta
s
e
t
in
c
lu
de
d
c
om
m
e
nt
s
m
a
de
a
bo
ut
C
I
M
A
S
ove
r
th
e
pa
s
t
two
ye
a
r
s
a
nd
w
a
s
a
no
nym
iz
e
d
to
pr
ot
e
c
t
us
e
r
pr
iv
a
c
y.
D
a
ta
s
e
t
f
ie
ld
s
:
i)
c
om
m
e
nt
_t
e
xt
;
ii
)
t
im
e
s
ta
m
p
;
ii
i)
us
e
r
_pl
a
tf
o
r
m
;
a
nd
iv
)
la
be
l
(
pos
it
iv
e
, ne
ga
ti
ve
,
a
nd
ne
ut
r
a
l)
.
T
e
xt
da
ta
w
a
s
pr
e
pr
oc
e
s
s
e
d
us
in
g
th
e
f
ol
lo
w
in
g
s
te
ps
:
i)
r
e
m
o
va
l
of
s
pe
c
ia
l
c
ha
r
a
c
te
r
s
,
e
m
oj
is
,
a
nd
U
R
L
s
;
ii
)
lo
w
e
r
c
a
s
in
g
of
a
ll
w
or
ds
;
ii
i)
t
oke
ni
z
a
ti
on
(
s
pl
it
t
in
g
te
xt
in
to
in
di
vi
dua
l
w
or
ds
)
;
iv
)
s
to
p
w
or
d
re
m
ova
l
(
e
.g.,
"
th
e
"
,
"
is
"
,
a
nd
"
a
t"
)
;
a
nd
v)
l
e
m
m
a
ti
z
a
ti
on
(
c
onve
r
ti
ng
w
or
ds
to
th
e
ir
ba
s
e
f
or
m
)
.
T
hi
s
pr
e
pr
oc
e
s
s
in
g s
te
p e
n
s
ur
e
d t
ha
t
ir
r
e
le
va
nt
noi
s
e
w
a
s
r
e
m
o
ve
d a
nd t
ha
t
th
e
t
e
xt
w
a
s
i
n a
c
on
s
is
te
nt
f
or
m
a
t
s
ui
ta
bl
e
f
or
m
a
c
hi
ne
l
e
a
r
ni
ng mode
ls
.
3
.3.
F
e
at
u
r
e
e
n
gi
n
e
e
r
in
g
F
e
a
tu
r
e
e
xt
r
a
c
ti
on
w
a
s
pe
r
f
or
m
e
d
us
in
g
TF
-
I
D
F
,
c
hos
e
n
f
or
it
s
pr
ove
n
a
bi
li
ty
to
r
e
f
le
c
t
t
e
r
m
im
por
ta
nc
e
in
hi
gh
-
d
im
e
ns
io
na
l
te
xt
da
ta
.
U
ni
gr
a
m
s
a
nd
b
ig
r
a
m
s
w
e
r
e
a
ls
o
us
e
d
to
c
a
pt
ur
e
s
hor
t
c
ont
e
xt
ua
l
pa
tt
e
r
ns
.
A
lt
hough
a
dva
nc
e
d
m
e
th
ods
s
uc
h
a
s
w
or
d
e
m
be
ddi
ng
s
w
e
r
e
c
on
s
id
e
r
e
d,
T
F
-
I
D
F
of
f
e
r
e
d
s
im
pl
ic
it
y,
e
f
f
e
c
ti
ve
ne
s
s
,
a
nd
in
te
r
pr
e
ta
bi
li
ty
f
or
th
is
a
ppl
ic
a
ti
on.
W
e
r
e
c
ons
id
e
r
e
d
a
s
f
e
a
tu
r
e
s
to
c
a
pt
ur
e
lo
c
a
l
w
or
d
r
e
la
ti
ons
hi
ps
.
T
he
f
ol
lo
w
in
g
is
a
s
um
m
a
r
y of
t
he
e
ngi
ne
e
r
e
d f
e
a
tu
r
e
s
c
a
pt
ur
e
d i
n T
a
bl
e
1.
T
a
bl
e
1.
S
um
m
a
r
y of
t
he
e
ngi
ne
e
r
e
d f
e
a
tu
r
e
s
F
e
a
t
ur
e
R
e
l
e
va
nc
e
TF
-
I
D
F
s
c
or
e
s
R
e
pr
e
s
e
nt
t
he
i
m
por
t
a
nc
e
of
e
a
c
h w
or
d i
n t
he
c
ont
e
xt
of
t
he
doc
um
e
nt
a
nd
c
or
pus
.
N
-
gr
a
m
s
C
a
pt
ur
e
c
om
m
on w
or
d pa
i
r
i
ngs
t
ha
t
i
ndi
c
a
t
e
s
e
nt
i
m
e
nt
(
e
.g., "not ha
ppy"
a
nd
"
ve
r
y good")
.
W
or
d c
ount
G
i
ve
s
a
ba
s
i
c
m
e
a
s
ur
e
of
c
om
m
e
nt
l
e
ngt
h, w
hi
c
h
s
om
e
t
i
m
e
s
c
or
r
e
l
a
t
e
s
w
i
t
h e
m
ot
i
on.
S
e
nt
i
m
e
nt
l
e
xi
c
on s
c
or
e
U
s
e
d a
s
a
s
e
c
ond
a
r
y va
l
i
da
t
i
on f
e
a
t
ur
e
t
o c
om
pa
r
e
a
ga
i
ns
t
pr
e
di
c
t
e
d s
e
nt
i
m
e
nt
.
3
.4.
M
ac
h
in
e
le
ar
n
in
g m
od
e
l
d
e
ve
lo
p
m
e
n
t
T
he
s
e
nt
im
e
nt
c
la
s
s
if
ic
a
ti
on
pr
obl
e
m
w
a
s
a
ppr
oa
c
he
d
u
s
in
g
s
upe
r
vi
s
e
d
le
a
r
ni
ng.
A
l
a
be
le
d
da
t
a
s
e
t
w
a
s
us
e
d
to
tr
a
in
m
ode
ls
th
a
t
pr
e
di
c
t
w
he
th
e
r
a
gi
ve
n
c
om
m
e
nt
is
pos
it
iv
e
,
ne
ga
ti
ve
,
or
ne
ut
r
a
l.
T
hr
e
e
di
f
f
e
r
e
nt
m
ode
ls
w
e
r
e
te
s
te
d:
i)
N
B
c
la
s
s
if
ie
r
;
ii
)
S
V
M
;
a
nd
ii
i
)
L
R
.
A
m
ong
th
e
s
e
,
S
V
M
yi
e
ld
e
d
th
e
hi
ghe
s
t
a
c
c
ur
a
c
y
dur
in
g
va
li
da
ti
on
a
nd
w
a
s
s
e
le
c
te
d
a
s
th
e
f
in
a
l
m
od
e
l.
P
yt
hon’
s
s
c
ik
it
-
le
a
r
n
li
br
a
r
y
w
a
s
us
e
d
f
or
m
ode
l
de
ve
lo
pm
e
nt
a
nd
e
va
lu
a
ti
on.
T
he
pr
oc
e
s
s
in
c
lu
de
d
m
ode
l
tr
a
in
in
g,
c
r
os
s
-
va
li
da
ti
on,
a
nd
hype
r
pa
r
a
m
e
te
r
t
uni
ng t
o i
m
pr
ove
pe
r
f
or
m
a
nc
e
.
3
.5.
M
od
e
l
e
val
u
at
io
n
T
he
e
f
f
e
c
ti
ve
ne
s
s
of
th
e
m
ode
l
w
a
s
a
s
s
e
s
s
e
d
us
in
g
s
ta
nd
a
r
d
c
la
s
s
if
ic
a
ti
on
m
e
tr
ic
s
:
i)
a
c
c
ur
a
c
y
is
th
e
pe
r
c
e
nt
a
ge
of
c
or
r
e
c
t
pr
e
di
c
ti
ons
;
ii
)
pr
e
c
is
io
n
is
t
h
e
pr
opor
ti
on
of
pos
it
iv
e
pr
e
di
c
ti
ons
th
a
t
w
e
r
e
a
c
tu
a
ll
y
pos
it
iv
e
;
ii
i)
r
e
c
a
ll
i
s
t
he
pr
opor
ti
on
of
a
c
tu
a
l
po
s
it
iv
e
c
a
s
e
s
th
a
t
w
e
r
e
c
or
r
e
c
tl
y
id
e
nt
if
ie
d;
iv
)
F
1
-
s
c
or
e
is
t
he
ha
r
m
on
ic
m
e
a
n
of
pr
e
c
is
io
n
a
nd
r
e
c
a
ll
;
a
nd
v)
c
onf
us
io
n
m
a
tr
ix
is
u
s
e
d
to
vi
s
ua
li
z
e
m
ode
l
pe
r
f
or
m
a
nc
e
a
c
r
os
s
a
ll
s
e
nt
im
e
nt
c
a
te
gor
ie
s
.
T
he
m
e
tr
ic
s
a
nd pur
pos
e
u
s
e
d a
r
e
s
how
n i
n T
a
bl
e
2.
T
a
bl
e
2.
M
a
tr
ic
e
s
a
nd
th
e
ir
pur
pos
e
a
r
e
us
e
d
M
e
t
r
i
c
P
ur
pos
e
A
c
c
ur
a
c
y
O
ve
r
a
l
l
pr
e
di
c
t
i
on pe
r
f
or
m
a
nc
e
.
P
r
e
c
i
s
i
on
H
ow
m
a
ny pr
e
di
c
t
e
d pos
i
t
i
ve
s
e
nt
i
m
e
nt
s
w
e
r
e
a
c
c
ur
a
t
e
.
R
e
c
a
l
l
H
ow
w
e
l
l
t
he
m
ode
l
c
a
pt
ur
e
d a
c
t
ua
l
s
e
nt
i
m
e
nt
.
F1
-
s
c
or
e
B
a
l
a
nc
e
d m
e
t
r
i
c
f
or
i
m
ba
l
a
nc
e
d c
l
a
s
s
e
s
.
3
.6.
M
od
e
l
a
r
c
h
it
e
c
t
u
r
e
T
he
a
r
c
hi
te
c
tu
r
e
m
od
e
l
s
how
n
h
e
r
e
il
lu
s
tr
a
te
s
th
e
w
or
kf
lo
w
th
a
t
tr
a
ns
f
or
m
s
r
a
w
c
us
to
m
e
r
opi
ni
ons
in
to
a
c
ti
ona
bl
e
s
e
nt
im
e
nt
in
s
ig
ht
s
.
C
us
to
m
e
r
f
e
e
dba
c
k
in
th
e
f
o
r
m
of
s
oc
ia
l
m
e
di
a
pos
ts
is
c
ol
le
c
te
d
a
nd
pr
e
-
Evaluation Warning : The document was created with Spire.PDF for Python.
C
om
put
S
c
i
I
nf
T
e
c
hnol
I
S
S
N
:
2722
-
3221
D
e
e
p l
e
ar
ni
ng f
or
s
e
nt
ime
nt
analy
s
is
and topi
c
e
x
t
r
ac
ti
on i
n he
al
th
i
ns
ur
anc
e
(
M
uz
ondi
w
a K
a
r
om
o)
69
pr
oc
e
s
s
e
d
th
r
ough
da
ta
c
le
a
ni
ng,
w
he
r
e
noi
s
e
s
uc
h
a
s
dupl
ic
a
te
s
a
nd
m
is
s
pe
ll
e
d
w
or
ds
i
s
a
ddr
e
s
s
e
d.
T
hi
s
pr
oc
e
s
s
us
e
s
th
e
T
-
D
I
F
f
e
a
tu
r
e
e
xt
r
a
c
ti
on
te
c
hni
que
,
w
hi
c
h
c
onve
r
ts
th
e
c
le
a
ne
d
te
xt
in
to
num
e
r
ic
a
l
ve
c
to
r
s
.
T
he
s
e
ve
c
to
r
s
a
r
e
th
e
n
f
e
d
in
to
a
n
S
V
M
,
w
hi
c
h
s
e
pa
r
a
te
s
t
he
da
ta
in
to
di
f
f
e
r
e
nt
c
a
te
gor
ie
s
a
nd
c
la
s
s
e
s
.
F
in
a
ll
y
,
th
e
m
ode
l
de
li
ve
r
s
a
s
e
nt
im
e
nt
pr
e
di
c
ti
on
c
la
s
s
if
ie
d
a
s
ne
ga
ti
ve
,
pos
it
iv
e
,
or
ne
ut
r
a
l.
F
ig
ur
e
1
de
m
ons
tr
a
te
s
t
hi
s
a
n
a
lo
g
y
.
F
ig
ur
e
1.
M
ode
l
a
r
c
hi
te
c
tu
r
e
4.
R
E
S
U
L
T
S
A
N
D
D
I
S
C
U
S
S
I
O
N
4
.1.
T
r
ai
n
in
g t
h
e
m
od
e
l
T
he
s
e
nt
im
e
nt
a
na
ly
s
i
s
m
ode
l
w
a
s
tr
a
in
e
d
us
in
g
a
la
be
ll
e
d
da
ta
s
e
t
of
c
us
to
m
e
r
f
e
e
dba
c
k
a
bout
C
I
M
A
S
c
ol
le
c
te
d
f
r
om
s
oc
ia
l
m
e
di
a
pl
a
tf
o
r
m
s
a
nd
r
e
vi
e
w
s
it
e
s
.
T
he
da
ta
s
e
t
w
a
s
di
vi
de
d
in
to
80%
tr
a
in
in
g
a
nd
20%
te
s
ti
ng
s
e
ts
.
A
f
te
r
a
ppl
yi
ng
pr
e
-
pr
oc
e
s
s
in
g
a
nd
f
e
a
tu
r
e
e
xt
r
a
c
ti
on
te
c
hni
que
s
(
e
.g.,
T
F
-
I
D
F
ve
c
to
r
iz
a
ti
on)
, t
hr
e
e
c
la
s
s
if
ie
r
s
w
e
r
e
t
e
s
te
d:
NB
,
LR
, a
nd
S
V
M
.
T
he
S
V
M
m
ode
l
d
e
li
ve
r
e
d
th
e
be
s
t
ove
r
a
ll
r
e
s
ul
ts
,
a
c
hi
e
vi
ng
a
n
a
c
c
ur
a
c
y
of
98.86%
.
T
h
e
w
e
ig
ht
e
d
pr
e
c
is
io
n,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
w
e
r
e
0.9888,
0.9886,
a
nd
0.9
882
,
r
e
s
pe
c
ti
ve
ly
.
W
hi
le
th
e
m
ode
l
c
l
a
s
s
if
ie
d
pos
it
iv
e
(
c
la
s
s
0)
a
nd
ne
ut
r
a
l
(
c
la
s
s
1)
s
e
nt
im
e
nt
s
w
it
h
ne
a
r
ly
pe
r
f
e
c
t
pr
e
c
is
io
n
a
nd
r
e
c
a
ll
,
it
s
how
e
d
s
li
ght
ly
r
e
duc
e
d
r
e
c
a
ll
(
0.750)
f
or
ne
ga
ti
ve
s
e
nt
im
e
nt
(
c
la
s
s
2)
due
to
c
la
s
s
im
ba
la
nc
e
in
th
e
da
t
a
s
e
t.
S
a
m
pl
e
c
la
s
s
if
ic
a
ti
on i
s
ob
s
e
r
ve
d i
n F
ig
ur
e
2.
F
ig
ur
e
2.
S
V
M
c
la
s
s
if
ic
a
ti
on r
e
por
t
T
he
s
e
r
e
s
ul
ts
de
m
ons
tr
a
te
th
a
t
th
e
S
V
M
m
od
e
l
is
hi
ghl
y
e
f
f
e
c
ti
ve
f
or
r
e
a
l
-
w
or
ld
s
e
nt
im
e
nt
c
la
s
s
if
ic
a
ti
on
in
th
e
c
ont
e
xt
of
he
a
lt
hc
a
r
e
-
r
e
la
te
d
c
us
to
m
e
r
f
e
e
dba
c
k.
S
V
M
s
w
e
r
e
s
e
le
c
te
d
f
or
th
e
ir
a
bi
li
ty
to
ha
ndl
e
hi
gh
-
di
m
e
ns
io
na
l
s
pa
c
e
s
a
nd
th
e
ir
pr
ove
n
e
f
f
e
c
ti
ve
ne
s
s
in
te
xt
c
la
s
s
if
ic
a
ti
on.
W
e
pe
r
f
or
m
e
d
5
-
f
ol
d
c
r
os
s
-
va
li
da
ti
on
a
nd
hyp
e
r
pa
r
a
m
e
te
r
tu
ni
ng
(
a
dj
us
ti
ng
th
e
C
pa
r
a
m
e
te
r
a
nd
k
e
r
ne
l
ty
pe
)
to
opt
im
iz
e
pe
r
f
or
m
a
nc
e
a
nd pr
e
ve
nt
ove
r
f
it
ti
ng.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2722
-
3221
C
om
put
S
c
i
I
nf
T
e
c
hnol
,
V
ol
. 7, No. 1, M
a
r
c
h 2026
:
66
-
73
70
S
e
nt
im
e
nt
c
la
s
s
if
ic
a
ti
on
e
qua
ti
on
(
S
V
M
ke
r
ne
l
f
unc
ti
on
)
,
t
he
S
V
M
m
ode
l
us
e
s
a
ke
r
ne
l
f
unc
ti
on
to
tr
a
ns
f
or
m
in
put
f
e
a
tu
r
e
s
in
to
a
hi
gh
e
r
-
di
m
e
ns
io
na
l
s
pa
c
e
,
a
ll
ow
in
g
th
e
c
la
s
s
if
ie
r
to
dr
a
w
opt
im
a
l
d
e
c
is
io
n
bounda
r
ie
s
be
twe
e
n
s
e
nt
im
e
nt
c
a
te
gor
ie
s
th
a
t
m
a
y
not
be
li
n
e
a
r
ly
s
e
pa
r
a
bl
e
in
th
e
or
ig
in
a
l
f
e
a
tu
r
e
s
pa
c
e
be
twe
e
n s
e
nt
im
e
nt
c
l
a
s
s
e
s
.
T
he
c
la
s
s
if
ic
a
ti
on e
qua
ti
on i
s
(
1)
.
(
)
=
(
∑
(
,
)
+
=
1
)
(
1)
W
he
r
e
:
i)
f
(
x)
i
s
th
e
pr
e
di
c
te
d
s
e
nt
im
e
nt
la
be
l;
ii
)
x
i
a
r
e
th
e
s
uppor
t
ve
c
to
r
s
;
ii
i)
y
i
a
r
e
th
e
s
e
nt
im
e
nt
la
be
ls
;
iv
)
αi
a
r
e
t
he
m
ode
l
c
oe
f
f
ic
ie
nt
s
;
v)
K
(
x
i
,x)
is
t
he
ke
r
ne
l
f
unc
ti
o
n;
a
nd vi)
b i
s
t
he
bi
a
s
t
e
r
m
.
T
hi
s
m
ode
l
w
a
s
s
e
le
c
te
d
f
or
it
s
r
obus
tn
e
s
s
in
ha
ndl
in
g
hi
gh
-
di
m
e
ns
io
na
l
te
xt
da
ta
a
nd
it
s
s
tr
ong
ge
ne
r
a
li
z
a
ti
on
on
uns
e
e
n
e
xa
m
pl
e
s
.
T
h
e
ke
r
ne
l
f
unc
ti
on
(
li
ne
a
r
in
th
is
c
a
s
e
)
he
lp
e
d
di
s
ti
ngui
s
h
be
twe
e
n
s
ubt
le
s
e
nt
im
e
nt
di
f
f
e
r
e
nc
e
s
f
ound in f
e
e
dba
c
k d
a
ta
.
4
.2.
O
ve
r
al
l
m
od
e
l
r
e
s
u
lt
s
S
e
nt
im
e
nt
di
s
tr
ib
ut
io
n
is
a
n a
na
ly
s
is
of
t
he
s
e
nt
im
e
nt
pr
e
di
c
ti
on
s
a
c
r
os
s
t
he
e
nt
ir
e
t
e
s
t
da
ta
s
e
t
, w
hi
c
h
r
e
ve
a
le
d
th
e
f
ol
lo
w
in
g
di
s
tr
ib
ut
io
n
:
i)
ne
ut
r
a
l
s
e
nt
im
e
nt
:
49.1%
;
ii
)
pos
it
iv
e
s
e
nt
im
e
nt
:
47.5%
;
a
nd
ii
i)
ne
ga
ti
ve
s
e
nt
im
e
nt
:
3.4
%
.
F
r
om
F
ig
ur
e
3
,
it
is
c
le
a
r
th
a
t
ne
ut
r
a
l
a
nd
pos
it
iv
e
s
e
nt
im
e
nt
s
dom
in
a
te
c
us
to
m
e
r
f
e
e
dba
c
k
to
w
a
r
d
C
I
M
A
S
,
s
ugge
s
ti
ng
th
a
t
th
e
publ
ic
ge
ne
r
a
ll
y
vi
e
w
s
th
e
or
ga
ni
z
a
ti
on
in
a
ba
la
nc
e
d
to
f
a
vor
a
bl
e
li
ght
.
H
ow
e
ve
r
,
th
e
lo
w
le
ve
l
of
ne
ga
ti
ve
s
e
nt
im
e
n
t
(
3.4%
)
,
w
hi
le
s
m
a
ll
,
s
ti
ll
f
la
gs
is
ol
a
te
d
a
r
e
a
s
of
di
s
s
a
ti
s
f
a
c
ti
on
th
a
t
m
a
y
r
e
qui
r
e
ta
r
ge
te
d
a
tt
e
nt
io
n
f
r
om
m
a
na
ge
m
e
nt
.
F
ig
ur
e
3
s
how
s
th
e
di
s
tr
ib
ut
io
n
of
s
e
nt
im
e
nt
s
.
4
.3.
E
xam
p
le
s
o
f
s
e
n
t
im
e
n
t
c
la
s
s
i
f
ic
at
io
n
T
o
va
li
da
te
in
te
r
pr
e
ta
bi
li
ty
,
s
e
ve
r
a
l
s
a
m
pl
e
out
put
s
w
e
r
e
a
na
l
yz
e
d.
T
a
bl
e
3
s
how
s
a
n
e
xa
m
pl
e
of
m
ode
l
pr
e
di
c
ti
ons
f
or
r
a
ndoml
y
s
e
le
c
te
d
f
e
e
dba
c
k.
T
h
e
s
e
e
xa
m
pl
e
s
hi
ghl
ig
ht
th
e
m
od
e
l'
s
a
bi
li
ty
to
id
e
nt
if
y
s
e
nt
im
e
nt
c
ont
e
xt
ua
ll
y, e
ve
n w
h
e
n l
a
ngua
ge
i
s
a
m
bi
guous
or
e
m
ot
io
na
ll
y s
ubt
le
.
T
a
bl
e
3
.
M
od
e
l
pr
e
di
c
ti
ons
f
or
r
a
ndoml
y s
e
le
c
te
d f
e
e
dba
c
k
S
a
m
pl
e
f
e
e
dba
c
k
A
c
t
ua
l
s
e
nt
i
m
e
nt
P
r
e
di
c
t
e
d
s
e
nt
i
m
e
nt
"
C
I
M
A
S
s
e
r
vi
c
e
a
t
B
or
r
ow
da
l
e
br
a
nc
h w
a
s
qui
c
k a
nd h
e
l
pf
ul
!
"
P
os
i
t
i
ve
P
os
i
t
i
ve
"T
he
m
obi
l
e
a
pp a
l
w
a
ys
c
r
a
s
he
s
w
he
n I
ne
e
d i
t
m
os
t
."
N
e
ga
t
i
ve
N
e
ga
t
i
ve
"
B
ut
i
s
C
I
M
A
S
r
e
a
l
l
y l
i
ke
t
ha
t
?
"
N
e
ut
r
a
l
N
e
ut
r
a
l
4
.4.
P
r
e
d
ic
t
io
n
t
im
e
an
d
r
e
al
-
t
im
e
f
e
as
ib
il
it
y
T
he
m
ode
l’
s
a
ve
r
a
ge
pr
e
di
c
ti
on
ti
m
e
p
e
r
c
om
m
e
nt
w
a
s
r
e
c
or
de
d
a
t
0.001993
s
e
c
onds
,
w
hi
c
h
qua
li
f
ie
s
a
s
r
e
a
l
-
ti
m
e
in
m
os
t
w
e
b
-
ba
s
e
d
or
m
obi
le
a
ppl
ic
a
ti
on
us
e
c
a
s
e
s
.
T
hi
s
s
pe
e
d
s
uppor
ts
d
e
pl
oym
e
nt
in
li
ve
f
e
e
dba
c
k
da
s
hboa
r
ds
or
a
ut
om
a
te
d
c
us
to
m
e
r
s
e
r
vi
c
e
m
o
ni
to
r
in
g
s
ys
te
m
s
,
of
f
e
r
in
g
C
I
M
A
S
im
m
e
di
a
te
vi
s
ib
il
it
y i
nt
o c
us
to
m
e
r
s
e
nt
im
e
nt
t
r
e
nds
.
F
ig
ur
e
4
s
how
s
t
he
c
o
de
e
xt
r
a
c
t
to
de
ta
il
th
e
a
ve
r
a
ge
t
im
e
t
a
k
e
n.
F
ig
ur
e
3. S
e
nt
im
e
nt
di
s
tr
ib
ut
io
n of
p
r
e
di
c
te
d f
e
e
dba
c
k
F
ig
ur
e
4. P
r
e
di
c
ti
on t
im
e
a
nd r
e
a
l
-
ti
m
e
f
e
a
s
ib
il
it
y
Evaluation Warning : The document was created with Spire.PDF for Python.
C
om
put
S
c
i
I
nf
T
e
c
hnol
I
S
S
N
:
2722
-
3221
D
e
e
p l
e
ar
ni
ng f
or
s
e
nt
ime
nt
analy
s
is
and topi
c
e
x
t
r
ac
ti
on i
n he
al
th
i
ns
ur
anc
e
(
M
uz
ondi
w
a K
a
r
om
o)
71
4
.5.
I
n
s
ig
h
t
s
f
r
o
m
s
e
n
t
im
e
n
t
t
r
e
n
d
s
T
hr
ough
de
e
pe
r
te
xt
ua
l
a
na
ly
s
i
s
of
ne
ga
ti
ve
s
e
nt
im
e
nt
c
lu
s
te
r
s
,
th
e
f
ol
lo
w
in
g
r
e
c
ur
r
in
g
th
e
m
e
s
w
e
r
e
id
e
nt
if
ie
d:
i)
m
obi
le
a
pp
i
s
s
ue
s
:
f
r
e
que
nt
c
om
pl
a
in
ts
a
bout
us
a
bi
li
ty
a
nd
s
y
s
te
m
e
r
r
or
s
;
ii
)
c
l
a
im
pr
oc
e
s
s
in
g
de
la
ys
:
ne
ga
ti
ve
f
e
e
dba
c
k
r
e
ga
r
di
ng
th
e
ti
m
e
ta
ke
n
f
or
r
e
im
bur
s
e
m
e
nt
;
a
nd
ii
i)
br
a
nc
h
-
le
ve
l
s
e
r
vi
c
e
va
r
ia
bi
li
ty
:
m
ix
e
d
r
e
vi
e
w
s
r
e
ga
r
di
ng
c
us
to
m
e
r
s
e
r
vi
c
e
a
c
r
os
s
lo
c
a
ti
ons
.
T
he
s
e
in
s
ig
ht
s
e
na
bl
e
C
I
M
A
S
to
pr
io
r
it
iz
e
s
e
r
vi
c
e
i
m
pr
ove
m
e
nt
s
i
n s
pe
c
if
ic
de
pa
r
tm
e
nt
s
a
nd c
ha
nne
ls
.
4
.6.
C
on
c
lu
s
io
n
of
r
e
s
u
lt
s
T
hi
s
s
tu
dy
de
m
ons
tr
a
te
s
th
a
t
m
a
c
hi
ne
le
a
r
ni
ng
c
a
n
s
u
c
c
e
s
s
f
ul
ly
be
a
ppl
ie
d
to
c
la
s
s
if
y
a
nd
a
n
a
ly
z
e
s
e
nt
im
e
nt
in
r
e
a
l
-
w
or
ld
c
us
to
m
e
r
f
e
e
dba
c
k
da
ta
.
T
he
s
e
nt
im
e
nt
a
na
ly
s
is
m
ode
l
a
c
hi
e
ve
d
hi
gh
c
la
s
s
if
ic
a
ti
on
a
c
c
ur
a
c
y,
pr
ovi
de
d
a
c
ti
ona
bl
e
in
s
ig
ht
s
,
a
nd
r
e
s
ponde
d
w
it
hi
n
r
e
a
l
-
ti
m
e
c
ons
tr
a
in
ts
.
T
he
s
e
r
e
s
ul
ts
va
li
da
te
th
e
us
e
of
m
a
c
hi
ne
le
a
r
ni
ng
in
a
ugm
e
nt
in
g
tr
a
di
ti
ona
l
c
us
to
m
e
r
e
xpe
r
ie
nc
e
m
a
na
ge
m
e
nt
e
f
f
or
ts
,
e
na
bl
in
g
pr
oa
c
ti
ve
r
e
put
a
ti
on ma
na
ge
m
e
nt
a
nd s
tr
a
te
gi
c
pl
a
nni
ng a
t
C
I
M
A
S
.
5.
C
O
N
C
L
U
S
I
O
N
T
hi
s
r
e
s
e
a
r
c
h
f
oc
u
s
e
d
on
th
e
de
ve
lo
pm
e
nt
of
a
m
a
c
hi
ne
le
a
r
ni
ng
-
ba
s
e
d
s
e
nt
im
e
nt
a
na
ly
s
i
s
m
ode
l
de
s
ig
ne
d
to
c
la
s
s
if
y
c
us
to
m
e
r
f
e
e
dba
c
k
di
r
e
c
te
d
to
w
a
r
d
C
I
M
A
S
M
e
di
c
a
l
A
id
S
oc
ie
ty
.
T
h
e
m
a
in
obj
e
c
ti
ve
w
a
s
to
e
na
bl
e
th
e
a
ut
om
a
ti
c
id
e
nt
if
ic
a
ti
on
of
s
e
nt
im
e
nt
,
po
s
it
iv
e
,
ne
ga
ti
ve
,
or
ne
ut
r
a
l
,
f
r
om
uns
tr
uc
tu
r
e
d
te
xt
da
ta
s
our
c
e
d
f
r
om
s
oc
ia
l
m
e
di
a
pl
a
tf
or
m
s
a
nd
c
us
to
m
e
r
r
e
vi
e
w
por
ta
ls
.
I
t
w
a
s
de
ve
lo
p
e
d
w
it
h
s
upe
r
vi
s
e
d
le
a
r
ni
ng,
a
nd
th
e
S
V
M
a
lg
or
it
hm
s
how
e
d
th
e
hi
gh
e
s
t
c
la
s
s
if
i
e
r
pe
r
f
or
m
a
nc
e
.
P
r
e
-
pr
oc
e
s
s
in
g
m
e
th
ods
li
ke
to
ke
ni
z
a
ti
on,
s
to
p
-
w
or
d
r
e
m
ova
l,
a
nd
T
F
-
I
D
F
f
e
a
tu
r
e
e
xt
r
a
c
t
io
n
w
e
r
e
us
e
d
to
ke
e
p
in
put
da
ta
c
le
a
n
a
nd
pr
e
pa
r
e
d f
or
e
f
f
e
c
ti
ve
t
r
a
in
in
g.
E
va
lu
a
ti
on me
tr
ic
s
s
how
e
d t
ha
t
th
e
m
ode
l
a
c
hi
e
ve
d hi
gh a
c
c
ur
a
c
y a
nd r
e
li
a
bl
e
c
la
s
s
if
ic
a
ti
on,
w
it
h
pe
r
f
or
m
a
nc
e
le
ve
ls
s
ui
ta
bl
e
f
or
r
e
a
l
-
ti
m
e
de
pl
oym
e
nt
.
I
m
por
ta
nt
ly
,
th
e
a
na
ly
s
is
r
e
ve
a
le
d
not
onl
y
a
m
a
jo
r
it
y
o
f
pos
it
iv
e
s
e
nt
im
e
nt
to
w
a
r
d
C
I
M
A
S
but
a
ls
o
a
r
e
c
ur
r
in
g
s
e
t
of
c
onc
e
r
ns
a
r
ound
m
obi
le
a
pp
f
unc
ti
ona
li
ty
a
nd
c
la
im
pr
oc
e
s
s
in
g.
T
he
s
e
in
s
ig
ht
s
a
r
e
a
c
ti
ona
bl
e
a
nd
c
a
n
di
r
e
c
tl
y
in
f
or
m
s
e
r
vi
c
e
im
pr
ove
m
e
nt
s
.
T
he
im
pl
ic
a
ti
on
s
of
th
is
r
e
s
e
a
r
c
h
s
uppor
t
th
e
a
dopt
io
n
of
s
e
nt
im
e
nt
a
na
ly
s
is
a
s
a
s
tr
a
te
gi
c
c
us
to
m
e
r
in
te
ll
ig
e
nc
e
pl
a
tf
or
m
in
he
a
lt
hc
a
r
e
.
W
it
h
s
e
nt
im
e
nt
c
la
s
s
if
ic
a
ti
on
a
ut
om
a
te
d,
C
I
M
A
S
is
a
bl
e
to
m
oni
to
r
publ
ic
s
e
nt
im
e
nt
c
ont
in
uous
ly
,
a
c
t
on
f
e
e
dba
c
k
pr
e
-
e
m
pt
iv
e
ly
,
a
nd
a
dj
us
t
ope
r
a
ti
ons
to
s
ui
t
c
li
e
nt
ne
e
ds
be
tt
e
r
.
T
hi
s
pl
a
c
e
s
th
e
c
om
pa
ny
in
a
be
tt
e
r
po
s
it
io
n
to
m
a
ke
e
vi
de
nc
e
-
ba
s
e
d
de
c
is
io
ns
th
a
t
pr
om
ot
e
br
a
nd t
r
us
t
a
nd s
e
r
vi
c
e
de
li
ve
r
y.
A
C
K
N
O
WL
E
D
G
M
E
N
T
S
I
w
oul
d
li
ke
to
e
xpr
e
s
s
m
y
s
in
c
e
r
e
a
ppr
e
c
ia
ti
on
to
C
I
M
A
S
M
e
di
c
a
l
A
id
S
oc
ie
ty
f
or
th
e
ir
s
uppor
t
in
th
is
r
e
s
e
a
r
c
h.
S
pe
c
ia
l
th
a
nks
to
th
e
te
a
m
m
e
m
be
r
s
w
ho
pr
ovi
de
d
va
lu
a
bl
e
f
e
e
dba
c
k
a
nd
he
lp
e
d
in
id
e
nt
if
yi
ng
r
e
le
va
nt
f
e
e
dba
c
k
s
our
c
e
s
a
nd
ope
r
a
ti
ona
l
us
e
c
a
s
e
s
. T
he
ir
c
ont
r
ib
ut
io
n
w
a
s
e
s
s
e
nt
ia
l
to
th
e
de
ve
lo
pm
e
nt
a
nd
c
ont
e
xt
ua
l
r
e
le
va
nc
e
of
t
hi
s
s
tu
dy.
F
U
N
D
I
N
G
I
N
F
O
R
M
A
T
I
O
N
T
he
a
ut
hor
s
s
t
a
te
t
ha
t
no f
undi
ng i
s
i
nvol
ve
d i
n doing t
hi
s
r
e
s
e
a
r
c
h.
A
U
T
H
O
R
C
O
N
T
R
I
B
U
T
I
O
N
S
S
T
A
T
E
M
E
N
T
T
hi
s
jo
ur
na
l
us
e
s
th
e
C
ont
r
ib
ut
or
R
ol
e
s
T
a
xonomy
(
C
R
e
d
iT
)
to
r
e
c
ogni
z
e
in
di
vi
dua
l
a
ut
hor
c
ont
r
ib
ut
io
ns
, r
e
duc
e
a
ut
hor
s
hi
p di
s
put
e
s
,
a
nd f
a
c
il
it
a
te
c
ol
la
bo
r
a
ti
on.
N
am
e
o
f
A
u
t
h
or
C
M
So
Va
Fo
I
R
D
O
E
Vi
Su
P
Fu
M
uz
ondi
w
a
K
a
r
om
o
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
M
a
in
f
or
d M
ut
a
nda
va
r
i
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
W
il
to
n M
uz
a
va
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
C
:
C
onc
e
pt
ua
l
i
z
a
t
i
on
M
:
M
e
t
hodol
ogy
S
o:
So
f
t
w
a
r
e
V
a
:
Va
l
i
da
t
i
on
F
o:
Fo
r
m
a
l
a
na
l
ys
i
s
I
:
I
nve
s
t
i
ga
t
i
on
R
:
R
e
s
our
c
e
s
D:
D
a
t
a
C
ur
a
t
i
on
O:
W
r
i
t
i
ng
-
O
r
i
gi
na
l
D
r
a
f
t
E:
W
r
i
t
i
ng
-
R
e
vi
e
w
&
E
di
t
i
ng
V
i:
Vi
s
ua
l
i
z
a
t
i
on
S
u:
Su
pe
r
vi
s
i
on
P
:
P
r
oj
e
c
t
a
dm
i
ni
s
t
r
a
t
i
on
F
u:
Fu
ndi
ng a
c
qui
s
i
t
i
on
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2722
-
3221
C
om
put
S
c
i
I
nf
T
e
c
hnol
,
V
ol
. 7, No. 1, M
a
r
c
h 2026
:
66
-
73
72
C
O
N
F
L
I
C
T
O
F
I
N
T
E
R
E
S
T
S
T
A
T
E
M
E
N
T
T
he
a
ut
hor
s
c
e
r
ti
f
y
th
a
t
th
e
w
or
k
de
s
c
r
ib
e
d
in
th
is
pa
pe
r
w
a
s
not
in
f
lu
e
nc
e
d
by
a
ny
known
c
onf
li
c
ti
ng
f
in
a
nc
ia
l
in
te
r
e
s
ts
or
pe
r
s
on
a
l
ti
e
s
th
a
t
c
oul
d
ha
ve
a
ppe
a
r
e
d
to
in
f
lu
e
nc
e
th
e
w
or
k
r
e
por
te
d
in
th
is
pa
pe
r
. T
he
a
ut
hor
s
s
ta
te
no c
onf
li
c
t
of
i
nt
e
r
e
s
t.
E
T
H
I
C
A
L
A
P
P
R
O
VAL
T
hi
s
s
tu
dy
a
dhe
r
e
s
to
e
th
ic
a
l
gui
de
li
ne
s
f
or
r
e
s
e
a
r
c
h
in
te
le
c
om
m
uni
c
a
ti
ons
.
A
ll
da
ta
us
e
d
w
a
s
tr
a
ns
f
or
m
e
d,
a
nd
a
ll
pe
r
s
ona
l
id
e
nt
if
ie
r
s
w
e
r
e
e
li
m
in
a
te
d
[
21
]
.
N
o
pe
r
s
ona
l
in
f
o
r
m
a
ti
on
w
a
s
e
xpos
e
d
[
8]
.
I
ns
ti
tu
ti
ona
l
a
ppr
ova
l
w
a
s
gr
a
nt
e
d
by
th
e
H
a
r
a
r
e
I
ns
ti
tu
te
of
T
e
c
hnol
ogy
f
or
unde
r
ta
ki
ng
th
is
s
tu
dy.
T
he
da
ta
s
e
t
us
e
d
f
or
th
is
r
e
s
e
a
r
c
h
w
a
s
de
r
iv
e
d
f
r
om
publ
ic
ly
a
va
il
a
bl
e
s
oc
ia
l
m
e
di
a
c
om
m
e
nt
s
a
nd
f
e
e
dba
c
k.
A
ll
da
ta
w
a
s
a
nonymi
z
e
d,
a
nd
no
pe
r
s
ona
ll
y
id
e
nt
if
ia
bl
e
in
f
or
m
a
ti
on
w
a
s
r
e
ta
in
e
d,
in
a
dhe
r
e
nc
e
w
it
h
e
th
ic
a
l
s
ta
nda
r
ds
f
or
publi
c
da
ta
us
a
g
e
.
D
A
T
A
A
V
A
I
L
A
B
I
L
I
T
Y
T
he
da
ta
s
e
t’
s
a
va
il
a
bi
li
ty
is
r
e
s
tr
ic
te
d
be
c
a
us
e
of
it
s
pr
opr
ie
ta
r
y
na
tu
r
e
a
nd
th
e
p
r
e
s
e
nc
e
of
c
om
m
e
r
c
ia
ll
y
s
e
ns
it
iv
e
in
f
or
m
a
ti
on.
H
ow
e
ve
r
,
th
e
a
nonymi
z
e
d
da
ta
s
e
t
us
e
d
f
or
m
ode
l
e
va
lu
a
ti
on
m
a
y
be
m
a
de
a
va
il
a
bl
e
by t
he
a
ut
hor
[
MK
],
upon r
e
que
s
t.
R
E
F
E
R
E
N
C
E
S
[
1]
B
. L
i
u,
Se
nt
i
m
e
nt
anal
y
s
i
s
and opi
ni
on m
i
ni
ng
. S
a
n R
a
f
a
e
l
, C
a
l
i
f
or
ni
a
:
M
or
ga
n &
C
l
a
ypool
P
ubl
i
s
he
r
s
, 2012.
[
2]
J
.
D
e
vl
i
n,
M
.
-
W
.
C
ha
ng,
K
.
L
e
e
,
K
.
T
.
G
oogl
e
,
a
nd
A
.
I
.
L
a
ngua
g
e
,
“
B
E
R
T
:
pr
e
-
t
r
a
i
ni
ng
of
de
e
p
bi
di
r
e
c
t
i
ona
l
t
r
a
ns
f
or
m
e
r
s
f
or
l
a
ngua
ge
unde
r
s
t
a
ndi
ng,”
i
n
P
r
oc
e
e
di
ng
s
of
t
he
2019
c
onf
e
r
e
nc
e
of
t
h
e
N
or
t
h
A
m
e
r
i
c
an
c
hapt
e
r
of
t
he
a
s
s
oc
i
at
i
on
f
o
r
c
om
put
at
i
onal
l
i
ngui
s
t
i
c
s
:
hum
an l
anguage
t
e
c
hnol
ogi
e
s
(
N
A
A
C
L
-
H
L
T
)
, M
i
nne
apol
i
s
, M
i
nne
s
ot
a
, 2019, pp. 4171
–
4186.
[
3]
Y
.
C
ha
ng,
“
M
ul
t
i
l
i
ngua
l
s
e
nt
i
m
e
nt
a
na
l
y
s
i
s
dur
i
ng
t
he
pa
nd
e
m
i
c
us
i
ng
de
e
p
l
e
a
r
ni
ng
m
ode
l
s
,”
A
ppl
i
e
d
and
C
om
put
at
i
onal
E
ngi
ne
e
r
i
ng
, vol
. 43, no. 1, pp. 284
–
293, 2024, doi
:
10.54254/
2755
-
2721/
43/
20
230847.
[
4]
M
.
M
a
l
i
nga
,
I
.
L
upa
nda
,
M
.
W
.
N
kongol
o,
a
nd
P
.
va
n
D
e
ve
nt
e
r
,
“
A
m
ul
t
i
l
i
n
gua
l
s
e
nt
i
m
e
nt
l
e
xi
c
on
f
or
l
ow
-
r
e
s
our
c
e
l
a
ngua
ge
t
r
a
ns
l
a
t
i
on us
i
ng l
a
r
ge
l
a
ngua
ge
m
ode
l
s
a
nd e
xpl
a
i
na
bl
e
A
I
,”
A
dv
anc
e
s
i
n N
e
ur
al
I
nf
or
m
at
i
on P
r
oc
e
s
s
i
ng Sy
s
t
e
m
s
, 2021.
[
5]
A
. G
o, R
. B
ha
ya
ni
, a
nd
L
. H
ua
ng, “
T
w
i
t
t
e
r
s
e
nt
i
m
e
nt
c
l
a
s
s
i
f
i
c
a
t
i
on us
i
ng di
s
t
a
nt
s
upe
r
vi
s
i
on,”
2009.
[
6]
E
.
A
l
s
e
nt
z
e
r
e
t
al
.
,
“
P
ubl
i
c
l
y
a
va
i
l
a
bl
e
c
l
i
ni
c
a
l
,”
i
n
P
r
oc
e
e
di
ng
s
of
t
he
2nd
C
l
i
ni
c
al
N
at
ur
al
L
anguage
P
r
oc
e
s
s
i
ng
W
o
r
k
s
hop
,
M
i
nne
a
pol
i
s
, M
i
nn
e
s
ot
a
, U
S
A
, 2019, pp. 72
–
78. doi
:
10.18653/
v1/
W
19
-
1909.
[
7]
K
.
H
ua
ng,
J
.
A
l
t
os
a
a
r
,
a
nd
R
.
R
a
nga
na
t
h,
“
C
l
i
ni
c
a
l
B
E
R
T
:
m
ode
l
i
ng
c
l
i
ni
c
a
l
not
e
s
a
nd
pr
e
di
c
t
i
ng
hos
pi
t
a
l
r
e
a
dm
i
s
s
i
on,”
ar
X
i
v
:
1904.05342
, 2020.
[
8]
D
.
I
.
A
de
l
a
ni
e
t
al
.
,
“
M
a
s
a
kha
n
e
r
:
na
m
e
d
e
nt
i
t
y
r
e
c
ogni
t
i
on
f
or
A
f
r
i
c
a
n
l
a
ngua
ge
s
,”
T
r
an
s
ac
t
i
ons
of
t
he
A
s
s
o
c
i
at
i
on
f
or
C
om
put
at
i
onal
L
i
ngui
s
t
i
c
s
, vol
. 9, pp. 1116
–
1131, 2021, doi
:
10.1162/
t
a
c
l
_a
_00416.
[
9]
M
.
W
a
ng,
H
.
A
de
l
,
L
.
L
a
nge
,
J
.
S
t
r
öt
ge
n,
a
nd
H
.
S
c
hüt
z
e
,
“
N
L
N
D
E
a
t
S
e
m
E
va
l
-
2023
t
a
s
k
12:
a
da
pt
i
ve
pr
e
t
r
a
i
ni
ng
a
nd
s
our
c
e
l
a
ngua
ge
s
e
l
e
c
t
i
on
f
or
l
ow
-
r
e
s
our
c
e
m
ul
t
i
l
i
ngua
l
s
e
nt
i
m
e
nt
a
na
l
ys
i
s
,”
17t
h
I
nt
e
r
nat
i
onal
W
or
k
s
hop
on
Se
m
ant
i
c
E
v
al
uat
i
on,
Se
m
E
v
al
2023
-
P
r
oc
e
e
di
ngs
of
t
he
W
or
k
s
hop
, pp. 488
–
497, 2023, doi
:
10.1865
3/
v1/
2023.s
e
m
e
va
l
-
1.68.
[
10]
T
.
F
i
l
i
p,
M
.
P
a
vl
í
č
e
k,
a
nd
P
.
S
os
í
k,
“
F
i
ne
-
t
uni
ng
m
ul
t
i
l
i
ngua
l
l
a
ngua
ge
m
ode
l
s
i
n
T
w
i
t
t
e
r
/
X
s
e
nt
i
m
e
nt
a
na
l
ys
i
s
:
a
s
t
udy
o
n
E
a
s
t
e
r
n
-
E
ur
ope
a
n V
4 l
a
ngua
ge
s
,”
ar
X
i
v
:
2408.02044
, A
ug. 2024.
[
11]
J
.
H
ow
a
r
d
a
nd
S
.
R
ude
r
,
“
U
ni
ve
r
s
a
l
l
a
ngua
ge
m
ode
l
f
i
ne
-
t
uni
ng
f
or
t
e
xt
c
l
a
s
s
i
f
i
c
a
t
i
on,”
A
C
L
2018
-
56
t
h
A
nnual
M
e
e
t
i
ng
of
t
he
A
s
s
oc
i
at
i
on
f
or
C
o
m
put
at
i
onal
L
i
ngui
s
t
i
c
s
,
P
r
oc
e
e
di
ngs
of
t
he
C
onf
e
r
e
n
c
e
(
L
ong
P
ape
r
s
)
,
vol
.
1,
pp.
328
–
339,
2018,
doi
:
10.18653/
v1/
p18
-
1031.
[
12]
N
.
S
ha
r
m
a
a
nd
V
.
J
a
i
n,
“
M
a
c
hi
ne
l
e
a
r
ni
ng
a
nd
s
e
nt
i
m
e
nt
a
na
l
y
s
i
s
:
a
na
l
yz
i
ng
c
us
t
om
e
r
f
e
e
dba
c
k,”
i
n
H
um
an
-
C
e
nt
e
r
e
d
A
ppr
oac
he
s
i
n
I
ndus
t
r
y
5.0:
H
um
an
-
M
ac
hi
ne
I
nt
e
r
ac
t
i
on,
V
i
r
t
ual
R
e
al
i
t
y
T
r
ai
n
i
ng,
and
C
us
t
om
e
r
Se
nt
i
m
e
nt
A
nal
y
s
i
s
,
N
e
w
Y
or
k,
U
ni
t
e
d S
t
a
t
e
s
:
I
G
I
G
l
oba
l
S
c
i
e
nt
i
f
i
c
P
ubl
i
s
hi
ng, 2024.
[
13]
A
.
B
.
D
i
e
ng,
F
.
J
.
R
.
R
ui
z
,
a
nd
D
.
M
.
B
l
e
i
,
“
T
opi
c
m
od
e
l
i
ng
i
n
e
m
be
dd
i
ng
s
pa
c
e
s
,”
T
r
an
s
ac
t
i
ons
of
t
he
A
s
s
oc
i
at
i
on
f
o
r
C
om
put
at
i
onal
L
i
ngui
s
t
i
c
s
, vol
. 8, pp. 439
–
453, 2020, doi
:
10.1162/
t
a
c
l
_a
_00325.
[
14]
E
.
Ö
hm
a
n,
M
.
P
à
m
i
e
s
,
K
.
K
a
j
a
va
,
a
nd
J
.
T
i
e
de
m
a
nn,
“
X
E
D
:
a
m
ul
t
i
l
i
ngua
l
da
t
a
s
e
t
f
or
s
e
nt
i
m
e
nt
a
na
l
ys
i
s
a
nd
e
m
ot
i
on
de
t
e
c
t
i
on,”
i
n
P
r
oc
e
e
di
ngs
of
t
he
28t
h I
nt
e
r
nat
i
onal
C
onf
e
r
e
nc
e
on
C
om
put
at
i
onal
L
i
ngui
s
t
i
c
s
,
S
t
r
ouds
bur
g,
P
A
,
U
S
A
,
2020,
pp.
6542
–
6552.
doi
:
10.18653/
v1/
2020.c
ol
i
ng
-
m
a
i
n.575.
[
15]
G
. B
ha
t
i
a
, I
. A
de
ba
r
a
, A
.
R
. E
l
m
a
da
ny, a
nd
M
. A
.
-
M
a
g
e
e
d, “
U
B
C
-
D
L
N
L
P
a
t
S
e
m
E
va
l
-
2023 T
a
s
k 12:
i
m
pa
c
t
of
t
r
a
ns
f
e
r
l
e
a
r
ni
ng
on
A
f
r
i
c
a
n
s
e
nt
i
m
e
nt
a
na
l
y
s
i
s
,”
17t
h
I
nt
e
r
nat
i
onal
W
or
k
s
hop
on
Se
m
ant
i
c
E
v
al
uat
i
on,
Se
m
E
v
al
2023
-
P
r
oc
e
e
di
ngs
of
t
he
W
or
k
s
hop
, pp. 246
–
255, 2023, doi
:
10.18653/
v1/
2023.s
e
m
e
va
l
-
1.33.
[
16]
L
.
Z
ha
ng,
S
.
W
a
ng,
a
nd
B
.
L
i
u,
“
D
e
e
p
l
e
a
r
ni
ng
f
or
s
e
nt
i
m
e
nt
a
na
l
ys
i
s
:
a
s
ur
ve
y,”
W
i
l
e
y
I
nt
e
r
di
s
c
i
pl
i
nar
y
R
e
v
i
e
w
s
:
D
at
a
M
i
ni
ng
and K
now
l
e
dge
D
i
s
c
ov
e
r
y
, vol
. 8, no. 4, 2018, doi
:
10.1002/
w
i
dm
.1253.
[
17]
I
.
A
a
t
t
ouc
hi
,
S
.
E
l
m
e
ndi
l
i
,
a
nd
F
.
E
l
m
e
ndi
l
i
,
“
S
e
nt
i
m
e
nt
a
na
l
ys
i
s
of
he
a
l
t
h
c
a
r
e
:
r
e
vi
e
w
,”
E
3S
W
e
b
of
C
onf
e
r
e
nc
e
s
,
vol
.
319
,
2021, doi
:
10.1051/
e
3s
c
onf
/
202131901064.
[
18]
N
.
Z
a
i
nuddi
n,
A
.
S
e
l
a
m
a
t
,
a
nd
R
.
I
br
a
hi
m
,
“
H
ybr
i
d
s
e
nt
i
m
e
nt
c
l
a
s
s
i
f
i
c
a
t
i
on
on
t
w
i
t
t
e
r
a
s
pe
c
t
-
ba
s
e
d
s
e
nt
i
m
e
nt
a
na
l
y
s
i
s
,”
A
ppl
i
e
d
I
nt
e
l
l
i
ge
nc
e
, vol
. 48, no. 5, pp. 1218
–
1232, 2018, doi
:
10.1007/
s
10489
-
017
-
109
8
-
6.
[
19]
L
.
H
a
ki
m
,
I
.
N
ur
ya
s
i
n,
a
nd
S
.
N
ugr
oho,
“
S
e
nt
i
m
e
nt
a
na
l
ys
i
s
i
n
i
ns
ur
a
nc
e
:
a
s
ys
t
e
m
a
t
i
c
r
e
vi
e
w
of
a
ppr
oa
c
he
s
,
t
e
c
hni
que
s
,
a
n
d
a
ppl
i
c
a
t
i
ons
,”
M
ul
t
i
di
s
c
i
pl
i
nar
y
R
e
v
i
e
w
s
, vol
. 8, no. 10, 2025, doi
:
10.31893/
m
ul
t
i
r
e
v.2025323.
[
20]
J
.
Y
.
L
e
C
ha
n,
K
.
T
.
B
e
a
,
S
.
M
.
H
.
L
e
ow
,
S
.
W
.
P
hoo
ng,
a
nd
W
.
K
.
C
he
ng
,
“
S
t
a
t
e
of
t
he
a
r
t
:
a
r
e
vi
e
w
o
f
s
e
nt
i
m
e
n
t
a
na
l
ys
i
s
ba
s
e
d
on s
e
que
nt
i
a
l
t
r
a
ns
f
e
r
l
e
a
r
n
i
ng,”
A
r
t
i
f
i
c
i
a
l
I
nt
e
l
l
i
ge
nc
e
R
e
v
i
e
w
, vol
. 56,
no.
1, pp
.
749
–
78
0, 20
23, d
oi
:
10.10
07/
s
104
62
-
022
-
1018
3
-
8.
[
21]
H
. P
. S
ur
e
s
ha
a
nd K
. K
.
T
i
w
a
r
i
, “
T
opi
c
m
ode
l
i
ng a
nd s
e
nt
i
m
e
nt
a
n
a
l
ys
i
s
of
T
w
i
t
t
e
r
da
t
a
,”
A
s
i
an J
our
nal
of
R
e
s
e
ar
c
h i
n
C
om
put
e
r
Sc
i
e
nc
e
, pp. 13
–
29, 2021, doi
:
10.9734/
a
j
r
c
os
/
2021/
v12i
230278.
Evaluation Warning : The document was created with Spire.PDF for Python.
C
om
put
S
c
i
I
nf
T
e
c
hnol
I
S
S
N
:
2722
-
3221
D
e
e
p l
e
ar
ni
ng f
or
s
e
nt
ime
nt
analy
s
is
and topi
c
e
x
t
r
ac
ti
on i
n he
al
th
i
ns
ur
anc
e
(
M
uz
ondi
w
a K
a
r
om
o)
73
[
22]
J
.
B
a
r
ne
s
,
R
.
K
l
i
nge
r
,
a
nd
S
.
S
.
i
m
W
a
l
de
,
“
P
r
oj
e
c
t
i
ng
e
m
be
ddi
ngs
f
o
r
dom
a
i
n
a
da
pt
a
t
i
on:
j
oi
nt
m
ode
l
i
ng
of
s
e
nt
i
m
e
nt
a
na
l
ys
i
s
i
n
di
ve
r
s
e
dom
a
i
ns
,”
C
O
L
I
N
G
2018
-
27t
h
I
nt
e
r
nat
i
onal
C
onf
e
r
e
nc
e
on
C
om
put
at
i
onal
L
i
ngui
s
t
i
c
s
,
P
r
oc
e
e
di
ngs
,
pp.
818
–
830
,
2018.
[
23]
L
.
S
he
ng,
Z
.
W
a
ng,
L
.
Z
ha
ng,
a
nd
L
.
J
i
a
ng,
“
A
ppl
i
c
a
t
i
on
of
s
e
nt
i
m
e
nt
a
n
a
l
ys
i
s
ba
s
e
d
on
de
e
p
l
e
a
r
ni
ng
i
n
publ
i
c
opi
ni
on
m
oni
t
or
i
ng
i
n
he
a
l
t
hc
a
r
e
ne
t
w
or
k,”
I
nt
e
r
nat
i
onal
J
our
nal
of
H
i
gh
Spe
e
d
E
l
e
c
t
r
oni
c
s
and
Sy
s
t
e
m
s
,
M
a
r
.
2025,
doi
:
10.1142/
S
0129156425404279.
[
24]
M
c
K
i
ns
e
y &
C
om
pa
ny, “
G
l
oba
l
i
ns
ur
a
nc
e
r
e
por
t
2023:
r
e
i
m
a
gi
ni
ng l
i
f
e
i
ns
ur
a
nc
e
,”
2022.
[
25]
A
.
G
hos
h,
S
.
U
m
e
r
,
B
.
C
.
D
ha
r
a
,
a
nd
G
.
G
.
M
.
N
.
A
l
i
,
“
A
m
ul
t
i
m
oda
l
pa
i
n
s
e
nt
i
m
e
nt
a
na
l
ys
i
s
s
ys
t
e
m
u
s
i
ng
e
ns
e
m
bl
e
d
de
e
p
l
e
a
r
ni
ng a
ppr
oa
c
he
s
f
or
I
oT
-
e
na
bl
e
d he
a
l
t
hc
a
r
e
f
r
a
m
e
w
or
k,”
Se
ns
or
s
, vol
. 25, n
o. 4, 2025, doi
:
10.3390/
s
25041223.
[
26]
A
.
K
ha
n,
“
I
m
pr
ove
d
m
ul
t
i
-
l
i
ngua
l
s
e
nt
i
m
e
nt
a
na
l
ys
i
s
a
nd
r
e
c
ogni
t
i
on
u
s
i
ng
de
e
p
l
e
a
r
ni
ng,”
J
ou
r
nal
of
I
nf
or
m
at
i
on
Sc
i
e
n
c
e
,
vol
. 51, no. 1, pp. 284
–
291, 2025, doi
:
10.1177/
01655515221137270.
[
27]
K
.
F
uj
i
hi
r
a
a
nd
N
.
H
or
i
be
,
“
M
ul
t
i
l
i
ngua
l
s
e
nt
i
m
e
nt
a
na
l
ys
i
s
f
or
w
e
b
t
e
xt
ba
s
e
d
on
w
or
d
-
to
-
w
o
r
d
t
r
a
ns
l
a
t
i
on,”
P
r
oc
e
e
di
ngs
-
2020
9t
h
I
nt
e
r
nat
i
onal
C
ongr
e
s
s
on
A
dv
anc
e
d
A
ppl
i
e
d
I
nf
or
m
at
i
c
s
(
I
I
A
I
-
A
A
I
)
,
pp.
74
–
79,
2020,
doi
:
10.1109/
I
I
A
I
-
A
A
I
50415.2020.00025.
[
28]
A
.
O
w
oye
m
i
,
J
.
O
w
oye
m
i
,
A
.
O
s
i
ye
m
i
,
a
nd
A
.
B
oyd,
“
A
r
t
i
f
i
c
i
a
l
i
nt
e
l
l
i
ge
n
c
e
f
or
he
a
l
t
hc
a
r
e
i
n
A
f
r
i
c
a
,”
F
r
ont
i
e
r
s
i
n
D
i
gi
t
al
H
e
al
t
h
, vol
. 2, J
ul
. 2020, doi
:
10.3389/
f
dgt
h.2020.00006.
[
29]
T
.
M
.
O
m
r
a
n,
B
.
T
.
S
ha
r
e
f
,
C
.
G
r
os
a
n,
a
nd
Y
.
L
i
,
“
T
r
a
ns
f
e
r
l
e
a
r
ni
ng
a
nd
s
e
nt
i
m
e
nt
a
na
l
ys
i
s
of
B
a
hr
a
i
ni
di
a
l
e
c
t
s
s
e
que
nt
i
a
l
t
e
x
t
da
t
a
us
i
ng m
ul
t
i
l
i
ngua
l
de
e
p l
e
a
r
ni
ng a
ppr
oa
c
h,”
SSR
N
E
l
e
c
t
r
oni
c
J
ou
r
nal
, 202
2, doi
:
10.2139/
s
s
r
n.4111929.
[
30]
A
.
G
.
T
.
A
buR
a
e
d,
E
.
A
.
P
r
i
kr
yl
,
G
.
C
a
r
e
ni
ni
,
a
nd
N
.
Z
.
J
a
nj
ua
,
“
L
ong
C
O
V
I
D
di
s
c
our
s
e
i
n
C
a
na
da
,
t
he
U
ni
t
e
d
S
t
a
t
e
s
,
a
nd
E
ur
ope
:
t
opi
c
m
ode
l
i
ng
a
nd
s
e
nt
i
m
e
nt
a
na
l
ys
i
s
of
T
w
i
t
t
e
r
da
t
a
,”
J
our
nal
of
M
e
di
c
al
I
nt
e
r
ne
t
R
e
s
e
ar
c
h
,
vol
.
26,
2024,
doi
:
10.2196/
59425.
B
I
O
G
R
A
P
H
I
E
S
O
F
A
U
T
H
O
R
S
Muzond
iwa
Karomo
holds
a
Bachelor
of
Technology
degr
ee
i
n
Information
Technology
from
the
School
of
Information
Science
and
Technol
ogy,
Harare
Institute
of
Technology
.
He
is
currently
pu
rsuing
a
Master
of
Technology
degr
ee
in
Cloud
Computing.
His primary int
erest lies in research
-
driven technology development, with a particular
focus on
cloud com
puting
. He can be
contacted
at email
: mkarom
o@
gmail.co
m.
Mainford
Mutand
avari
is
a
Ph.D.
Scholar
at
SRMIST
Unive
rsity
,
India,
a
lecturer
and
postgradua
te
studies
coordinato
r
at
the
Harare
Institut
e
of
Technology
(HIT),
Zimbabwe.
With
advanced
degrees
in
Computer
Science
and
Strat
egy
and
Innovation,
his
research
spans
data
analytics,
cybersecurity,
IoT,
AI,
and
cloud
comp
uting.
He
is
a
member
of
HIT’s Cybersecurity and AI
research groups and actively cont
ributes to nati
onal ICT standards
through
the
Standards
Association
of
Zimbabwe.
Mainford
has
pub
lished
widely
on
topics
such
as
data
loss
prevention
systems,
digital
learning
infras
tructure,
a
nd
e
-
health
security.
Hi
s
work
bridges
academic
research
with
industry
applications,
focusing
on
practical
digital
solutions
for
education,
telecommunications,
and
healthcare
in
Zimba
bwe.
He
is
also
involved
in
curriculum
development,
postgraduate
supervision,
and
buil
ding
academic
-
industry
partnerships. He can be c
ontacted at email: mmutandavari@hit.ac.zw.
Wilton
Muzava
is
an
academic
affiliated
with
the
Harare
Institute
of
Technology
(HIT)
in
Zimbabwe.
He
holds
a
Bachelor
of
Technology
degree
in
I
nformation
Security
and
Assurance
and
is
a
lecturer
in
the
School
of
Information
Science
an
d
Technology,
where
he
teaches
courses
for
the
(B.Sc.)
Informatio
n
Security
and
Assurance
program.
He
has
further
specialized
by
pursuing
a
Master
of
Technology
(M.Tech.)
degree
in
Cloud
Computing.
Hi
s
research
interests
are
particularly
strong
in
the
convergence
of
modern
technologi
es,
specifically
encompassing
data
science,
internet
of
things
(IoT),
computer
vision
,
an
d
cybersecurit
y.
This
interdi
sciplin
ary
focus
highli
ghts
his
expertise
in
how
these
fields
interact,
especiall
y
within
the
context
of
cloud
environm
ents.
His
work
i
s
likely
to
explore
the
challenges
and
soluti
ons
at
the
intersect
ion
of
these
critical
areas,
such
as
securing
IoT
data
in
the
cloud
or
leveraging
data
s
cience
for
cybersecurity
analytics.
He
ca
n
be
contacted
at
email:
wmuzava@
hit.ac.zw.
Evaluation Warning : The document was created with Spire.PDF for Python.