I
AE
S In
t
er
na
t
io
na
l J
o
urna
l o
f
Art
if
icia
l In
t
ellig
ence
(
I
J
-
AI
)
Vo
l.
14
,
No
.
6
,
Dec
em
b
er
2
0
2
5
,
p
p
.
5
2
0
1
~
5
2
1
7
I
SS
N:
2
2
5
2
-
8
9
3
8
,
DOI
: 1
0
.
1
1
5
9
1
/ijai.v
14
.i
6
.
p
p
5
2
0
1
-
5
2
1
7
5201
J
o
ur
na
l ho
m
ep
a
g
e
:
h
ttp
:
//ij
a
i
.
ia
esco
r
e.
co
m
Ara
bic t
e
x
t
cla
ss
if
ica
tion usin
g
ma
c
hine learning
and
deep
lea
rning
alg
o
rithms
Ra
wa
d Aw
a
d Alqa
hta
ni
1
,
H
o
da
A.
Abdelh
a
f
ez
1,
2
1
D
e
p
a
r
t
m
e
n
t
o
f
I
n
f
o
r
m
a
t
i
o
n
T
e
c
h
n
o
l
o
g
y
,
C
o
l
l
e
g
e
o
f
C
o
m
p
u
t
e
r
a
n
d
I
n
f
o
r
m
a
t
i
o
n
S
c
i
e
n
c
e
s
,
P
r
i
n
c
e
s
s
N
o
u
r
a
h
b
i
n
t
A
b
d
u
l
r
a
h
m
a
n
U
n
i
v
e
r
s
i
t
y
,
R
i
y
a
d
h
,
S
a
u
d
i
A
r
a
b
i
a
2
F
a
c
u
l
t
y
o
f
C
o
mp
u
t
e
r
a
n
d
I
n
f
o
r
ma
t
i
c
s
,
S
u
e
z
C
a
n
a
l
U
n
i
v
e
r
s
i
t
y
,
I
smai
l
i
a
,
Eg
y
p
t
Art
icle
I
nfo
AB
S
T
RAC
T
A
r
ticle
his
to
r
y:
R
ec
eiv
ed
J
an
29
,
2
0
2
5
R
ev
is
ed
Au
g
27
,
2
0
2
5
Acc
ep
ted
Oct
16
,
2
0
2
5
Th
e
c
las
sifica
ti
o
n
o
f
Ara
b
ic
tex
tu
a
l
c
o
n
te
n
t
p
re
se
n
ts
c
o
n
sid
e
ra
b
le
c
h
a
ll
e
n
g
e
s
d
u
e
t
o
t
h
e
lan
g
u
a
g
e
'
s
rich
m
o
rp
h
o
l
o
g
ica
l
str
u
c
tu
re
a
n
d
th
e
wi
d
e
v
a
riatio
n
a
m
o
n
g
it
s
d
iale
c
ts.
Th
is
stu
d
y
a
ims
to
e
n
h
a
n
c
e
c
las
sifica
ti
o
n
a
c
c
u
ra
c
y
b
y
lev
e
ra
g
in
g
e
n
se
m
b
le
lea
rn
i
n
g
tec
h
n
i
q
u
e
s
a
n
d
a
d
e
e
p
b
i
d
irec
ti
o
n
a
l
tran
sfo
rm
e
r
-
b
a
se
d
m
o
d
e
l
,
sp
e
c
ifi
c
a
ll
y
th
e
m
u
lt
il
i
n
g
u
a
l
a
u
t
o
re
g
re
ss
iv
e
BERT
(M
ARBERT).
To
a
d
d
re
ss
li
n
g
u
isti
c
v
a
riab
il
i
ty
,
a
d
v
a
n
c
e
d
p
re
p
ro
c
e
ss
in
g
tec
h
n
iq
u
e
s
we
re
e
m
p
lo
y
e
d
,
in
c
lu
d
i
n
g
F
a
ra
sa
,
Tas
h
a
p
h
y
n
e
,
a
n
d
As
se
m
ste
m
m
in
g
m
e
th
o
d
s.
Th
e
Al
Kh
a
lee
j
d
a
tas
e
t
se
rv
e
d
a
s th
e
b
a
sis f
o
r
s
u
p
e
rv
ise
d
lea
rn
in
g
,
p
r
o
v
i
d
i
n
g
a
re
p
re
se
n
ta
ti
v
e
sa
m
p
le
o
f
Ara
b
ic
te
x
t.
F
u
r
th
e
rm
o
re
,
term
fre
q
u
e
n
c
y
-
i
n
v
e
rse
d
o
c
u
m
e
n
t
fre
q
u
e
n
c
y
(TF
-
IDF)
wit
h
b
i
g
ra
m
a
n
d
tri
g
ra
m
fe
a
tu
re
e
x
trac
ti
o
n
wa
s
u
ti
li
z
e
d
t
o
e
ffe
c
ti
v
e
l
y
c
a
p
tu
re
c
o
n
tex
t
u
a
l
se
m
a
n
ti
c
s.
Ex
p
e
rime
n
tal
re
su
lt
s
in
d
ica
te
th
a
t
t
h
e
p
ro
p
o
se
d
a
p
p
ro
a
c
h
,
p
a
rti
c
u
larly
wit
h
t
h
e
in
teg
ra
t
io
n
o
f
M
ARBERT,
a
c
h
ie
v
e
s
a
p
e
a
k
c
las
sifica
ti
o
n
a
c
c
u
ra
c
y
o
f
9
8
.
5
9
%
,
o
u
tp
e
rfo
rm
in
g
e
x
isti
n
g
m
o
d
e
ls.
Th
is
re
se
a
rc
h
u
n
d
e
rsc
o
re
s
th
e
e
ffica
c
y
o
f
c
o
m
b
i
n
in
g
e
n
se
m
b
le
lea
rn
i
n
g
with
d
e
e
p
tran
sfo
rm
e
r
-
b
a
se
d
m
o
d
e
ls
f
o
r
A
ra
b
ic
tex
t
c
las
sifica
ti
o
n
a
n
d
h
i
g
h
li
g
h
ts
t
h
e
c
rit
ica
l
ro
le
o
f
ro
b
u
st
p
re
p
r
o
c
e
ss
in
g
tec
h
n
i
q
u
e
s
i
n
m
a
n
a
g
in
g
li
n
g
u
isti
c
c
o
m
p
lex
it
y
a
n
d
imp
r
o
v
i
n
g
m
o
d
e
l
p
e
rfo
rm
a
n
c
e
.
K
ey
w
o
r
d
s
:
Ar
ab
ic
tex
t c
lass
if
icatio
n
E
n
s
em
b
le
lear
n
in
g
L
in
g
u
is
tic
p
r
ep
r
o
ce
s
s
in
g
Ma
ch
in
e
lear
n
in
g
MA
R
B
E
R
T
Stem
m
in
g
m
eth
o
d
s
T
h
is i
s
a
n
o
p
e
n
a
c
c
e
ss
a
rticle
u
n
d
e
r th
e
CC B
Y
-
SA
li
c
e
n
se
.
C
o
r
r
e
s
p
o
nd
ing
A
uth
o
r
:
Ho
d
a
A.
Ab
d
elh
a
f
ez
Dep
ar
tm
en
t o
f
I
n
f
o
r
m
atio
n
T
e
ch
n
o
lo
g
y
,
C
o
lleg
e
o
f
C
o
m
p
u
t
er
an
d
I
n
f
o
r
m
atio
n
Scien
ce
s
Prin
ce
s
s
No
u
r
ah
b
in
t A
b
d
u
lr
a
h
m
an
Un
iv
e
r
s
ity
P.O.
B
o
x
8
4
4
2
8
,
R
iy
ad
h
1
1
6
7
1
,
Sau
d
i A
r
ab
ia
E
m
ail:
h
o
d
aa
b
d
elh
af
ez
@
g
m
ai
l.c
o
m
1.
I
NT
RO
D
UCT
I
O
N
N
a
t
u
r
a
l
l
a
n
g
u
a
g
e
p
r
o
c
e
s
s
i
n
g
(
N
L
P
)
is
a
f
i
e
l
d
w
i
t
h
i
n
d
a
t
a
s
c
i
e
n
c
e
t
h
a
t
f
o
c
u
s
es
o
n
t
h
e
cr
e
a
t
i
o
n
o
f
s
o
f
t
w
a
r
e
w
it
h
t
h
e
a
b
i
li
t
y
t
o
c
o
m
p
r
e
h
e
n
d
,
s
c
r
u
t
i
n
i
ze
,
a
n
d
d
e
c
i
p
h
e
r
h
u
m
a
n
s
p
e
ec
h
.
T
h
e
o
b
j
e
c
t
i
v
e
o
f
t
h
is
t
e
c
h
n
o
l
o
g
y
i
s
t
o
i
m
p
r
o
v
e
t
h
e
i
n
t
e
r
f
a
c
e
b
et
w
e
e
n
h
u
m
a
n
s
a
n
d
c
o
m
p
u
t
e
r
s
b
y
e
n
a
b
l
i
n
g
c
o
m
m
u
n
i
c
a
t
i
o
n
t
h
r
o
u
g
h
w
r
i
t
i
n
g
a
n
d
s
p
e
e
c
h
,
h
e
n
c
e
b
o
o
s
t
i
n
g
t
h
e
c
o
m
p
u
t
e
r
'
s
a
b
i
l
it
y
t
o
u
n
d
e
r
s
t
a
n
d
.
A
r
a
b
i
c
h
as
s
e
e
n
a
s
u
r
g
e
i
n
i
n
t
e
r
es
t
i
n
N
L
P
b
e
c
a
u
s
e
o
f
c
o
n
s
i
d
e
r
a
b
l
e
s
t
u
d
y
u
n
d
e
r
t
a
k
e
n
i
n
E
n
g
l
i
s
h
a
n
d
o
t
h
e
r
l
a
n
g
u
a
g
e
s
.
A
s
a
r
e
s
u
l
t
,
d
e
d
i
c
a
t
e
d
A
r
a
b
i
c
N
L
P
r
es
e
a
r
c
h
l
a
b
o
r
a
t
o
r
ie
s
h
a
v
e
b
e
e
n
e
s
t
a
b
l
i
s
h
e
d
t
o
a
d
v
a
n
c
e
v
a
r
i
o
u
s
a
p
p
l
i
c
at
i
o
n
s
,
i
n
c
l
u
d
i
n
g
t
e
x
t
c
l
as
s
i
f
i
c
at
i
o
n
,
s
p
a
m
d
e
t
e
c
ti
o
n
,
a
n
d
s
e
n
t
i
m
e
n
t
a
n
a
l
y
s
is
.
N
e
v
e
r
t
h
e
l
ess
,
t
h
e
p
r
o
g
r
e
s
s
o
f
A
r
a
b
i
c
N
L
P
t
o
o
ls
e
n
c
o
u
n
t
e
r
s
d
i
f
f
i
c
u
l
ti
e
s
a
s
s
o
ci
a
t
e
d
wi
t
h
t
h
e
i
n
c
o
r
p
o
r
a
t
io
n
o
f
A
r
a
b
i
c
c
h
a
r
a
c
t
e
r
s
a
n
d
t
h
e
e
l
i
m
i
n
a
ti
o
n
o
f
v
o
w
e
l
d
i
a
c
r
it
i
cs
[
1
]
.
Mo
r
eo
v
er
,
Ar
ab
ic
d
ialec
ts
ex
h
ib
it
a
wid
e
r
an
g
e
o
f
d
iv
er
s
ity
,
s
u
ch
as
L
ev
an
tin
e,
Ma
g
h
r
e
b
i,
E
g
y
p
tian
,
an
d
Ar
ab
ian
Gu
lf
v
a
r
iatio
n
s
.
C
o
m
p
r
eh
en
d
in
g
th
ese
d
i
f
f
er
e
n
ce
s
is
d
if
f
icu
lt
d
u
e
to
m
o
r
p
h
o
lo
g
ical
v
ar
iab
ilit
y
,
o
r
th
o
g
r
ap
h
ic
in
c
o
n
s
is
ten
cies,
an
d
lin
g
u
is
tic
co
m
p
lex
ity
.
Ar
a
b
ic
tex
ts
o
n
s
o
cial
n
etwo
r
k
s
o
f
ten
ap
p
ea
r
in
b
o
t
h
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
2
5
2
-
8
9
3
8
I
n
t J Ar
tif
I
n
tell
,
Vo
l.
14
,
No
.
6
,
Dec
em
b
er
20
25
:
5
2
0
1
-
5
2
1
7
5202
m
o
d
er
n
s
tan
d
a
r
d
Ar
ab
ic
(
MS
A)
an
d
d
ialec
tal
f
o
r
m
s
,
wh
ich
ca
n
lead
to
d
if
f
e
r
en
t
m
ea
n
in
g
s
f
o
r
th
e
s
am
e
wo
r
d
.
T
h
is
co
m
p
lex
ity
e
x
em
p
lifie
s
th
e
ex
ten
s
iv
e
lin
g
u
is
tic
v
ar
iatio
n
ch
ar
ac
ter
is
tic
o
f
th
e
Ar
ab
ic
l
an
g
u
ag
e
[
2
]
.
T
ex
t
class
if
icatio
n
i
s
a
cr
u
ci
al
an
d
ess
en
tial
task
in
d
iv
er
s
e
NL
P
ap
p
licatio
n
s
,
s
u
ch
as
s
en
tim
en
t
an
aly
s
is
,
s
u
b
ject
lab
elin
g
,
q
u
esti
o
n
an
s
wer
in
g
,
a
n
d
d
ialo
g
ac
t
ca
teg
o
r
izatio
n
.
I
t
en
tails
th
e
allo
ca
tio
n
o
f
p
r
ed
eter
m
in
e
d
lab
els to
tex
tu
a
l c
o
n
ten
t.
Giv
en
t
h
e
v
ast v
o
lu
m
e
o
f
av
ailab
le
i
n
f
o
r
m
atio
n
,
m
an
u
ally
s
o
r
tin
g
an
d
ca
teg
o
r
izin
g
lar
g
e
tex
t
d
ata
is
a
lab
o
r
i
o
u
s
an
d
tim
e
-
co
n
s
u
m
i
n
g
task
.
I
n
ad
d
itio
n
,
th
e
p
r
ec
i
s
io
n
o
f
m
an
u
al
tex
t
class
if
icatio
n
ca
n
b
e
r
ea
d
ily
a
f
f
ec
ted
b
y
h
u
m
a
n
v
a
r
iab
les,
s
u
ch
as
tire
d
n
ess
an
d
p
r
o
f
icien
cy
.
Usi
n
g
m
ac
h
in
e
lear
n
in
g
tech
n
iq
u
es
to
au
to
m
ate
th
e
tex
t
class
if
icatio
n
o
p
er
atio
n
is
ad
v
an
tag
e
o
u
s
a
s
it
lead
s
to
m
o
r
e
d
ep
en
d
a
b
le
an
d
less
s
u
b
jecti
v
e
o
u
tco
m
es.
Fu
r
th
er
m
o
r
e,
t
h
is
ca
n
also
im
p
r
o
v
e
th
e
ef
f
icien
cy
o
f
r
etr
iev
in
g
in
f
o
r
m
atio
n
an
d
r
ed
u
ce
th
e
is
s
u
e
o
f
in
f
o
r
m
atio
n
o
v
er
l
o
ad
b
y
d
is
co
v
er
in
g
th
e
n
ec
ess
ar
y
in
f
o
r
m
atio
n
.
Acc
u
r
ate
tex
t
class
if
icat
io
n
co
n
tr
ib
u
tes
s
ig
n
if
ican
tly
to
th
e
s
y
s
tem
at
ic
o
r
g
an
izatio
n
o
f
in
f
o
r
m
atio
n
,
th
e
ex
tr
ac
tio
n
o
f
ac
tio
n
ab
le
in
s
ig
h
ts
,
an
d
in
f
o
r
m
ed
d
ec
is
io
n
-
m
a
k
in
g
ac
r
o
s
s
v
ar
io
u
s
d
o
m
ai
n
s
.
W
h
eth
er
u
s
ed
f
o
r
s
p
am
d
etec
tio
n
,
to
p
ic
ca
teg
o
r
izatio
n
,
o
r
s
en
ti
m
en
t
an
al
y
s
is
,
ef
f
ec
tiv
e
class
if
icatio
n
en
h
an
ce
s
b
o
th
d
ata
co
m
p
r
eh
e
n
s
io
n
an
d
m
an
ag
em
en
t
[
3
]
.
Ma
ch
in
e
lear
n
in
g
is
a
m
eth
o
d
o
lo
g
y
th
at
e
n
ab
les
co
m
p
u
ter
s
to
ac
q
u
ir
e
in
f
o
r
m
atio
n
an
d
en
h
an
ce
th
eir
p
er
f
o
r
m
an
ce
with
o
u
t
d
ep
en
d
i
n
g
o
n
e
x
p
licit
p
r
o
g
r
a
m
m
in
g
.
Ma
ch
in
e
lea
r
n
in
g
h
as
d
em
o
n
s
tr
ated
s
ig
n
if
ican
t
ad
v
an
tag
es
in
in
tr
icate
task
s
s
u
ch
as
NL
P
,
o
b
v
iatin
g
th
e
n
ee
d
f
o
r
s
p
ec
ialis
t
ap
p
r
o
ac
h
es.
A
s
a
r
esu
lt,
m
ac
h
in
e
lear
n
in
g
is
wid
ely
u
s
ed
in
ar
ea
s
s
u
ch
as
au
to
m
ated
NL
P.
E
n
s
em
b
le
lear
n
in
g
is
an
ap
p
r
o
a
ch
u
s
ed
in
m
ac
h
in
e
lear
n
in
g
to
e
n
h
an
ce
th
e
ac
cu
r
ac
y
o
f
m
ac
h
in
e
lea
r
n
in
g
m
o
d
els
[
4
]
.
T
ex
t
ca
teg
o
r
izatio
n
is
a
m
ac
h
in
e
-
lear
n
in
g
p
r
o
ce
s
s
wh
er
e
a
d
o
cu
m
en
t
is
ca
teg
o
r
ized
in
to
o
n
e
o
r
m
o
r
e
p
r
ed
eter
m
in
ed
ca
te
g
o
r
ies
b
ased
o
n
its
co
n
te
n
t.
T
ex
ts
ca
n
b
e
co
m
p
o
s
ed
o
f
s
e
v
er
al
g
en
r
es,
s
u
ch
as
s
cien
tifi
c
ar
ticles,
n
ews
r
ep
o
r
ts
,
m
o
v
i
e
r
ev
iews,
an
d
ad
s
.
Gen
r
e
r
ef
e
r
s
to
h
o
w
a
tex
t
was
p
r
o
d
u
ce
d
an
d
ed
ited
,
th
e
lin
g
u
is
tic
s
ty
le
it
em
p
lo
y
s
,
an
d
th
e
in
ten
d
ed
au
d
ie
n
ce
it tar
g
ets
[
5
]
.
R
ec
en
t
r
esear
ch
u
s
ed
th
e
b
id
ir
ec
tio
n
al
en
co
d
er
r
e
p
r
esen
tatio
n
s
f
r
o
m
tr
an
s
f
o
r
m
er
s
(
B
E
R
T
)
m
o
d
el,
wh
ich
in
teg
r
ates
co
n
tex
tu
al
wo
r
d
in
f
o
r
m
atio
n
.
T
r
a
n
s
f
er
lear
n
in
g
with
em
b
ed
d
in
g
is
a
wid
ely
u
s
ed
an
d
s
o
p
h
is
ticated
d
ee
p
lear
n
in
g
tech
n
iq
u
e
th
at
im
p
r
o
v
es
th
e
ef
f
ec
tiv
en
ess
o
f
v
ar
i
o
u
s
NL
P
ap
p
licatio
n
s
.
W
h
en
it
co
m
es
to
ca
teg
o
r
izin
g
Ar
ab
ic
tex
t,
b
o
th
m
ac
h
in
e
lear
n
in
g
-
b
ased
en
s
em
b
le
lear
n
in
g
an
d
b
id
ir
ec
tio
n
al
tr
an
s
f
o
r
m
er
s
h
a
v
e
u
n
iq
u
e
b
e
n
ef
its
.
E
n
s
em
b
le
lear
n
in
g
u
s
es
th
e
p
o
wer
o
f
d
if
f
er
en
t
m
o
d
els
to
in
cr
ea
s
e
ac
cu
r
ac
y
an
d
r
o
b
u
s
tn
ess
b
y
m
er
g
in
g
th
eir
p
r
ed
ictio
n
s
.
C
o
n
v
er
s
ely
,
b
id
ir
ec
tio
n
al
tr
a
n
s
f
o
r
m
er
s
,
s
u
ch
as
B
E
R
T
,
ar
e
ex
ce
p
tio
n
ally
ef
f
ec
tiv
e
in
ca
p
tu
r
i
n
g
co
n
tex
tu
al
in
f
o
r
m
atio
n
an
d
co
m
p
r
eh
en
d
in
g
in
tr
icate
lin
g
u
is
tic
p
atter
n
s
.
T
h
e
ef
f
icac
y
o
f
ea
c
h
m
eth
o
d
i
n
class
if
y
in
g
A
r
ab
ic
tex
t
d
e
p
en
d
s
o
n
s
ev
e
r
al
as
p
ec
ts
,
s
u
ch
as
th
e
p
ar
ticu
lar
tar
g
et,
ch
ar
ac
ter
is
tics
o
f
th
e
d
ataset,
an
d
th
e
co
m
p
u
tatio
n
al
r
eso
u
r
ce
s
av
ailab
le
[
6
]
.
F
u
r
th
er
m
o
r
e
,
Ar
ab
ic
m
o
r
p
h
o
l
o
g
y
is
co
m
p
licated
,
a
n
d
w
o
r
d
s
co
u
ld
h
a
v
e
s
ev
er
al
r
o
o
t
f
o
r
m
s
,
wh
ich
ca
n
im
p
ac
t
th
e
ef
f
icien
cy
o
f
ca
teg
o
r
izatio
n
p
r
o
ce
s
s
es.
V
ar
io
u
s
s
tem
m
in
g
m
eth
o
d
s
s
ee
k
to
m
itig
ate
th
is
d
iv
er
s
ity
b
y
s
tan
d
ar
d
izin
g
w
o
r
d
f
o
r
m
s
t
o
th
eir
b
ase
f
o
r
m
s
.
An
ass
ess
m
en
t
o
f
th
e
i
n
f
lu
en
ce
o
f
v
ar
io
u
s
s
tem
m
in
g
m
eth
o
d
s
o
n
ca
teg
o
r
izatio
n
ac
cu
r
ac
y
ca
n
o
f
f
er
v
al
u
ab
le
in
s
ig
h
ts
in
to
th
e
m
o
s
t
ef
f
icien
t
ap
p
r
o
a
ch
f
o
r
m
an
ag
in
g
Ar
ab
ic
tex
t d
ata
.
T
h
is
s
tu
d
y
in
v
esti
g
ates
th
e
i
n
teg
r
atio
n
o
f
m
ac
h
in
e
lear
n
i
n
g
-
b
ased
en
s
em
b
le
m
eth
o
d
s
an
d
d
ee
p
b
id
ir
ec
tio
n
al
lear
n
in
g
with
in
th
e
d
o
m
ain
o
f
NL
P
,
with
a
p
ar
ticu
lar
f
o
cu
s
o
n
Ar
a
b
ic
tex
t.
Desp
ite
g
r
o
win
g
in
ter
est,
th
e
in
f
l
u
en
ce
o
f
v
ar
io
u
s
s
tem
m
in
g
tech
n
iq
u
es
o
n
class
if
icatio
n
ac
cu
r
ac
y
r
e
m
ain
s
in
s
u
f
f
icien
tly
ex
am
in
ed
.
T
o
ad
d
r
ess
th
is
g
ap
,
th
e
s
tu
d
y
co
n
d
u
cts
a
th
o
r
o
u
g
h
ev
alu
atio
n
o
f
Ar
a
b
ic
tex
t
class
if
icatio
n
ap
p
r
o
ac
h
es.
T
h
e
m
a
i
n
o
b
j
e
c
tiv
e
s
a
r
e
:
i
)
t
o
c
o
m
p
a
r
e
t
h
e
p
e
r
f
o
r
m
a
n
c
e
o
f
t
r
a
d
i
t
i
o
n
al
m
a
c
h
i
n
e
l
e
a
r
n
i
n
g
e
n
s
e
m
b
l
e
t
e
c
h
n
i
q
u
e
s
w
it
h
M
A
R
B
E
R
T
,
a
d
e
e
p
b
i
d
i
r
e
c
t
i
o
n
a
l
t
r
a
n
s
f
o
r
m
e
r
m
o
d
e
l
,
o
n
A
r
a
b
i
c
t
e
x
t
d
a
t
a
s
et
s
;
ii
)
t
o
e
v
a
l
u
a
t
e
t
h
e
e
f
f
e
c
t
o
f
d
i
f
f
e
r
e
n
t
s
te
m
m
i
n
g
m
e
t
h
o
d
s
—
n
a
m
e
l
y
Ass
e
m
,
F
a
r
a
s
a
,
a
n
d
T
as
h
a
p
h
y
n
e
—
o
n
c
l
a
s
s
i
f
ic
a
t
i
o
n
p
e
r
f
o
r
m
a
n
c
e
;
a
n
d
iii
)
t
o
p
r
o
p
o
s
e
a
r
o
b
u
s
t
p
r
e
p
r
o
c
e
s
s
i
n
g
f
r
a
m
e
w
o
r
k
t
h
a
t
s
t
a
n
d
a
r
d
i
z
es
A
r
a
b
i
c
t
e
x
t
a
n
d
a
c
c
o
m
m
o
d
a
t
e
i
ts
c
o
m
p
l
e
x
m
o
r
p
h
o
l
o
g
i
c
a
l
s
t
r
u
ct
u
r
e
.
T
h
e
m
a
i
n
c
o
n
t
r
i
b
u
t
i
o
n
s
o
f
t
h
is
s
t
u
d
y
a
r
e
a
s
f
o
l
l
o
ws
:
–
C
o
n
d
u
ct
a
co
m
p
ar
ativ
e
an
al
y
s
is
o
f
e
n
s
em
b
le
m
ac
h
in
e
l
ea
r
n
in
g
m
eth
o
d
s
v
e
r
s
u
s
tr
an
s
f
o
r
m
er
-
b
ase
d
m
o
d
els f
o
r
Ar
a
b
ic
tex
t c
lass
if
icatio
n
.
–
Ass
es
s
an
d
b
en
c
h
m
ar
k
v
a
r
io
u
s
s
tem
m
in
g
tech
n
iq
u
es
with
r
esp
ec
t
t
o
Ar
a
b
ic
m
o
r
p
h
o
lo
g
ica
l
ch
ar
ac
ter
is
tics
an
d
th
eir
im
p
ac
t o
n
class
if
icatio
n
ac
cu
r
ac
y
.
–
Dev
elo
p
a
co
m
p
r
eh
e
n
s
iv
e
p
r
e
p
r
o
ce
s
s
in
g
p
ip
elin
e
tailo
r
ed
f
o
r
Ar
ab
ic
NL
P
task
s
.
T
h
e
co
n
tr
i
b
u
tio
n
s
o
f
th
is
s
tu
d
y
ar
e
g
u
id
e
d
b
y
th
e
f
o
llo
win
g
r
esear
ch
q
u
esti
o
n
s
:
–
R
Q1
:
w
h
at
is
th
e
co
m
p
ar
ativ
e
ef
f
ec
tiv
en
ess
o
f
e
n
s
em
b
le
lea
r
n
in
g
a
n
d
b
id
ir
ec
tio
n
al
tr
a
n
s
f
o
r
m
er
m
o
d
els
in
Ar
ab
ic
tex
t c
lass
if
icatio
n
?
–
R
Q
2
:
h
o
w
d
o
v
a
r
i
o
u
s
s
te
m
m
i
n
g
m
e
t
h
o
d
s
a
f
f
e
c
t
cl
a
s
s
i
f
ic
a
t
i
o
n
p
e
r
f
o
r
m
a
n
c
e
a
c
r
o
s
s
d
i
v
e
r
s
e
A
r
a
b
i
c
d
a
t
as
e
ts
?
T
h
e
r
e
m
ain
d
er
o
f
th
is
p
ap
er
is
s
tr
u
ctu
r
ed
as
f
o
llo
ws:
s
ec
tio
n
2
r
ev
iews
r
elate
d
wo
r
k
in
th
e
f
ield
.
Sectio
n
3
o
u
tlin
es
th
e
m
eth
o
d
s
u
s
ed
in
th
is
s
tu
d
y
,
d
etailin
g
th
e
p
r
o
p
o
s
ed
ap
p
r
o
ac
h
an
d
al
g
o
r
ith
m
s
.
Sectio
n
4
p
r
esen
ts
th
e
im
p
lem
en
tatio
n
p
r
o
ce
s
s
an
d
ex
p
er
im
e
n
tal
r
esu
lts
.
Sectio
n
5
d
is
cu
s
s
es
th
e
k
e
y
f
in
d
in
g
s
an
d
th
ei
r
im
p
licatio
n
s
.
Fin
ally
,
s
ec
tio
n
6
co
n
clu
d
es th
e
s
tu
d
y
a
n
d
s
u
g
g
ests
p
o
ten
tial d
ir
ec
tio
n
s
f
o
r
f
u
tu
r
e
r
esear
ch
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J Ar
tif
I
n
tell
I
SS
N:
2252
-
8
9
3
8
A
r
a
b
ic
text
cla
s
s
ifica
tio
n
u
s
in
g
ma
ch
in
e
le
a
r
n
in
g
a
n
d
d
ee
p
…
(
R
a
w
a
d
A
w
a
d
A
l
q
a
h
ta
n
i
)
5203
2.
RE
T
AT
E
D
WO
RK
T
h
is
s
ec
tio
n
p
r
o
v
id
es
a
co
m
p
r
eh
en
s
iv
e
s
u
r
v
ey
o
f
p
r
ev
i
o
u
s
s
tu
d
ies
r
elate
d
to
th
e
class
i
f
icatio
n
o
f
n
ews
ar
ticles,
f
o
cu
s
in
g
o
n
th
e
Ar
ab
ic
lan
g
u
a
g
e.
T
h
e
r
ev
iew
i
s
p
r
esen
ted
in
th
r
ee
s
u
b
s
ec
tio
n
s
co
v
er
in
g
Ar
ab
ic
tex
t
class
if
icat
io
n
with
o
u
t
en
s
em
b
le
lear
n
in
g
an
d
Ar
a
b
ic
tex
t
clas
s
if
icatio
n
with
en
s
em
b
le
lear
n
in
g
tech
n
iq
u
es.
I
t a
ls
o
in
clu
d
es Ar
ab
ic
tex
t c
lass
if
icatio
n
u
s
in
g
d
ee
p
b
id
ir
ec
tio
n
al
tr
an
s
f
o
r
m
er
l
ea
r
n
in
g
.
2
.
1
.
Ara
bic
t
e
x
t
cla
s
s
if
ica
t
io
n us
ing
m
a
chine le
a
rning
wi
t
ho
ut
ens
em
ble le
a
rning
Sev
er
al
s
tu
d
ies
h
av
e
a
d
d
r
ess
ed
th
e
c
h
allen
g
es
o
f
Ar
ab
ic
tex
t
class
if
icatio
n
with
o
u
t
em
p
lo
y
in
g
en
s
em
b
le
lear
n
in
g
s
tr
ateg
ies.
Mu
aa
d
et
a
l.
[
7
]
co
n
d
u
cted
a
co
m
p
ar
ativ
e
s
tu
d
y
in
v
o
lv
in
g
s
ev
en
class
if
icatio
n
alg
o
r
ith
m
s
—
m
u
ltin
o
m
ial
n
aï
v
e
B
ay
es
(
MN
B
)
,
B
er
n
o
u
lli
n
aïv
e
B
ay
es
(
B
NB
)
,
s
to
ch
asti
c
g
r
ad
ien
t
d
escen
t
(
SGD)
,
lo
g
is
tic
r
eg
r
ess
io
n
,
s
u
p
p
o
r
t
v
ec
t
o
r
class
if
ier
(
SVC
)
,
lin
ea
r
SVC
,
an
d
co
n
v
o
l
u
tio
n
al
n
eu
r
al
n
etwo
r
k
s
(
C
NN)
—
to
cla
s
s
if
y
Ar
ab
ic
t
ex
t
u
s
in
g
th
e
Al
-
Kh
alee
j
d
ataset.
T
h
e
s
tu
d
y
u
tili
ze
d
th
r
ee
f
ea
tu
r
e
ex
tr
ac
tio
n
tech
n
iq
u
es:
t
er
m
f
r
eq
u
e
n
cy
-
in
v
er
s
e
d
o
cu
m
en
t
f
r
e
q
u
en
c
y
(
T
F
-
I
DF)
,
b
ag
-
of
-
wo
r
d
s
(
B
o
W
)
,
an
d
ch
a
r
ac
ter
-
lev
e
l
r
ep
r
esen
tatio
n
.
T
h
e
au
th
o
r
s
e
m
p
h
asized
th
at
d
ata
au
g
m
e
n
ta
tio
n
f
o
r
t
h
e
Ar
a
b
ic
lan
g
u
ag
e
r
em
ain
s
a
s
ig
n
if
ican
t
ch
allen
g
e
a
n
d
n
o
ted
th
at
th
e
c
h
o
ice
o
f
f
ea
t
u
r
e
r
ep
r
esen
tatio
n
tech
n
iq
u
es
p
lay
s
a
cr
itical
r
o
le
in
in
f
lu
en
cin
g
th
e
p
er
f
o
r
m
an
ce
o
f
tex
t c
lass
if
icatio
n
m
o
d
els.
T
h
e
ex
p
e
r
im
en
tal
r
esu
lts
s
h
o
wed
th
at
l
in
ea
r
SV
C
o
u
tp
er
f
o
r
m
e
d
th
e
o
th
er
m
o
d
els in
ter
m
s
o
f
class
if
icatio
n
ac
cu
r
ac
y
.
E
ln
ag
ar
et
a
l.
[
8
]
co
n
d
u
cte
d
a
co
m
p
ar
is
o
n
with
s
ev
er
al
d
ee
p
lear
n
in
g
m
o
d
els
b
ased
o
n
C
NN,
r
ec
u
r
r
en
t
n
e
u
r
al
n
etwo
r
k
s
(
R
NN)
,
lo
n
g
s
h
o
r
t
-
ter
m
m
em
o
r
y
(
L
STM
)
,
g
ated
r
ec
u
r
r
e
n
t
u
n
it
(
GR
U)
,
h
ier
ar
ch
ical
atten
tio
n
n
etwo
r
k
(
HAN)
,
a
n
d
p
r
o
p
o
s
ed
d
ee
p
lear
n
in
g
m
o
d
els
f
o
r
Ar
a
b
ic
tex
t
class
if
icatio
n
in
t
wo
d
atasets
co
r
p
u
s
:
s
in
g
le
-
lab
el
Ar
ab
ic
n
ews
ar
ticle
s
d
ataset
(
SANAD
)
an
d
n
ews
ar
ticles
d
atase
t
in
Ar
ab
ic
(
NADiA
)
.
T
h
e
au
th
o
r
s
em
p
lo
y
ed
o
n
ly
o
n
e
m
eth
o
d
o
lo
g
y
f
o
r
f
ea
tu
r
e
ex
tr
ac
tio
n
,
wh
ich
was
wo
r
d
2
Vec
em
b
ed
d
e
d
m
o
d
els.
Hig
h
lig
h
ted
th
at
m
ac
h
in
e
lear
n
in
g
ap
p
r
o
ac
h
es
em
p
lo
y
e
d
in
s
in
g
le
-
lab
el
class
if
icatio
n
d
if
f
er
f
r
o
m
t
h
o
s
e
u
s
ed
in
m
u
lti
-
lab
el
class
if
icatio
n
,
a
s
th
e
f
o
r
m
e
r
r
e
q
u
ir
e
a
d
ap
tatio
n
o
r
is
s
u
e
tr
an
s
f
o
r
m
atio
n
.
C
o
n
v
en
tio
n
al
lear
n
in
g
m
eth
o
d
s
r
eq
u
ir
ed
ad
ju
s
tm
en
t,
b
u
t
d
ee
p
lear
n
in
g
-
b
ased
m
o
d
els
r
eq
u
ir
ed
less
m
o
d
if
icatio
n
.
Div
er
s
e
s
tr
ateg
ies
wer
e
u
tili
ze
d
t
o
a
d
d
r
ess
ad
a
p
tatio
n
ch
allen
g
es,
o
n
e
o
f
wh
ic
h
in
v
o
lv
ed
co
n
v
er
tin
g
m
u
lti
-
lab
el
s
ce
n
ar
io
s
i
n
to
m
u
ltip
le
s
in
g
le
-
lab
el
in
s
tan
ce
s
.
T
h
e
ex
p
er
im
en
tal
r
esu
lts
s
h
o
wed
th
at
all
m
o
d
els
p
er
f
o
r
m
ed
well
o
n
th
e
SANAD
d
ataset,
with
th
e
atte
n
tio
n
-
GR
U
m
o
d
el
ac
h
ie
v
ed
th
e
h
ig
h
est ac
cu
r
ac
y
o
f
9
6
.
9
4
%
[
8
]
.
Mu
aa
d
et
a
l.
[
9
]
in
t
r
o
d
u
ce
d
a
n
o
v
el
d
ee
p
lear
n
in
g
-
b
ased
s
y
s
tem
ca
lled
Ar
ab
ic
co
m
p
u
ter
-
aid
e
d
r
ec
o
g
n
itio
n
(
Ar
C
AR
)
,
d
esig
n
ed
s
p
ec
if
ically
to
class
if
y
Ar
ab
ic
tex
t
u
s
in
g
ch
ar
ac
ter
-
lev
el
r
ep
r
esen
tatio
n
.
T
h
e
s
tu
d
y
ad
d
r
ess
ed
cr
itical
ch
all
en
g
es
en
co
u
n
ter
ed
b
y
tr
ad
iti
o
n
al
m
ac
h
in
e
lear
n
in
g
a
p
p
r
o
ac
h
ed
in
class
if
ied
Ar
ab
ic
tex
t,
attr
ib
u
ted
to
t
h
e
lan
g
u
ag
e'
s
co
m
p
lex
m
o
r
p
h
o
lo
g
y
a
n
d
v
a
r
iatio
n
.
T
h
ese
ch
allen
g
es
in
clu
d
e
s
tem
m
in
g
,
d
ialec
ts
,
p
h
o
n
o
l
o
g
y
,
o
r
th
o
g
r
ap
h
y
,
an
d
m
o
r
p
h
o
l
o
g
y
.
T
h
e
Ar
C
AR
s
y
s
tem
wa
s
b
u
ilt
u
s
in
g
a
d
ee
p
C
NN
to
r
ec
o
g
n
ize
Ar
ab
ic
tex
t
at
th
e
ch
ar
ac
ter
lev
el
an
d
u
n
d
er
wen
t
v
alid
atio
n
th
r
o
u
g
h
f
iv
e
-
f
o
ld
cr
o
s
s
-
v
alid
atio
n
f
o
r
d
o
cu
m
en
t
clas
s
if
icatio
n
.
T
h
e
Ar
C
AR
s
y
s
te
m
d
em
o
n
s
tr
ated
it
was
p
r
o
f
i
cien
t
in
ac
cu
r
ately
ca
teg
o
r
ized
Ar
ab
ic
tex
t
at
th
e
ch
ar
ac
ter
lev
el,
ac
h
iev
e
d
a
n
im
p
r
ess
iv
e
ac
cu
r
ac
y
o
f
9
7
.
7
6
%,
an
F
-
m
ea
s
u
r
e
-
s
co
r
e
o
f
9
2
.
6
3
%,
a
p
r
ec
is
io
n
o
f
9
2
.
7
5
%,
an
d
a
r
ec
all
o
f
9
2
%
ac
co
r
d
ed
t
o
th
e
AlKh
alee
j
-
b
al
an
ce
d
d
ataset
.
M
u
a
a
d
e
t
a
l.
[
1
0
]
p
r
o
p
o
s
ed
a
n
e
n
h
an
c
ed
m
e
th
o
d
f
o
r
Ar
ab
i
c
d
o
cu
m
en
t
c
l
a
s
s
i
f
i
c
a
t
io
n
,
e
v
a
l
u
a
t
i
n
g
t
h
e
s
a
m
e
s
e
t
o
f
m
a
ch
i
n
e
l
ea
r
n
in
g
c
l
a
s
s
i
f
i
er
s
m
e
n
t
io
n
e
d
e
a
r
l
i
e
r
,
i
n
c
lu
d
ed
M
N
B
,
B
N
B
,
S
G
D
,
l
o
g
i
s
t
i
c
r
e
g
r
e
s
s
i
o
n
,
S
V
C
,
L
in
e
ar
S
V
C
,
a
n
d
C
N
N
o
n
th
e
A
l
-
Kh
a
l
ee
j
d
a
t
a
s
e
t
.
T
h
i
s
s
t
u
d
y
f
o
c
u
s
ed
o
n
o
p
t
im
i
z
i
n
g
t
h
e
f
e
a
t
u
r
e
en
g
in
e
e
r
in
g
p
r
o
c
e
s
s
,
e
m
p
lo
y
in
g
B
o
W
,
T
F
-
I
D
F
,
an
d
c
h
ar
a
c
t
er
-
l
ev
e
l
f
e
a
t
u
r
e
s
.
E
x
p
e
r
i
m
e
n
t
a
l
r
e
s
u
l
t
s
r
e
v
e
a
le
d
t
h
a
t
th
e
C
N
N
m
o
d
e
l
u
s
i
n
g
c
h
ar
a
c
t
er
-
l
ev
e
l
r
ep
r
e
s
en
t
a
t
i
o
n
s
a
ch
i
ev
e
d
t
h
e
h
i
g
h
e
s
t
a
c
c
u
r
a
c
y
,
r
e
a
ch
i
n
g
9
8
%
,
t
h
e
r
eb
y
d
e
m
o
n
s
t
r
a
t
in
g
t
h
e
e
f
f
e
c
t
i
v
e
n
e
s
s
o
f
d
e
e
p
l
ea
r
n
i
n
g
ar
ch
i
t
e
c
t
u
r
e
s
i
n
h
a
n
d
l
i
n
g
th
e
c
o
m
p
l
ex
i
t
i
e
s
o
f
Ar
a
b
i
c
l
an
g
u
a
g
e
.
2
.
2
.
Ara
bic
t
e
x
t
cla
s
s
if
ica
t
io
n us
ing
m
a
chine le
a
rning
wi
t
h e
ns
em
ble le
a
rning
R
ec
en
t
r
esear
ch
h
as
d
em
o
n
s
tr
ated
th
e
ef
f
ec
tiv
en
ess
o
f
en
s
e
m
b
le
lear
n
in
g
tech
n
iq
u
es
in
i
m
p
r
o
v
i
n
g
th
e
p
er
f
o
r
m
a
n
ce
o
f
Ar
a
b
ic
te
x
t
class
if
icatio
n
m
o
d
els,
p
ar
ti
cu
lar
ly
f
o
r
n
ews
ar
ticles.
Sab
r
i
et
a
l.
[
1
1
]
a
p
p
lied
en
s
em
b
le
lear
n
in
g
s
tr
ateg
ies
to
th
e
task
o
f
au
to
m
atic
Ar
ab
ic
n
ews
class
if
icat
io
n
.
T
h
e
au
t
h
o
r
s
ev
alu
ated
th
e
p
er
f
o
r
m
an
ce
o
f
s
ev
er
al
b
ase
c
lass
if
ier
s
,
in
clu
d
in
g
d
ec
is
io
n
t
r
ee
(
DT
)
,
n
aïv
e
B
ay
es
(
NB
)
,
k
-
n
ea
r
est
n
ei
g
h
b
o
r
s
(
KNN)
,
an
d
m
u
ltil
ay
er
p
e
r
c
ep
tr
o
n
(
ML
P)
,
in
c
o
n
ju
n
ctio
n
with
en
s
em
b
le
tech
n
iq
u
es
s
u
ch
as
b
ag
g
in
g
,
b
o
o
s
tin
g
,
s
tack
in
g
,
an
d
v
o
tin
g
.
T
h
e
m
o
d
els
wer
e
test
ed
o
n
th
r
ee
wid
ely
r
ec
o
g
n
ized
Ar
ab
ic
b
en
ch
m
a
r
k
d
atasets
:
W
AT
AN
-
2
0
0
4
,
KHAL
E
E
J
-
2
0
0
4
,
an
d
ANT
C
o
r
p
u
s
.
T
h
e
s
tu
d
y
illu
s
tr
ated
th
e
ad
v
an
tag
es
o
f
en
s
em
b
le
lear
n
in
g
in
m
itig
ati
n
g
is
s
u
es
s
u
ch
as
b
ia
s
,
o
v
er
f
itti
n
g
,
an
d
n
o
is
e,
wh
ile
en
h
an
ci
n
g
m
o
d
el
d
i
v
er
s
ity
,
r
o
b
u
s
tn
ess
,
an
d
s
ca
lab
ilit
y
.
Am
o
n
g
th
e
en
s
em
b
le
s
tr
ateg
ies,
s
tack
in
g
an
d
v
o
tin
g
y
ield
ed
t
h
e
b
est
r
esu
lts
,
wit
h
s
tack
in
g
ac
h
iev
in
g
th
e
h
ig
h
e
s
t
ac
cu
r
ac
y
o
f
9
5
.
2
0
%
o
n
th
e
ANT
C
o
r
p
u
s
d
ataset.
Ad
d
it
io
n
ally
,
th
e
v
o
tin
g
ap
p
r
o
ac
h
s
ig
n
if
ican
tly
im
p
r
o
v
ed
class
if
icatio
n
ac
cu
r
ac
y
,
a
ch
iev
in
g
9
3
.
2
4
%
o
n
th
e
KH
AL
E
E
J
-
2
0
0
4
d
ataset
an
d
9
2
.
1
5
% o
n
th
e
W
AT
AN
-
2
0
0
4
d
ataset
.
M
o
h
am
m
ed
an
d
K
o
r
a
[
1
2
]
ex
p
l
o
r
ed
th
e
im
p
o
r
t
an
c
e
o
f
en
s
e
m
b
l
e
le
a
r
n
i
n
g
an
d
d
e
e
p
l
ea
r
n
in
g
f
o
r
e
n
h
an
c
ed
t
e
x
t
c
la
s
s
i
f
i
c
a
t
i
o
n
.
T
h
e
y
id
en
t
i
f
i
ed
t
h
e
s
e
l
ec
t
i
o
n
o
f
a
n
o
p
t
im
a
l
d
e
e
p
l
ea
r
n
in
g
c
l
a
s
s
i
f
i
e
r
a
s
a
k
ey
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
2
5
2
-
8
9
3
8
I
n
t J Ar
tif
I
n
tell
,
Vo
l.
14
,
No
.
6
,
Dec
em
b
er
20
25
:
5
2
0
1
-
5
2
1
7
5204
c
h
a
l
l
en
g
e.
T
h
e
s
t
u
d
y
p
r
o
p
o
s
e
d
a
n
en
s
e
m
b
le
a
p
p
r
o
a
c
h
to
i
m
p
r
o
v
e
c
l
a
s
s
i
f
i
ca
t
i
o
n
e
f
f
e
ct
i
v
e
n
e
s
s
a
c
r
o
s
s
s
i
x
d
a
t
a
s
e
t
s
in
Ar
a
b
ic
an
d
E
n
g
l
is
h
,
in
c
l
u
d
i
n
g
t
h
e
A
r
ab
i
c
T
wi
t
t
e
r
C
o
r
p
u
s
,
A
J
G
T
,
I
M
D
B
r
e
v
i
e
w
s
,
S
e
m
E
v
a
l
,
C
O
V
I
D
-
19
f
ak
e
n
e
w
s
d
e
te
c
t
i
o
n
,
an
d
Ar
S
a
r
c
a
s
m
d
at
a
s
e
t
s
.
S
e
v
e
r
a
l
d
ee
p
l
ea
r
n
in
g
m
o
d
e
l
s
w
e
r
e
i
m
p
l
e
m
e
n
t
ed
,
i
n
c
lu
d
in
g
L
S
T
M
,
G
R
U
,
C
N
N
,
G
R
U
-
C
N
N
,
L
S
T
M
-
C
N
N
,
a
n
d
an
d
b
id
i
r
ec
t
i
o
n
a
l
l
o
n
g
s
h
o
r
t
-
t
e
r
m
m
e
m
o
r
y
(
B
i
L
S
T
M
)
,
a
s
w
e
l
l
a
s
e
n
s
e
m
b
l
e
t
e
ch
n
iq
u
e
s
s
u
c
h
a
s
v
o
t
i
n
g
an
d
s
t
a
ck
i
n
g
.
T
h
e
r
e
s
u
l
t
s
s
h
o
we
d
t
h
a
t
th
e
en
s
e
m
b
l
e
t
e
ch
n
iq
u
e
s
i
g
n
i
f
i
ca
n
t
ly
i
m
p
r
o
v
ed
th
e
c
la
s
s
i
f
i
c
a
t
i
o
n
p
r
e
c
i
s
i
o
n
o
f
th
e
in
i
t
i
a
l
d
e
e
p
m
o
d
e
l
s
a
n
d
o
u
t
p
er
f
o
r
m
e
d
th
e
m
o
s
t
a
d
v
a
n
ce
d
e
n
s
e
m
b
l
e
m
e
th
o
d
s
.
E
x
p
er
i
m
en
t
a
l
r
e
s
u
l
t
s
d
em
o
n
s
t
r
a
t
e
d
t
h
a
t
t
h
e
e
n
s
e
m
b
le
ap
p
r
o
a
ch
s
i
g
n
i
f
i
ca
n
t
l
y
en
h
a
n
c
ed
th
e
c
l
a
s
s
i
f
ic
a
t
io
n
p
r
e
c
i
s
io
n
a
n
d
o
u
t
p
e
r
f
o
r
m
e
d
i
n
d
i
v
id
u
a
l
d
e
e
p
l
e
a
r
n
in
g
m
o
d
e
l
s
in
m
o
s
t
c
a
s
es
.
T
h
e
A
r
ab
i
c
c
o
r
p
u
s
a
ch
i
e
v
e
d
th
e
h
i
g
h
e
s
t
c
la
s
s
i
f
i
c
a
t
i
o
n
a
c
c
u
r
a
c
y
o
f
9
3
.
2
%
u
s
i
n
g
t
h
e
p
r
o
p
o
s
e
d
e
n
s
em
b
l
e
t
e
c
h
n
iq
u
e
.
Ah
m
ad
et
a
l.
[
1
3
]
f
o
c
u
s
ed
o
n
f
a
k
e
n
ews
d
etec
tio
n
an
d
e
v
alu
ated
t
h
e
ef
f
ec
tiv
en
ess
o
f
en
s
em
b
l
e
lear
n
in
g
s
tr
ateg
ies
s
u
ch
as
b
a
g
g
in
g
,
b
o
o
s
tin
g
,
an
d
v
o
tin
g
.
T
h
e
au
th
o
r
s
u
s
ed
th
e
I
SOT
f
ak
e
n
ews
d
ataset
an
d
co
m
p
ar
ed
th
e
p
er
f
o
r
m
a
n
ce
o
f
m
u
ltip
le
class
if
ier
s
,
in
clu
d
i
n
g
l
in
ea
r
s
u
p
p
o
r
t
v
ec
t
o
r
m
ac
h
i
n
es
(
SVM)
,
C
NN,
an
d
B
iLST
M
n
etwo
r
k
s
.
T
h
e
en
s
em
b
le
-
en
h
a
n
ce
d
m
o
d
el
ac
h
iev
ed
an
ac
cu
r
ac
y
o
f
ap
p
r
o
x
im
ately
0
.
9
9
o
n
th
e
I
SOT
d
ataset
an
d
0
.
9
6
o
n
t
h
e
DS3
d
ataset,
in
d
icatin
g
t
h
e
ef
f
ec
tiv
e
n
ess
o
f
en
s
em
b
l
e
m
eth
o
d
s
in
te
x
t
class
if
icatio
n
f
o
r
b
o
th
f
ac
tu
al
an
d
d
ec
e
p
tiv
e
co
n
te
n
t
.
Ak
h
ad
am
a
n
d
Ay
y
ad
[
1
4
]
c
o
n
tr
ib
u
ted
to
th
e
f
ield
o
f
Ar
ab
ic
tex
t
class
if
icatio
n
b
y
p
r
o
p
o
s
in
g
a
co
m
p
r
eh
e
n
s
iv
e
p
r
o
ce
s
s
in
g
p
i
p
elin
e
th
at
in
co
r
p
o
r
ates
m
u
ltip
le
p
r
ep
r
o
ce
s
s
in
g
tech
n
i
q
u
es
aim
ed
at
im
p
r
o
v
in
g
class
if
icatio
n
ac
cu
r
ac
y
.
Fo
r
f
e
atu
r
e
ex
t
r
ac
tio
n
,
th
e
s
tu
d
y
u
tili
ze
d
b
o
th
B
o
W
an
d
T
F
-
I
DF
m
eth
o
d
s
.
A
r
an
g
e
o
f
m
ac
h
in
e
lear
n
in
g
an
d
d
ee
p
lear
n
in
g
m
o
d
els
wer
e
ev
alu
ate
d
,
in
clu
d
in
g
l
o
g
is
tic
r
eg
r
ess
io
n
,
MN
B
,
B
N
B
,
lin
ea
r
SVC
,
SGD,
S
VC
,
an
d
C
NN.
T
h
e
ex
p
er
im
en
ts
wer
e
co
n
d
u
cted
u
s
in
g
th
e
Al
-
Kh
alee
j
d
ataset.
Am
o
n
g
th
e
m
o
d
els
test
ed
,
th
e
C
NN
with
wo
r
d
-
lev
el
r
e
p
r
esen
tatio
n
an
d
s
tem
m
in
g
ac
h
iev
e
d
th
e
h
ig
h
est
class
if
icat
io
n
ac
cu
r
ac
y
,
r
ea
ch
in
g
9
7
%
.
2
.
3
.
Ara
bic
t
e
x
t
cla
s
s
if
ica
t
io
n us
ing
dee
p bid
irec
t
io
na
l t
r
a
ns
f
o
rm
er
lea
rning
T
h
e
tr
an
s
f
o
r
m
er
-
b
ased
m
o
d
els
h
av
e
s
ig
n
if
ican
tly
en
h
an
ce
d
th
e
p
er
f
o
r
m
an
ce
o
f
Ar
ab
ic
tex
t
class
if
icatio
n
,
p
ar
ticu
lar
ly
th
r
o
u
g
h
th
e
d
e
v
elo
p
m
e
n
t
o
f
l
an
g
u
ag
e
m
o
d
els
tailo
r
ed
to
th
e
lin
g
u
is
tic
an
d
m
o
r
p
h
o
lo
g
ical
ch
a
r
ac
ter
is
tics
o
f
Ar
ab
ic
as
o
u
tlin
ed
.
Ma
g
ee
d
et
a
l.
[
1
5
]
in
tr
o
d
u
ce
d
t
wo
Ar
ab
ic
-
s
p
ec
if
ic
tr
an
s
f
o
r
m
er
-
b
ased
m
o
d
els,
A
R
B
E
R
T
an
d
MA
R
B
E
R
T
,
d
ev
elo
p
ed
to
a
d
d
r
ess
th
e
s
h
o
r
tc
o
m
in
g
s
o
f
ex
is
tin
g
m
u
ltil
in
g
u
al
m
ask
ed
lan
g
u
ag
e
m
o
d
els
(
ML
Ms)
s
u
ch
as
m
B
E
R
T
,
XL
M
-
R
,
an
d
Ar
aB
E
R
T
in
p
r
o
ce
s
s
in
g
Ar
ab
ic
tex
t.
T
o
ev
al
u
ate
th
e
ef
f
ec
tiv
en
ess
o
f
t
h
ese
m
o
d
els
ac
r
o
s
s
d
iv
er
s
e
Ar
a
b
ic
NL
U
task
s
,
th
e
au
th
o
r
s
p
r
o
p
o
s
ed
a
co
m
p
r
eh
en
s
iv
e
b
e
n
ch
m
ar
k
,
Ar
B
en
ch
,
s
p
ec
if
ical
ly
d
esig
n
ed
f
o
r
m
u
lti
-
d
ialec
ta
l
Ar
ab
ic.
Ar
B
en
ch
co
m
p
r
is
es
4
1
d
atasets
s
p
an
n
i
n
g
f
i
v
e
m
ajo
r
NL
U
task
s
:
i
)
s
en
tim
en
t
an
aly
s
is
(
AJGT,
Ar
aNE
T
,
Ar
aSen
T
i
-
T
wee
t,
Ar
Sar
ca
s
m
,
Ar
SAS,
Ar
Sen
D
-
L
E
V,
ASTD
,
ASTD
-
B
,
AW
AT
I
F,
B
B
N,
HA
R
D,
L
AB
R
,
SAMA
R
,
Sem
E
v
al,
SYTS
d
atasets
)
;
ii
)
s
o
cial
m
ea
n
in
g
p
r
ed
ictio
n
(
Ar
ap
-
T
wee
t,
Ar
aDa
n
g
,
Ar
a
NE
T
,
Ar
ap
-
T
wee
t,
OSAC
T
-
B
,
FIRE2
0
1
9
,
OSAC
T
-
A,
an
d
Ar
aSar
ca
s
m
d
ata
s
ets)
;
iii
)
to
p
ic
clas
s
if
icatio
n
(
Ar
ab
ic
n
ews
tex
t,
Kh
alee
j,
an
d
OSAC
d
atasets
)
;
iv
)
d
ialec
t
id
e
n
tific
atio
n
(
AOC,
Ar
Sar
ca
s
m
,
MA
DAR
-
T
L
,
NADI
,
an
d
QADI
)
;
an
d
v
)
n
a
m
ed
en
tity
r
ec
o
g
n
iti
o
n
(
ANE
R
C
o
r
p
,
AC
E
-
2
0
0
3
B
N,
AC
E
-
2
0
0
3
B
N,
AC
E
-
2
0
0
4
B
N,
an
d
T
W
-
NE
R
d
ataset)
.
E
x
p
er
im
en
tal
r
esu
lt
s
s
h
o
wed
th
at
AR
B
E
R
T
a
n
d
MA
R
B
E
R
T
co
n
s
i
s
ten
tly
o
u
tp
er
f
o
r
m
ed
t
h
e
m
u
ltil
in
g
u
al
an
d
ea
r
lier
Ar
a
b
ic
m
o
d
els
ac
r
o
s
s
th
ese
tas
k
s
.
T
h
e
au
th
o
r
s
em
p
h
asized
th
e
im
p
o
r
tan
ce
o
f
Ar
B
en
ch
as
a
s
tan
d
ar
d
ized
ev
alu
atio
n
f
r
a
m
ewo
r
k
f
o
r
Ar
ab
ic
NL
U
an
d
h
ig
h
lig
h
ted
th
e
s
ig
n
if
ican
t
co
n
tr
ib
u
tio
n
s
o
f
AR
B
E
R
T
an
d
MA
R
B
E
R
T
in
ad
v
an
cin
g
la
n
g
u
ag
e
m
o
d
ellin
g
f
o
r
Ar
a
b
ic
.
B
ah
u
r
m
u
z
et
a
l.
[
1
6
]
in
v
esti
g
ated
th
e
ap
p
licatio
n
o
f
tr
an
s
f
o
r
m
er
-
b
ased
d
ee
p
lear
n
in
g
m
o
d
els
f
o
r
Ar
ab
ic
r
u
m
o
r
d
etec
tio
n
,
em
p
lo
y
in
g
a
r
an
g
e
o
f
p
r
e
-
tr
ai
n
ed
m
o
d
els
in
cl
u
d
in
g
Ar
aB
E
R
T
,
MA
R
B
E
R
T
,
Ar
E
lectr
a,
Ar
B
E
R
T
,
an
d
m
B
E
R
T
.
T
h
e
s
tu
d
y
u
tili
ze
d
th
r
ee
Ar
ab
ic
-
lan
g
u
ag
e
d
atasets
f
o
r
m
o
d
el
tr
ain
i
n
g
a
n
d
ev
alu
atio
n
:
a
r
u
m
o
r
v
s
.
n
o
n
-
r
u
m
o
r
twee
ts
d
ataset,
a
g
en
e
r
a
l
f
ak
e
n
ews
d
etec
tio
n
d
ataset,
an
d
a
C
OVI
D
-
1
9
m
is
in
f
o
r
m
atio
n
d
ataset.
Am
o
n
g
th
e
m
o
d
els
ev
alu
ated
,
MA
R
B
E
R
T
d
em
o
n
s
tr
ated
s
u
p
er
io
r
p
er
f
o
r
m
an
ce
o
v
e
r
Ar
aBER
T
in
ter
m
s
o
f
class
if
icatio
n
ac
cu
r
ac
y
.
T
o
ad
d
r
ess
th
e
is
s
u
e
o
f
d
ataset
im
b
al
an
ce
,
th
e
a
u
th
o
r
s
em
p
lo
y
ed
r
esam
p
lin
g
tech
n
i
q
u
es
an
d
co
n
d
u
cted
h
y
p
er
p
a
r
am
eter
tu
n
in
g
to
o
p
tim
ize
m
o
d
el
p
er
f
o
r
m
an
c
e.
T
h
e
h
y
p
er
p
a
r
am
eter
s
u
s
ed
f
o
r
b
o
th
Ar
aBER
T
an
d
MA
R
B
E
R
T
in
clu
d
ed
a
n
em
b
ed
d
in
g
s
ize
o
f
1
0
0
,
b
atch
s
izes
o
f
4
0
f
o
r
Ar
aBER
T
an
d
3
2
f
o
r
MA
R
B
E
R
T
,
8
tr
ain
in
g
ep
o
ch
s
,
an
d
a
lear
n
in
g
r
ate
o
f
5
e
-
5
.
As
a
r
esu
lt
o
f
th
ese
o
p
tim
izatio
n
s
,
b
o
th
Ar
a
B
E
R
T
an
d
MA
R
B
E
R
T
ac
h
iev
ed
a
m
a
x
im
u
m
class
if
icatio
n
a
cc
u
r
ac
y
o
f
9
7
%
o
n
th
e
ev
alu
ated
d
atasets
.
Nass
if
et
a
l.
[
1
7
]
co
n
d
u
cted
a
co
m
p
r
eh
e
n
s
iv
e
s
tu
d
y
o
n
Ar
ab
ic
f
ak
e
n
ews
d
etec
tio
n
u
s
in
g
d
ee
p
co
n
tex
tu
alize
d
e
m
b
ed
d
in
g
m
o
d
els.
T
h
e
au
th
o
r
s
d
e
v
elo
p
ed
a
n
d
ev
alu
at
ed
tr
an
s
f
o
r
m
er
-
b
ased
class
if
ier
s
u
s
in
g
eig
h
t
s
tate
-
of
-
th
e
-
a
r
t
Ar
ab
ic
-
lan
g
u
ag
e
p
r
e
-
tr
ain
e
d
m
o
d
els:
Ar
aBer
t,
Qar
ib
B
er
t,
AR
B
E
R
T
,
MA
R
B
E
R
T
,
Ar
ab
ic
-
B
E
R
T
,
Ar
ab
er
t,
Gig
a
B
E
R
T
v
4
,
an
d
XL
M
-
R
o
b
er
ta.
T
h
e
s
tu
d
y
em
p
lo
y
ed
two
d
ata
s
ets:
th
e
f
ir
s
t
was
an
o
r
ig
in
al
Ar
ab
ic
f
a
k
e
n
ews
d
at
aset,
co
llected
v
ia
web
s
cr
ap
in
g
f
r
o
m
Ar
ab
ic
T
witter
p
o
s
ts
;
th
e
s
ec
o
n
d
was
an
E
n
g
lis
h
-
lan
g
u
a
g
e
f
a
k
e
n
ews
d
ataset
s
o
u
r
ce
d
f
r
o
m
Ka
g
g
le,
wh
ich
was
tr
an
s
lated
in
t
o
A
r
ab
ic
to
e
x
p
an
d
th
e
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J Ar
tif
I
n
tell
I
SS
N:
2252
-
8
9
3
8
A
r
a
b
ic
text
cla
s
s
ifica
tio
n
u
s
in
g
ma
ch
in
e
le
a
r
n
in
g
a
n
d
d
ee
p
…
(
R
a
w
a
d
A
w
a
d
A
l
q
a
h
ta
n
i
)
5205
tr
ain
in
g
d
ata.
T
h
e
p
r
e
-
tr
ain
e
d
m
o
d
els
wer
e
f
in
e
-
tu
n
e
d
b
y
ad
ju
s
tin
g
p
ar
am
ete
r
s
s
u
ch
as
o
p
tim
izer
,
lear
n
i
n
g
r
ate,
n
u
m
b
er
o
f
ep
o
ch
s
,
a
n
d
d
r
o
p
o
u
t
v
alu
e.
T
h
e
s
tu
d
y
u
s
ed
ADAM
W
as
th
e
o
p
tim
izer
,
1
e
-
5
as
th
e
lear
n
in
g
r
ate,
th
e
n
u
m
b
er
o
f
e
p
o
ch
s
f
r
o
m
1
to
1
0
0
,
an
d
0
.
2
3
as
th
e
d
r
o
p
o
u
t
v
alu
e
.
T
h
e
A
r
ab
ic
-
B
E
R
T
an
d
AR
B
E
R
T
m
o
d
els o
u
tp
e
r
f
o
r
m
ed
o
t
h
er
m
o
d
els,
ac
h
iev
in
g
h
ig
h
ac
cu
r
ac
y
r
atin
g
s
o
f
9
8
%.
3.
M
E
T
H
O
D
T
h
e
s
ec
tio
n
ex
p
lo
r
es
p
r
e
-
p
r
o
c
ess
in
g
m
eth
o
d
s
an
d
p
r
o
ce
d
u
r
es
f
o
r
tex
tu
al
d
ata.
I
t
e
n
co
m
p
ass
es
ta
s
k
s
s
u
ch
as
clea
n
s
in
g
an
d
s
tan
d
a
r
d
izin
g
th
e
tex
t
with
in
th
e
d
a
taset,
em
p
lo
y
in
g
d
if
f
er
e
n
t
s
tem
m
in
g
tech
n
iq
u
es,
u
tili
zin
g
m
ac
h
i
n
e
lear
n
in
g
an
d
d
ee
p
lear
n
in
g
al
g
o
r
ith
m
s
,
an
d
em
p
lo
y
in
g
ev
alu
atio
n
m
et
h
o
d
s
.
Fig
u
r
e
1
d
ep
icts
th
e
s
eq
u
en
tial p
h
ases
th
at
co
n
s
titu
te
th
e
p
r
o
p
o
s
ed
s
o
lu
tio
n
.
Fig
u
r
e
1
.
Fiv
e
-
p
h
ase
m
eth
o
d
o
l
o
g
ical
f
r
am
ewo
r
k
f
o
r
th
e
p
r
o
p
o
s
ed
s
o
lu
tio
n
3
.
1
.
Da
t
a
c
o
llect
io
n
T
h
e
SANAD
d
ata
co
llectio
n
is
s
p
ec
ially
d
esig
n
ed
f
o
r
A
r
ab
i
c
tex
t
ca
teg
o
r
izatio
n
.
T
h
e
s
et
i
s
h
u
g
e
an
d
class
if
ied
,
in
clu
d
in
g
s
ev
er
al
Ar
ab
ic
ar
ticles.
T
h
ese
ar
ticles
ar
e
co
m
m
o
n
ly
class
if
ied
in
to
s
ev
er
al
s
u
b
jects
s
u
ch
as
p
o
liti
cs,
ec
o
n
o
m
y
,
s
p
o
r
ts
,
an
d
cu
ltu
r
e.
SANAD
is
em
p
l
o
y
ed
b
y
r
esear
ch
er
s
an
d
d
ev
e
lo
p
er
s
to
tr
ai
n
an
d
ass
es
s
m
ac
h
in
e
lear
n
in
g
alg
o
r
ith
m
s
f
o
r
th
e
ca
teg
o
r
izatio
n
o
f
Ar
ab
ic
tex
t,
s
p
ec
if
ically
ar
ti
cles.
SANAD
h
a
s
t
h
r
ee
co
llectio
n
s
o
u
r
ce
s
: A
l A
r
ab
ia,
Al
Kh
alee
j,
an
d
Ak
h
b
ar
o
n
a
-
Alan
b
a.
T
h
e
Al
Kh
alee
j d
a
taset,
a
co
n
s
titu
en
t
o
f
th
e
SANAD
co
r
p
u
s
,
is
d
er
i
v
ed
f
r
o
m
th
e
Al
Kh
alee
j
n
ewsp
ap
er
a
n
d
s
er
v
es
as
a
cr
itical
r
eso
u
r
ce
f
o
r
a
wid
e
r
an
g
e
o
f
NL
P
ap
p
licatio
n
s
in
Ar
ab
ic.
I
t
is
p
ar
ticu
lar
ly
u
tili
z
ed
in
task
s
s
u
ch
as
tex
t
class
i
f
icatio
n
,
s
en
tim
en
t
an
aly
s
is
,
an
d
lan
g
u
ag
e
m
o
d
eli
n
g
,
w
h
er
e
h
ig
h
-
q
u
ality
Ar
ab
ic
tex
tu
al
d
ata
is
ess
en
tial.
T
h
e
d
ataset
co
n
s
is
ts
o
f
o
v
er
4
5
,
5
0
0
ar
ticles
p
u
b
lis
h
e
d
b
etwe
en
2
0
0
8
a
n
d
2
0
1
8
,
with
ap
p
r
o
x
im
ately
6
,
5
0
0
d
o
c
u
m
en
ts
p
er
ca
teg
o
r
y
.
T
h
e
co
r
p
u
s
is
s
y
s
tem
atica
lly
o
r
g
an
ized
i
n
to
s
ev
en
ca
teg
o
r
ies:
cu
ltu
r
e,
tech
n
o
lo
g
y
,
p
o
liti
cs,
m
ed
icin
e,
s
p
o
r
ts
,
f
in
an
ce
,
an
d
r
elig
io
n
,
th
er
eb
y
en
ab
lin
g
r
o
b
u
s
t e
v
al
u
atio
n
ac
r
o
s
s
d
iv
er
s
e
co
n
ten
t d
o
m
ain
s
[
1
8
]
.
3
.
2
.
Da
t
a
prepro
ce
s
s
ing
T
h
e
p
r
e
p
r
o
ce
s
s
in
g
p
h
ase
in
cl
u
d
es
Ar
ab
ic
tex
t
n
o
r
m
aliza
tio
n
an
d
tex
t
clea
n
in
g
.
I
t
also
c
o
v
er
s
tex
t
en
co
d
in
g
an
d
to
k
en
izatio
n
.
I
n
ad
d
itio
n
,
th
r
ee
d
is
tin
ct
s
tem
m
in
g
m
eth
o
d
s
ar
e
im
p
lem
e
n
ted
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
2
5
2
-
8
9
3
8
I
n
t J Ar
tif
I
n
tell
,
Vo
l.
14
,
No
.
6
,
Dec
em
b
er
20
25
:
5
2
0
1
-
5
2
1
7
5206
3
.
2
.
1
.
Ara
bic
t
ex
t
no
r
m
a
liza
t
io
n,
clea
nin
g
,
a
nd
enco
din
g
T
ex
t
n
o
r
m
aliza
tio
n
is
a
f
u
n
d
am
en
tal
s
tep
in
NL
P,
aim
ed
at
co
n
v
er
tin
g
tex
t
u
al
d
a
ta
in
to
a
s
tan
d
ar
d
ized
an
d
u
n
if
o
r
m
f
o
r
m
at.
I
n
th
e
c
o
n
tex
t
o
f
Ar
ab
ic,
th
is
p
r
o
ce
s
s
p
r
esen
ts
u
n
iq
u
e
c
h
allen
g
es
d
u
e
to
t
h
e
lan
g
u
ag
e'
s
in
tr
icate
wo
r
d
s
tr
u
c
tu
r
e
an
d
m
o
r
p
h
o
lo
g
y
.
No
r
m
aliza
tio
n
is
p
ar
ticu
lar
ly
v
ital in
MSA,
in
v
o
lv
in
g
th
e
s
u
b
s
titu
tio
n
o
f
s
p
ec
if
ic
ch
ar
ac
ter
s
with
alter
n
ativ
es
an
d
th
e
r
em
o
v
al
o
f
ce
r
tain
elem
en
ts
,
p
r
im
ar
ily
f
r
e
q
u
en
t
co
n
ju
n
ctio
n
s
[
8
]
.
E
x
am
p
les o
f
p
o
ten
tial a
ctio
n
s
f
o
r
Ar
ab
ic
te
x
t n
o
r
m
aliza
tio
n
in
clu
d
e:
–
R
ep
lacin
g
d
if
f
er
e
n
t f
o
r
m
s
o
f
h
am
za
ted
alif
(
آ
,
أ ,
إ
)
with
alif
b
ar
e
(
ا
)
with
o
u
t
h
am
za
.
–
R
ep
lace
th
e
f
in
al
letter
o
f
th
e
wo
r
d
,
alif
m
a
q
s
u
r
a
(
ى
)
,
with
y
aa
(
ي
).
–
R
em
o
v
e
th
e
f
ir
s
t
waaw
(
و
)
ch
a
r
ac
ter
,
if
th
er
e
ar
e
th
r
ee
o
r
m
o
r
e
ch
ar
ac
ter
s
lef
t.
Fu
r
th
er
m
o
r
e
,
Ar
ab
ic
tex
t
p
ar
t
icu
lar
ly
f
r
o
m
in
f
o
r
m
al
s
o
u
r
ce
s
o
f
ten
in
clu
d
es
n
o
n
-
s
tan
d
ar
d
s
y
m
b
o
ls
s
u
ch
as
q
u
o
tatio
n
m
ar
k
s
,
p
ar
en
th
eses
,
aster
is
k
s
,
an
d
p
u
n
ct
u
atio
n
.
T
h
ese
ch
ar
ac
ter
s
ar
e
t
y
p
ically
co
n
s
id
er
e
d
n
o
is
e
d
u
r
in
g
p
r
ep
r
o
ce
s
s
in
g
an
d
ar
e
r
em
o
v
e
d
o
r
r
ep
lace
d
wi
th
wh
ites
p
ac
e.
Ad
d
itio
n
ally
,
r
ep
ea
ted
ch
ar
ac
ter
s
u
s
ed
f
o
r
(
e.
g
.
,
“
كو
وو
ور
ب
م
”)
ar
e
n
o
r
m
alize
d
t
o
th
eir
b
ase
f
o
r
m
(
“
كو
ر
ب
م
”)
.
T
h
is
p
r
o
ce
s
s
o
f
clea
n
in
g
th
e
te
x
t
s
ig
n
if
ican
tly
r
ed
u
ce
s
n
o
is
e
an
d
h
elp
s
r
esto
r
e
w
o
r
d
s
to
th
eir
n
atu
r
al
f
o
r
m
,
u
ltima
tely
en
h
an
ci
n
g
th
e
p
er
f
o
r
m
an
ce
an
d
ac
c
u
r
ac
y
o
f
tex
t
class
if
icatio
n
m
o
d
els
[
1
9
]
.
Fo
r
ex
am
p
le,
th
e
p
h
r
ase:
(
ت
ن
و
اع
ت
»
ج
ي
ل
خ
ل
ا
«
:يب
د
يع
ان
ص
ل
ا
يب
د
ع
م
ج
م
ع
م
»
و
ك
سل
د
«
»)
af
t
er
n
o
is
e
r
em
o
v
al
is
tr
a
n
s
f
o
r
m
ed
in
to
:
(
يع
ان
ص
ل
ا
يب
د
ع
م
ج
م
ع
م
و
ك
سل
د
ت
ن
و
اع
ت
ج
ي
ل
خ
ل
ا
يب
د
)
,
d
em
o
n
s
tr
atin
g
h
o
w
th
e
r
em
o
v
al
o
f
p
u
n
ctu
atio
n
a
n
d
ex
tr
an
e
o
u
s
s
y
m
b
o
ls
p
r
eser
v
es
th
e
s
e
m
an
tic
in
teg
r
ity
o
f
th
e
tex
t w
h
ile
r
ed
u
ci
n
g
n
o
is
e.
Sto
p
wo
r
d
r
em
o
v
al:
s
to
p
wo
r
d
s
ar
e
f
r
eq
u
e
n
tly
o
cc
u
r
r
in
g
w
o
r
d
s
th
at
h
o
ld
litt
le
s
em
an
tic
v
alu
e.
T
h
ey
ar
e
ty
p
ically
r
em
o
v
ed
d
u
r
in
g
tex
t
p
r
ep
r
o
ce
s
s
in
g
to
m
in
im
ize
n
o
is
e
an
d
im
p
r
o
v
e
th
e
ef
f
ec
tiv
en
ess
o
f
d
o
wn
s
tr
ea
m
NL
P
task
s
.
C
o
m
m
o
n
Ar
ab
ic
s
to
p
w
o
r
d
s
in
clu
d
e
“
يف
”
(
in
)
,
“
ىلإ
”
(
to
)
,
“
ل
”
(
f
o
r
)
,
an
d
“
نع
”
(
ab
o
u
t)
.
T
h
e
p
r
o
ce
s
s
o
f
s
to
p
wo
r
d
r
e
m
o
v
al
en
tails
f
ilter
in
g
o
u
t
th
ese
ter
m
s
f
r
o
m
tex
t,
as
th
ey
u
s
u
ally
d
o
n
o
t
ad
d
s
ig
n
if
ican
t m
ea
n
in
g
[
7
]
.
C
ateg
o
r
ical
d
ata
is
co
m
m
o
n
l
y
f
o
u
n
d
in
r
ea
l
-
w
o
r
ld
d
ataset
s
,
y
et
m
o
s
t
m
ac
h
in
e
lear
n
in
g
alg
o
r
ith
m
s
r
eq
u
ir
e
n
u
m
e
r
ical
in
p
u
t.
T
h
er
ef
o
r
e
,
it'
s
ess
en
tial
to
co
n
v
er
t
ca
teg
o
r
ical
v
ar
iab
les
in
to
a
n
u
m
er
ical
r
ep
r
esen
tatio
n
.
On
e
o
f
t
h
e
m
o
s
t
u
s
ed
tech
n
iq
u
es
f
o
r
th
is
p
u
r
p
o
s
e
is
o
n
e
-
h
o
t
en
co
d
in
g
,
wh
i
ch
is
co
n
s
id
er
ed
a
f
o
u
n
d
atio
n
al
m
eth
o
d
d
u
e
to
i
ts
s
im
p
licity
an
d
ef
f
ec
tiv
en
es
s
,
p
ar
ticu
lar
ly
f
o
r
n
o
m
in
al
v
a
r
iab
les
[
2
0
]
.
I
n
t
h
e
Al
Kh
alee
j
d
ataset,
th
e
ca
te
g
o
r
ical
lab
els
in
clu
d
e
tech
,
c
u
ltu
r
e,
m
ed
ical,
f
in
an
c
e,
p
o
liti
cs,
r
elig
io
n
,
a
n
d
s
p
o
r
t
,
ar
e
tr
an
s
f
o
r
m
e
d
in
to
n
u
m
e
r
ical
f
o
r
m
at
th
r
o
u
g
h
o
n
e
-
h
o
t e
n
co
d
in
g
.
3
.
2
.
2
.
T
o
k
eniza
t
io
n
T
o
k
en
izatio
n
is
a
f
u
n
d
am
e
n
tal
p
r
ep
r
o
ce
s
s
in
g
s
tep
in
NL
P
,
as
it
tr
an
s
f
o
r
m
s
r
aw
tex
tu
al
d
ata
in
to
s
tr
u
ctu
r
ed
u
n
its
to
k
en
s
th
at
c
an
b
e
r
ep
r
esen
te
d
n
u
m
e
r
ically
an
d
p
r
o
ce
s
s
ed
b
y
m
ac
h
i
n
e
lear
n
in
g
alg
o
r
ith
m
s
[
2
1
]
.
T
r
ad
itio
n
al
to
k
e
n
izatio
n
ap
p
r
o
ac
h
es
o
f
ten
r
ely
o
n
wh
it
esp
ac
e
d
elim
iter
s
o
r
p
u
n
ctu
atio
n
to
s
eg
m
en
t
te
x
t,
b
u
t
m
o
r
e
ad
v
a
n
ce
d
tech
n
iq
u
es
s
u
ch
as
s
u
b
-
wo
r
d
to
k
e
n
izatio
n
an
d
m
o
r
p
h
o
lo
g
y
-
a
war
e
m
eth
o
d
s
h
a
v
e
d
em
o
n
s
tr
ated
s
u
p
er
i
o
r
p
er
f
o
r
m
an
ce
,
p
ar
ticu
lar
l
y
f
o
r
m
o
r
p
h
o
lo
g
ically
r
ich
lan
g
u
ag
es
lik
e
Ar
ab
ic
[
2
2
]
.
Ar
a
b
ic
p
r
esen
ts
s
p
ec
if
ic
to
k
en
izatio
n
ch
allen
g
es
d
u
e
to
its
co
m
p
le
x
m
o
r
p
h
o
lo
g
y
,
co
n
ca
ten
ativ
e
wo
r
d
s
tr
u
ctu
r
e,
a
n
d
th
e
f
r
eq
u
e
n
t
ab
s
en
ce
o
f
d
iacr
itic
m
ar
k
s
,
wh
ich
ca
n
lead
to
lex
ical
am
b
ig
u
ity
[
2
3
]
.
E
f
f
ec
tiv
e
to
k
en
izatio
n
m
eth
o
d
s
f
o
r
Ar
ab
ic
m
u
s
t
ac
co
u
n
t
f
o
r
th
ese
lin
g
u
is
tic
ch
ar
ac
ter
is
tics
to
en
ab
le
ac
cu
r
ate
p
ar
s
in
g
an
d
f
ea
tu
r
e
ex
tr
ac
tio
n
.
B
y
s
eg
m
en
tin
g
tex
t
in
to
lin
g
u
is
tically
m
ea
n
in
g
f
u
l
u
n
its
,
to
k
en
izatio
n
en
h
an
ce
s
th
e
ab
ilit
y
o
f
NL
P
s
y
s
tem
s
to
p
er
f
o
r
m
d
o
w
n
s
tr
ea
m
task
s
s
u
ch
as
tex
t
cla
s
s
if
ica
tio
n
an
d
s
en
tim
en
t
an
aly
s
is
,
le
ad
in
g
to
im
p
r
o
v
ed
m
o
d
el
p
er
f
o
r
m
a
n
ce
[
2
4
]
.
3
.
2
.
3
.
Ste
mm
ing
I
n
Ar
ab
ic
m
o
r
p
h
o
lo
g
y
,
wo
r
d
s
f
r
eq
u
en
tly
e
x
h
ib
it
v
ar
io
u
s
p
r
ef
ix
es,
in
f
ix
es,
a
n
d
s
u
f
f
ix
es.
A
p
r
ef
ix
is
an
af
f
ix
p
o
s
itio
n
e
d
b
ef
o
r
e
t
h
e
s
tem
o
f
a
wo
r
d
,
wh
ile
a
s
u
f
f
ix
is
ap
p
en
d
e
d
to
th
e
e
n
d
.
B
o
th
p
r
ef
i
x
es
an
d
s
u
f
f
ix
es
ex
er
t
in
f
lu
en
ce
o
n
t
h
e
m
ea
n
in
g
o
f
th
e
wo
r
d
th
ey
m
o
d
if
y
,
o
f
ten
cr
ea
tin
g
n
ew
wo
r
d
ca
teg
o
r
ies
o
r
alter
in
g
th
e
ex
is
tin
g
o
n
es
[
2
5
]
.
An
in
f
ix
is
an
af
f
i
x
in
s
er
ted
with
in
a
wo
r
d
,
d
if
f
er
in
g
f
r
o
m
p
r
e
f
ix
es
an
d
s
u
f
f
ix
es
th
at
ar
e
ad
d
ed
t
o
th
e
b
eg
in
n
in
g
o
r
e
n
d
.
I
n
Ar
ab
ic,
a
s
with
o
th
er
Sem
itic
lan
g
u
a
g
e
s
,
th
er
e'
s
a
s
tr
u
ctu
r
e
co
m
p
r
is
in
g
r
o
o
t
letter
s
an
d
p
a
tter
n
s
.
I
n
th
is
s
y
s
tem
,
in
f
i
x
es,
ty
p
ically
v
o
wels
o
r
c
o
m
b
in
ati
o
n
s
o
f
v
o
wels
an
d
s
p
ec
if
ic
co
n
s
o
n
an
ts
,
ar
e
i
n
s
er
ted
b
etwe
en
th
e
b
ase
co
n
s
o
n
an
ts
.
T
h
is
p
r
o
ce
s
s
y
ield
s
d
if
f
er
en
t
wo
r
d
s
o
r
ex
p
r
ess
es
d
is
tin
ct
g
r
am
m
atica
l
f
u
n
ctio
n
s
.
R
ec
en
t
r
esear
ch
a
f
f
ir
m
s
th
at
Ar
ab
ic
m
o
r
p
h
o
l
o
g
y
r
elies
h
ea
v
ily
o
n
tem
p
latic
r
o
o
t
–
p
atter
n
s
tr
u
ctu
r
es,
wh
er
e
in
f
ix
es
—
o
f
ten
c
o
m
p
r
is
in
g
v
o
wels
an
d
o
cc
asio
n
al
ly
co
n
s
o
n
an
ts
—
ar
e
s
y
s
tem
atica
lly
in
s
er
ted
in
to
co
n
s
o
n
an
tal
r
o
o
ts
to
d
er
i
v
e
v
ar
io
u
s
g
r
am
m
atica
l
f
o
r
m
s
.
T
h
is
p
h
en
o
m
e
n
o
n
is
well
-
d
o
cu
m
en
ted
in
co
m
p
u
tat
io
n
al
m
o
r
p
h
o
lo
g
y
s
y
s
tem
s
[
2
6
]
.
Stem
m
in
g
is
u
s
ed
to
r
ed
u
ce
th
ese
wo
r
d
s
to
th
eir
r
o
o
t
f
o
r
m
.
So
m
e
co
m
m
o
n
s
tem
m
in
g
alg
o
r
ith
m
s
f
o
r
Ar
ab
ic
in
clu
d
e:
Ass
em
’
s
s
tem
m
er
,
th
e
Ar
ab
ic
lig
h
t
s
tem
m
er
,
is
an
alg
o
r
ith
m
d
esig
n
ed
to
s
tem
Ar
ab
ic
wo
r
d
s
,
u
tili
zin
g
a
s
n
o
wb
all
m
eth
o
d
to
en
h
an
ce
s
ea
r
ch
f
u
n
ctio
n
alit
y
in
th
e
Ar
ab
ic
lan
g
u
ag
e
.
Giv
en
Ar
ab
ic'
s
in
tr
icate
s
tr
u
ctu
r
e
o
f
in
f
lectio
n
s
,
s
tem
m
in
g
p
o
s
es
ch
allen
g
es
d
u
e
to
th
e
lan
g
u
ag
e'
s
p
r
o
p
e
n
s
ity
f
o
r
alter
atio
n
s
v
ia
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J Ar
tif
I
n
tell
I
SS
N:
2252
-
8
9
3
8
A
r
a
b
ic
text
cla
s
s
ifica
tio
n
u
s
in
g
ma
ch
in
e
le
a
r
n
in
g
a
n
d
d
ee
p
…
(
R
a
w
a
d
A
w
a
d
A
l
q
a
h
ta
n
i
)
5207
p
r
ef
ix
es,
i
n
f
ix
es,
a
n
d
s
u
f
f
ix
es,
p
o
ten
tially
r
esu
ltin
g
in
v
ar
ied
m
ea
n
in
g
s
[
2
7
]
.
Far
asa
is
a
co
m
p
r
eh
en
s
iv
e,
f
u
ll
-
s
tack
Ar
ab
ic
NL
P
to
o
lk
it
wid
ely
u
s
ed
in
task
s
s
u
ch
as
s
ea
r
ch
,
m
ac
h
in
e
tr
an
s
latio
n
,
it
i
n
teg
r
ates
a
r
an
g
e
o
f
h
ig
h
-
p
e
r
f
o
r
m
an
ce
m
o
d
u
les
in
clu
d
in
g
w
o
r
d
s
eg
m
en
tatio
n
,
lem
m
atiza
tio
n
,
n
am
e
d
en
tity
r
ec
o
g
n
itio
n
,
p
ar
t
-
of
-
s
p
ee
ch
tag
g
in
g
,
d
iac
r
itic
r
ec
o
v
er
y
,
a
n
d
te
x
t
class
if
icatio
n
[
2
8
]
.
T
h
e
Far
asa
s
tem
m
in
g
alg
o
r
ith
m
p
lay
s
a
k
ey
r
o
le
in
r
ed
u
cin
g
Ar
a
b
ic
wo
r
d
s
to
th
eir
r
o
o
t
f
o
r
m
s
,
th
er
eb
y
s
tan
d
ar
d
izin
g
tex
t
an
d
im
p
r
o
v
in
g
th
e
ef
f
icien
c
y
o
f
d
o
wn
s
tr
ea
m
ap
p
licatio
n
s
s
u
ch
as in
f
o
r
m
atio
n
r
etr
iev
al
a
n
d
li
n
g
u
is
tic
an
aly
s
is
[
2
9
]
.
T
ash
ap
h
y
n
e
s
tem
m
er
is
an
o
th
er
Ar
ab
ic
to
o
l
d
esig
n
ed
f
o
r
th
e
s
eg
m
en
tatio
n
an
d
s
tem
m
in
g
o
f
tex
t
,
lev
er
ag
in
g
an
af
f
i
x
-
b
ased
f
in
it
e
s
tate
au
to
m
ato
n
t
o
e
x
tr
ac
t
p
r
ef
ix
es
an
d
s
u
f
f
i
x
es
f
r
o
m
a
d
e
f
in
ed
s
et
o
f
a
f
f
ix
es.
T
h
is
to
o
l
f
in
d
s
ap
p
licatio
n
s
in
task
s
s
u
ch
as
n
a
m
ed
e
n
tity
id
en
tific
atio
n
,
s
en
tim
en
t
an
aly
s
is
,
an
d
te
x
t
class
if
icatio
n
[
3
0
]
.
I
n
th
e
s
am
p
le
s
h
o
wn
in
T
ab
le
1
,
ea
ch
s
tem
r
ep
r
esen
ts
a
d
if
f
er
en
t
wo
r
d
r
o
o
t,
d
esig
n
e
d
to
r
ed
u
ce
wo
r
d
s
to
th
ei
r
b
ase
f
o
r
m
in
o
r
d
er
t
o
en
h
an
ce
th
e
ac
c
u
r
ac
y
o
f
tex
t
a
n
aly
s
is
.
Ho
wev
er
,
it
is
im
p
o
r
ta
n
t
to
r
ec
o
g
n
ize
th
at
t
h
e
ch
o
ice
o
f
s
t
em
m
in
g
alg
o
r
ith
m
ca
n
s
ig
n
if
i
ca
n
tly
im
p
ac
t th
e
class
if
icatio
n
r
esu
lts
.
T
ab
le
1
.
E
x
am
p
les o
f
w
o
r
d
s
te
m
s
g
en
er
ated
b
y
th
r
ee
d
if
f
e
r
en
t A
r
ab
ic
s
tem
m
er
s
A
r
a
b
i
c
s
t
e
mm
e
r
A
r
a
b
i
c
t
e
x
t
a
n
d
e
x
t
r
a
c
t
e
d
r
o
o
t
En
g
l
i
sh
t
r
a
n
sl
a
t
i
o
n
S
a
mp
l
e
a
r
t
i
c
l
e
ي
ب
د
عم
ج
م
«
عم
»
وك
س
ل
د
«
ت
ن
وا
ع
ت
»
ج
ي
ل
خ
لا
«
:
ي
ب
د
ةي
ب
لت
فد
ه
ب
ةلم
ا
ك
ت
م
ةي
ب
ط
ةد
ا
ي
ع
ل
وأ
ح
ا
ت
ت
ف
»
ي
ع
ا
ن
ص
لا
ن
م
ر
ث
ك
ة
ي
ب
طلا
ت
ا
ج
ا
ي
ت
ح
ا
42
ث
ث
ى
لع نوع
ز
وت
ي
لم
ا
ع
ف
لأ
ت
ا
ك
ر
ش
ت
ا
ي
ر
ب
ك
عم
ةد
ا
ي
ع
لا
ت
د
قا
ع
ت
و
.
عم
ج
م
لا
ن
م
ض
ةي
لا
م
ع
ى
ر
ق
ةي
ح
ص
لا
ةي
ا
ع
ر
لا
و
م
ع
د
لا
ت
ا
ي
وت
س
م
ى
لعأ
ر
ي
فوت
ل
ةلود
لا
ي
ف
ن
ي
م
أ
ت
لا
عم
ج
م
لا
ي
ف
لم
ع
لا
ى
وق
ل
D
u
b
a
i
:
A
l
K
h
a
l
e
e
j
:
D
u
l
sc
o
c
o
l
l
a
b
o
r
a
t
e
d
w
i
t
h
D
u
b
a
i
I
n
d
u
st
r
i
a
l
c
o
m
p
l
e
x
t
o
o
p
e
n
t
h
e
f
i
r
s
t
f
u
l
l
y
i
n
t
e
g
r
a
t
e
d
med
i
c
a
l
c
l
i
n
i
c
,
a
i
m
i
n
g
t
o
m
e
e
t
t
h
e
h
e
a
l
t
h
c
a
r
e
n
e
e
d
s
o
f
mo
r
e
t
h
a
n
4
2
,
0
0
0
w
o
r
k
e
r
s
l
i
v
i
n
g
a
c
r
o
ss
t
h
r
e
e
l
a
b
o
r
v
i
l
l
a
g
e
s
w
i
t
h
i
n
t
h
e
c
o
m
p
l
e
x
.
Th
e
c
l
i
n
i
c
h
a
s
s
i
g
n
e
d
a
g
r
e
e
me
n
t
s
w
i
t
h
ma
j
o
r
i
n
s
u
r
a
n
c
e
c
o
mp
a
n
i
e
s
i
n
t
h
e
c
o
u
n
t
r
y
t
o
p
r
o
v
i
d
e
t
h
e
h
i
g
h
e
st
st
a
n
d
a
r
d
s
o
f
s
u
p
p
o
r
t
a
n
d
h
e
a
l
t
h
c
a
r
e
f
o
r
t
h
e
w
o
r
k
f
o
r
c
e
i
n
t
h
e
c
o
m
p
l
e
x
.
A
ssem
st
e
mm
e
r
ل
وا
ح
ا
ت
ف
ع
ا
ن
ص
ب
د
عم
ج
م
ك
س
لد
ن
وا
ع
ت
ج
ي
ل
خ
ب
د
ع
ز
وت
ي
لم
ا
ع
ف
لا
ر
ث
ك
هي
ب
ط
ج
ا
ي
ت
ح
ا
ب
لت
فد
ه لما
ك
ت
م
ي
ب
ط
د
ا
ي
ع
ر
ي
فوت
ل
هلود
م
ا
ت
ا
ك
ر
ش
ا
ي
ر
ب
ك
هد
ا
ي
ع
د
قا
ع
ت
عم
ج
م
ن
م
ض
لا
م
ع
ر
ق
لع
عم
ج
م
لم
ع
وق
ل
هي
ح
ص
ا
ع
ر
ل
او
م
ع
د
ا
ي
وت
س
م
لعا
D
u
b
G
u
l
f
c
o
l
l
a
b
o
r
a
t
e
d
w
i
t
h
D
u
l
s
k
w
i
t
h
M
a
n
u
f
a
c
t
u
r
e
r
s
C
o
m
p
l
e
x
t
o
o
p
e
n
t
h
e
f
i
r
st
i
n
t
e
g
r
a
t
e
d
med
i
c
a
l
c
l
i
n
i
c
a
i
m
i
n
g
t
o
m
e
e
t
t
h
e
m
e
d
i
c
a
l
n
e
e
d
s
o
f
mo
r
e
t
h
a
n
a
t
h
o
u
s
a
n
d
w
o
r
k
e
r
s
d
i
st
r
i
b
u
t
e
d
o
v
e
r
l
a
b
o
r
v
i
l
l
a
g
e
s
w
i
t
h
i
n
a
c
o
m
p
l
e
x
c
o
n
t
r
a
c
t
e
d
c
l
i
n
i
c
ma
j
o
r
c
o
m
p
a
n
y
i
n
t
h
e
c
o
u
n
t
r
y
t
o
p
r
o
v
i
d
e
t
h
e
h
i
g
h
e
st
l
e
v
e
l
o
f
s
u
p
p
o
r
t
a
n
d
h
e
a
l
t
h
c
a
r
e
f
o
r
t
h
e
w
o
r
k
c
o
mp
l
e
x
.
F
a
r
a
sa
s
t
e
m
mer
ح
ا
ت
فا
ي
ع
ا
ن
ص
ي
ب
د
عم
ج
م
وك
س
لد
ن
وا
ع
ت
ج
ي
ل
خ
ي
ب
د
لم
ا
ع
ف
لأ
ر
ث
ك
أ
ي
ب
ط
ج
ا
ي
ت
ح
ا
ى
ب
ل
فد
ه لما
ك
ت
م
ي
ب
ط
د
ا
ي
ع
لوأ
م
ا
ت
ةك
ر
ش
ر
ب
ك
أ
هد
ا
ي
ع
د
قا
ع
ت
عم
ج
م
ن
م
ض
لم
ا
ع
ي
ر
ق
ي
لع ع
ز
وت
عم
ج
م
لم
ع
ي
وق
هي
ح
ص
هي
ا
ع
ر
م
ع
د
ى
وت
س
م
ي
لعا
ر
ي
فوت
ةلود
D
u
b
a
i
G
u
l
f
c
o
l
l
a
b
o
r
a
t
e
d
w
i
t
h
D
u
l
s
c
o
D
u
b
a
i
i
n
d
u
s
t
r
i
a
l
c
o
m
p
l
e
x
o
p
e
n
i
n
g
t
h
e
f
i
r
s
t
f
u
l
l
y
i
n
t
e
g
r
a
t
e
d
me
d
i
c
a
l
c
l
i
n
i
c
a
i
m
e
d
t
o
me
e
t
t
h
e
me
d
i
c
a
l
n
e
e
d
s o
f
m
o
r
e
t
h
a
n
a
t
h
o
u
sa
n
d
w
o
r
k
e
r
s
d
i
st
r
i
b
u
t
e
d
a
c
r
o
ss
t
h
e
l
a
b
o
r
v
i
l
l
a
g
e
s
o
f
w
o
r
k
e
r
s
w
i
t
h
i
n
t
h
e
c
o
m
p
l
e
x
c
l
i
n
i
c
c
o
n
t
a
c
t
e
d
t
h
e
l
a
r
g
e
st
c
o
m
p
a
n
y
i
n
t
h
e
c
o
u
n
t
r
y
p
r
o
v
i
d
i
n
g
t
h
e
h
i
g
h
e
st
l
e
v
e
l
o
f
h
e
a
l
t
h
c
a
r
e
s
u
p
p
o
r
t
s
t
r
o
n
g
w
o
r
k
c
o
m
p
l
e
x
.
T
a
sh
a
p
h
y
n
e
st
e
mm
e
r
لو
ح
ا
ت
فا
ع
ا
ن
ص
ب
د
عم
ج
م
ك
س
لد
ن
وا
ع
ت
ج
ي
ل
خ
ب
د
ع
ز
وت
لم
ا
ع
فل
ر
ث
ك
ا
هي
ب
ط
ج
ا
ي
ت
ح
ا
ب
ل
فد
ه لما
ك
ت
م
ب
ط
د
ا
ي
ع
لع
ر
ي
فو
هلود
م
ا
ت
ك
ر
ش
ر
ب
هد
ا
ي
ع
د
قا
ع
ت
عم
ج
م
ن
م
ض
لا
م
ع
ر
ق
ى
لع
عم
ج
م
لم
ع
وق
هي
ح
ص
هي
ا
ع
ر
م
ع
د
وت
س
م
D
u
b
G
u
l
f
c
o
o
p
e
r
a
t
i
o
n
D
u
l
s
k
D
u
b
c
o
m
p
l
e
x
man
u
f
a
c
t
u
r
e
r
s
o
p
e
n
i
n
g
a
n
i
n
t
e
g
r
a
t
e
d
me
d
i
c
a
l
c
l
i
n
i
c
g
o
a
l
t
o
m
e
e
t
me
d
i
c
a
l
n
e
e
d
s
o
f
m
o
r
e
t
h
a
n
a
t
h
o
u
sa
n
d
w
o
r
k
e
r
s
d
i
s
t
r
i
b
u
t
e
d
a
c
r
o
ss
l
a
b
o
r
v
i
l
l
a
g
e
s
w
i
t
h
i
n
a
c
o
m
p
l
e
x
c
o
n
t
r
a
c
t
i
n
g
c
l
i
n
i
c
w
i
t
h
a
c
o
mp
l
e
t
e
st
a
t
e
-
of
-
t
h
e
-
a
r
t
c
o
m
p
a
n
y
,
p
r
o
v
i
d
i
n
g
a
l
e
v
e
l
o
f
s
u
p
p
o
r
t
f
o
r
h
e
a
l
t
h
c
a
r
e
,
a
st
r
o
n
g
w
o
r
k
c
o
m
p
l
e
x
.
T
h
e
Ass
em
s
tem
m
er
wo
r
k
s
b
y
r
em
o
v
in
g
th
e
ch
ar
ac
te
r
s
'
y
a
'
(
'
ي
'
)
,
tah
m
ar
b
u
ta,
an
d
m
a
f
to
h
a
(
'
ة
'
,
'
ت
'
)
at
th
e
en
d
o
f
wo
r
d
s
.
Ad
d
itio
n
all
y
,
it
elim
in
ate
d
'
لا
'
an
d
'
ب
'
at
th
e
b
e
g
in
n
in
g
o
f
w
o
r
d
s
,
alo
n
g
with
an
y
ass
o
ciate
d
n
u
m
er
als
an
d
p
r
o
n
o
u
n
s
.
I
n
co
n
tr
ast,
th
e
T
ash
ap
h
y
n
e
s
tem
m
er
is
q
u
ite
s
im
ilar
to
A
s
s
em
,
with
m
in
o
r
d
is
tin
ctio
n
s
s
u
ch
as
th
e
ex
clu
s
io
n
o
f
th
e
in
itial
alef
(
'
ا
'
)
f
r
o
m
wo
r
d
s
.
Ho
wev
er
,
th
e
Far
asa
s
tem
m
er
ex
h
ib
ited
a
n
o
tab
le
d
if
f
er
e
n
ce
co
m
p
ar
e
d
to
th
e
o
th
er
s
tem
m
er
s
.
Mo
s
t
wo
r
d
s
r
ev
er
ted
to
th
eir
s
tem
s
with
o
u
t
s
ig
n
if
ican
t
alter
atio
n
s
.
Ass
em
an
d
T
ash
ap
h
y
n
e
s
tem
m
er
s
em
p
lo
y
ag
g
r
ess
iv
e
ap
p
r
o
ac
h
es
th
at
ef
f
ec
tiv
ely
s
im
p
li
f
y
tex
t
b
u
t
m
ay
lo
s
e
im
p
o
r
tan
t
s
em
an
tic
o
r
m
o
r
p
h
o
lo
g
ical
d
etails.
I
n
co
n
tr
ast,
Far
asa
p
r
eser
v
es
th
e
wo
r
d
in
teg
r
ity
,
m
ak
in
g
it
id
ea
l
f
o
r
task
s
r
eq
u
i
r
in
g
s
em
an
tic
r
ich
n
ess
,
s
u
ch
a
s
s
en
tim
en
t
an
aly
s
is
o
r
d
ialec
t
s
tu
d
ies.
Ho
wev
er
,
Far
asa’
s
les
s
ag
g
r
ess
iv
e
n
atu
r
e
m
ay
n
o
t
b
e
s
u
itab
le
f
o
r
n
o
i
s
y
d
ata,
wh
er
e
Ass
em
an
d
T
ash
ap
h
y
n
e
p
e
r
f
o
r
m
b
etter
.
E
ac
h
s
tem
m
er
d
em
o
n
s
tr
ates u
n
iq
u
e
s
tr
en
g
th
s
tailo
r
ed
to
d
if
f
er
e
n
t N
L
P task
s
.
3
.
3
.
F
e
a
t
ure
ex
t
r
a
ct
io
n
TF
-
I
DF
is
a
wid
ely
u
s
ed
m
et
h
o
d
in
NL
P
an
d
in
f
o
r
m
atio
n
r
e
tr
iev
al
th
at
q
u
an
tifie
s
th
e
im
p
o
r
tan
ce
o
f
a
ter
m
with
in
a
s
p
ec
if
ic
d
o
cu
m
en
t
r
elativ
e
t
o
its
o
cc
u
r
r
en
ce
ac
r
o
s
s
a
lar
g
er
co
llect
io
n
o
f
d
o
c
u
m
en
ts
.
I
t
co
m
b
in
es
two
k
ey
m
et
r
ics:
ter
m
f
r
eq
u
e
n
cy
,
wh
ic
h
m
ea
s
u
r
es
h
o
w
o
f
ten
a
wo
r
d
a
p
p
ea
r
s
i
n
a
d
o
cu
m
e
n
t,
an
d
in
v
er
s
e
d
o
c
u
m
en
t
f
r
eq
u
e
n
cy
,
wh
ich
ac
co
u
n
ts
f
o
r
h
o
w
r
ar
e
th
e
wo
r
d
is
ac
r
o
s
s
th
e
en
tire
c
o
r
p
u
s
.
T
h
e
T
F
-
I
DF
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
2
5
2
-
8
9
3
8
I
n
t J Ar
tif
I
n
tell
,
Vo
l.
14
,
No
.
6
,
Dec
em
b
er
20
25
:
5
2
0
1
-
5
2
1
7
5208
w
ei
ght
h
elp
s
d
is
tin
g
u
is
h
s
ig
n
if
ican
t
ter
m
s
th
at
a
r
e
ch
a
r
ac
te
r
is
tic
o
f
a
d
o
cu
m
en
t
wh
ile
r
ed
u
cin
g
th
e
im
p
ac
t
o
f
f
r
eq
u
e
n
tly
o
cc
u
r
r
in
g
b
u
t le
s
s
m
ea
n
in
g
f
u
l w
o
r
d
s
[
3
1
]
.
N
-
g
r
am
s
p
lay
a
cr
itical
r
o
le
in
tex
t
an
aly
s
is
b
y
ca
p
tu
r
in
g
co
n
tex
tu
al
r
elatio
n
s
h
ip
s
an
d
s
y
n
tactic
s
tr
u
ctu
r
es
b
etwe
en
wo
r
d
s
.
B
ig
r
am
s
,
wh
ich
r
ep
r
esen
t
s
eq
u
en
ce
s
o
f
two
co
n
s
ec
u
tiv
e
wo
r
d
s
,
o
f
f
er
r
ich
er
co
n
tex
tu
al
i
n
f
o
r
m
atio
n
th
an
is
o
lated
ter
m
s
,
en
h
an
cin
g
th
e
p
er
f
o
r
m
an
c
e
o
f
m
a
n
y
NL
P
task
s
.
T
r
ig
r
am
s
,
in
v
o
lv
in
g
th
r
ee
-
w
o
r
d
s
eq
u
en
c
es,
p
r
o
v
id
e
d
ee
p
e
r
in
s
ig
h
ts
in
t
o
lin
g
u
is
tic
p
atter
n
s
an
d
s
tr
u
c
tu
r
al
d
ep
e
n
d
en
cies
with
in
tex
t.
R
ec
en
t
r
esear
ch
b
y
Sh
an
n
aq
[
3
2
]
d
e
m
o
n
s
tr
ates
th
at
o
p
tim
izin
g
n
-
g
r
a
m
len
g
th
,
in
clu
d
in
g
b
u
t
n
o
t
lim
ited
to
b
ig
r
am
s
an
d
tr
i
g
r
am
s
,
s
ig
n
if
ican
tly
im
p
r
o
v
es
class
if
icatio
n
,
ac
cu
r
ac
y
,
an
d
g
en
er
aliza
b
ilit
y
,
p
ar
ticu
lar
l
y
with
in
Ar
a
b
ic
-
lan
g
u
ag
e
c
o
r
p
o
r
a
.
3
.
4
.
M
o
dellin
g
in
t
ex
t
cla
s
s
if
ica
t
io
n
T
h
is
s
ec
tio
n
p
r
esen
ts
m
ac
h
in
e
lear
n
in
g
tech
n
iq
u
es.
I
t
also
in
tr
o
d
u
ce
s
en
s
em
b
le
lear
n
i
n
g
m
eth
o
d
s
.
Fu
r
th
er
m
o
r
e
,
it c
o
v
e
r
s
th
e
MA
R
AB
E
R
T
d
ee
p
b
id
ir
ec
tio
n
al
tr
an
s
f
o
r
m
er
m
o
d
el.
3
.
4
.
1
.
M
a
chine
lea
rning
t
ec
h
niq
ue
s
Ma
ch
in
e
lear
n
in
g
class
if
ier
s
ar
e
f
u
n
d
am
en
tal
t
o
tex
t
clas
s
if
icatio
n
,
p
r
o
v
id
in
g
an
a
u
to
m
ated
an
d
ef
f
ec
tiv
e
m
ea
n
s
o
f
d
etec
tin
g
p
atter
n
s
with
in
tex
tu
al
d
ata.
B
r
o
ad
ly
,
m
ac
h
in
e
lear
n
i
n
g
ca
n
b
e
ca
teg
o
r
ized
in
to
th
r
ee
ty
p
es:
s
u
p
e
r
v
is
ed
lear
n
i
n
g
,
u
n
s
u
p
er
v
is
ed
lea
r
n
in
g
,
an
d
r
ein
f
o
r
ce
m
e
n
t
lear
n
i
n
g
.
T
h
is
r
esear
ch
f
o
cu
s
es
o
n
th
e
s
u
p
er
v
is
ed
lear
n
in
g
ap
p
r
o
ac
h
d
u
e
to
its
ab
ilit
y
to
lev
er
ag
e
lab
eled
d
ata
in
b
u
ild
in
g
ac
cu
r
ate
p
r
ed
ictiv
e
m
o
d
els.
Sp
ec
if
ically
,
th
is
s
tu
d
y
em
p
lo
y
s
lo
g
is
tic
r
eg
r
ess
io
n
,
M
NB
,
an
d
SGD
clas
s
if
ier
s
,
ea
ch
ch
o
s
en
f
o
r
its
d
is
tin
ct
s
tr
en
g
th
s
an
d
p
r
o
v
en
e
f
f
ec
tiv
en
ess
in
h
an
d
lin
g
tex
t c
l
ass
if
icatio
n
task
s
.
L
o
g
is
tic
r
eg
r
ess
io
n
is
a
wid
ely
ad
o
p
ted
s
tatis
tical
tech
n
iq
u
e
co
m
m
o
n
ly
u
s
ed
f
o
r
class
if
icatio
n
an
d
p
r
ed
ictiv
e
an
al
y
tics
.
I
ts
m
ain
o
b
jectiv
e
is
to
p
r
e
d
ict
ca
teg
o
r
ical
o
u
tco
m
es
in
s
tead
o
f
c
o
n
tin
u
o
u
s
v
ar
iab
les.
B
y
ev
alu
atin
g
th
e
lik
elih
o
o
d
o
f
s
u
cc
ess
v
er
s
u
s
f
ailu
r
e,
it
tr
an
s
f
o
r
m
s
o
d
d
s
in
to
p
r
o
b
ab
ilit
ies
th
at
f
all
with
in
th
e
r
an
g
e
o
f
0
t
o
1
,
as
in
(
1
)
.
T
h
e
s
im
p
licity
an
d
in
te
r
p
r
etab
ilit
y
o
f
lo
g
is
tic
r
eg
r
es
s
io
n
m
ak
e
it
an
ex
ce
llen
t
b
aselin
e
m
o
d
el
in
tex
t
class
if
i
ca
tio
n
[
33
]
,
[
34
]
.
=
1
1
+
−
(
0
+
1
)
(
1
)
T
h
e
MN
B
class
if
ier
h
as
b
ee
n
ex
ten
s
iv
ely
u
s
ed
an
d
ev
alu
ated
in
th
e
co
n
tex
t
o
f
A
r
ab
ic
tex
t
class
if
icatio
n
.
MN
B
i
s
a
wid
ely
r
ec
o
g
n
ized
s
u
p
e
r
v
is
ed
lea
r
n
in
g
alg
o
r
ith
m
.
I
t
is
a
p
r
o
b
a
b
ilis
tic
m
eth
o
d
th
at
u
s
es
B
ay
es
'
th
eo
r
em
to
d
ete
r
m
in
e
th
e
lik
elih
o
o
d
o
f
ea
c
h
ta
g
in
a
s
am
p
le
as
in
(
2
)
.
I
t
ass
u
m
es
th
at
all
f
ea
tu
r
es
ar
e
co
n
d
itio
n
ally
in
d
e
p
en
d
e
n
t
,
m
ea
n
in
g
th
at
th
e
p
r
esen
ce
o
r
ab
s
en
ce
o
f
o
n
e
f
ea
tu
r
e
is
ass
u
m
ed
to
h
av
e
n
o
in
f
lu
en
ce
o
n
o
th
e
r
s
[
35
]
.
(
|
)
=
(
)
∗
(
|
)
(
)
(
2
)
T
h
e
MN
B
v
ar
ian
t
is
p
ar
ticu
lar
ly
well
-
s
u
ited
f
o
r
ca
teg
o
r
ical
an
d
tex
t
d
ata;
it
is
ex
ten
s
iv
ely
u
tili
ze
d
in
NL
P
task
s
d
u
e
t
o
its
ef
f
icien
cy
a
n
d
ef
f
ec
tiv
e
n
ess
.
T
h
e
alg
o
r
ith
m
o
p
er
ates
b
ased
o
n
B
ay
es’
th
e
o
r
em
,
co
m
p
u
tin
g
th
e
li
k
elih
o
o
d
f
o
r
ea
ch
p
o
ten
tial
tag
an
d
s
elec
tin
g
th
e
o
n
e
with
th
e
g
r
ea
test
p
r
o
b
a
b
ilit
y
as
th
e
o
u
tp
u
t.
T
h
e
s
im
p
licity
o
f
NB
,
alo
n
g
with
its
r
o
b
u
s
t
ef
f
icac
y
in
h
ig
h
-
d
im
en
s
io
n
al
tex
tu
al
d
ata,
p
r
o
m
p
te
d
its
ap
p
licatio
n
in
th
is
r
esear
ch
[
36
]
.
A
SGD
cla
s
s
if
ier
d
em
o
n
s
tr
ates
s
tr
o
n
g
p
er
f
o
r
m
an
ce
in
Ar
a
b
ic
tex
t
class
if
icatio
n
task
s
,
u
n
d
er
s
co
r
in
g
its
ef
f
ec
tiv
en
ess
an
d
d
ep
en
d
ab
ilit
y
as
a
m
ac
h
in
e
lear
n
in
g
m
eth
o
d
f
o
r
p
r
o
ce
s
s
in
g
Ar
ab
ic
tex
t.
T
h
is
is
esp
ec
ially
v
alu
ab
le
g
iv
e
n
th
e
lan
g
u
ag
e
’
s
in
tr
icate
m
o
r
p
h
o
lo
g
y
a
n
d
d
iv
er
s
e
d
ialec
ts
.
As
an
o
p
tim
izatio
n
tech
n
iq
u
e,
SGD
wo
r
k
s
b
y
iter
ativ
ely
u
p
d
atin
g
m
o
d
el
p
a
r
am
eter
s
to
m
in
im
ize
a
co
s
t
f
u
n
ctio
n
.
I
t
d
o
es
s
o
b
y
co
m
p
u
tin
g
th
e
g
r
a
d
ien
t
o
f
th
e
lo
s
s
f
u
n
ctio
n
u
s
in
g
r
an
d
o
m
ly
s
elec
ted
d
ata
p
o
in
ts
[
1
4
]
.
T
h
e
f
u
n
ctio
n
ca
lcu
lates
th
e
g
r
ad
ien
t
o
f
th
e
lo
s
s
f
u
n
ctio
n
with
r
esp
ec
t
to
th
e
m
o
d
el'
s
p
ar
am
eter
s
,
u
s
in
g
eith
er
a
s
in
g
le
tr
ain
in
g
in
s
tan
ce
o
r
a
s
m
all
b
atch
o
f
s
am
p
les
.
T
h
e
class
if
ier
s
ee
k
s
to
m
in
im
iz
e
a
p
r
e
d
ef
in
e
d
lo
s
s
f
u
n
ctio
n
—
s
u
ch
as
h
in
g
e
lo
s
s
f
o
r
lin
ea
r
SVM
o
r
lo
g
lo
s
s
f
o
r
lo
g
is
tic
r
eg
r
ess
io
n
.
T
h
e
lear
n
in
g
r
ate
p
ar
am
eter
co
n
t
r
o
ls
th
e
m
ag
n
itu
d
e
o
f
th
e
in
cr
em
en
ts
d
u
r
i
n
g
p
a
r
am
eter
u
p
d
ates
an
d
h
as
a
s
u
b
s
tan
tial
in
f
lu
en
ce
o
n
b
o
t
h
th
e
r
ate
at
wh
ich
th
e
o
p
tim
izatio
n
p
r
o
ce
s
s
co
n
v
er
g
es,
an
d
its
s
tab
ilit
y
as
in
(
3
)
.
T
h
e
s
ize
o
f
th
e
m
in
ib
atc
h
u
s
e
d
in
ea
ch
iter
atio
n
also
p
lay
s
a
cr
itical
r
o
le
in
th
e
al
g
o
r
ith
m
'
s
o
v
er
all
p
er
f
o
r
m
an
ce
.
T
o
r
e
d
u
ce
o
v
e
r
f
itti
n
g
an
d
im
p
r
o
v
e
g
en
er
aliza
tio
n
to
u
n
s
ee
n
d
ata,
SGD
class
if
ier
s
in
co
r
p
o
r
ate
r
e
g
u
lar
izatio
n
tech
n
iq
u
es.
T
h
ey
ar
e
h
ig
h
l
y
s
ca
lab
le
an
d
well
-
s
u
ited
f
o
r
h
an
d
lin
g
h
ig
h
-
d
im
en
s
io
n
al
d
ata
an
d
lar
g
e
-
s
ca
le
d
atasets
[
37
]
.
(
+
1
)
=
−
∝
∇
(
(
)
)
(
3
)
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J Ar
tif
I
n
tell
I
SS
N:
2252
-
8
9
3
8
A
r
a
b
ic
text
cla
s
s
ifica
tio
n
u
s
in
g
ma
ch
in
e
le
a
r
n
in
g
a
n
d
d
ee
p
…
(
R
a
w
a
d
A
w
a
d
A
l
q
a
h
ta
n
i
)
5209
3
.
4
.
2
.
E
ns
em
ble
lea
rning
E
n
s
em
b
le
lear
n
in
g
is
a
tec
h
n
iq
u
e
i
n
m
ac
h
i
n
e
lear
n
in
g
th
at
im
p
r
o
v
es
f
o
r
ec
ast
ac
cu
r
ac
y
b
y
ag
g
r
eg
atin
g
p
r
ed
ictio
n
s
f
r
o
m
m
an
y
m
o
d
els.
Attem
p
ts
to
r
e
d
u
ce
m
is
tak
es,
im
p
r
o
v
e
p
er
f
o
r
m
an
ce
,
an
d
b
o
o
s
t
o
v
er
all
p
r
e
d
ictio
n
r
esil
ien
ce
,
en
s
em
b
le
lear
n
in
g
o
f
ten
r
esu
lts
in
s
u
p
er
io
r
o
u
tc
o
m
es
ac
r
o
s
s
v
ar
io
u
s
m
ac
h
in
e
lear
n
in
g
task
s
[
38
]
.
A
v
o
tin
g
class
if
ier
,
o
r
(
m
ajo
r
ity
r
u
les)
is
r
eg
u
lar
ly
u
s
ed
f
o
r
class
if
icatio
n
p
r
o
b
lem
s
.
I
t
o
p
er
ates
b
y
a
g
g
r
e
g
atin
g
t
h
e
p
r
ed
ictio
n
s
o
f
m
u
ltip
le
b
ase
m
o
d
els,
ty
p
ically
th
r
o
u
g
h
m
aj
o
r
ity
v
o
tin
g
o
r
b
y
tak
in
g
th
e
a
v
er
ag
e
o
f
th
ei
r
o
u
tp
u
ts
,
to
p
r
o
d
u
ce
a
f
in
al
d
e
cisi
o
n
.
E
ac
h
m
o
d
el
p
r
o
v
id
es
an
esti
m
atio
n
.
An
esti
m
ate
f
r
o
m
ea
ch
m
o
d
el
co
u
n
ts
as
o
n
e
'
v
o
te'
.
T
h
e
m
o
s
t
co
m
m
o
n
'
v
o
te'
is
p
ick
ed
to
r
e
p
r
esen
t
th
e
m
er
g
ed
m
o
d
el;
th
is
m
eth
o
d
lev
er
ag
es
th
e
s
tr
en
g
th
s
o
f
in
d
iv
id
u
al
m
o
d
els
to
cr
ea
te
a
m
o
r
e
b
alan
ce
d
an
d
ac
cu
r
ate
o
v
er
all
p
r
e
d
ictio
n
[
1
3
]
.
Stack
in
g
is
a
p
o
wer
f
u
l
en
s
em
b
le
lear
n
i
n
g
a
p
p
r
o
ac
h
in
m
ac
h
in
e
lear
n
in
g
t
h
at
co
m
b
in
es
th
e
p
r
ed
ictio
n
s
o
f
m
an
y
b
ase
m
o
d
els
to
g
et
a
f
in
al
p
r
e
d
ictio
n
with
h
ig
h
er
p
e
r
f
o
r
m
an
ce
.
T
h
e
p
r
o
ce
s
s
co
m
p
r
is
es
tr
ain
in
g
m
an
y
b
ase
m
o
d
els
o
n
th
e
s
am
e
tr
ain
in
g
d
ataset
an
d
th
en
f
ee
d
in
g
th
eir
p
r
e
d
ictio
n
s
in
to
a
m
o
r
e
ad
v
an
ce
d
m
o
d
el,
also
r
ef
er
r
e
d
to
as
a
m
eta
-
m
o
d
el
t
o
cr
ea
t
e
th
e
f
in
al
p
r
ed
ictio
n
.
T
h
e
m
ain
co
n
ce
p
t
b
eh
in
d
s
tack
in
g
is
to
in
co
r
p
o
r
ate
th
e
p
r
ed
ictio
n
s
o
f
m
an
y
b
ase
m
o
d
els to
g
et
b
etter
p
r
ed
ictiv
e
p
er
f
o
r
m
an
ce
th
a
n
u
s
in
g
a
s
in
g
le
m
o
d
el
[
39
]
.
3
.
4
.
3
.
Dee
p
bid
irec
t
io
na
l t
ra
ns
f
o
rm
er
lea
rning
T
h
e
B
E
R
T
is
a
d
ee
p
lear
n
i
n
g
m
o
d
el
th
at
em
p
lo
y
s
b
id
ir
ec
tio
n
al
s
elf
-
atten
tio
n
to
ev
alu
ate
a
ll
wo
r
d
s
in
a
p
h
r
ase
at
th
e
s
am
e
tim
e,
tak
in
g
in
to
co
n
s
id
er
atio
n
b
o
t
h
th
e
lef
t
an
d
r
ig
h
t
co
n
tex
ts
.
T
h
is
ch
ar
ac
ter
is
tic
m
ak
es
it
a
n
o
tab
le
lan
g
u
a
g
e
m
o
d
el
b
ased
o
n
d
ee
p
lear
n
in
g
.
B
E
R
T
ca
n
b
e
test
ed
v
ia
p
r
e
-
tr
ain
in
g
ap
p
ly
in
g
a
ML
M
an
d
n
ex
t
-
s
en
ten
ce
p
r
e
d
ictio
n
(
NSP).
T
h
is
in
v
o
lv
es
r
an
d
o
m
ly
m
ask
in
g
to
k
en
s
in
an
in
p
u
t
s
eq
u
en
ce
an
d
p
r
ed
ictin
g
th
e
m
ask
ed
wo
r
d
s
b
ased
o
n
th
e
s
u
r
r
o
u
n
d
in
g
c
o
n
tex
t.
ML
M
ass
i
s
t
s
B
E
R
T
in
c
o
m
p
r
eh
e
n
d
in
g
th
e
co
n
tex
tu
al
m
ea
n
i
n
g
in
s
id
e
a
s
en
ten
ce
,
wh
ile
NSP
aid
s
B
E
R
T
in
ca
p
tu
r
in
g
th
e
co
r
r
ela
tio
n
o
r
ass
o
ciatio
n
b
etwe
en
p
air
s
o
f
p
h
r
ases
.
C
o
n
s
eq
u
en
tly
,
b
y
tr
ai
n
in
g
b
o
t
h
tech
n
iq
u
es
s
im
u
ltan
eo
u
s
ly
,
B
E
R
T
ac
q
u
ir
es
a
wid
e
-
r
an
g
in
g
a
n
d
t
h
o
r
o
u
g
h
c
o
m
p
r
eh
e
n
s
io
n
o
f
la
n
g
u
a
g
e,
e
n
co
m
p
ass
in
g
b
o
th
in
tr
icate
as
p
ec
ts
in
s
id
e
p
h
r
ases
an
d
th
e
co
h
e
r
en
ce
b
etwe
en
s
en
ten
ce
s
[
8
]
.
T
h
e
m
o
tiv
atio
n
f
o
r
u
s
in
g
B
E
R
T
in
th
is
r
esear
ch
s
tem
s
f
r
o
m
its
ab
ilit
y
to
h
an
d
le
c
o
m
p
lex
asp
ec
ts
o
f
lan
g
u
ag
e,
i
n
clu
d
in
g
th
e
r
elatio
n
s
h
ip
s
b
etwe
en
s
en
ten
ce
s
.
MA
R
B
E
R
T
is
a
p
r
e
-
tr
ain
ed
t
r
a
n
s
f
o
r
m
e
r
-
b
ase
d
m
o
d
el
s
p
ec
if
ically
d
esig
n
e
d
f
o
r
th
e
Ar
ab
ic
lan
g
u
ag
e
.
T
h
e
MA
R
B
E
R
T
m
o
d
el
u
tili
ze
s
b
o
th
d
ialec
tal
Ar
ab
ic
an
d
MSA
as
in
p
u
ts
f
o
r
ev
alu
a
tin
g
s
em
an
tic
s
en
ten
ce
s
im
ilar
ity
[
1
8
]
.
I
ts
tailo
r
ed
d
esig
n
f
o
r
th
e
co
m
p
le
x
ities
o
f
th
e
Ar
ab
ic
la
n
g
u
a
g
e
p
r
o
m
p
ted
its
u
s
e
in
th
is
s
tu
d
y
to
im
p
r
o
v
e
tex
t
class
if
icatio
n
ef
f
icac
y
[
40
]
.
T
h
is
s
tu
d
y
s
elec
ts
m
o
d
els
to
en
co
m
p
ass
a
v
ar
iety
o
f
m
eth
o
d
o
lo
g
ies,
r
an
g
in
g
f
r
o
m
tr
ad
itio
n
al
m
ac
h
in
e
lear
n
in
g
class
if
ier
s
to
ad
v
an
ce
d
d
ee
p
lear
n
in
g
tech
n
iq
u
es,
th
er
eb
y
o
p
tim
izin
g
p
er
f
o
r
m
an
ce
f
o
r
Ar
a
b
ic
tex
t c
l
ass
if
icatio
n
.
3
.
5
.
E
v
a
lua
t
i
o
n
Mu
ltip
le
m
etr
ics
ar
e
b
ein
g
u
s
ed
f
o
r
e
v
alu
atin
g
m
ac
h
in
e
lea
r
n
in
g
u
s
in
g
en
s
em
b
le
lear
n
in
g
an
d
d
ee
p
lear
n
in
g
m
o
d
els.
T
h
e
c
o
n
f
u
s
io
n
m
atr
ix
is
th
e
m
o
s
t
wid
ely
em
p
lo
y
ed
cr
iter
io
n
.
C
a
teg
o
r
izatio
n
m
o
d
el
ev
alu
atio
n
is
a
tech
n
iq
u
e
u
s
ed
f
o
r
ass
ess
in
g
th
e
ef
f
ec
tiv
e
n
ess
o
f
ca
teg
o
r
izatio
n
m
o
d
els.
T
h
i
s
f
o
u
n
d
atio
n
al
to
o
l
s
er
v
es a
s
th
e
b
asis
f
o
r
ca
lcu
latin
g
o
th
er
cr
u
cial
p
e
r
f
o
r
m
an
ce
m
etr
ics.
Acc
u
r
ac
y
is
a
c
o
m
m
o
n
ly
em
p
lo
y
ed
m
ea
s
u
r
e
in
m
u
lti
-
class
class
if
icatio
n
,
wh
ich
m
ay
b
e
d
ir
ec
tly
d
er
iv
ed
f
r
o
m
t
h
e
co
n
f
u
s
io
n
m
atr
ix
as
in
(
4
)
.
Acc
u
r
ac
y
is
th
e
q
u
an
tific
atio
n
o
f
c
o
r
r
ec
tly
p
r
ed
ictin
g
v
alu
es
in
th
eir
en
tire
ty
[
41
]
.
I
t’
s
ef
f
ec
ti
v
e
wh
en
th
e
class
es
in
a
d
ata
s
et
ar
e
ev
en
ly
b
alan
ce
d
,
as
it
p
r
o
v
id
es
a
g
en
er
al
m
ea
s
u
r
e
o
f
th
e
m
o
d
el'
s
co
r
r
ec
tn
ess
.
Ho
wev
er
,
f
o
r
im
b
alan
ce
d
d
atasets
,
ac
cu
r
ac
y
alo
n
e
ca
n
b
e
m
is
lead
in
g
,
m
ak
in
g
m
etr
ics lik
e
p
r
ec
is
io
n
an
d
r
ec
all
m
o
r
e
v
alu
a
b
le
f
o
r
u
n
d
er
s
tan
d
in
g
th
e
m
o
d
el'
s
p
er
f
o
r
m
an
ce
[
42
]
.
=
+
+
+
+
(
4
)
Pre
cisi
o
n
m
ea
s
u
r
es
h
o
w
c
o
n
f
id
en
tly
a
m
o
d
el
ca
n
class
if
y
an
in
s
tan
ce
as
p
o
s
itiv
e,
f
o
cu
s
in
g
o
n
th
e
p
r
o
p
o
r
tio
n
o
f
tr
u
e
p
o
s
itiv
es
am
o
n
g
all
p
r
ed
icted
p
o
s
itiv
es
as
in
(
5
)
[
41
]
.
I
t’
s
cr
u
cial
in
s
itu
atio
n
s
wh
er
e
th
e
co
s
t
o
f
f
alse
p
o
s
itiv
es
is
h
ig
h
.
E
m
p
h
asizin
g
th
e
ac
cu
r
ac
y
o
f
p
o
s
itiv
e
p
r
ed
ictio
n
s
h
el
p
s
p
r
ev
en
t
t
h
e
m
o
d
el
f
r
o
m
in
co
r
r
ec
tly
class
if
y
in
g
n
eg
ativ
e
in
s
tan
ce
s
as
p
o
s
itiv
e,
wh
ich
ca
n
h
av
e
s
ig
n
if
ica
n
t
co
n
s
eq
u
en
ce
s
in
m
an
y
r
ea
l
-
wo
r
ld
ap
p
licatio
n
s
[
4
3
].
=
+
(
5
)
R
ec
all
g
au
g
es
th
e
m
o
d
el's
c
ap
ac
ity
to
lo
ca
te
ev
er
y
p
o
s
itiv
e
u
n
it
in
th
e
d
ataset
an
d
a
s
s
es
s
es
its
f
o
r
ec
ast
ac
cu
r
ac
y
f
o
r
th
e
p
o
s
itiv
e
class
as
in
(
6
)
[
42
]
.
I
t’
s
cr
u
cial
in
s
ce
n
ar
io
s
w
h
er
e
m
is
s
in
g
p
o
s
itiv
e
in
s
tan
ce
s
(
f
alse
n
eg
ativ
es)
ca
n
h
av
e
s
er
io
u
s
c
o
n
s
eq
u
en
ce
s
.
T
h
er
ef
o
r
e
,
r
ec
all
p
r
o
v
id
es v
alu
a
b
le
in
s
ig
h
t
i
n
to
th
e
m
o
d
el'
s
ab
ilit
y
to
ca
p
tu
r
e
all
r
elev
an
t p
o
s
itiv
e
ca
s
es
[
4
4
]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
2
5
2
-
8
9
3
8
I
n
t J Ar
tif
I
n
tell
,
Vo
l.
14
,
No
.
6
,
Dec
em
b
er
20
25
:
5
2
0
1
-
5
2
1
7
5210
=
+
(
6
)
T
h
e
F1
-
s
co
r
e
ev
alu
ates
a
class
if
icatio
n
m
o
d
el
b
y
c
o
m
b
in
i
n
g
p
r
ec
is
io
n
an
d
r
ec
all
in
to
a
s
in
g
le
m
etr
ic,
ca
lcu
lated
as
th
eir
h
ar
m
o
n
ic
m
ea
n
as
in
(
7
)
[
4
3
]
.
I
t
o
f
f
e
r
s
a
s
in
g
le
p
er
f
o
r
m
an
ce
m
etr
ic
th
at
co
n
s
id
er
s
b
o
th
t
h
e
ac
cu
r
ac
y
o
f
p
o
s
itiv
e
p
r
ed
ictio
n
s
an
d
th
e
m
o
d
el’
s
ab
ilit
y
to
i
d
en
tify
all
p
o
s
itiv
e
ca
s
es.
T
h
is
is
esp
ec
ially
u
s
ef
u
l
in
tex
t
class
if
icatio
n
a
n
d
o
th
er
d
o
m
ain
s
wh
er
e
b
o
th
f
alse
p
o
s
itiv
es
an
d
f
alse
n
eg
ativ
e
s
h
av
e
s
ig
n
if
ican
t
co
n
s
eq
u
en
ce
s
[
40
].
1
−
=
(
2
−
1
+
−
1
)
=
2
×
(
.
+
)
(
7
)
4.
I
M
P
L
E
M
E
NT
A
T
I
O
N
A
ND
E
X
P
E
R
I
M
E
N
T
A
L
RE
SUL
T
S
4
.
1
.
I
m
ple
m
ent
a
t
io
n
As
o
u
tlin
ed
ea
r
lier
in
th
e
m
eth
o
d
s
ec
tio
n
,
th
e
s
tu
d
y
f
o
llo
wed
f
iv
e
m
ain
p
h
ases
.
T
h
e
f
ir
s
t
p
h
ase
in
v
o
lv
ed
s
elec
tin
g
an
d
u
n
d
er
s
tan
d
in
g
th
e
d
ataset
—
th
e
Al
Kh
alee
j
d
ataset,
wh
ich
i
s
p
ar
t
o
f
th
e
SANAD
co
r
p
u
s
.
I
n
t
h
e
s
ec
o
n
d
p
h
ase,
t
h
e
r
aw
d
ata
was
p
r
e
-
p
r
o
ce
s
s
ed
to
p
r
ep
ar
e
it
f
o
r
class
if
icatio
n
.
T
h
e
th
ir
d
p
h
ase
f
o
cu
s
ed
o
n
f
ea
tu
r
e
s
elec
tio
n
,
f
o
llo
wed
b
y
th
e
ap
p
licatio
n
o
f
class
if
icatio
n
m
o
d
els
in
th
e
f
o
u
r
th
p
h
ase.
Fin
ally
,
th
e
f
if
th
p
h
ase
in
v
o
lv
e
d
ev
alu
atin
g
th
e
p
er
f
o
r
m
a
n
ce
o
f
th
ese
m
o
d
els
as sh
o
wn
in
Fig
u
r
e
1
.
T
h
e
ex
p
er
im
en
ts
wer
e
co
n
d
u
cted
u
s
in
g
th
e
Go
o
g
le
C
o
lab
an
d
Kag
g
le
p
latf
o
r
m
s
.
A
cu
s
to
m
ex
tr
ac
t_
f
ea
tu
r
es
f
u
n
ctio
n
,
co
m
b
in
ed
with
T
f
i
d
f
Vec
to
r
izer
,
was
u
s
ed
t
o
a
n
aly
ze
te
x
tu
al
d
ata.
T
h
is
m
eth
o
d
ev
alu
ated
ter
m
im
p
o
r
tan
ce
wit
h
in
a
co
r
p
u
s
u
s
in
g
T
F
-
I
DF,
wi
th
s
u
b
lin
ea
r
_
tf
ap
p
l
y
in
g
a
lo
g
a
r
ith
m
ic
s
ca
le
(
T
F)
with
1
+
lo
g
(
T
F)
to
b
etter
r
e
p
r
esen
t
ter
m
s
ig
n
if
ican
ce
.
T
o
en
h
an
ce
f
ea
t
u
r
e
ex
tr
ac
tio
n
an
d
ca
p
tu
r
e
ca
p
tu
r
ed
s
u
b
tle
lin
g
u
is
tic
p
atter
n
s
,
n
-
g
r
am
r
ep
r
esen
tatio
n
s
—
s
p
ec
if
ically
b
ig
r
am
s
an
d
tr
ig
r
a
m
s
—
wer
e
u
tili
ze
d
.
A
m
ax
im
u
m
o
f
1
0
,
0
0
0
f
ea
tu
r
e
s
was
s
elec
ted
to
s
tr
ik
e
a
b
ala
n
ce
b
etwe
en
co
m
p
u
tatio
n
al
ef
f
icien
cy
an
d
m
o
d
el
ac
cu
r
ac
y
.
T
h
e
d
ataset
was sp
lit in
to
8
0
% f
o
r
tr
ain
in
g
an
d
2
0
% f
o
r
test
in
g
,
f
o
llo
win
g
s
tan
d
ar
d
p
r
ac
tice.
Mo
d
el
tr
ain
in
g
f
o
r
lo
g
is
tic
r
eg
r
ess
io
n
an
d
MN
B
f
o
c
u
s
ed
o
n
h
y
p
er
p
a
r
am
eter
o
p
tim
iza
tio
n
u
s
in
g
r
an
d
o
m
ize
d
s
ea
r
ch
with
cr
o
s
s
-
v
alid
atio
n
.
T
h
is
ap
p
r
o
ac
h
ev
alu
ated
m
u
ltip
le
co
n
f
ig
u
r
atio
n
s
ac
r
o
s
s
1
0
0
iter
atio
n
s
,
em
p
l
o
y
in
g
a
3
-
f
o
ld
c
r
o
s
s
-
v
alid
atio
n
s
tr
ateg
y
to
en
s
u
r
e
r
o
b
u
s
t
an
d
r
eliab
le
p
ar
am
eter
tu
n
in
g
.
I
n
th
e
SGD
class
if
ier
,
a
s
q
u
ar
ed
h
in
g
e
lo
s
s
f
u
n
ctio
n
an
d
L
2
r
eg
u
lar
izatio
n
wer
e
im
p
lem
en
ted
.
T
h
e
m
o
d
el
was
co
n
f
ig
u
r
ed
with
s
ev
er
al
p
ar
a
m
eter
s
,
in
clu
d
in
g
α
=0
.
0
0
0
1
to
co
n
tr
o
l
r
e
g
u
lar
izatio
n
s
tr
e
n
g
th
,
an
l
1
_
r
atio
o
f
0
.
1
5
to
d
e
f
in
e
th
e
E
last
icNe
t
m
ix
in
g
r
atio
,
a
m
ax
im
u
m
o
f
1
0
0
0
iter
atio
n
s
,
an
d
a
to
ler
a
n
ce
o
f
0
.
0
0
1
to
s
p
ec
if
y
wh
en
th
e
m
o
d
el
s
h
o
u
ld
h
alt.
Data
wer
e
s
h
u
f
f
led
b
ef
o
r
e
ea
ch
iter
atio
n
,
u
tili
zin
g
all
av
ailab
le
p
r
o
ce
s
s
o
r
s
f
o
r
co
n
cu
r
r
en
t
task
s
,
s
ettin
g
a
f
ix
ed
s
ee
d
v
alu
e
o
f
1
f
o
r
r
ep
ea
ta
b
ilit
y
,
an
'
o
p
tim
al'
tech
n
iq
u
e
was
ap
p
lied
to
ad
ju
s
t
th
e
lear
n
ed
r
ate,
a
n
d
t
h
e
eta
1
p
ar
am
eter
was
s
et
to
0
.
0
.
T
h
e
m
o
d
el
also
u
s
ed
a
p
o
we
r
_
t
v
alu
e
o
f
0
.
5
f
o
r
in
v
er
s
e
s
ca
lin
g
an
d
d
id
n
o
t im
p
lem
en
t e
ar
ly
s
to
p
p
in
g
.
Vo
tin
g
an
d
s
tack
in
g
en
s
em
b
le
tech
n
iq
u
es
wer
e
im
p
lem
en
te
d
to
en
h
an
ce
p
r
ed
ictio
n
p
er
f
o
r
m
an
ce
b
y
lev
er
ag
in
g
a
c
o
m
b
in
atio
n
o
f
m
u
ltip
le
m
o
d
els
—
lo
g
is
tic
r
eg
r
ess
io
n
,
MN
B
,
an
d
SGD
c
lass
if
ier
.
T
h
e
y
c
o
n
v
er
te
d
in
p
u
t
d
ata
to
Nu
m
Py
ar
r
ay
s
f
o
r
c
o
m
p
atib
ilit
y
,
lev
e
r
ag
e
d
p
ar
allel
p
r
o
ce
s
s
in
g
to
m
ax
i
m
ize
co
m
p
u
tatio
n
al
ef
f
icien
cy
,
an
d
p
r
o
v
id
ed
d
etai
led
o
u
tp
u
t
d
u
r
in
g
tr
ain
i
n
g
.
W
h
ile
th
e
v
o
tin
g
class
if
ier
ag
g
r
eg
ated
p
r
ed
ictio
n
s
th
r
o
u
g
h
a
r
o
b
u
s
t
v
o
tin
g
m
ec
h
an
is
m
,
th
e
s
tack
in
g
class
if
ier
b
u
ilt
a
m
eta
-
m
o
d
el
to
lear
n
f
r
o
m
b
ase
m
o
d
el
o
u
tp
u
ts
,
b
o
th
ef
f
ec
tiv
el
y
en
h
a
n
cin
g
class
if
icatio
n
p
er
f
o
r
m
an
ce
.
T
h
e
MA
R
B
E
R
T
m
o
d
el
(
UB
C
-
NL
P/MA
R
B
E
R
T
)
was
im
p
lem
en
ted
u
s
in
g
th
e
T
r
a
n
s
f
o
r
m
er
s
lib
r
ar
y
f
o
r
to
k
en
izatio
n
an
d
s
eq
u
en
c
e
class
if
icat
io
n
.
T
F
-
I
DF
v
ec
to
r
izatio
n
was
also
ap
p
lied
to
th
e
tex
t
d
ata
f
o
r
ad
d
itio
n
al
p
r
o
ce
s
s
in
g
.
T
h
e
Al
Kh
alee
j
d
ataset
was
p
r
o
ce
s
s
ed
u
s
in
g
a
T
r
a
n
s
f
o
r
m
e
r
s
to
k
en
izer
,
with
L
am
b
d
a
f
u
n
ctio
n
s
ap
p
ly
in
g
th
r
ee
s
tem
m
er
s
—
Far
asa,
As
s
em
,
an
d
T
ash
ap
h
y
n
e
—
to
g
e
n
er
ate
to
k
e
n
I
Ds
an
d
en
co
d
e
t
h
e
tex
t.
T
o
k
e
n
s
an
d
in
p
u
t
I
Ds
w
er
e
s
to
r
ed
f
o
r
u
s
e
with
a
p
r
e
-
tr
ain
ed
B
E
R
T
m
o
d
el.
T
h
e
d
at
aset
was
th
en
s
p
lit
in
to
f
ea
tu
r
es
an
d
tar
g
et
v
ar
iab
les
(
8
0
/2
0
s
p
lit
with
r
an
d
o
m
s
tate
4
2
)
.
Seq
u
en
ce
p
ad
d
in
g
w
as
p
er
f
o
r
m
ed
u
s
in
g
th
e
Ker
as
Pre
p
r
o
ce
s
s
in
g
lib
r
ar
y
’
s
“p
a
d
_
s
eq
u
en
ce
s
”
f
u
n
ctio
n
,
s
ettin
g
th
e
m
a
x
im
u
m
s
eq
u
en
ce
len
g
th
“M
AX_
L
E
N”
o
f
1
2
8
,
co
m
m
o
n
ly
u
s
ed
f
o
r
B
E
R
T
m
o
d
els
.
Atten
tio
n
m
ask
s
wer
e
cr
ea
ted
to
d
is
tin
g
u
is
h
b
etwe
en
ac
tu
al
to
k
en
s
a
n
d
p
a
d
d
in
g
,
en
a
b
lin
g
B
E
R
T
to
f
o
c
u
s
o
n
r
elev
a
n
t in
p
u
t.
T
h
e
“tr
ain
_
t
est_
s
p
lit” f
u
n
ctio
n
s
p
lit
s
th
e
d
ata
in
to
9
0
%
tr
ai
n
in
g
an
d
1
0
%
v
alid
atio
n
s
ets;
r
ep
licatio
n
o
f
th
e
s
p
lit
is
en
s
u
r
ed
b
y
s
p
ec
if
y
in
g
a
r
an
d
o
m
s
tate
o
f
4
2
.
T
h
e
in
p
u
t
d
ata,
co
m
p
r
is
in
g
lab
ele
d
i
n
p
u
ts
an
d
atten
tio
n
m
ask
s
f
o
r
b
o
th
tr
ain
in
g
an
d
test
in
g
,
was
tr
an
s
f
o
r
m
ed
in
to
Py
T
o
r
ch
ten
s
o
r
s
to
f
ac
ilit
ate
ef
f
icien
t
p
r
o
ce
s
s
in
g
a
n
d
s
m
o
o
th
in
teg
r
atio
n
with
n
eu
r
al
n
etwo
r
k
s
.
T
r
ain
in
g
u
tili
ze
d
a
b
atch
s
ize
o
f
1
6
to
en
h
an
ce
co
m
p
u
tatio
n
o
n
b
o
th
g
r
ap
h
ics
p
r
o
ce
s
s
in
g
u
n
its
(
GPUs
)
an
d
ce
n
tr
al
p
r
o
c
ess
in
g
u
n
its
(
C
PU
s
)
.
T
h
e
o
p
tim
izer
was th
en
co
n
f
ig
u
r
ed
u
s
i
n
g
“o
p
tim
.
Ad
a
m
W
”
with
a
lear
n
in
g
r
ate
o
f
0
.
0
0
0
0
1
.
T
r
ai
n
in
g
was
p
er
f
o
r
m
ed
o
v
er
1
1
ep
o
ch
s
with
p
e
r
f
o
r
m
a
n
ce
tr
ac
k
ed
th
r
o
u
g
h
o
u
t,
f
o
llo
wed
b
y
ev
al
u
atio
n
o
n
th
e
v
alid
atio
n
s
et
u
s
in
g
th
e
“f
lat_
ac
cu
r
ac
y
”
f
u
n
ctio
n
to
m
ea
s
u
r
e
ac
cu
r
ac
y
.
Du
r
in
g
test
in
g
,
t
h
e
d
ata
u
n
d
e
r
wen
t
s
im
ilar
p
r
ep
r
o
ce
s
s
in
g
—
in
clu
d
in
g
to
k
en
iz
atio
n
,
p
ad
d
in
g
,
a
n
d
Evaluation Warning : The document was created with Spire.PDF for Python.