I
AE
S In
t
er
na
t
io
na
l J
o
urna
l o
f
Art
if
icia
l In
t
ellig
ence
(
I
J
-
AI
)
Vo
l.
1
4
,
No
.
6
,
Dec
em
b
er
2
0
2
5
,
p
p
.
5
3
1
1
~
5
3
3
2
I
SS
N:
2
2
5
2
-
8
9
3
8
,
DOI
: 1
0
.
1
1
5
9
1
/ijai.v
14
.i
6
.
p
p
5
3
1
1
-
5
3
3
2
5311
J
o
ur
na
l ho
m
ep
a
g
e
:
h
ttp
:
//ij
a
i
.
ia
esco
r
e.
co
m
La
rg
e lang
ua
g
e mo
dels for pa
t
ter
n recog
nition in
t
ex
t
data
Ak
nu
r
K
o
s
s
a
y
a
ko
v
a
1
,
K
urma
s
hev
I
lda
r
1
,
L
uig
i
L
a
Sp
a
da
2
,
Nida
Z
ee
s
ha
n
2
,
M
a
k
ha
bb
a
t
B
a
k
y
t
3
,
M
o
lda
m
ura
t
K
hu
ra
la
y
4
,
O
mirza
k
Abdi
ra
s
hev
4
1
D
e
p
a
r
t
me
n
t
o
f
I
n
f
o
r
mat
i
o
n
a
n
d
C
o
m
mu
n
i
c
a
t
i
o
n
Te
c
h
n
o
l
o
g
i
e
s
,
M
.
K
o
z
y
b
a
y
e
v
N
o
r
t
h
K
a
z
a
k
h
s
t
a
n
U
n
i
v
e
r
s
i
t
y
,
P
e
t
r
o
p
a
v
l
o
v
s
k
,
K
a
z
a
k
h
s
t
a
n
2
S
c
h
o
o
l
o
f
C
o
m
p
u
t
i
n
g
,
E
n
g
i
n
e
e
r
i
n
g
a
n
d
t
h
e
B
u
i
l
t
En
v
i
r
o
n
me
n
t
,
E
d
i
n
b
u
r
g
h
N
a
p
i
e
r
U
n
i
v
e
r
si
t
y
,
E
d
i
n
b
u
r
g
h
,
U
n
i
t
e
d
K
i
n
g
d
o
m
3
D
e
p
a
r
t
me
n
t
o
f
I
n
f
o
r
mat
i
o
n
S
e
c
u
r
i
t
y
,
L.
N
.
G
u
mi
l
y
o
v
E
u
r
a
s
i
a
n
N
a
t
i
o
n
a
l
U
n
i
v
e
r
si
t
y
,
A
s
t
a
n
a
,
K
a
z
a
k
h
s
t
a
n
4
D
e
p
a
r
t
me
n
t
o
f
S
p
a
c
e
Te
c
h
n
i
q
u
e
a
n
d
Te
c
h
n
o
l
o
g
y
,
L
.
N
.
G
u
mi
l
y
o
v
Eu
r
a
s
i
a
n
N
a
t
i
o
n
a
l
U
n
i
v
e
r
si
t
y
,
A
s
t
a
n
a
,
K
a
z
a
k
h
st
a
n
Art
icle
I
nfo
AB
S
T
RAC
T
A
r
ticle
his
to
r
y:
R
ec
eiv
ed
Oct
1
0
,
2
0
2
4
R
ev
is
ed
Oct
1
8
,
2
0
2
5
Acc
ep
ted
No
v
8
,
2
0
2
5
Larg
e
lan
g
u
a
g
e
m
o
d
e
ls
(LL
M
s)
a
re
wid
e
ly
d
e
p
lo
y
e
d
i
n
se
tt
in
g
s
wh
e
re
b
o
th
re
li
a
b
il
it
y
a
n
d
e
fficie
n
c
y
m
a
tt
e
r.
We
p
re
se
n
t
a
c
a
li
b
ra
ted
,
se
e
d
‑ro
b
u
st
e
m
p
iri
c
a
l
c
o
m
p
a
riso
n
o
f
a
n
e
n
c
o
d
e
r
fin
e
‑tu
n
e
d
m
o
d
e
l
(
b
id
irec
ti
o
n
a
l
e
n
c
o
d
e
r
re
p
re
se
n
tatio
n
s
fr
o
m
tran
sfo
rm
e
r
s
(
BERT
)
‑b
a
se
)
a
n
d
a
d
e
c
o
d
e
r
i
n
‑c
o
n
te
x
t
m
o
d
e
l
(
g
e
n
e
ra
ti
v
e
p
re
-
trai
n
e
d
tr
a
n
sfo
rm
e
r
(
G
P
T
)
‑2
sm
a
ll
)
a
c
ro
ss
S
tan
fo
rd
q
u
e
stio
n
a
n
sw
e
rin
g
d
a
tas
e
t
v
2
.
0
(
S
Qu
AD
v
2
.
0
)
a
n
d
g
e
n
e
ra
l
lan
g
u
a
g
e
u
n
d
e
rsta
n
d
i
n
g
e
v
a
lu
a
ti
o
n
(
G
LUE
)
-
m
u
lt
i
-
g
e
n
re
n
a
tu
ra
l
lan
g
u
a
g
e
in
fe
re
n
c
e
(
M
NLI
)
,
S
tan
f
o
rd
se
n
ti
m
e
n
t
tre
e
b
a
n
k
2
(
S
S
T‑
2
)
.
Be
y
o
n
d
a
c
c
u
ra
c
y
,
we
a
ss
e
ss
re
li
a
b
il
it
y
(e
x
p
e
c
ted
c
a
li
b
ra
ti
o
n
e
rro
r
with
re
li
a
b
il
it
y
d
iag
ra
m
s
a
n
d
c
o
n
fid
e
n
c
e
–
c
o
v
e
ra
g
e
a
n
a
ly
sis)
a
n
d
e
fficie
n
c
y
(late
n
c
y
,
m
e
m
o
ry
,
t
h
ro
u
g
h
p
u
t)
u
n
d
e
r
m
a
tch
e
d
c
o
n
d
it
i
o
n
s
a
n
d
th
re
e
fix
e
d
se
e
d
s.
BERT
‑
b
a
se
y
iel
d
s
h
ig
h
e
r
a
c
c
u
ra
c
y
a
n
d
l
o
we
r
c
a
li
b
ra
ti
o
n
e
rro
r,
w
h
il
e
G
P
T‑2
n
a
rro
ws
g
a
p
s
u
n
d
e
r
fe
w‑s
h
o
t
p
ro
m
p
ti
n
g
b
u
t
re
m
a
in
s m
o
re
se
n
siti
v
e
to
p
r
o
m
p
t
d
e
si
g
n
a
n
d
c
o
n
tex
t
len
g
t
h
.
Eff
icie
n
c
y
b
e
n
c
h
m
a
rk
s
sh
o
w
th
a
t
d
e
c
o
d
e
r‑o
n
l
y
p
r
o
m
p
ti
n
g
i
n
c
u
rs
n
e
a
r‑li
n
e
a
r
late
n
c
y
/me
m
o
ry
g
r
o
wth
with
k
‑sh
o
t
e
x
e
m
p
lars
,
wh
e
re
a
s
fin
e
‑tu
n
e
d
e
n
c
o
d
e
rs
m
a
in
tain
sta
b
le
p
e
r‑e
x
a
m
p
le
c
o
st.
T
h
e
se
fin
d
in
g
s
o
ffe
r
p
ra
c
ti
c
a
l
g
u
i
d
a
n
c
e
o
n
wh
e
n
to
p
re
fe
r
fin
e
‑t
u
n
i
n
g
v
e
rsu
s
p
ro
m
p
ti
n
g
a
n
d
d
e
m
o
n
stra
te
th
a
t
re
li
a
b
i
li
ty
m
u
st
b
e
e
v
a
lu
a
ted
a
lo
n
g
si
d
e
a
c
c
u
ra
c
y
fo
r
risk
‑a
wa
re
d
e
p
lo
y
m
e
n
t.
K
ey
w
o
r
d
s
:
B
E
R
T
-
b
ase
C
o
m
p
u
tatio
n
al
ef
f
icien
c
y
E
x
p
ec
ted
ca
lib
r
atio
n
er
r
o
r
GPT
-
2
I
n
co
n
te
x
t le
ar
n
in
g
L
ar
g
e
lan
g
u
ag
e
m
o
d
els
Qu
esti
o
n
an
s
wer
in
g
T
h
is i
s
a
n
o
p
e
n
a
c
c
e
ss
a
rticle
u
n
d
e
r th
e
CC B
Y
-
SA
li
c
e
n
se
.
C
o
r
r
e
s
p
o
nd
ing
A
uth
o
r
:
Ma
k
h
ab
b
at
B
ak
y
t
Dep
ar
tm
en
t o
f
I
n
f
o
r
m
atio
n
Secu
r
ity
,
L
.
N.
Gu
m
ily
o
v
E
u
r
asia
n
Natio
n
al
Un
iv
er
s
ity
Satp
ay
ev
s
tr
.
2
,
Ast
an
a,
Kaz
ak
h
s
tan
E
m
ail:
b
ak
y
t.m
a
k
h
ab
b
at@
g
m
ail.
co
m
1.
I
NT
RO
D
UCT
I
O
N
P
a
t
t
e
r
n
r
e
c
o
g
n
i
t
io
n
in
t
e
x
t
d
a
t
a
i
s
a
f
u
n
d
a
m
e
n
t
a
l
t
a
s
k
w
i
t
h
i
n
n
a
t
u
r
a
l
l
a
n
g
u
ag
e
p
r
o
c
e
s
s
i
n
g
(
NL
P
)
a
n
d
ar
t
i
f
ic
i
a
l
i
n
te
l
l
i
g
en
c
e
(
AI
)
.
T
h
e
t
r
a
n
s
f
o
r
m
e
r
a
r
ch
i
t
e
c
tu
r
e
,
a
t
y
p
e
o
f
h
i
e
r
a
r
ch
i
c
a
l
n
e
u
r
a
l
n
e
t
wo
r
k
,
h
a
s
b
e
e
n
w
id
e
l
y
ad
o
p
t
ed
f
o
r
i
t
s
e
f
f
e
c
t
iv
e
n
e
s
s
i
n
m
o
d
e
l
i
n
g
lo
n
g
-
t
er
m
d
ep
en
d
en
c
i
e
s
.
G
en
e
r
a
t
i
v
e
p
r
e
-
t
r
a
in
ed
t
r
a
n
s
f
o
r
m
e
r
(
G
P
T
)
,
an
ad
v
an
c
e
d
m
o
d
e
l
f
r
o
m
O
p
e
n
A
I
,
ex
em
p
l
i
f
i
e
s
t
h
e
p
o
te
n
t
i
a
l
o
f
t
r
a
n
s
f
o
r
m
e
r
s
in
d
i
v
er
s
e
N
L
P
t
a
s
k
s
.
T
h
e
i
n
tr
o
d
u
c
t
i
o
n
o
f
t
h
e
t
r
a
n
s
f
o
r
m
e
r
m
ar
k
ed
a
p
ar
a
d
ig
m
s
h
i
f
t
i
n
N
L
P
,
e
s
t
ab
l
i
s
h
i
n
g
p
r
in
c
i
p
le
s
l
i
k
e
s
e
l
f
-
a
t
t
e
n
t
io
n
a
n
d
b
id
i
r
ec
t
i
o
n
a
l
e
n
co
d
in
g
.
W
h
i
l
e
ex
i
s
t
i
n
g
l
i
t
er
a
t
u
r
e
r
e
v
i
e
w
s
o
p
en
-
s
o
u
r
c
e
m
o
d
e
l
s
l
ik
e
b
i
d
i
r
e
c
t
i
o
n
a
l
e
n
co
d
e
r
r
ep
r
e
s
e
n
t
a
t
i
o
n
s
f
r
o
m
tr
an
s
f
o
r
m
er
s
(
B
E
R
T
)
a
n
d
G
P
T
-
2
(
r
e
v
iew
e
d
in
s
e
c
t
io
n
2
)
,
a
s
i
g
n
if
i
c
a
n
t
r
e
s
e
ar
c
h
g
ap
r
em
a
i
n
s
:
a
c
r
i
t
i
c
a
l
e
x
a
m
i
n
a
t
i
o
n
o
f
h
i
er
a
r
ch
i
c
a
l
p
a
t
t
er
n
r
e
co
g
n
i
t
i
o
n
m
e
ch
an
i
s
m
s
i
n
o
p
aq
u
e
,
c
o
m
m
er
c
i
a
l
ly
d
ep
l
o
y
e
d
m
o
d
e
l
s
l
i
k
e
G
PT
.
T
h
i
s
s
t
u
d
y
ex
p
lo
r
e
s
h
o
w
l
a
r
g
e
l
an
g
u
a
g
e
m
o
d
e
l
s
(
L
L
M
s
)
r
e
co
g
n
i
z
e
p
a
t
te
r
n
s
,
u
s
i
n
g
G
P
T
a
s
a
r
e
p
r
e
s
e
n
t
a
t
iv
e
c
a
s
e
,
m
o
v
i
n
g
f
r
o
m
b
a
s
i
c
l
i
n
g
u
i
s
t
i
c
f
e
a
tu
r
e
s
t
o
d
e
e
p
e
r
s
e
m
an
t
i
c
r
e
p
r
e
s
e
n
t
a
t
io
n
s
[
1
]
–
[
5
]
.
A
p
r
i
m
a
r
y
o
b
j
e
ct
i
v
e
i
s
t
o
an
a
l
y
ze
t
h
e
G
P
T
a
r
c
h
i
t
e
c
tu
r
e
in
d
e
ta
i
l
,
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
2
5
2
-
8
9
3
8
I
n
t J Ar
tif
I
n
tell
,
Vo
l.
1
4
,
No
.
6
,
Dec
em
b
er
2
0
2
5
:
5
3
1
1
-
5
3
3
2
5312
e
x
a
m
in
i
n
g
h
o
w
d
e
s
i
g
n
ch
o
i
ce
s
,
s
u
c
h
a
s
s
t
a
c
k
e
d
t
r
an
s
f
o
r
m
er
l
a
y
e
r
s
an
d
m
u
l
t
i
-
h
e
a
d
s
e
l
f
-
at
t
e
n
t
i
o
n
,
a
f
f
e
c
t
th
e
m
o
d
e
l
's
a
b
i
l
i
t
y
to
le
a
r
n
an
d
i
d
en
t
i
f
y
p
a
t
t
e
r
n
s
,
th
e
r
e
b
y
d
e
t
e
r
m
in
i
n
g
i
f
s
t
r
o
n
g
p
e
r
f
o
r
m
a
n
c
e
s
t
e
m
s
f
r
o
m
a
r
c
h
i
t
ec
t
u
r
a
l
i
n
n
o
v
a
t
io
n
a
s
m
u
c
h
a
s
f
r
o
m
th
e
s
c
a
le
o
f
i
t
s
t
r
a
i
n
i
n
g
d
a
t
a
.
T
h
i
s
in
c
l
u
d
e
s
an
e
x
a
m
in
a
t
i
o
n
o
f
th
e
a
t
t
e
n
t
i
o
n
m
e
ch
a
n
i
s
m
,
wh
i
ch
a
l
l
o
w
s
s
i
m
u
l
t
an
eo
u
s
c
o
n
s
i
d
e
r
a
t
i
o
n
o
f
m
u
l
t
i
p
l
e
l
in
g
u
i
s
t
i
c
a
s
p
ec
t
s
an
d
t
h
e
f
o
r
m
a
t
i
o
n
o
f
h
i
er
a
r
c
h
i
c
a
l
r
e
p
r
e
s
en
t
a
t
i
o
n
s
.
An
o
th
e
r
o
b
je
c
t
i
v
e
i
s
t
o
a
s
s
e
s
s
G
P
T
's
g
e
n
er
a
l
i
z
a
t
io
n
b
y
e
v
a
l
u
a
t
i
n
g
n
o
t
o
n
ly
r
e
p
o
r
t
ed
p
e
r
f
o
r
m
an
c
e
m
e
t
r
i
c
s
b
u
t
a
ls
o
t
h
e
c
o
n
d
i
t
io
n
s
u
n
d
e
r
w
h
ic
h
g
e
n
er
a
l
i
z
a
t
io
n
s
u
c
c
e
e
d
s
o
r
f
a
i
l
s
,
c
o
n
s
i
d
e
r
in
g
t
h
e
ch
a
l
l
e
n
g
e
t
h
a
t
r
e
l
i
an
c
e
o
n
m
a
s
s
i
v
e
d
a
t
a
s
c
a
l
e
p
o
s
e
s
f
o
r
r
e
p
r
o
d
u
c
ib
i
l
i
t
y
a
n
d
c
o
m
p
u
ta
t
i
o
n
a
l
co
s
t
.
F
i
n
al
l
y
,
th
e
s
t
u
d
y
ex
a
m
in
e
s
th
e
r
e
a
l
-
w
o
r
l
d
a
p
p
l
i
c
a
t
i
o
n
s
o
f
G
PT
i
n
d
o
m
a
i
n
s
l
ik
e
h
e
a
l
t
h
c
a
r
e
an
d
la
w
,
a
ck
n
o
wl
e
d
g
in
g
th
a
t
i
t
s
u
t
i
l
i
ty
i
s
co
u
n
te
r
b
a
l
a
n
c
e
d
b
y
i
t
s
s
u
s
c
ep
t
i
b
i
l
i
ty
t
o
f
a
c
t
u
a
l
i
n
a
c
cu
r
a
c
i
e
s
,
m
i
s
i
n
te
r
p
r
e
t
a
t
io
n
s
,
a
n
d
i
n
a
p
p
r
o
p
r
ia
t
e
o
u
tp
u
t
s
,
wh
i
ch
h
ig
h
l
ig
h
t
s
th
e
c
r
i
t
i
c
a
l
n
e
ed
f
o
r
r
e
s
p
o
n
s
i
b
l
e
d
ep
l
o
y
m
en
t
th
a
t
p
r
i
o
r
i
t
i
z
e
s
ac
c
u
r
a
cy
an
d
in
t
e
r
p
r
e
t
ab
i
l
i
t
y
.
T
h
i
s
w
o
r
k
m
a
k
e
s
t
h
e
f
o
l
l
o
w
i
n
g
c
o
n
t
r
i
b
u
t
i
o
n
s
:
i
)
C
a
l
i
b
r
a
t
e
d
e
v
a
l
u
a
t
i
o
n
f
r
a
m
e
w
o
r
k
f
o
r
p
a
t
t
e
r
n
r
e
c
o
g
n
i
t
i
o
n
i
n
t
e
x
t
.
W
e
i
n
t
r
o
d
u
c
e
a
u
n
i
f
i
e
d
f
r
a
m
e
w
o
r
k
t
h
a
t
e
v
a
l
u
a
t
e
s
a
c
c
u
r
a
c
y
,
r
e
l
i
a
b
i
l
i
t
y
(
e
x
p
e
c
t
e
d
c
a
l
i
b
r
a
t
i
o
n
e
r
r
o
r
(
E
C
E
)
w
i
t
h
r
e
l
i
a
b
i
l
i
t
y
d
i
a
g
r
a
m
s
,
c
o
n
f
i
d
e
n
c
e
–
c
o
v
e
r
a
g
e
a
n
a
l
y
s
i
s
)
,
a
n
d
e
f
f
i
c
i
e
n
c
y
(
l
a
t
e
n
c
y
,
m
e
m
o
r
y
,
a
n
d
t
h
r
o
u
g
h
p
u
t
)
u
n
d
e
r
m
a
t
c
h
e
d
c
o
n
d
i
t
i
o
n
s
.
T
h
i
s
f
r
a
m
e
w
o
r
k
e
x
p
l
i
c
i
t
l
y
c
o
n
t
r
a
s
t
s
e
n
c
o
d
e
r
f
i
n
e
t
u
n
i
n
g
w
i
t
h
d
e
c
o
d
e
r
i
n
c
o
n
t
e
x
t
l
e
a
r
n
i
n
g
s
o
t
h
a
t
p
e
r
f
o
r
m
a
n
c
e
i
s
i
n
t
e
r
p
r
e
t
e
d
a
l
o
n
g
s
i
d
e
c
o
m
p
u
t
a
t
i
o
n
a
l
c
o
s
t
;
i
i
)
C
o
n
t
r
o
l
l
e
d
,
s
e
e
d
r
o
b
u
s
t
c
o
m
p
a
r
i
s
o
n
o
f
e
n
c
o
d
e
r
v
s
.
d
e
c
o
d
e
r
p
a
r
a
d
i
g
m
s
.
U
s
i
n
g
p
u
b
l
i
c
b
e
n
c
h
m
a
r
k
s
(
S
t
a
n
f
o
r
d
q
u
e
s
t
i
o
n
a
n
s
w
e
r
i
n
g
d
a
t
a
s
e
t
v
2
.
0
(
S
Q
u
A
D
v
2
.
0
)
;
g
e
n
e
r
a
l
l
a
n
g
u
a
g
e
u
n
d
e
r
s
t
a
n
d
i
n
g
e
v
a
l
u
a
t
i
o
n
(
G
L
U
E
)
—
m
u
l
t
i
-
g
e
n
r
e
n
a
t
u
r
a
l
l
a
n
g
u
a
g
e
i
n
f
e
r
e
n
c
e
(
M
N
L
I
)
,
S
t
a
n
f
o
r
d
s
e
n
t
i
m
e
n
t
t
r
e
e
b
a
n
k
2
(
S
S
T
‑
2
)
)
,
w
e
r
u
n
f
i
x
e
d
s
e
e
d
e
x
p
e
r
i
m
e
n
t
s
(
t
h
r
e
e
s
e
e
d
s
)
t
o
q
u
a
n
t
i
f
y
n
o
t
o
n
l
y
a
c
c
u
r
a
c
y
b
u
t
a
l
s
o
c
a
l
i
b
r
a
t
i
o
n
a
n
d
s
e
l
e
c
t
i
v
e
p
r
e
d
i
c
t
i
o
n
b
e
h
a
v
i
o
r
.
W
e
e
x
p
o
s
e
p
r
o
m
p
t
s
e
n
s
i
t
i
v
i
t
y
e
f
f
e
c
t
s
f
o
r
t
h
e
d
e
c
o
d
e
r
o
n
l
y
m
o
d
e
l
a
n
d
s
h
o
w
h
o
w
t
h
e
s
e
e
f
f
e
c
t
s
i
n
t
e
r
a
c
t
w
i
t
h
c
a
l
i
b
r
a
t
i
o
n
;
i
i
i
)
E
r
r
o
r
t
a
x
o
n
o
m
y
l
i
n
k
e
d
t
o
c
a
l
i
b
r
a
t
i
o
n
a
n
d
a
b
s
t
e
n
t
i
o
n
.
W
e
p
r
o
v
i
d
e
q
u
a
l
i
t
a
t
i
v
e
a
n
d
q
u
a
n
t
i
t
a
t
i
v
e
a
n
a
l
y
s
e
s
o
f
f
a
i
l
u
r
e
m
o
d
e
s
(
h
a
l
l
u
c
i
n
a
t
i
o
n
s
/
a
b
s
t
e
n
t
i
o
n
e
r
r
o
r
s
,
s
p
a
n
b
o
u
n
d
a
r
y
e
r
r
o
r
s
,
l
a
b
e
l
c
o
n
f
u
s
i
o
n
s
)
a
n
d
s
h
o
w
t
h
a
t
c
a
l
i
b
r
a
t
i
o
n
a
w
a
r
e
t
h
r
e
s
h
o
l
d
s
c
a
n
t
r
a
d
e
c
o
v
e
r
a
g
e
f
o
r
r
e
l
i
a
b
i
l
i
t
y
i
n
d
e
p
l
o
y
m
e
n
t
s
c
e
n
a
r
i
o
s
;
a
n
d
i
v
)
C
o
m
p
u
t
e
a
w
a
r
e
g
u
i
d
a
n
c
e
f
o
r
p
r
a
c
t
i
t
i
o
n
e
r
s
.
W
e
d
e
l
i
n
e
a
t
e
r
e
g
i
m
e
s
w
h
e
r
e
f
i
n
e
-
t
u
n
e
d
e
n
c
o
d
e
r
s
p
r
o
v
i
d
e
s
t
a
b
l
e
,
c
o
m
p
u
t
e
e
f
f
i
c
i
e
n
t
p
e
r
f
o
r
m
a
n
c
e
v
e
r
s
u
s
r
e
g
i
m
e
s
w
h
e
r
e
i
n
c
o
n
t
e
x
t
d
e
c
o
d
e
r
s
a
r
e
c
o
m
p
e
t
i
t
i
v
e
,
o
f
f
e
r
i
n
g
a
c
t
i
o
n
a
b
l
e
g
u
i
d
a
n
c
e
f
o
r
b
u
d
g
e
t
c
o
n
s
t
r
a
i
n
e
d
a
p
p
l
i
c
a
t
i
o
n
s
.
2.
ST
A
T
E
O
F
T
H
E
ART
T
r
an
s
f
o
r
m
e
r
-
b
ased
lan
g
u
ag
e
m
o
d
els
ac
h
iev
e
r
em
ar
k
ab
le
s
u
cc
ess
b
y
lear
n
in
g
h
i
er
ar
ch
ical
r
ep
r
esen
tatio
n
s
o
f
tex
t.
Un
lik
e
o
ld
er
s
eq
u
en
tial
ar
ch
itect
u
r
es,
t
r
an
s
f
o
r
m
er
s
u
tili
ze
m
u
ltip
le
s
elf
-
atten
tio
n
lay
er
s
th
at
in
teg
r
ate
lo
w
-
le
v
e
l
lin
g
u
is
tic
f
ea
tu
r
es
i
n
to
h
ig
h
er
-
lev
el
ab
s
tr
ac
tio
n
s
.
E
m
p
ir
ic
al
an
aly
s
is
s
h
o
ws
th
ese
m
o
d
els
ca
p
tu
r
e
tr
e
e
-
lik
e
s
y
n
tactic
s
tr
u
ctu
r
es
with
in
th
eir
laten
t
s
p
ac
es,
alig
n
in
g
wit
h
lin
g
u
is
tic
th
eo
r
ies
o
f
s
y
n
tax
a
n
d
s
em
an
tics
.
T
h
e
m
u
lti
-
h
ea
d
s
elf
-
atten
tio
n
m
ec
h
an
is
m
is
ce
n
tr
al
to
th
is
h
ier
ar
ch
ical
o
r
g
an
izatio
n
.
Mu
ltip
le
h
ea
d
s
allo
w
s
im
u
lta
n
eo
u
s
an
aly
s
is
o
f
s
en
ten
ce
p
ar
ts
,
with
s
o
m
e
h
ea
d
s
s
p
ec
iali
zin
g
in
s
y
n
tactic
o
r
s
em
an
tic
f
u
n
ctio
n
s
[
6
]
–
[
1
1
]
.
Alth
o
u
g
h
atten
tio
n
weig
h
ts
o
f
f
er
in
ter
p
r
etiv
e
clu
es,
th
ey
d
o
n
o
t
f
u
lly
r
ev
ea
l
m
o
d
el
r
ea
s
o
n
in
g
.
Nev
er
th
eless
,
a
co
n
s
en
s
u
s
h
o
ld
s
th
at
t
r
an
s
f
o
r
m
er
s
lear
n
r
ic
h
,
s
tr
u
ct
u
r
ed
r
e
p
r
esen
tatio
n
s
wh
er
e
ea
r
lier
lay
er
s
ca
p
t
u
r
e
le
x
ical
f
ea
tu
r
es a
n
d
d
ee
p
er
o
n
e’
s
ab
s
tr
ac
t sem
an
tics
.
H
a
v
i
n
g
e
s
t
a
b
l
i
s
h
e
d
h
i
er
a
r
c
h
ic
a
l
r
ep
r
e
s
en
t
a
t
i
o
n
s
,
w
e
e
x
am
i
n
e
s
c
a
l
in
g
.
O
v
e
r
th
e
p
a
s
t
f
i
v
e
y
e
ar
s
,
m
o
d
e
l
s
h
av
e
e
x
p
a
n
d
e
d
d
r
a
m
a
t
i
c
a
l
l
y
,
e
x
e
m
p
l
if
i
ed
b
y
Op
en
A
I
’
s
G
P
T
-
3
(
2
0
2
0
)
(
1
7
5
b
il
l
i
o
n
p
a
r
a
m
e
t
er
s
)
,
w
h
i
c
h
d
em
o
n
s
t
r
a
t
ed
ex
c
ep
t
io
n
a
l
f
e
w
-
s
h
o
t
l
e
ar
n
in
g
p
er
f
o
r
m
an
c
e.
T
h
i
s
s
h
if
t
f
r
o
m
B
E
R
T
's
b
i
d
i
r
e
c
t
io
n
a
l
p
r
e
tr
a
i
n
in
g
t
o
G
P
T
-
3
's
l
a
r
g
e
-
s
c
a
l
e
au
t
o
r
e
g
r
e
s
s
i
v
e
f
r
am
e
w
o
r
k
m
ar
k
ed
tr
a
n
s
i
t
io
n
f
r
o
m
s
u
p
e
r
v
i
s
ed
f
in
e
-
tu
n
in
g
t
o
w
a
r
d
p
r
o
m
p
t
-
b
a
s
e
d
ad
a
p
ta
ti
o
n
.
W
h
i
l
e
B
E
R
T
-
s
ty
l
e
e
n
co
d
e
r
s
a
c
cu
m
u
l
a
t
e
f
e
a
tu
r
e
s
f
o
r
t
a
s
k
-
s
p
e
c
if
i
c
f
in
e
-
t
u
n
in
g
,
la
r
g
e
d
e
c
o
d
e
r
-
o
n
l
y
m
o
d
e
l
s
l
ev
e
r
ag
e
s
c
a
le
f
o
r
s
t
r
o
n
g
z
e
r
o
-
/f
e
w
-
s
h
o
t
r
e
s
u
l
t
s
.
T
h
i
s
r
e
o
r
i
e
n
t
s
l
ea
r
n
in
g
t
o
th
e
p
r
o
m
p
t
,
i
n
cr
e
a
s
i
n
g
a
d
a
p
t
a
b
i
l
i
ty
b
u
t
a
l
s
o
s
en
s
i
t
i
v
i
t
y
t
o
p
r
o
m
p
t
d
e
s
ig
n
.
S
c
a
l
in
g
,
h
o
w
ev
e
r
,
in
v
o
l
v
es
t
r
a
d
e‑
o
f
f
s
b
e
t
w
e
en
d
a
ta
,
p
ar
a
m
e
t
er
s
,
an
d
c
o
m
p
u
t
e
.
S
tu
d
i
e
s
o
n
co
m
p
u
t
e‑
o
p
t
i
m
a
l
tr
a
i
n
i
n
g
s
h
o
w
t
h
a
t
m
o
d
e
r
a
t
e
ly
s
i
z
ed
m
o
d
e
l
s
t
r
a
in
e
d
o
n
s
u
b
s
t
a
n
t
i
a
l
ly
m
o
r
e
t
o
k
e
n
s
c
an
m
a
t
c
h
o
r
ex
c
e
ed
l
ar
g
e
r
o
n
e
s
,
r
ed
e
f
in
i
n
g
s
c
a
l
i
n
g
a
s
b
a
l
a
n
ce
r
a
th
e
r
th
an
r
a
c
e
to
m
ax
i
m
a
l
p
a
r
am
e
t
er
c
o
u
n
t
s
[
1
2
]
.
F
r
a
m
e
w
o
r
k
s
s
u
c
h
a
s
t
ex
t
-
to
-
t
ex
t
t
r
a
n
s
f
er
tr
a
n
s
f
o
r
m
e
r
(
T5
)
,
p
a
th
w
a
y
s
l
an
g
u
ag
e
m
o
d
e
l
(
P
aL
M
)
,
an
d
l
a
r
g
e
l
an
g
u
ag
e
m
o
d
el
m
e
t
a
AI
(
L
L
a
M
A
)
b
r
o
a
d
en
m
u
l
t
i
l
i
n
g
u
a
l
an
d
m
u
l
t
i
‑
t
a
s
k
c
o
v
e
r
a
g
e
wh
i
l
e
m
a
i
n
t
a
in
i
n
g
a
u
n
i
f
i
ed
i
n
te
r
f
a
c
e
,
e
n
a
b
l
in
g
m
o
r
e
e
f
f
i
c
i
en
t
t
r
an
s
f
er
a
n
d
c
o
m
p
a
r
i
s
o
n
[
1
3
]
,
[
1
4
]
.
T
o
co
n
s
o
l
i
d
a
t
e
t
h
e
s
e
d
ev
e
l
o
p
m
en
t
s
,
T
a
b
l
e
1
c
o
m
p
ar
e
s
r
e
p
r
e
s
e
n
t
a
t
iv
e
t
r
a
n
s
f
o
r
m
e
r
f
am
i
l
i
e
s
—
B
E
R
T
,
G
P
T
‑
2
/3
/
3
.
5
/4
,
T
5
,
L
L
a
M
A
,
P
a
L
M,
C
h
i
n
ch
i
l
l
a
,
an
d
C
l
au
d
e
—
i
n
t
er
m
s
o
f
s
i
z
e
,
d
a
t
a
,
b
e
n
ch
m
a
r
k
s
,
a
n
d
r
ep
o
r
te
d
l
i
m
i
t
a
ti
o
n
s
,
i
l
l
u
s
t
r
a
t
i
n
g
s
c
a
l
in
g
t
r
en
d
s
a
n
d
r
e
l
i
a
b
i
l
i
ty
c
h
a
l
l
en
g
e
s
.
B
e
n
ch
m
ar
k
s
l
i
s
te
d
ar
e
r
e
p
r
e
s
e
n
ta
t
i
v
e
;
e
x
a
c
t
s
c
o
r
e
s
v
ar
y
b
y
v
a
r
i
an
t
an
d
s
e
t
u
p
.
P
r
o
p
r
i
e
t
ar
y
m
o
d
e
l
s
d
i
s
c
lo
s
e
l
i
m
i
t
ed
tr
a
i
n
in
g
d
e
t
a
i
l
s
;
v
a
l
u
e
s
r
ef
l
e
c
t
p
u
b
li
c
r
ep
o
r
t
s
a
t
t
i
m
e
o
f
wr
i
t
i
n
g
.
L
L
M
c
a
p
ab
i
l
i
t
i
e
s
a
r
e
d
ef
i
n
ed
b
y
r
e
p
r
e
s
e
n
t
at
i
o
n
h
i
e
r
ar
c
h
y
,
s
c
a
l
in
g
ef
f
i
c
i
en
c
y
,
a
n
d
r
e
l
i
a
b
i
l
i
ty
/
i
n
te
r
p
r
e
t
ab
i
l
i
t
y
.
Mo
d
e
l
s
l
i
k
e
T
5
(
1
1
B
)
,
L
L
aM
A
(
2
0
2
3
,
u
p
t
o
6
5
B
)
,
C
l
a
u
d
e,
P
a
L
M
,
an
d
C
h
i
n
c
h
i
l
l
a
o
p
t
i
m
iz
e
th
e
s
i
z
e
-
d
a
t
a
t
r
ad
e
-
o
f
f
.
L
a
r
g
e
r
,
m
o
r
e
d
i
v
er
s
e
tr
a
i
n
in
g
d
a
t
a
g
en
e
r
a
ll
y
co
r
r
e
l
a
t
e
s
w
i
th
b
e
t
t
e
r
p
er
f
o
r
m
an
c
e,
f
o
l
l
o
w
in
g
e
m
p
i
r
i
c
al
s
c
a
l
i
n
g
l
a
w
s
.
Ho
w
e
v
e
r
,
s
c
a
l
i
n
g
cr
e
a
t
e
s
ch
a
l
l
en
g
e
s
:
r
e
p
r
o
d
u
c
ib
i
l
i
t
y
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J Ar
tif
I
n
tell
I
SS
N:
2252
-
8
9
3
8
La
r
g
e
la
n
g
u
a
g
e
mo
d
els fo
r
p
a
tter
n
r
ec
o
g
n
itio
n
in
text
d
a
ta
(
A
kn
u
r
K
o
s
s
a
ya
ko
va
)
5313
(
e
.
g
.
,
G
P
T
-
3
)
a
n
d
a
c
ce
s
s
i
b
i
l
it
y
a
r
e
m
a
jo
r
co
n
ce
r
n
s
.
P
e
r
f
o
r
m
a
n
c
e
m
ay
r
e
f
l
e
c
t
t
r
a
in
i
n
g
d
a
t
a
ar
t
i
f
a
c
t
s
,
a
n
d
s
c
a
l
e
a
l
o
n
e
d
o
e
s
n
o
t
g
u
ar
a
n
te
e
g
en
u
in
e
u
n
d
er
s
t
a
n
d
in
g
.
Ar
c
h
i
t
e
c
tu
r
a
l
i
n
n
o
v
a
t
io
n
s
l
ik
e
e
f
f
i
c
i
en
t
-
a
t
te
n
t
io
n
v
a
r
i
an
t
s
(
e
.
g
.
,
B
i
g
B
i
r
d
a
n
d
F
l
a
s
h
A
t
t
e
n
t
io
n
)
ex
t
en
d
c
o
n
t
e
x
t
l
e
n
g
t
h
an
d
t
h
r
o
u
g
h
p
u
t
,
c
o
m
p
le
m
en
t
i
n
g
p
a
r
am
e
t
e
r
s
ca
l
i
n
g
.
S
i
n
c
e
g
r
ea
t
e
r
s
c
a
l
e
ca
n
am
p
l
if
y
h
a
l
l
u
c
i
n
a
t
i
o
n
s
a
n
d
b
i
a
s
,
t
h
e
n
ex
t
s
e
c
t
i
o
n
e
v
a
lu
a
t
e
s
r
e
l
i
a
b
i
l
i
ty
an
d
f
a
i
lu
r
e
m
o
d
e
s
[
1
5
]
–
[
2
0
]
.
T
ab
le
1
.
E
n
co
d
er
/
d
ec
o
d
e
r
L
M
s
m
o
s
t c
ited
in
2
0
2
0
–
2
0
2
5
lite
r
atu
r
e
Mo
d
e
l
(
y
e
ar)
Param
s
/
d
e
p
t
h
Pret
rai
n
i
n
g
d
at
a
(t
y
p
e
a
n
d
s
i
ze)
A
rc
h
i
t
ect
u
r
e
Rep
res
en
t
a
t
i
v
e
b
e
n
c
h
ma
rk
s
ac
h
i
ev
ed
Rep
o
r
t
e
d
l
i
m
i
t
a
t
i
o
n
s
(b
i
a
s
,
h
a
l
l
u
ci
n
a
t
i
o
n
,
a
n
d
i
n
t
er
p
r
et
ab
i
l
i
t
y
)
BE
R
T
b
a
s
e
/
l
a
rg
e
(2
0
1
8
)
1
1
0
M
/
3
4
0
M;
~1
2
/
2
4
l
a
y
er
s
Bo
o
k
C
o
r
p
u
s
+
E
n
g
l
i
s
h
W
i
k
i
p
e
d
i
a;
~
3
.
3
B
w
o
rd
s
(≈
1
6
–
2
0
G
B)
;
E
n
g
l
i
s
h
o
n
l
y
E
n
c
o
d
er
-
o
n
l
y
(b
i
d
i
re
ct
i
o
n
al
)
SO
T
A
a
t
rel
ea
s
e
o
n
G
L
U
E
,
S
Q
u
A
D
1
.
1
/
2
.
0
,
MN
L
I
(
w
i
t
h
f
i
n
e
-
t
u
n
i
n
g
)
N
o
t
g
en
era
t
i
v
e
;
l
i
m
i
t
ed
co
n
t
ex
t
;
p
re
t
ra
i
n
–
fi
n
e
t
u
n
e
mi
s
ma
t
c
h
;
s
e
n
s
i
t
i
v
i
t
y
t
o
d
o
ma
i
n
s
h
i
f
t
;
i
n
t
er
p
r
et
ab
i
l
i
t
y
l
i
mi
t
e
d
(at
t
e
n
t
i
o
n
≠
e
x
p
l
an
at
i
o
n
)
G
PT
‑
2
(
2
0
1
9
)
u
p
t
o
1
.
5
B
;
~4
8
l
a
y
er
s
W
e
b
T
e
x
t
(~
4
0
G
B
;
f
i
l
t
ere
d
w
e
b
p
a
g
e
s
)
D
ec
o
d
er
-
o
n
l
y
(cau
s
a
l
)
St
r
o
n
g
zer
o
-
s
h
o
t
/
u
n
s
u
p
e
rv
i
s
e
d
p
er
p
l
e
x
i
t
y
;
ear
l
y
fe
w
-
s
h
o
t
d
e
mo
s
H
a
l
l
u
c
i
n
a
t
i
o
n
s
;
b
i
as
/
t
o
x
i
c
i
t
y
fro
m
w
e
b
d
a
t
a
;
e
x
p
o
s
u
re
b
i
a
s
;
n
o
t
a
s
k
g
ro
u
n
d
i
n
g
;
l
i
m
i
t
e
d
s
afe
t
y
t
o
o
l
i
n
g
T
5
(
2
0
2
0
)
u
p
t
o
1
1
B;
d
e
p
t
h
s
v
ar
y
b
y
s
i
z
e
C4
(c
l
ea
n
e
d
c
o
mm
o
n
cra
w
l
;
h
u
n
d
r
ed
s
o
f
G
B)
;
mu
l
t
i
l
i
n
g
u
al
v
ar
i
a
n
t
s
ex
i
s
t
E
n
c
o
d
er
–
d
ec
o
d
er
(t
e
x
t
‑t
o
‑
t
e
x
t
)
SO
T
A
(
at
re
l
ea
s
e)
o
n
G
L
U
E
,
S
u
p
erG
L
U
E
,
SQ
u
A
D
,
t
ra
n
s
l
a
t
i
o
n
/
s
u
mmar
i
z
at
i
on
Co
m
p
u
t
e‑
i
n
t
e
n
s
i
v
e
;
b
r
i
t
t
l
en
es
s
t
o
p
r
o
m
p
t
fram
i
n
g
;
h
a
l
l
u
c
i
n
a
t
i
o
n
s
i
n
ab
s
t
rac
t
i
v
e
t
as
k
s
;
i
n
t
er
p
r
et
ab
i
l
i
t
y
c
h
a
l
l
en
g
e
s
G
PT
‑
3
(
2
0
2
0
)
1
7
5
B
;
~
9
6
l
a
y
er
s
Mi
x
t
u
r
e:
fi
l
t
ere
d
C
o
mm
o
n
Cra
w
l
+
W
e
b
T
e
x
t
2
+
B
o
o
k
s
1
/
2
+
W
i
k
i
p
ed
i
a
;
~
3
0
0
B
t
o
k
e
n
s
(
p
u
b
l
i
c
es
t
i
ma
t
e
s
)
D
ec
o
d
er‑
o
n
l
y
Few
‑s
h
o
t
S
O
T
A
ac
ro
s
s
man
y
t
a
s
k
s
(
t
ra
n
s
l
a
t
i
o
n
,
Q
A
,
rea
s
o
n
i
n
g
p
r
o
m
p
t
s
)
;
s
t
r
o
n
g
zero
‑/
f
e
w
‑s
h
o
t
p
erf
o
r
ma
n
ce
Rep
ro
d
u
c
i
b
i
l
i
t
y
/
acce
s
s
co
n
s
t
r
ai
n
t
s
;
b
i
a
s
/
t
o
x
i
c
i
t
y
;
h
a
l
l
u
c
i
n
a
t
i
o
n
s
;
o
p
a
q
u
e
i
n
t
er
n
a
l
s
;
d
at
a
p
r
o
v
en
an
ce
co
n
cer
n
s
Ch
i
n
ch
i
l
l
a
(2
0
2
2
)
7
0
B
;
d
e
p
t
h
s
p
er
c
o
n
fi
g
~1
.
4
T
t
o
k
e
n
s
(co
mp
u
t
e‑
o
p
t
i
m
al
s
c
al
i
n
g
)
D
ec
o
d
er‑
o
n
l
y
St
r
o
n
g
p
er
p
l
ex
i
t
y
a
n
d
d
o
w
n
s
t
ream
t
ra
n
s
fer
;
i
n
fl
u
e
n
ce
d
s
cal
i
n
g
p
rac
t
i
ce
Pro
p
r
i
e
t
ar
y
t
ra
i
n
i
n
g
d
e
t
a
i
l
s
;
n
o
t
i
n
s
t
r
u
c
t
i
o
n
‑t
u
n
e
d
b
y
d
efa
u
l
t
;
s
t
i
l
l
h
a
l
l
u
c
i
n
a
t
e
s
PaL
M
(
2
0
2
2
)
u
p
t
o
5
4
0
B
Mi
x
t
u
r
e
o
f
w
e
b
,
b
o
o
k
s
,
co
d
e,
m
u
l
t
i
l
i
n
g
u
al
c
o
r
p
o
r
a
(s
ca
l
e
>
1
T
t
o
k
e
n
s
c
l
a
s
s
)
D
ec
o
d
er‑
o
n
l
y
(Pat
h
w
a
y
s
)
SO
T
A
/
n
e
ar‑S
O
T
A
o
n
BIG
‑b
en
ch
,
reas
o
n
i
n
g
/
c
o
d
e
t
a
s
k
s
;
s
t
r
o
n
g
m
u
l
t
i
l
i
n
g
u
a
l
V
er
y
h
i
g
h
co
m
p
u
t
e/
en
er
g
y
;
b
i
a
s
a
n
d
s
afe
t
y
r
i
s
k
s
;
h
a
l
l
u
c
i
n
a
t
i
o
n
s
;
l
i
m
i
t
ed
t
ra
n
s
p
are
n
c
y
G
PT
‑
3
.
5
(2
0
2
2
)
(u
n
d
i
s
c
l
o
s
e
d
)
A
s
G
P
T
‑
3
+
i
n
s
t
r
u
c
t
i
o
n
/
RL
H
F
d
at
a
D
ec
o
d
er‑
o
n
l
y
Co
n
v
er
s
a
t
i
o
n
a
l
Ch
a
t
G
P
T
;
s
t
r
o
n
g
er
co
d
i
n
g
an
d
i
n
s
t
r
u
c
t
i
o
n
fo
l
l
o
w
i
n
g
v
s
.
G
PT
‑3
H
a
l
l
u
c
i
n
a
t
i
o
n
s
;
co
n
f
i
d
en
t
i
al
i
t
y
r
i
s
k
s
;
p
ar
t
i
al
d
i
s
c
l
o
s
u
re
o
f
t
ra
i
n
i
n
g
;
p
r
o
m
p
t
‑s
en
s
i
t
i
v
i
t
y
G
PT
‑
4
(
2
0
2
3
)
(u
n
d
i
s
c
l
o
s
e
d
;
mu
l
t
i
m
o
d
a
l
v
ar
i
a
n
t
s
)
U
n
d
i
s
c
l
o
s
e
d
mi
x
t
u
re
;
ex
t
e
n
s
i
v
e
RL
H
F
a
n
d
s
afe
t
y
t
u
n
i
n
g
D
ec
o
d
er‑
o
n
l
y
(mu
l
t
i
mo
d
a
l
IO
)
T
o
p
-
t
i
er
o
n
MM
L
U
,
co
d
e
b
e
n
c
h
mar
k
s
,
reas
o
n
i
n
g
;
l
o
n
g
‑c
o
n
t
ex
t
v
ar
i
a
n
t
s
H
a
l
l
u
c
i
n
a
t
i
o
n
s
(re
d
u
ce
d
,
n
o
t
el
i
m
i
n
a
t
e
d
)
;
c
l
o
s
e
d
w
e
i
g
h
t
s
/
d
a
t
a
;
i
n
t
er
p
r
et
ab
i
l
i
t
y
o
p
ac
i
t
y
;
co
s
t
/
l
at
en
cy
L
L
aM
A
1
(2
0
2
3
)
7
B
–
6
5
B
~1
T
t
o
k
e
n
s
(m
i
x
t
u
re
o
f
w
e
b
,
b
o
o
k
s
,
c
o
d
e
;
E
n
g
l
i
s
h
‑c
en
t
r
i
c)
D
ec
o
d
er‑
o
n
l
y
Co
m
p
e
t
i
t
i
v
e
o
n
ma
n
y
aca
d
em
i
c
N
L
P
t
as
k
s
v
s
.
l
ar
g
er
c
l
o
s
e
d
mo
d
e
l
s
(p
ar
ame
t
er
‑eff
i
c
i
e
n
t
)
Safe
t
y
a
l
i
g
n
me
n
t
m
i
n
i
ma
l
b
y
d
efa
u
l
t
;
t
o
x
i
c
i
t
y
/
b
i
as
ri
s
k
;
l
i
ce
n
s
e/
u
s
e
res
t
r
i
ct
i
o
n
s
L
L
aM
A
2
(2
0
2
3
)
7
B
/
1
3
B
/
7
0
B;
7
0
B
u
s
es
GQA
~2
T
t
o
k
e
n
s
;
ad
d
e
d
s
afe
t
y
/
i
n
s
t
r
u
c
t
i
o
n
t
u
n
i
n
g
fo
r
ch
at
v
ar
i
a
n
t
s
D
ec
o
d
er‑
o
n
l
y
St
r
o
n
g
o
p
e
n
b
as
e
l
i
n
e;
co
m
p
e
t
i
t
i
v
e
w
i
t
h
G
PT
‑
3
.
5
o
n
ma
n
y
t
as
k
s
;
l
o
n
g
‑c
o
n
t
ex
t
o
p
t
i
o
n
s
H
a
l
l
u
c
i
n
a
t
i
o
n
s
p
er
s
i
s
t
;
rel
i
a
n
ce
o
n
cu
ra
t
e
d
w
e
b
d
a
t
a
;
s
afe
t
y
s
t
i
l
l
e
v
o
l
v
i
n
g
PaL
M
2
(
2
0
2
3
)
fami
l
y
s
i
ze
s
(u
n
d
i
s
c
l
o
s
e
d
)
Mu
l
t
i
l
i
n
g
u
al
w
e
b
/
b
o
o
k
s
/
c
o
d
e;
g
rea
t
er
fo
c
u
s
o
n
eff
i
c
i
e
n
c
y
an
d
mu
l
t
i
l
i
n
g
u
al
i
t
y
D
ec
o
d
er‑
o
n
l
y
Imp
ro
v
e
d
rea
s
o
n
i
n
g
,
t
ra
n
s
l
a
t
i
o
n
,
c
o
d
i
n
g
;
en
t
er
p
r
i
s
e
A
PI
s
L
i
mi
t
e
d
d
i
s
cl
o
s
u
re
s
;
h
a
l
l
u
c
i
n
a
t
i
o
n
s
;
b
e
n
c
h
mar
k
d
e
p
e
n
d
e
n
ce
Cl
a
u
d
e
2
(2
0
2
3
)
(u
n
d
i
s
c
l
o
s
e
d
)
Pro
p
r
i
e
t
ar
y
w
e
b
/
b
o
o
k
s
/
c
o
d
e
+
R
L
H
F
/
c
o
n
s
t
i
t
u
t
i
o
n
al
A
I
D
ec
o
d
er‑
o
n
l
y
(v
er
y
l
o
n
g
co
n
t
ex
t
)
St
r
o
n
g
o
n
s
a
fe
t
y
‑
al
i
g
n
e
d
t
a
s
k
s
;
c
o
m
p
e
t
i
t
i
v
e
co
d
i
n
g
/
Q
A
;
l
o
n
g
‑c
o
n
t
ex
t
re
t
ri
ev
al
H
a
l
l
u
c
i
n
a
t
i
o
n
s
;
d
a
t
a
s
e
t
o
p
ac
i
t
y
;
e
v
o
l
v
i
n
g
s
afe
t
y
g
u
ar
d
ra
i
l
s
G
PT
‑
4
o
/
v
ar
i
a
n
t
s
(2
0
2
4
–
2
0
2
5
)
(u
n
d
i
s
c
l
o
s
e
d
)
A
s
G
P
T
‑
4
w
i
t
h
ex
p
a
n
d
e
d
mu
l
t
i
m
o
d
a
l
d
a
t
a
D
ec
o
d
er‑
o
n
l
y
(mu
l
t
i
mo
d
a
l
,
real
‑t
i
me
)
E
n
h
an
ce
d
m
u
l
t
i
m
o
d
a
l
reas
o
n
i
n
g
;
re
al
‑t
i
me
v
o
i
c
e/
v
i
s
i
o
n
Same
c
o
re
ri
s
k
s
(h
a
l
l
u
c
i
n
a
t
i
o
n
/
b
i
a
s
),
p
r
i
v
ac
y
/
c
o
n
s
e
n
t
f
o
r
mu
l
t
i
m
o
d
a
l
d
a
t
a
;
o
p
ac
i
t
y
N
o
t
es
.
“Para
ms
/
D
e
p
t
h
”
s
h
o
w
n
f
o
r
h
e
ad
l
i
n
e
v
er
s
i
o
n
s
;
fam
i
l
i
e
s
i
n
c
l
u
d
e
m
u
l
t
i
p
l
e
s
i
zes
.
Ben
ch
mar
k
s
l
i
s
t
e
d
ar
e
e
x
e
mp
l
ar
s
(G
L
U
E
,
S
Q
u
A
D
,
S
u
p
erG
L
U
E
,
B
IG
‑b
en
ch
,
M
ML
U
,
c
o
d
i
n
g
/
ev
al
s
u
i
t
es
)
.
“H
a
l
l
u
ci
n
a
t
i
o
n
s
”
d
e
n
o
t
es
fac
t
u
a
l
i
t
y
e
rr
o
r
s
i
n
g
e
n
era
t
i
o
n
;
s
ee
n
acr
o
s
s
m
o
d
el
s
d
e
s
p
i
t
e
i
n
s
t
r
u
ct
i
o
n
/
s
af
et
y
t
u
n
i
n
g
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
2
5
2
-
8
9
3
8
I
n
t J Ar
tif
I
n
tell
,
Vo
l.
1
4
,
No
.
6
,
Dec
em
b
er
2
0
2
5
:
5
3
1
1
-
5
3
3
2
5314
T
h
e
r
ap
id
s
ca
lin
g
o
f
L
L
Ms
h
as
am
p
lifie
d
b
o
th
th
eir
f
lu
en
cy
an
d
th
eir
v
u
ln
er
ab
ilit
y
to
h
allu
cin
atio
n
s
—
o
u
tp
u
ts
th
at
ap
p
ea
r
p
lau
s
ib
le
b
u
t
a
r
e
f
ac
tu
ally
in
co
r
r
ec
t
[
2
1
]
–
[
2
3
]
.
W
h
ile
s
ca
lin
g
an
d
in
s
tr
u
ctio
n
‑
tu
n
in
g
e
n
h
an
ce
f
e
w‑
s
h
o
t
p
er
f
o
r
m
a
n
ce
,
th
ey
al
s
o
in
cr
ea
s
e
r
is
k
s
o
f
b
ias
an
d
f
alse
g
en
er
atio
n
,
esp
ec
ially
wh
en
p
r
o
m
p
ts
p
u
s
h
m
o
d
els
b
e
y
o
n
d
th
eir
tr
ain
i
n
g
d
is
tr
ib
u
tio
n
.
E
m
p
ir
ical
a
u
d
its
in
h
ig
h
‑
s
tak
es
d
o
m
ain
s
s
u
ch
as
h
ea
lth
ca
r
e
a
n
d
ed
u
ca
tio
n
s
h
o
w
th
at
L
L
Ms
ca
n
p
r
o
d
u
ce
co
n
f
id
en
t
y
et
in
ac
cu
r
ate
s
tatem
en
ts
,
u
n
d
er
s
co
r
i
n
g
t
h
e
im
p
o
r
tan
ce
o
f
f
ac
tu
al
v
er
if
icatio
n
an
d
r
o
b
u
s
t
d
o
m
ai
n
g
u
ar
d
r
ails
[
2
4
]
.
Mo
r
eo
v
er
,
s
tan
d
ar
d
atten
tio
n
v
is
u
aliza
tio
n
s
alo
n
e
f
ail
to
p
r
o
v
id
e
f
aith
f
u
l
ex
p
lan
atio
n
s
;
th
u
s
,
co
m
p
r
eh
e
n
s
iv
e
e
x
p
lain
ab
le
a
r
tific
ial
in
tellig
en
ce
(
XAI
)
f
r
a
m
ewo
r
k
s
r
em
ain
ess
en
tial to
en
s
u
r
e
in
ter
p
r
etab
le
r
ea
s
o
n
i
n
g
[
2
5
]
,
[
2
6
]
.
T
h
e
ad
v
an
ce
m
en
t
o
f
L
L
Ms
r
el
ies
o
n
b
o
th
a
r
ch
itectu
r
al
i
n
n
o
v
atio
n
s
an
d
ev
o
lv
i
n
g
tr
ai
n
in
g
p
ar
ad
ig
m
s
.
E
ar
ly
d
ec
o
d
e
r
-
o
n
l
y
tr
an
s
f
o
r
m
er
s
en
ab
led
f
lu
en
t
f
ew
-
s
h
o
t
lear
n
in
g
b
u
t
led
to
f
ac
t
u
al
in
co
n
s
is
ten
cies;
b
id
ir
ec
tio
n
al
m
o
d
els
lik
e
B
E
R
T
im
p
r
o
v
ed
class
if
icatio
n
b
y
u
s
in
g
b
o
th
c
o
n
tex
ts
.
T
o
en
h
an
ce
r
eliab
ilit
y
b
ey
o
n
d
b
en
ch
m
ar
k
s
,
r
esear
c
h
er
s
in
tr
o
d
u
ce
d
in
s
tr
u
ctio
n
tu
n
in
g
an
d
r
ein
f
o
r
ce
m
en
t
lear
n
in
g
f
r
o
m
h
u
m
a
n
f
ee
d
b
ac
k
(
R
L
HF)
,
s
h
if
tin
g
f
o
cu
s
f
r
o
m
r
aw
s
ca
le
to
tr
ain
in
g
s
ig
n
al
q
u
ality
an
d
alig
n
m
en
t
with
h
u
m
an
in
ten
t
[
1
2
]
,
[
2
4
]
–
[
2
9
]
.
Ho
wev
er
,
th
e
s
e
alig
n
m
en
t
tech
n
i
q
u
es
r
em
ai
n
im
p
er
f
ec
t,
an
d
ev
e
n
ad
v
an
ce
d
s
y
s
tem
s
co
n
tin
u
e
to
p
r
o
d
u
ce
c
o
n
f
id
e
n
t
y
et
in
c
o
r
r
ec
t
o
u
tp
u
ts
,
lim
itin
g
r
e
p
r
o
d
u
cib
ilit
y
d
u
e
to
p
r
o
p
r
ietar
y
alig
n
m
en
t
d
atasets
.
GPT
-
4
d
em
o
n
s
tr
ates
s
tr
o
n
g
er
co
n
s
is
ten
cy
th
an
GPT
-
3
.
5
,
att
r
ib
u
ted
t
o
r
e
f
in
ed
alig
n
m
e
n
t
a
n
d
s
ca
lin
g
,
th
o
u
g
h
th
e
p
r
o
p
r
ietar
y
n
atu
r
e
o
f
its
tr
a
in
in
g
o
b
s
cu
r
es
ca
u
s
al
f
ac
to
r
s
.
Mo
s
t
co
m
p
ar
is
o
n
s
r
ely
o
n
o
b
s
er
v
ed
p
er
f
o
r
m
an
ce
r
ath
er
th
an
d
is
clo
s
ed
m
ec
h
an
i
s
m
s
.
R
eg
ar
d
in
g
m
o
d
el
d
is
clo
s
u
r
es,
it
is
ty
p
ically
p
o
s
s
ib
le
to
id
e
n
tify
th
e
m
o
d
el
class
(
e.
g
.
,
d
e
co
d
er
-
o
n
ly
v
s
.
en
co
d
e
r
–
d
ec
o
d
er
)
,
th
e
p
r
e
s
en
ce
o
f
in
s
tr
u
ctio
n
tu
n
in
g
a
n
d
R
L
HF,
h
ig
h
-
lev
el
tr
ain
in
g
d
ata
ca
teg
o
r
ies
(
e.
g
.
,
web
tex
t,
b
o
o
k
s
,
an
d
co
d
e)
,
a
n
d
g
en
e
r
al
ca
p
ab
ilit
ies,
lim
itatio
n
s
,
an
d
s
ca
le
in
d
icato
r
s
(
p
a
r
am
eter
s
o
r
to
k
en
s
)
f
r
o
m
s
y
s
tem
/m
o
d
el
ca
r
d
s
o
r
p
u
b
lic
r
ep
o
r
ts
(
e.
g
.
,
GPT
-
4
tech
n
ical
r
ep
o
r
t,
L
laM
A
2
tech
n
ical
r
ep
o
r
t,
PaL
M/PaL
M2
p
ap
er
s
,
C
h
in
c
h
illa,
an
d
C
lau
d
e
m
o
d
el
ca
r
d
s
)
.
Ho
wev
er
,
d
etails
ar
e
ty
p
ically
u
n
d
is
clo
s
ed
,
in
clu
d
in
g
t
h
e
ex
ac
t
c
o
m
p
o
s
itio
n
,
licen
s
in
g
,
o
r
s
am
p
lin
g
s
tr
ateg
ies
o
f
th
e
p
r
etr
ain
in
g
co
r
p
u
s
,
co
n
tam
i
n
atio
n
co
n
tr
o
ls
,
th
e
p
r
o
v
en
a
n
ce
an
d
s
ize
o
f
p
r
ef
er
en
ce
/s
u
p
e
r
v
is
ed
f
in
e
-
tu
n
i
n
g
d
atasets
,
o
p
tim
i
ze
r
s
ch
ed
u
les,
an
d
p
r
ec
is
e
co
m
p
u
te
b
u
d
g
ets
[
3
0
]
–
[
3
5
]
.
T
h
er
ef
o
r
e,
th
e
p
ap
er
u
s
es
h
ed
g
ed
p
h
r
asin
g
(
e.
g
.
,
“p
u
b
lic
r
ep
o
r
ts
s
u
g
g
est..
.
”)
an
d
a
n
ch
o
r
s
claim
s
in
o
f
f
icial
r
ep
o
r
ts
,
ev
alu
atin
g
o
b
s
er
v
ed
b
eh
a
v
io
r
r
ath
e
r
th
an
ass
u
m
in
g
ac
ce
s
s
to
p
r
o
p
r
ietar
y
in
ter
n
als.
Ultim
ately
,
d
ee
p
er
n
etwo
r
k
s
an
d
lo
n
g
er
c
o
n
tex
t
win
d
o
ws
ex
p
an
d
r
ep
r
esen
tatio
n
al
d
ep
th
,
wh
ile
alig
n
m
en
t
m
et
h
o
d
s
s
h
ap
e
h
o
w
th
at
ca
p
ac
i
ty
is
u
s
ed
,
u
n
d
er
s
co
r
i
n
g
th
e
n
ee
d
to
in
teg
r
ate
ar
ch
itectu
r
e,
g
r
o
u
n
d
in
g
,
an
d
tr
an
s
p
ar
en
t e
v
alu
atio
n
d
u
e
to
p
e
r
s
is
ten
t is
s
u
es lik
e
h
allu
cin
atio
n
an
d
b
ias.
Pu
b
lic
d
o
cu
m
e
n
tatio
n
s
u
p
p
o
r
t
s
s
ev
er
al
s
tatem
en
ts
th
at
ca
n
b
e
m
ad
e
with
co
n
f
id
en
ce
.
I
t
is
ty
p
ically
p
o
s
s
ib
le
to
id
en
tify
th
e
m
o
d
el
class
(
f
o
r
ex
am
p
le,
d
ec
o
d
er
o
n
ly
v
er
s
u
s
en
co
d
er
–
d
ec
o
d
er
)
an
d
th
e
p
r
esen
ce
o
f
in
s
tr
u
ctio
n
tu
n
in
g
an
d
R
L
HF
.
Hig
h
lev
el
tr
ai
n
in
g
d
ata
ca
t
eg
o
r
ies
ar
e
o
f
ten
d
is
clo
s
ed
—
s
u
ch
as
web
tex
t,
b
o
o
k
s
,
co
d
e,
W
ik
ip
ed
ia,
an
d
m
u
ltil
in
g
u
al
co
r
p
o
r
a
—
alth
o
u
g
h
th
ese
ar
e
ca
teg
o
r
ies
r
ath
er
t
h
an
e
x
ac
t
d
atasets
.
Sy
s
tem
o
r
m
o
d
el
ca
r
d
s
f
r
eq
u
en
tly
r
ep
o
r
t
ca
p
ab
ilit
ies
an
d
k
n
o
wn
lim
itatio
n
s
,
in
clu
d
i
n
g
co
n
te
x
t
win
d
o
w,
s
u
p
p
o
r
ted
m
o
d
alities
,
s
af
ety
to
o
lin
g
,
an
d
s
n
ap
s
h
o
t e
v
alu
atio
n
s
.
So
m
e
s
o
u
r
ce
s
also
p
r
o
v
i
d
e
o
r
d
er
o
f
m
a
g
n
itu
d
e
in
d
icato
r
s
o
f
s
ca
le
(
p
ar
a
m
eter
s
o
r
to
k
en
s
)
a
n
d
g
e
n
er
al
d
e
s
cr
ip
tio
n
s
o
f
h
ar
d
wa
r
e
an
d
t
o
o
lin
g
.
W
h
er
e
s
u
ch
in
f
o
r
m
atio
n
ex
is
ts
,
we
g
r
o
u
n
d
claim
s
in
o
f
f
icial
m
ater
ials
,
f
o
r
e
x
am
p
le
th
e
GPT
-
4
tech
n
ical
r
ep
o
r
t/s
y
s
tem
ca
r
d
[
3
6
]
,
th
e
L
laM
A
2
tech
n
i
ca
l
r
ep
o
r
t
[
3
7
]
,
an
d
p
u
b
lic
p
a
p
er
s
o
n
PaL
M/PaL
M
2
[
3
8
]
a
n
d
C
h
in
ch
illa
[
3
9
]
,
as we
ll a
s
C
lau
d
e
m
o
d
el
an
d
s
af
ety
ca
r
d
s
[
4
0
]
.
Sev
er
al
d
etails
ar
e
ty
p
ically
u
n
d
is
clo
s
ed
.
T
h
e
ex
ac
t
c
o
m
p
o
s
itio
n
o
f
th
e
p
r
etr
ain
in
g
c
o
r
p
u
s
,
t
h
e
licen
s
in
g
b
r
ea
k
d
o
wn
,
a
n
d
th
e
f
ilter
in
g
o
r
d
ed
u
p
licatio
n
r
u
l
es
ar
e
r
ar
ely
s
p
ec
if
ied
.
Pro
p
o
r
tio
n
s
an
d
s
am
p
lin
g
s
tr
ateg
ies
ac
r
o
s
s
s
o
u
r
ce
s
,
as
well
as
co
n
tam
in
atio
n
co
n
tr
o
ls
ag
ain
s
t
b
en
ch
m
ar
k
leak
ag
e,
ar
e
n
o
t
u
s
u
ally
m
a
d
e
p
u
b
lic.
T
h
e
p
r
o
v
en
a
n
ce
,
s
ize,
an
d
in
s
tr
u
ctio
n
s
o
f
p
r
ef
er
en
c
e
lear
n
in
g
o
r
s
u
p
er
v
is
ed
f
in
e
tu
n
in
g
d
atasets
ar
e
s
im
ilar
ly
o
p
aq
u
e,
as
ar
e
o
p
ti
m
izer
s
ch
ed
u
les,
cu
r
r
icu
l
u
m
s
tr
ateg
ies,
an
d
p
r
ec
is
e
co
m
p
u
te
b
u
d
g
ets.
T
o
r
e
f
lect
th
ese
co
n
s
tr
ain
ts
,
we
d
elib
er
ately
ad
o
p
t
h
e
d
g
ed
p
h
r
asin
g
wh
en
r
ef
er
r
in
g
to
p
r
o
p
r
ietar
y
s
y
s
tem
s
.
W
e
u
s
e
lan
g
u
ag
e
s
u
ch
as
“p
u
b
lic
r
ep
o
r
ts
s
u
g
g
est
th
at
tr
ain
in
g
s
o
u
r
ce
s
in
clu
d
ed
m
u
ltip
le
b
r
o
ad
tex
t
ca
teg
o
r
ies;
th
e
ex
ac
t
co
m
p
o
s
itio
n
is
u
n
d
is
clo
s
ed
”
an
d
“a
cc
o
r
d
in
g
to
th
e
tech
n
ical
r
ep
o
r
t,
th
e
m
o
d
el
e
m
p
lo
y
s
in
s
tr
u
ctio
n
tu
n
in
g
an
d
R
L
HF;
d
etail
s
o
f
th
e
p
r
ef
er
en
ce
d
ata
ar
e
n
o
t
p
u
b
lic.
”
W
e
th
er
ef
o
r
e
ev
alu
ate
o
b
s
er
v
ed
b
eh
a
v
io
r
u
n
d
er
m
atc
h
ed
p
r
o
to
co
ls
r
ath
er
th
an
ass
u
m
e
ac
ce
s
s
to
p
r
o
p
r
ietar
y
in
te
r
n
als,
an
d
we
a
n
ch
o
r
a
n
y
s
p
ec
if
ic
claim
s
in
o
f
f
icial
tech
n
ical
r
ep
o
r
ts
o
r
s
y
s
tem
ca
r
d
s
wh
en
e
v
er
th
ey
ar
e
a
v
ailab
le.
Ultim
ately
,
ar
ch
itectu
r
al
an
d
alig
n
m
en
t
ad
v
an
ce
s
s
h
o
u
l
d
b
e
v
iewe
d
as
co
m
p
lem
en
ta
r
y
.
Dee
p
er
n
etwo
r
k
s
an
d
lo
n
g
e
r
co
n
tex
t
win
d
o
ws
ex
p
an
d
r
ep
r
esen
tatio
n
al
d
ep
th
,
wh
ile
alig
n
m
en
t
m
eth
o
d
s
s
h
ap
e
h
o
w
th
at
ca
p
ac
ity
is
u
s
ed
.
Per
s
is
ten
t iss
u
es lik
e
h
allu
cin
atio
n
,
b
ias,
an
d
ca
lib
r
atio
n
u
n
d
er
s
co
r
e
t
h
e
n
ee
d
to
in
te
g
r
ate
ar
ch
itectu
r
e,
r
etr
ie
v
al
-
au
g
m
e
n
ted
g
r
o
u
n
d
in
g
,
an
d
tr
an
s
p
ar
e
n
t
ev
alu
atio
n
.
C
lar
if
y
in
g
th
ese
r
elatio
n
s
h
ip
s
is
k
ey
to
u
n
d
e
r
s
tan
d
in
g
w
h
en
d
e
ep
e
r
r
ep
r
esen
tatio
n
s
a
n
d
ex
te
n
d
e
d
co
n
tex
ts
tr
an
s
late
in
to
g
en
u
in
ely
m
o
r
e
r
eliab
le
m
o
d
el
b
e
h
av
io
r
.
T
h
er
e
is
s
till
a
k
ey
u
n
r
eso
lv
e
d
q
u
esti
o
n
:
wh
ile
lar
g
er
an
d
m
o
r
e
ef
f
icie
n
t
lan
g
u
a
g
e
m
o
d
els
h
av
e
im
p
r
o
v
e
d
p
er
f
o
r
m
a
n
ce
ac
r
o
s
s
m
an
y
task
s
,
it
r
em
ain
s
u
n
clea
r
h
o
w
th
eir
in
ter
n
al
r
ep
r
esen
tatio
n
s
—
p
ar
ticu
lar
ly
h
ier
ar
ch
ical
s
tr
u
c
tu
r
es
—
an
d
ab
ilit
y
to
p
r
o
ce
s
s
lo
n
g
er
tex
t
in
p
u
ts
co
n
t
r
ib
u
t
e
to
m
o
r
e
r
eliab
le
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J Ar
tif
I
n
tell
I
SS
N:
2252
-
8
9
3
8
La
r
g
e
la
n
g
u
a
g
e
mo
d
els fo
r
p
a
tter
n
r
ec
o
g
n
itio
n
in
text
d
a
ta
(
A
kn
u
r
K
o
s
s
a
ya
ko
va
)
5315
o
u
tp
u
ts
.
Sp
ec
if
ically
,
th
e
r
e
is
lim
ited
u
n
d
er
s
tan
d
i
n
g
o
f
wh
eth
er
th
ese
m
o
d
el
f
ea
tu
r
es
h
e
lp
r
ed
u
ce
c
o
m
m
o
n
er
r
o
r
s
s
u
ch
as f
ac
tu
al
i
n
ac
cu
r
a
cies (
h
allu
cin
atio
n
s
)
an
d
b
iased
o
u
tp
u
ts
.
T
h
e
d
if
f
icu
lty
in
test
in
g
f
in
d
i
n
g
s
an
d
co
m
p
ar
in
g
m
o
d
els
f
air
ly
s
tem
s
f
r
o
m
th
e
lack
o
f
tr
an
s
p
ar
en
c
y
in
th
e
tr
ain
in
g
d
ata
an
d
m
eth
o
d
s
u
s
ed
b
y
s
o
m
e
lead
in
g
m
o
d
e
ls
.
R
ec
en
t w
o
r
k
o
u
tlin
es two
v
iews o
n
r
eliab
ilit
y
:
a
s
ca
le
-
f
ir
s
t
v
iew
(
p
r
io
r
itizin
g
lar
g
er
d
ec
o
d
er
s
/lo
n
g
er
co
n
tex
ts
)
an
d
a
tr
ai
n
in
g
-
s
ig
n
al/s
tr
u
ctu
r
e
-
f
ir
s
t
v
iew
(
em
p
h
asizin
g
b
id
ir
ec
ti
o
n
al
s
u
p
er
v
is
io
n
,
alig
n
m
e
n
t,
an
d
r
etr
iev
al
-
au
g
m
en
te
d
g
r
o
u
n
d
i
n
g
)
.
E
m
er
g
in
g
e
v
id
en
c
e
(
2
0
2
4
–
2
0
2
5
)
s
u
g
g
ests
ca
lib
r
atio
n
d
o
es
n
o
t
au
to
m
atica
lly
im
p
r
o
v
e
with
s
ize
a
n
d
t
h
at
r
o
b
u
s
tn
ess
is
m
o
r
e
s
en
s
itiv
e
to
th
e
lear
n
in
g
s
ig
n
al
th
an
to
r
aw
s
ca
le
[
4
1
]
–
[
4
3
]
.
Ou
r
r
esu
lts
s
u
p
p
o
r
t
th
e
s
tr
u
ctu
r
e
-
f
ir
s
t
v
iew:
f
in
e
-
tu
n
e
d
en
c
o
d
er
s
(
B
E
R
T
-
b
ase)
ac
h
iev
e
lo
we
r
E
C
E
an
d
m
o
r
e
s
tab
le
ac
c
u
r
ac
y
,
wh
ile
d
ec
o
d
er
p
r
o
m
p
tin
g
(
GPT
-
2
)
is
s
en
s
itiv
e
to
th
e
p
r
o
m
p
t.
T
h
is
ar
g
u
es
th
a
t
r
eliab
ilit
y
is
a
p
r
o
p
er
ty
o
f
th
e
p
r
o
ce
d
u
r
e
(
f
in
e
-
tu
n
i
n
g
+c
alib
r
atio
n
/ab
s
ten
tio
n
)
as
m
u
ch
as
th
e
m
o
d
el,
an
d
claim
s
o
f
“
d
ee
p
er
=saf
er
”
r
eq
u
ir
e
ca
lib
r
atio
n
-
awa
r
e
ev
alu
atio
n
.
T
h
e
n
ex
t
s
ec
tio
n
ad
d
r
ess
es
th
is
g
ap
b
y
d
escr
ib
in
g
o
u
r
m
eth
o
d
,
c
r
iter
ia
f
o
r
r
eliab
ilit
y
an
aly
s
is
,
an
d
er
r
o
r
f
o
cu
s
.
3.
M
E
T
H
O
D
Ou
r
s
tu
d
y
o
n
GPT'
s
p
atter
n
r
ec
o
g
n
itio
n
an
d
r
eliab
ilit
y
u
tili
ze
d
a
r
ep
r
o
d
u
cib
le
m
e
th
o
d
o
lo
g
y
co
m
b
in
in
g
th
eo
r
etica
l
an
aly
s
i
s
with
em
p
ir
ical
v
alid
atio
n
.
W
e
f
ir
s
t
an
aly
ze
d
th
e
t
r
an
s
f
o
r
m
er
ar
ch
itectu
r
e
,
as
s
h
o
wn
in
Fig
u
r
e
1
,
f
o
cu
s
in
g
o
n
h
ier
ar
ch
ical
r
ep
r
esen
tatio
n
s
,
s
elf
-
atten
tio
n
,
an
d
s
k
ip
co
n
n
ec
tio
n
s
to
lin
k
d
esig
n
ch
o
ices
to
p
er
f
o
r
m
an
c
e
an
d
in
ter
p
r
eta
b
ilit
y
[
4
4
]
,
[
4
5
]
.
I
n
th
is
ar
ch
itectu
r
e
,
ea
ch
d
ec
o
d
er
lay
er
a
p
p
lies
m
ask
ed
m
u
lti‑h
ea
d
s
elf
‑
atte
n
tio
n
o
v
er
th
e
p
r
ef
ix
to
k
en
s
(
ca
u
s
al
m
ask
)
,
f
o
llo
wed
b
y
a
p
o
s
itio
n
‑
wis
e
f
ee
d
‑
f
o
r
war
d
n
etwo
r
k
;
b
o
th
s
u
b
‑
lay
er
s
ar
e
wr
ap
p
e
d
b
y
r
esid
u
al
co
n
n
ec
tio
n
s
an
d
l
ay
er
n
o
r
m
aliza
tio
n
.
Po
s
itio
n
al
en
co
d
in
g
s
in
ject
o
r
d
er
in
f
o
r
m
atio
n
,
a
n
d
th
e
m
o
d
el
au
to
r
eg
r
ess
iv
ely
p
r
e
d
icts
th
e
n
ex
t
to
k
en
u
s
in
g
o
n
ly
th
e
d
ec
o
d
e
r
s
tack
,
illu
s
tr
atin
g
h
o
w
h
ie
r
ar
ch
ical
r
e
p
r
esen
tatio
n
s
ar
is
e
f
r
o
m
d
e
p
th
an
d
at
ten
tio
n
.
Fig
u
r
e
1
.
Dec
o
d
er
‑
o
n
ly
t
r
a
n
s
f
o
r
m
er
ar
c
h
itectu
r
e
(
C
h
atGPT
‑
class
)
Seco
n
d
,
we
ex
am
in
e
d
GPT’
s
tr
ain
in
g
d
ata
(
b
o
o
k
s
,
ar
ticle
s
,
web
,
an
d
co
d
e)
,
wh
ich
,
t
h
r
o
u
g
h
its
d
iv
er
s
ity
,
in
f
lu
en
ce
s
g
en
e
r
aliza
tio
n
an
d
b
iases
.
E
v
alu
atin
g
t
h
ese
s
o
u
r
ce
s
h
elp
ed
id
en
tify
f
ac
to
r
s
in
f
lu
en
cin
g
th
e
m
o
d
el’
s
g
en
er
aliza
tio
n
a
n
d
lim
itatio
n
s
.
Pu
b
lic
r
ep
o
r
ts
s
u
g
g
est
th
at
GPT‑
cla
s
s
m
o
d
els
ar
e
tr
ain
ed
o
n
lar
g
e‑
s
ca
le
m
ix
tu
r
es
o
f
licen
s
ed
,
p
u
b
licly
av
ailab
le,
a
n
d
p
r
o
v
id
e
r
‑
cu
r
ated
tex
t
(
f
o
r
ex
am
p
le,
b
o
o
k
s
,
web
p
ag
es,
an
d
co
d
e)
,
b
u
t
th
e
ex
ac
t
co
m
p
o
s
itio
n
an
d
weig
h
tin
g
ar
e
n
o
t
p
u
b
licly
d
is
clo
s
ed
.
Acc
o
r
d
in
g
l
y
,
an
y
s
tatem
en
ts
in
th
is
p
a
p
er
a
b
o
u
t
GPT
tr
ai
n
in
g
d
ata
s
h
o
u
ld
b
e
r
ea
d
as
in
f
er
en
ce
g
r
o
u
n
d
e
d
in
o
f
f
icial
s
y
s
tem
/tech
n
ical
r
ep
o
r
ts
an
d
m
o
d
el
o
r
s
y
s
tem
ca
r
d
s
r
ath
e
r
th
an
p
r
im
ar
y
d
is
clo
s
u
r
e
—
f
o
r
ex
am
p
le,
Op
en
AI
GPT‑
4
tech
n
ical/sy
s
tem
ca
r
d
s
,
a
n
th
r
o
p
ic
C
lau
d
e
m
o
d
el
c
ar
d
s
,
Me
ta’
s
L
la
MA
2
tech
n
ical
r
ep
o
r
t,
Go
o
g
le
PaL
M/PaL
M
2
d
o
c
u
m
en
tatio
n
,
an
d
Dee
p
Min
d
’
s
C
h
in
c
h
illa
s
ca
lin
g
an
al
y
s
is
.
W
h
er
e
p
r
ec
is
e
d
etails
ar
e
u
n
av
ailab
le,
we
in
ten
tio
n
ally
u
s
e
h
ed
g
e
d
p
h
r
asin
g
(
e.
g
.
,
“
p
u
b
lic
r
ep
o
r
ts
s
u
g
g
est…”)
a
n
d
f
o
cu
s
o
u
r
claim
s
o
n
o
b
s
er
v
ab
le
b
eh
av
io
r
u
n
d
er
o
u
r
ex
p
er
im
en
tal
p
r
o
to
c
o
l r
ath
er
th
an
u
n
d
o
cu
m
en
ted
im
p
lem
en
t
atio
n
d
etails.
Fin
ally
,
we
co
n
d
u
cted
p
e
r
f
o
r
m
an
ce
ev
alu
atio
n
s
u
s
in
g
s
tan
d
ar
d
NL
P
b
en
ch
m
ar
k
s
s
u
ch
as
GL
UE
an
d
SQu
AD
v
2
.
0
,
a
p
p
ly
in
g
m
etr
ic
s
in
clu
d
in
g
ac
c
u
r
ac
y
,
r
ec
all,
F
1
-
s
co
r
e,
a
n
d
p
er
p
lex
ity
to
ass
ess
u
n
d
er
s
tan
d
in
g
,
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
2
5
2
-
8
9
3
8
I
n
t J Ar
tif
I
n
tell
,
Vo
l.
1
4
,
No
.
6
,
Dec
em
b
er
2
0
2
5
:
5
3
1
1
-
5
3
3
2
5316
g
en
er
aliza
tio
n
,
an
d
r
o
b
u
s
tn
ess
.
T
o
g
eth
er
,
th
ese
an
al
y
s
es
—
co
v
er
in
g
a
r
ch
itectu
r
e,
d
ata
,
an
d
ev
alu
atio
n
—
p
r
o
v
id
e
a
c
o
m
p
r
e
h
en
s
iv
e
v
iew
o
f
GPT’
s
p
atter
n
r
ec
o
g
n
it
io
n
ca
p
a
b
ilit
ies.
Fo
r
b
r
o
ad
er
ar
ch
itectu
r
al
a
n
d
tr
ain
in
g
p
a
r
ad
ig
m
s
,
in
clu
d
in
g
in
s
tr
u
ctio
n
tu
n
in
g
,
R
L
HF,
an
d
clo
s
ed
-
s
o
u
r
ce
lim
itatio
n
s
,
r
ef
er
to
s
ec
tio
n
2
.
3
.
1
.
E
x
perim
ent
a
l set
up
T
h
e
e
x
p
e
r
i
m
e
n
t
al
s
e
tu
p
co
m
p
ar
es
tw
o
m
o
d
els
w
it
h
d
i
s
tin
ct
d
es
ig
n
p
h
il
o
s
o
p
h
i
es:
B
E
R
T
-
b
ase
(
a
b
id
ir
ec
t
io
n
al
en
c
o
d
er
)
a
n
d
GPT
-
2
s
m
all
(
a
n
a
u
t
o
r
e
g
r
ess
i
v
e
d
e
co
d
e
r
)
.
W
e
ass
ess
ed
m
o
d
e
l
b
e
h
a
v
i
o
r
u
s
i
n
g
t
h
e
SQu
A
D
v
2
.
0
(
q
u
esti
o
n
a
n
s
we
r
in
g
)
an
d
G
L
UE
(
MN
L
I
,
SS
T
-
2
)
b
e
n
ch
m
a
r
k
s
,
s
t
a
n
d
ar
d
i
zi
n
g
i
n
p
u
ts
to
5
1
2
t
o
k
e
n
s
wit
h
m
o
d
el
-
s
p
e
ci
f
ic
t
o
k
e
n
iz
at
io
n
(
W
o
r
d
Pie
ce
/B
P
E
)
f
o
r
c
o
m
p
ar
a
b
ili
ty
.
B
E
R
T
-
b
ase
was
f
i
n
e
-
t
u
n
e
d
f
o
r
t
h
r
e
e
ep
o
c
h
s
u
s
in
g
A
d
a
m
W
(
le
a
r
n
in
g
r
a
te
3
e
-
5
,
b
atc
h
s
iz
e
3
2
)
w
it
h
ea
r
l
y
s
to
p
p
i
n
g
;
tr
ai
n
i
n
g
cu
r
v
es
c
o
n
f
ir
m
co
n
v
e
r
g
e
n
c
e
(
s
e
e
F
ig
u
r
e
2
)
.
Fig
u
r
e
2
s
h
o
ws
th
at
t
r
ai
n
i
n
g
an
d
v
ali
d
a
ti
o
n
t
r
a
ce
s
i
n
d
ic
ate
r
ap
id
co
n
v
e
r
g
e
n
c
e
wit
h
i
n
th
r
e
e
e
p
o
c
h
s
,
wit
h
v
a
lid
ati
o
n
f
latt
e
n
i
n
g
b
e
f
o
r
e
t
r
ai
n
l
o
s
s
c
o
n
ti
n
u
es
t
o
d
ec
r
e
ase
,
a
n
d
s
h
a
d
e
d
b
an
d
s
r
e
p
r
ese
n
t
in
g
m
ea
n
±
s
.
d
.
ac
r
o
s
s
t
h
r
ee
s
e
e
d
s
.
GP
T
-
2
was
ev
a
lu
a
te
d
i
n
ze
r
o
-
s
h
o
t
a
n
d
f
e
w
-
s
h
o
t
s
ett
in
g
s
u
s
i
n
g
d
esi
g
n
e
d
p
r
o
m
p
ts
[
4
6
]
.
M
et
r
ic
s
i
n
cl
u
d
e
d
e
x
a
ct
m
a
tc
h
(
E
M
)
a
n
d
F
1
f
o
r
SQ
u
A
D,
a
n
d
ac
cu
r
a
cy
f
o
r
GL
U
E
.
E
r
r
o
r
an
aly
s
es
ch
ar
ac
ter
ize
d
r
ea
s
o
n
in
g
b
iases
,
s
u
ch
as
MN
L
I
co
n
f
u
s
io
n
ty
p
es
(
n
eu
tr
al
v
s
.
co
n
tr
ad
ictio
n
)
an
d
SS
T
-
2
asy
m
m
etr
ic
er
r
o
r
s
(
f
alse
n
eg
ativ
es
v
s
.
f
alse
p
o
s
itiv
es).
Pro
m
p
t
f
r
a
g
ilit
y
was
ass
es
s
ed
u
s
in
g
Fig
u
r
e
3
,
wh
e
r
e
s
em
an
tically
eq
u
iv
ale
n
t
p
r
o
m
p
ts
y
ield
m
ater
ially
d
if
f
er
e
n
t
s
co
r
es,
esp
ec
ially
in
ze
r
o
-
s
h
o
t
QA,
an
d
f
ew
-
s
h
o
t
p
r
o
m
p
ts
r
ed
u
ce
v
a
r
ian
ce
b
u
t
d
o
n
o
t
elim
in
ate
s
en
s
itiv
ity
.
L
ab
el
-
wis
e
f
ailu
r
es
wer
e
ch
ar
ac
ter
ize
d
b
y
co
n
f
u
s
io
n
m
atr
ices
f
o
r
B
E
R
T
-
b
ase
(
MN
L
I
,
Fig
u
r
e
4
)
an
d
GPT
-
2
f
ew
-
s
h
o
t
(
SS
T
-
2
,
Fig
u
r
e
5
)
.
Fig
u
r
e
4
s
h
o
ws
th
at
co
r
r
ec
t
p
r
e
d
i
ctio
n
s
d
o
m
in
ate
th
e
d
iag
o
n
al
b
u
t
n
eu
tr
al
a
n
d
co
n
tr
ad
ictio
n
ar
e
c
o
n
f
u
s
ed
m
o
r
e
o
f
te
n
th
a
n
en
tailm
en
t,
an
d
Fig
u
r
e
5
s
h
o
ws
asy
m
m
etr
y
i
n
o
f
f
-
d
iag
o
n
al
ce
lls
in
d
icatin
g
p
o
lar
ity
f
lip
s
co
n
s
is
ten
t a
cr
o
s
s
s
ee
d
s
,
u
n
d
er
s
co
r
in
g
p
r
o
m
p
t‑
s
en
s
itiv
e
r
eliab
ilit
y
l
im
its
.
Fig
u
r
e
2
.
L
ea
r
n
in
g
cu
r
v
es f
o
r
B
E
R
T
f
in
e
-
tu
n
in
g
Fig
u
r
e
3
.
Pro
m
p
t sen
s
itiv
ity
o
f
GPT
-
2
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J Ar
tif
I
n
tell
I
SS
N:
2252
-
8
9
3
8
La
r
g
e
la
n
g
u
a
g
e
mo
d
els fo
r
p
a
tter
n
r
ec
o
g
n
itio
n
in
text
d
a
ta
(
A
kn
u
r
K
o
s
s
a
ya
ko
va
)
5317
Fig
u
r
e
4
.
C
o
n
f
u
s
io
n
m
atr
i
x
f
o
r
B
E
R
T
‑
b
ase
o
n
MN
L
I
‑
m
(
r
o
w‑
n
o
r
m
alize
d
,
m
ea
n
o
f
th
r
ee
s
ee
d
s
)
Fig
u
r
e
5
.
C
o
n
f
u
s
io
n
m
atr
i
x
f
o
r
GPT‑
2
(
f
ew‑
s
h
o
t)
o
n
SS
T
‑
2
Fig
u
r
e
6
f
r
a
m
es
th
e
m
eth
o
d
o
lo
g
ical
lin
k
b
etwe
en
t
r
a
n
s
f
o
r
m
er
d
ep
th
an
d
r
eliab
ilit
y
,
s
h
o
win
g
th
at
ac
cu
r
ac
y
g
en
e
r
ally
im
p
r
o
v
es
with
d
e
p
th
,
r
ef
lectin
g
r
ic
h
e
r
h
ier
a
r
ch
ical
r
ep
r
esen
tatio
n
s
,
b
u
t
th
e
m
ar
g
in
al
b
en
ef
it
d
im
in
is
h
es
as
co
m
p
u
t
atio
n
al
co
s
t
r
is
es.
T
h
e
cu
r
v
e
m
o
tiv
ates
d
ep
th
–
e
f
f
icien
cy
tr
a
d
e
-
o
f
f
s
d
is
cu
s
s
ed
in
o
u
r
p
r
o
to
co
l
an
d
f
o
r
esh
ad
o
ws
r
eliab
ilit
y
an
aly
s
es
in
th
e
R
esu
lt
s
s
ec
t
io
n
.
Ou
r
o
wn
ex
p
er
im
en
ts
wer
e
co
n
d
u
cte
d
p
r
im
a
r
ily
o
n
an
N
VI
DI
A
R
T
X
3
0
9
0
GPU
(
2
4
GB
)
with
AM
D
R
y
ze
n
9
5
9
5
0
X,
r
ep
ea
ted
o
n
a
n
NVI
DI
A
A1
0
0
(
4
0
GB
)
to
v
a
lid
ate
co
n
s
is
ten
cy
.
All
r
u
n
s
u
s
ed
Ub
u
n
tu
2
2
.
0
4
,
Py
th
o
n
3
.
1
0
,
Py
T
o
r
c
h
2
.
0
,
an
d
Hu
g
g
in
g
Face
tr
a
n
s
f
o
r
m
er
s
4
.
3
6
,
with
C
UDA
1
2
.
1
f
o
r
GPU
ac
ce
ler
atio
n
.
Fig
u
r
e
7
s
u
m
m
a
r
izes
co
m
p
u
tatio
n
al
co
s
t
r
elativ
e
to
ac
cu
r
ac
y
,
illu
s
tr
atin
g
ef
f
icien
cy
–
r
eliab
ilit
y
tr
ad
e
-
o
f
f
s
.
T
h
e
Par
eto
f
r
o
n
tier
h
ig
h
lig
h
ts
s
ettin
g
s
th
at
m
ax
im
ize
m
etr
ic
p
er
GP
U
h
o
u
r
.
Fin
e
-
tu
n
e
d
B
E
R
T
g
e
n
er
ally
ac
h
iev
es
s
tr
o
n
g
er
ef
f
i
cien
cy
th
an
p
r
o
m
p
t
-
o
n
ly
GPT
-
2
in
o
u
r
s
etu
p
.
B
y
i
n
teg
r
atin
g
m
u
ltip
le
task
s
,
m
o
d
els,
an
d
ev
alu
atio
n
co
n
d
itio
n
s
,
th
is
s
etu
p
d
ir
ec
tly
test
s
wh
eth
er
ar
ch
itectu
r
al
ch
o
ices
—
b
id
ir
ec
tio
n
al
v
s
.
au
to
r
eg
r
ess
iv
e
d
esig
n
,
lay
er
d
e
p
th
,
an
d
co
n
tex
t le
n
g
th
—
in
f
lu
en
ce
r
eliab
ilit
y
,
f
ac
tu
al
p
r
ec
is
io
n
,
an
d
er
r
o
r
p
atter
n
s
.
Fig
u
r
e
6
.
Acc
u
r
ac
y
v
s
.
n
u
m
b
e
r
o
f
lay
e
r
s
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
2
5
2
-
8
9
3
8
I
n
t J Ar
tif
I
n
tell
,
Vo
l.
1
4
,
No
.
6
,
Dec
em
b
er
2
0
2
5
:
5
3
1
1
-
5
3
3
2
5318
Fig
u
r
e
7
.
C
o
m
p
u
te
–
p
e
r
f
o
r
m
an
ce
tr
ad
e
-
o
f
f
s
3
.
2
.
E
v
a
lua
t
i
o
n pro
t
o
co
ls
Fo
r
ea
ch
m
o
d
el
–
task
co
m
b
i
n
a
tio
n
,
th
r
ee
in
d
ep
en
d
en
t
r
u
n
s
wer
e
co
n
d
u
cted
u
s
in
g
f
ix
ed
r
an
d
o
m
s
ee
d
s
(
4
2
,
4
3
,
a
n
d
4
4
)
to
ass
ess
co
n
s
is
ten
cy
an
d
r
eliab
ilit
y
.
Av
e
r
a
g
in
g
r
esu
lts
ac
r
o
s
s
r
u
n
s
p
r
o
v
i
d
ed
m
ea
n
±
s
tan
d
ar
d
d
ev
iatio
n
,
r
ev
ea
lin
g
s
tab
ilit
y
in
p
er
f
o
r
m
an
ce
an
d
s
en
s
itiv
ity
to
in
itializatio
n
.
Fig
u
r
e
8
v
is
u
alize
s
s
co
r
e
d
is
tr
ib
u
tio
n
s
ac
r
o
s
s
s
ee
d
s
,
estab
lis
h
in
g
u
n
ce
r
tain
ty
b
a
n
d
s
f
o
r
r
ep
o
r
ted
m
etr
ics
.
Fig
u
r
e
8
s
h
o
ws
s
ee
d
s
en
s
itiv
ity
o
f
ev
alu
atio
n
m
etr
i
cs,
wh
er
e
v
io
lin
/
b
o
x
p
lo
ts
illu
s
tr
ate
th
e
s
p
r
ea
d
o
f
ac
c
u
r
ac
y
(
GL
UE
task
s
)
an
d
F1
/EM
(
SQu
AD
v
2
.
0
)
ac
r
o
s
s
s
ee
d
s
f
o
r
B
E
R
T
an
d
GPT
-
2
,
with
lim
ited
d
is
p
er
s
io
n
in
d
ica
tin
g
s
tab
le
tr
ain
in
g
an
d
ev
alu
atio
n
an
d
o
u
tlier
s
alig
n
in
g
with
h
ar
d
er
s
u
b
s
ets.
T
ab
le
2
p
r
esen
ts
SQu
AD
v
2
.
0
r
e
s
u
lts
(
EM
an
d
F1
)
,
an
d
B
E
R
T
‑
b
ase
f
in
e
‑
tu
n
in
g
y
ield
s
s
tr
o
n
g
,
s
tab
le
s
p
an
ex
tr
ac
tio
n
,
wh
ile
GPT‑
2
im
p
r
o
v
es
u
n
d
e
r
f
ew
‑
s
h
o
t
p
r
o
m
p
tin
g
b
u
t
lag
s
o
n
u
n
an
s
wer
ab
le
ca
s
es.
Var
iab
ilit
y
b
an
d
s
ar
e
n
ar
r
o
w,
in
d
icatin
g
r
ep
r
o
d
u
cib
le
r
u
n
s
u
n
d
er
th
e
s
tated
p
r
o
to
co
l
.
T
a
b
le
3
s
u
m
m
ar
izes
GL
UE
o
u
tco
m
es
(
SS
T
‑
2
,
MN
L
I
-
m
,
an
d
MN
L
I
-
mm)
,
an
d
B
E
R
T
‑
b
ase
f
in
e
‑
tu
n
in
g
co
n
s
is
ten
tly
o
u
tp
er
f
o
r
m
s
GPT
‑
2
;
f
ew
‑
s
h
o
t
p
r
o
m
p
tin
g
n
ar
r
o
ws
th
e
g
ap
o
n
SS
T
‑
2
b
u
t
o
n
ly
m
o
d
estl
y
im
p
r
o
v
es
MN
L
I
.
Sm
all
s
tan
d
ar
d
d
e
v
iatio
n
s
in
d
icate
s
tab
le
tr
ain
in
g
/ev
alu
atio
n
ac
r
o
s
s
s
ee
d
s
.
T
o
g
eth
er
,
t
h
ese
b
en
ch
m
a
r
k
s
co
m
p
ar
e
th
e
b
eh
av
i
o
r
o
f
B
E
R
T
-
b
ase
(
f
in
e
-
tu
n
ed
)
a
n
d
GPT‑
2
(
ze
r
o
-
a
n
d
f
ew
-
s
h
o
t)
u
n
d
e
r
co
n
tr
o
lled
co
n
d
itio
n
s
,
illu
s
tr
atin
g
h
o
w
en
co
d
er
v
er
s
u
s
d
ec
o
d
e
r
o
b
jectiv
e
s
af
f
ec
t
r
o
b
u
s
tn
ess
ac
r
o
s
s
f
ac
tu
al
an
d
in
f
er
en
tial t
ask
s
.
Fig
u
r
e
8
.
Seed
s
en
s
itiv
ity
o
f
e
v
alu
atio
n
m
etr
ics
Evaluation Warning : The document was created with Spire.PDF for Python.
I
n
t J Ar
tif
I
n
tell
I
SS
N:
2252
-
8
9
3
8
La
r
g
e
la
n
g
u
a
g
e
mo
d
els fo
r
p
a
tter
n
r
ec
o
g
n
itio
n
in
text
d
a
ta
(
A
kn
u
r
K
o
s
s
a
ya
ko
va
)
5319
T
ab
le
2
.
SQu
AD
v
2
.
0
r
esu
lts
(
E
M/F1
,
m
ea
n
±
s
td
o
v
er
th
r
ee
s
ee
d
s
)
M
o
d
e
l
EM
(
m
e
a
n
±
s
t
d
)
F
1
(
me
a
n
±
st
d
)
B
ER
T
-
b
a
se
7
8
.
8
±
0
.
4
8
1
.
3
±
0
.
5
G
P
T
-
2
(
z
e
r
o
-
sh
o
t
)
9
.
7
±
1
.
0
1
9
.
5
±
1
.
5
G
P
T
-
2
(
f
e
w
-
sh
o
t
)
1
5
.
3
±
0
.
9
2
9
.
8
±
1
.
1
N
o
t
e
s
.
E
M
=
e
x
a
c
t
m
a
t
c
h
;
F
1
=
t
o
k
e
n
-
l
e
v
e
l
o
v
e
r
l
a
p
.
A
v
e
r
a
g
e
s
o
v
e
r
3
r
u
n
s
(
see
d
s
4
2
,
4
3
,
4
4
)
.
B
e
s
t
c
h
e
c
k
p
o
i
n
t
p
e
r
r
u
n
se
l
e
c
t
e
d
b
y
v
a
l
i
d
a
t
i
o
n
F
1
T
ab
le
3
.
GL
UE
b
e
n
ch
m
a
r
k
r
e
s
u
lts
(
m
ea
n
±
s
td
)
M
o
d
e
l
SST
-
2
A
c
c
u
r
a
c
y
(
me
a
n
±
st
d
)
M
N
LI
-
m (m
e
a
n
±
st
d
)
M
N
LI
-
mm
(
mea
n
±
st
d
)
B
ER
T
-
b
a
se
9
3
.
3
±
0
.
3
8
5
.
0
±
0
.
4
8
4
.
3
±
0
.
5
G
P
T
-
2
(
z
e
r
o
-
sh
o
t
)
7
5
.
5
±
0
.
9
5
4
.
1
±
1
.
1
5
3
.
4
±
1
.
2
G
P
T
-
2
(
f
e
w
-
sh
o
t
)
8
3
.
1
±
0
.
7
6
5
.
2
±
0
.
8
6
4
.
0
±
0
.
9
N
o
t
e
s
.
M
N
LI
-
m =
ma
t
c
h
e
d
;
M
N
LI
-
m
m =
m
i
sma
t
c
h
e
d
.
A
v
e
r
a
g
e
s
o
v
e
r
3
r
u
n
s (see
d
s
4
2
,
4
3
,
4
4
)
.
B
e
st
c
h
e
c
k
p
o
i
n
t
p
e
r
r
u
n
s
e
l
e
c
t
e
d
b
y
v
a
l
i
d
a
t
i
o
n
a
c
c
u
r
a
c
y
.
T
o
m
in
im
ize
r
an
d
o
m
n
ess
,
id
en
tical
s
ee
d
s
wer
e
f
ix
ed
ac
r
o
s
s
all
s
o
f
twar
e
co
m
p
o
n
en
ts
(
Py
th
o
n
,
Nu
m
Py
,
Py
T
o
r
ch
,
an
d
C
UDA)
f
o
r
d
eter
m
in
is
tic
d
ata
h
an
d
lin
g
;
o
b
s
er
v
ed
f
lu
ctu
ati
o
n
s
wer
e
m
in
im
al,
co
n
f
ir
m
in
g
p
r
o
to
c
o
l
r
eliab
ilit
y
.
T
h
e
b
est
ch
ec
k
p
o
in
t
f
o
r
ea
ch
m
o
d
el
was
s
elec
ted
b
a
s
ed
o
n
v
alid
atio
n
p
er
f
o
r
m
an
ce
u
s
in
g
ea
r
ly
s
to
p
p
in
g
t
o
p
r
ev
en
t
o
v
e
r
f
itti
n
g
.
B
ey
o
n
d
ac
cu
r
ac
y
,
m
o
d
el
ca
lib
r
atio
n
was
ev
alu
ate
d
u
s
in
g
r
eliab
ilit
y
d
iag
r
am
s
an
d
E
C
E
,
q
u
an
tify
i
n
g
th
e
alig
n
m
en
t
b
etwe
en
p
r
ed
icted
co
n
f
id
en
ce
an
d
ac
tu
al
ac
cu
r
ac
y
(
Fig
u
r
e
9
)
.
Fig
u
r
e
9
s
h
o
ws
ca
lib
r
atio
n
an
aly
s
is
,
wh
er
e
r
eliab
ilit
y
d
iag
r
am
s
in
d
ica
te
th
at
B
E
R
T
ten
d
s
to
b
e
s
lig
h
tly
u
n
d
e
r
-
co
n
f
id
e
n
t
o
n
MN
L
I
wh
ile
GPT
-
2
i
s
o
v
er
-
c
o
n
f
id
e
n
t
in
ze
r
o
-
s
h
o
t
QA;
E
C
E
v
alu
es
s
u
m
m
ar
ize
m
is
ca
lib
r
atio
n
,
a
n
d
tem
p
er
atu
r
e
s
ca
lin
g
c
u
r
v
es
(
in
s
et)
s
h
o
w
p
o
ten
tial
co
r
r
ec
tio
n
with
o
u
t
r
etr
ain
in
g
.
Statis
tical
s
ig
n
if
ican
ce
was
test
ed
v
ia
p
air
ed
t
-
t
ests
(
p
<0
.
0
5
)
,
an
d
ef
f
ec
t
s
iz
es
u
s
ed
C
o
h
en
’
s
d
.
Fig
u
r
e
10
also
p
r
o
v
id
e
d
a
B
lan
d
–
Altm
an
p
lo
t
to
id
en
ti
f
y
s
y
s
tem
atic
b
ias
b
etwe
en
B
E
R
T
an
d
GPT
-
2
p
r
ed
ictio
n
s
.
Fig
u
r
e
1
0
in
d
icate
s
in
ter
-
m
o
d
el
ag
r
ee
m
e
n
t,
wh
e
r
e
th
e
m
ea
n
b
ias
f
av
o
r
s
B
E
R
T
o
n
in
f
er
en
ce
item
s
an
d
GPT
-
2
o
n
s
h
o
r
t
-
co
n
tex
t
QA
ca
s
es,
an
d
wid
er
lim
it
s
o
f
ag
r
ee
m
en
t
o
n
ad
v
e
r
s
ar
ial
s
u
b
s
ets
r
ev
ea
l
h
eter
o
g
en
e
o
u
s
g
en
er
aliza
tio
n
b
eh
av
io
r
.
R
esu
lts
(
T
ab
les
2
an
d
3
)
co
n
s
is
ten
tly
s
h
o
w
B
E
R
T
-
b
ase
ac
h
iev
es
h
ig
h
er
ac
c
u
r
ac
y
a
n
d
s
tab
ilit
y
,
wh
ile
GPT
-
2
f
ew
-
s
h
o
t
r
e
d
u
c
es
b
u
t
d
o
es
n
o
t
elim
in
ate
th
e
p
er
f
o
r
m
an
ce
g
ap
,
esp
ec
ially
in
SS
T
-
2
an
d
MN
L
I
.
Fig
u
r
e
9
.
C
alib
r
atio
n
a
n
aly
s
is
: r
eliab
ilit
y
d
iag
r
am
s
an
d
tem
p
er
atu
r
e
s
ca
llin
g
3.
3
.
E
rr
o
r
t
a
x
o
no
m
y
a
nd
a
na
ly
s
is
T
o
s
y
s
tem
atica
lly
ev
alu
ate
m
o
d
el
f
ailu
r
es,
we
d
ev
elo
p
e
d
an
er
r
o
r
tax
o
n
o
m
y
tailo
r
e
d
to
ea
c
h
b
en
ch
m
ar
k
,
lin
k
i
n
g
er
r
o
r
ty
p
es
to
ar
ch
itectu
r
al
d
if
f
e
r
en
ce
s
an
d
tr
ain
in
g
s
tr
ateg
ies,
b
u
ild
in
g
o
n
q
u
esti
o
n
s
r
aised
in
s
ec
tio
n
2
ab
o
u
t
m
o
d
el
s
tr
u
ctu
r
e
an
d
r
eliab
ilit
y
.
Fo
r
SQu
AD
v
2
.
0
(
q
u
esti
o
n
an
s
wer
in
g
)
,
th
r
ee
m
ajo
r
er
r
o
r
ca
teg
o
r
ies
wer
e
id
en
tif
ied
:
h
allu
cin
atio
n
s
(
co
n
f
id
en
t
b
u
t
u
n
s
u
p
p
o
r
te
d
an
s
wer
s
)
,
ab
s
ten
tio
n
f
ailu
r
es
(
an
s
wer
in
g
d
esp
ite
n
o
co
r
r
ec
t
s
p
an
)
,
an
d
s
p
an
-
b
o
u
n
d
ar
y
er
r
o
r
s
(
p
ar
tial
o
v
er
lap
s
)
.
T
h
ese
r
ev
ea
l
if
a
m
o
d
el
r
ec
o
g
n
izes
f
ac
tu
al
lim
its
o
r
m
er
ely
g
u
ess
es.
Fig
u
r
e
11
p
l
o
ts
ac
cu
r
ac
y
v
er
s
u
s
co
v
er
a
g
e
u
n
d
er
co
n
f
id
en
ce
-
b
ased
ab
s
ten
tio
n
,
o
p
er
atio
n
ali
zin
g
“k
n
o
win
g
wh
at
it
d
o
es
n
o
t
k
n
o
w”.
As
th
e
s
y
s
tem
a
b
s
tain
s
o
n
u
n
ce
r
tain
item
s
,
ac
cu
r
ac
y
o
n
th
e
r
em
ai
n
in
g
s
et
r
is
es
s
h
ar
p
ly
,
r
ev
ea
l
in
g
ac
tio
n
a
b
le
o
p
er
atin
g
p
o
in
ts
f
o
r
d
ep
lo
y
m
en
t.
GPT
-
2
ze
r
o
-
/f
ew
-
s
h
o
t c
u
r
v
es
ex
h
ib
it st
ee
p
er
d
r
o
p
s
th
an
f
in
e
-
tu
n
ed
B
E
R
T
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
SS
N
:
2
2
5
2
-
8
9
3
8
I
n
t J Ar
tif
I
n
tell
,
Vo
l.
1
4
,
No
.
6
,
Dec
em
b
er
2
0
2
5
:
5
3
1
1
-
5
3
3
2
5320
Fig
u
r
e
10
.
B
lan
d
-
Altm
an
ag
r
e
em
en
t b
etwe
en
B
E
R
T
an
g
GP
T
-
2
p
r
e
d
ictio
n
(
MN
L
I
,
3
s
ee
d
s
)
Fig
u
r
e
11
.
C
o
n
f
id
en
ce
–
co
v
e
r
a
g
e
tr
ad
e
-
o
f
f
in
QA
Fo
r
n
atu
r
al
lan
g
u
ag
e
in
f
er
e
n
ce
(
MN
L
I
)
,
we
tr
ac
k
e
d
lab
el
co
n
f
u
s
io
n
s
(
e
n
tailm
en
t,
co
n
t
r
ad
ictio
n
,
n
eu
tr
al)
.
Fig
u
r
e
12
v
is
u
alize
s
th
ese
lab
el
m
ig
r
atio
n
s
,
with
Fig
u
r
e
12
(
a)
s
h
o
win
g
B
E
R
T
-
b
ase
f
in
e
-
tu
n
ed
,
wh
er
e
co
r
r
ec
t
d
iag
o
n
al
f
lo
ws
d
o
m
in
ate
(
6
5
-
7
2
%)
an
d
s
y
s
tem
atic
b
u
t
lim
ited
d
r
if
ts
o
cc
u
r
(
e.
g
.
,
e
n
tailm
en
t
→
n
eu
tr
al
at
≈
1
8
%),
a
n
d
Fig
u
r
e
12
(
b
)
s
h
o
win
g
GPT
-
2
s
m
all,
ze
r
o
-
/f
ew
-
s
h
o
t,
wh
e
r
e
d
iag
o
n
al
ac
cu
r
ac
y
d
r
o
p
s
to
51
–
5
3
%
an
d
s
tr
o
n
g
er
o
f
f
-
d
ia
g
o
n
al
f
lo
ws
em
er
g
e:
e
n
tailm
en
t
o
f
ten
co
llap
s
es
in
to
n
e
u
tr
al,
co
n
tr
ad
ictio
n
is
m
is
r
ea
d
as
en
tailm
en
t,
an
d
n
eu
tr
al
ca
s
es
s
p
lit
alm
o
s
t
ev
en
ly
b
etwe
en
ex
tr
e
m
es.
F
in
e
-
tu
n
ed
B
E
R
T
-
b
ase
m
ain
tain
s
s
tr
o
n
g
d
iag
o
n
al
d
o
m
in
an
ce
(
s
tab
ilit
y
)
,
wh
ile
GPT
-
2
ex
h
ib
its
h
ig
h
er
o
f
f
-
d
iag
o
n
al
f
lo
w
(
b
r
o
ad
er
in
s
tab
ilit
y
)
,
h
ig
h
lig
h
tin
g
h
o
w
s
u
p
er
v
is
ed
f
i
n
e
-
tu
n
i
n
g
s
tab
ili
ze
s
r
ea
s
o
n
in
g
v
er
s
u
s
p
r
o
m
p
tin
g
wh
ic
h
am
p
lifie
s
u
n
ce
r
tain
ty
.
Fo
r
s
en
tim
en
t
an
aly
s
is
(
SS
T
-
2
)
,
we
d
ef
in
ed
p
o
lar
ity
f
lip
s
(
r
e
v
er
s
al
o
f
p
o
s
itiv
e/n
eg
ativ
e
s
en
tim
en
ts
)
,
o
f
ten
ar
is
in
g
f
r
o
m
n
eg
atio
n
s
o
r
in
ten
s
if
ier
s
.
Acr
o
s
s
all
task
s
,
we
m
o
n
ito
r
ed
o
v
er
co
n
f
id
en
c
e
(
ce
r
tain
ty
ex
ce
e
d
in
g
c
o
r
r
ec
tn
e
s
s
)
an
d
p
r
o
m
p
t
s
en
s
itiv
ity
(
r
e
wo
r
d
in
g
c
h
an
g
es
GPT
-
2
'
s
o
u
tp
u
t)
;
th
ese
m
etr
ics
q
u
an
tify
r
eliab
ilit
y
b
ey
o
n
d
r
aw
ac
cu
r
ac
y
.
Hy
b
r
id
ev
alu
at
io
n
co
m
b
in
ed
au
t
o
m
ated
d
et
ec
tio
n
with
h
u
m
a
n
v
er
if
icatio
n
,
u
s
in
g
c
o
n
f
u
s
io
n
m
atr
ices
s
tan
d
ar
d
ized
ac
r
o
s
s
s
ee
d
s
.
T
h
is
tax
o
n
o
m
y
lin
k
s
r
eliab
ilit
y
to
d
esig
n
ch
o
ices: b
id
ir
ec
tio
n
al
en
c
o
d
in
g
v
er
s
u
s
au
to
r
e
g
r
ess
iv
e
d
ec
o
d
in
g
an
d
f
in
e
-
t
u
n
in
g
v
er
s
u
s
p
r
o
m
p
tin
g
.
Evaluation Warning : The document was created with Spire.PDF for Python.