I
AE
S
I
n
t
e
r
n
at
ion
al
Jou
r
n
al
of
Ar
t
if
icial
I
n
t
e
ll
ig
e
n
c
e
(
I
J
-
AI
)
Vol.
14
,
No.
4
,
Augus
t
2025
,
pp.
3311
~
3323
I
S
S
N:
2252
-
8938
,
DO
I
:
10
.
11591/i
jai
.
v
14
.i
4
.
pp
33
11
-
3323
3311
Jou
r
n
al
h
omepage
:
ht
tp:
//
ij
ai
.
iaes
c
or
e
.
c
om
Hyb
r
id
c
on
vol
u
t
io
n
al
vi
si
on
t
r
an
sf
o
r
m
e
r
f
o
r
e
x
t
r
u
si
on
-
b
as
e
d
3D
f
ood
-
p
r
i
n
t
in
g
d
e
f
e
c
t
c
la
ssi
f
ic
a
t
io
n
Chol
id
M
awar
d
i
1
,
2
,
Agus
B
u
on
o
1
,
Karlis
a
P
r
ian
d
an
a
1
,
Her
ian
t
o
3
1
C
omput
e
r
S
c
ie
nc
e
S
tu
dy P
r
ogr
a
m, S
c
hool
of
D
a
ta
S
c
ie
nc
e
,
M
a
th
e
ma
ti
c
s
, a
nd I
nf
or
ma
ti
c
s
, I
ns
ti
tu
t
P
e
r
ta
ni
a
n B
ogor
, B
ogor
, I
ndone
s
ia
2
D
e
pa
r
tm
e
nt
of
G
r
a
phi
c
s
E
ngi
ne
e
r
in
g F
a
c
ul
ty
of
I
ndus
tr
ia
l
T
e
c
hnol
ogy, P
ol
it
e
kni
k N
e
ge
r
i
M
e
di
a
K
r
e
a
ti
f
, J
a
ka
r
ta
, I
ndone
s
ia
3
D
e
pa
r
tm
e
nt
of
M
e
c
h
a
ni
c
a
l
a
nd
I
ndus
tr
ia
l
E
ngi
ne
e
r
in
g,
U
ni
ve
r
s
it
a
s
G
a
dj
a
h M
a
da
, Y
ogy
a
ka
r
ta
, I
ndone
s
ia
Ar
t
icle
I
n
f
o
AB
S
T
RA
CT
A
r
ti
c
le
h
is
tor
y
:
R
e
c
e
ived
De
c
12,
2024
R
e
vis
e
d
J
un
11,
2025
Ac
c
e
pted
J
ul
10,
2025
D
eep
l
earn
i
n
g
i
s
g
e
n
eral
l
y
u
s
e
d
t
o
p
erf
o
rm
remo
t
e
mo
n
i
t
o
ri
n
g
o
f
t
h
ree
-
d
i
me
n
s
i
o
n
al
(
3D
)
p
r
i
n
t
i
n
g
res
u
l
t
s
,
i
n
cl
u
d
i
n
g
ex
t
r
u
s
i
o
n
-
b
a
s
ed
3
D
fo
o
d
p
ri
n
t
i
n
g
.
O
n
e
o
f
t
h
e
w
i
d
el
y
u
s
e
d
d
eep
l
ear
n
i
n
g
al
g
o
r
i
t
h
ms
fo
r
d
efec
t
d
et
ec
t
i
o
n
i
n
3
D
p
ri
n
t
i
n
g
i
s
t
h
e
co
n
v
o
l
u
t
i
o
n
al
n
e
u
r
al
n
et
w
o
r
k
(CN
N
).
H
o
w
ev
er,
t
h
e
p
r
o
ces
s
req
u
i
re
s
h
i
g
h
c
o
mp
u
t
a
t
i
o
n
a
l
co
s
t
s
an
d
a
l
arg
e
d
at
a
s
et
.
T
h
i
s
res
ear
ch
p
r
o
p
o
s
e
s
t
h
e
Co
n
4
V
i
T
mo
d
el
,
a
h
y
b
r
i
d
mo
d
e
l
t
h
a
t
co
m
b
i
n
e
s
t
h
e
s
t
ren
g
t
h
s
o
f
v
i
s
i
o
n
t
r
an
s
fo
rmer
w
i
t
h
t
h
e
i
n
h
eren
t
feat
u
re
e
x
t
rac
t
i
o
n
cap
ab
i
l
i
t
i
es
o
f
C
N
N
.
T
h
e
l
o
cal
l
y
e
x
t
rac
t
ed
fea
t
u
re
s
i
n
t
h
e
CN
N
w
ere
mer
g
e
d
u
s
i
n
g
t
h
e
t
ra
n
s
f
o
rmers
’
g
l
o
b
al
feat
u
res
w
i
t
h
fo
u
r
t
ra
n
s
f
o
rmer
en
co
d
er
b
l
o
c
k
s
.
T
h
e
p
ro
p
o
s
ed
m
o
d
e
l
h
as
a
s
mal
l
er
n
u
m
b
er
o
f
p
aramet
e
rs
co
m
p
ared
t
o
o
t
h
e
r
l
i
g
h
t
w
e
i
g
h
t
p
re
-
t
ra
i
n
e
d
d
eep
l
ear
n
i
n
g
mo
d
el
s
s
u
ch
a
s
V
G
G
1
6
,
V
G
G
1
9
,
E
ffi
c
i
en
t
N
e
t
B2
,
In
cep
t
i
o
n
V
3
,
an
d
Res
N
et
5
0
.
T
h
u
s
,
t
h
e
p
ro
p
o
s
ed
mo
d
el
i
s
s
i
mp
l
i
f
i
e
d
.
Si
mu
l
at
i
o
n
s
w
ere
co
n
d
u
ct
e
d
t
o
cl
a
s
s
i
fy
d
e
fect
an
d
n
o
n
-
d
ef
ec
t
i
mag
e
s
o
b
t
ai
n
ed
fr
o
m
t
h
e
p
ri
n
t
i
n
g
res
u
l
t
s
o
f
a
d
e
v
el
o
p
e
d
ex
t
r
u
s
i
o
n
-
b
a
s
ed
3
D
fo
o
d
p
r
i
n
t
i
n
g
de
v
i
ce.
Si
m
u
l
a
t
i
o
n
re
s
u
l
t
s
s
h
o
w
e
d
t
h
at
t
h
e
mo
d
e
l
p
r
o
d
u
ced
a
n
accu
racy
o
f
9
5
.
4
3
%
,
h
i
g
h
er
t
h
an
t
h
e
s
t
a
t
e
-
of
-
the
-
ar
t
t
ech
n
i
q
u
e
s
,
i
.
e.
,
V
G
G
1
6
,
V
G
G
1
9
,
Mo
b
i
l
eN
e
t
V
2
,
E
ff
i
ci
e
n
t
N
et
B
2
,
In
ce
p
t
i
o
n
V
3
,
an
d
Re
s
N
e
t
5
0
,
w
i
t
h
accu
raci
e
s
o
f
7
7
.
8
8
%
,
8
6
.
3
0
%
,
8
2
.
9
5
%
,
9
0
.
8
7
%
,
8
4
.
6
2
%
,
a
n
d
9
3
.
8
3
%
,
res
p
ec
t
i
v
el
y
.
T
h
i
s
res
earc
h
s
h
o
w
s
t
h
at
t
h
e
p
ro
p
o
s
ed
Co
n
4
V
i
T
mo
d
e
l
can
b
e
u
s
e
d
fo
r
3
D
fo
o
d
p
r
i
n
t
i
n
g
d
efec
t
d
e
t
ect
i
o
n
w
i
t
h
h
i
g
h
acc
u
racy
.
K
e
y
w
o
r
d
s
:
3D
f
ood
p
r
int
ing
C
onvolut
ional
ne
ur
a
l
ne
twor
k
Hybr
id
c
onvolut
ional
I
mage
c
las
s
if
ica
ti
on
Vis
ion
tr
a
ns
f
or
mer
Th
i
s
i
s
a
n
o
p
en
a
c
ces
s
a
r
t
i
c
l
e
u
n
d
e
r
t
h
e
CC
B
Y
-
SA
l
i
ce
n
s
e.
C
or
r
e
s
pon
din
g
A
u
th
or
:
Ka
r
li
s
a
P
r
ianda
na
C
omput
e
r
S
c
ienc
e
S
tudy
P
r
ogr
a
m
,
S
c
hool
of
Da
ta
S
c
ienc
e
,
M
a
thema
ti
c
s
,
a
nd
I
nf
o
r
matics
I
ns
ti
tut
P
e
r
tania
n
B
ogor
B
ogor
16680,
I
ndone
s
ia
E
mail:
ka
r
l
is
a
@a
pps
.
ipb.
a
c
.
id
1.
I
NT
RODU
C
T
I
ON
T
hr
e
e
-
dim
e
ns
ional
(
3D)
f
ood
pr
int
ing
tec
hnology
is
a
n
innovation
that
e
na
bles
the
c
r
e
a
ti
on
of
f
oods
with
c
ompl
e
x
s
ha
pe
s
a
nd
high
pr
e
c
is
ion
us
ing
s
pe
c
ialize
d
3D
pr
int
e
r
s
[
1]
.
T
his
tec
hnology
wor
ks
s
im
il
a
r
ly
to
c
onve
nti
ona
l
3D
pr
int
e
r
s
but
us
e
s
f
ood
mate
r
ials
s
uc
h
a
s
‘
i
nk’
to
pr
int
f
ood
[
2]
.
F
ood
pr
int
ing
of
f
e
r
s
a
r
a
nge
of
a
dva
ntage
s
that
e
nha
nc
e
the
c
uli
na
r
y
e
xpe
r
ien
c
e
.
I
t
e
na
bles
pe
r
s
ona
li
z
a
ti
on,
a
ll
owing
f
or
uniqu
e
s
ha
pe
s
,
textur
e
s
,
a
nd
f
lavor
s
a
c
c
or
ding
to
indi
vidual
p
r
e
f
e
r
e
nc
e
s
.
I
n
a
ddit
ion
,
s
ince
it
us
e
s
pr
e
c
is
e
tec
hniques
,
it
r
e
duc
e
s
f
ood
wa
s
te
[
3]
.
F
ood
pr
int
ing
a
ls
o
e
nc
our
a
ge
s
c
r
e
a
ti
vit
y
in
the
kit
c
he
n,
a
ll
owing
ne
w
c
omb
inations
of
ingr
e
dients
a
nd
de
s
igns
that
a
r
e
im
pos
s
ibl
e
with
c
onve
nti
ona
l
methods
[
4]
.
L
uxur
y
r
e
s
taur
a
nts
us
e
3D
f
ood
pr
int
ing
to
c
r
e
a
te
unique
dis
he
s
with
a
r
ti
s
ti
c
pr
e
s
e
ntation
[
5]
,
[
6]
.
I
n
the
f
utu
r
e
,
it
ha
s
the
potential
t
o
be
us
e
d
in
the
f
ood
indus
tr
y
f
or
unif
or
m
a
nd
e
f
f
icie
nt
mas
s
pr
oduc
ti
on
o
f
f
ood
[
7
]
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
Ar
ti
f
I
ntell
,
Vol.
14
,
No.
4
,
Augus
t
20
25
:
331
1
-
3323
3312
D
e
f
e
c
t
s
i
n
3
D
f
oo
d
pr
in
ti
ng
c
a
n
a
f
f
e
c
t
t
he
qu
a
li
t
y,
a
pp
e
a
r
a
n
c
e
,
a
n
d
t
e
xt
ur
e
of
t
h
e
pr
i
nt
e
d
f
oo
d
s
[
8]
.
T
h
e
c
a
u
s
e
s
o
f
th
e
s
e
d
e
f
e
c
t
s
c
a
n
v
a
r
y
f
r
o
m
t
e
c
h
ni
c
a
l
i
s
s
ue
s
wit
h
t
h
e
pr
i
nt
e
r
a
n
d
e
r
r
or
s
in
p
r
i
nti
ng
p
a
r
a
m
e
ter
s
to
th
e
n
a
t
ur
e
of
th
e
f
o
od
m
a
t
e
r
i
a
l
u
s
e
d
[
9]
.
R
e
s
o
lv
in
g
th
e
s
e
d
e
f
e
c
t
s
r
e
q
uir
e
s
a
dj
u
s
t
me
nt
s
to
th
e
m
ol
d
d
e
s
i
gn,
te
mp
e
r
a
t
ur
e
,
pr
int
in
g
s
p
e
e
d,
a
n
d
ma
t
e
r
i
a
l
s
e
t
ti
ng
s
[
10]
.
E
a
r
ly
de
te
c
ti
o
n
o
f
th
e
s
e
d
e
f
e
c
t
s
c
a
n
s
a
v
e
th
e
f
o
od
m
a
t
e
r
ia
l
s
u
s
e
d
in
3
D
f
o
od
p
r
in
ti
ng
b
y
s
tr
e
a
m
li
ni
ng
t
he
p
r
o
c
e
s
s
[
11]
.
E
a
r
l
y
de
f
e
c
t
de
t
e
c
ti
on
c
a
n
b
e
do
ne
r
e
m
ot
e
l
y
b
y
t
a
kin
g
a
n
i
m
a
g
e
of
th
e
f
oo
d
pr
in
ti
ng
r
e
s
u
lt
s
ob
ta
in
e
d
f
r
o
m
a
c
a
m
e
r
a
.
T
h
e
n,
c
l
a
s
s
if
ic
a
t
io
n
b
e
tw
e
e
n
de
f
e
c
t
a
nd
no
n
-
de
f
e
c
t
f
o
od
im
a
ge
s
i
s
do
ne
.
A
wi
de
ly
u
s
e
d
me
th
od
f
or
d
e
f
e
c
t
a
n
d
non
-
d
e
f
e
c
t
c
la
s
s
if
i
c
a
ti
o
n
i
s
d
e
e
p
l
e
a
r
n
in
g
c
on
vo
lu
ti
on
a
l
ne
ur
a
l
ne
tw
or
k
s
(
C
N
N)
[
1
2]
.
Ho
we
v
e
r
,
th
e
C
NN
m
e
t
ho
d
r
e
q
uir
e
s
hig
h
c
om
pu
ti
ng
c
o
s
t
s
a
nd
l
a
r
g
e
da
t
a
.
S
e
ve
r
a
l
m
od
e
l
s
h
a
v
e
b
e
e
n
d
e
v
e
l
op
e
d
t
o
r
e
du
c
e
th
e
c
o
mp
ut
a
ti
on
a
l
c
o
s
t
of
C
NN
,
n
a
m
e
l
y
l
ig
ht
we
ig
ht
C
NN
m
od
e
ls
s
u
c
h
a
s
V
GG
16
[
13]
,
V
G
G1
9
[
1
4]
,
M
o
bil
e
Ne
tV
2
[
1
5]
,
E
f
f
ic
i
e
nt
N
e
tB
2
[
16
]
,
I
nc
e
pti
on
V3
[
17]
,
a
n
d
R
e
s
N
e
t
50
[
18]
.
T
h
e
s
e
a
l
gor
it
h
m
s
u
s
e
a
p
a
r
a
m
e
t
e
r
r
e
du
c
t
io
n
a
p
pr
oa
c
h
t
o
lo
we
r
C
N
N
c
o
mp
ut
a
ti
on
a
l
c
o
s
t
s
.
An
ot
h
e
r
a
ppr
oa
c
h
th
a
t
h
a
s
n
ot
b
e
e
n
e
xp
lor
e
d
i
s
s
im
pli
f
yi
ng
t
h
e
f
e
a
t
ur
e
e
x
tr
a
c
ti
on
p
r
o
c
e
s
s
f
r
om
l
a
r
g
e
da
t
a
.
Vi
s
io
n
tr
a
n
s
f
or
m
e
r
(
V
iT
)
i
s
a
n
a
lg
or
it
h
m
th
a
t
c
a
n
p
e
r
f
or
m
glo
b
a
l
f
e
a
tur
e
e
x
tr
a
c
ti
o
n
f
r
o
m
lar
g
e
a
m
ou
nt
s
of
d
a
ta
[
1
9]
.
T
h
e
V
iT
me
th
od
h
a
s
b
e
e
n
pr
ov
e
n
t
o
b
e
a
b
l
e
to
c
la
s
s
if
y
t
om
ogr
a
ph
y
im
a
g
e
s
f
o
r
pu
lm
on
a
r
y
no
dul
e
d
e
t
e
c
t
io
n
a
nd
d
i
a
g
no
s
i
s
wit
h
g
oo
d
a
c
c
ur
a
c
y
[
2
0]
.
V
iT
of
f
e
r
s
s
e
v
e
r
a
l
a
dv
a
nt
a
g
e
s
o
ve
r
tr
a
di
ti
on
a
l
C
N
N
f
or
c
o
mp
ut
e
r
vi
s
io
n
t
a
s
k
s
,
i
nc
lu
di
ng
i
mpr
ov
e
d
e
f
f
i
c
i
e
nc
y,
s
c
a
la
bi
li
t
y,
tr
a
n
s
f
e
r
l
e
a
r
n
in
g,
p
e
r
f
o
r
m
a
n
c
e
,
a
n
d
f
le
xi
bil
it
y
[
2
1]
.
W
i
th
f
ur
t
he
r
r
e
s
e
a
r
c
h
a
nd
d
e
v
e
lo
pm
e
nt,
V
iT
h
a
s
t
he
po
te
nt
ia
l
to
b
e
c
om
e
a
p
o
we
r
f
ul
to
ol
f
or
a
wi
de
r
a
n
g
e
of
c
o
mp
ut
e
r
vi
s
io
n
a
pp
li
c
a
ti
on
s
,
s
u
c
h
a
s
c
r
o
p
p
e
s
t
im
a
g
e
r
e
c
o
gni
ti
on
[
22
]
.
I
n
th
i
s
r
e
s
e
a
r
c
h,
we
pr
o
po
s
e
a
hy
br
i
d
m
od
e
l
of
C
N
N
a
nd
ViT
to
c
o
mbi
n
e
th
e
a
b
il
i
ty
of
lo
c
a
l
f
e
a
tu
r
e
e
x
tr
a
c
t
io
n
in
C
NN
w
it
h
gl
ob
a
l
f
e
a
t
ur
e
e
x
tr
a
c
ti
on
i
n
V
iT
.
T
h
e
pr
o
po
s
e
d
me
th
od
i
s
c
a
ll
e
d
C
o
n
4V
iT
,
w
h
ic
h
c
om
bi
n
e
s
C
N
N
w
it
h
f
o
ur
tr
a
n
s
f
or
m
e
r
e
n
c
o
d
e
r
b
lo
c
k
s
of
ViT
.
S
i
mu
la
ti
on
s
we
r
e
c
on
du
c
t
e
d
t
o
pr
o
ve
t
h
e
p
e
r
f
or
m
a
nc
e
of
th
e
pr
o
po
s
e
d
C
o
n4
ViT
m
e
th
o
d
f
o
r
t
he
d
e
v
e
lo
pe
d
e
x
tr
u
s
io
n
-
b
a
s
e
d
3
D
f
o
od
pr
i
nt
in
g
d
e
vi
c
e
.
C
on
4V
iT
i
s
u
s
e
d
t
o
c
l
a
s
s
if
y
f
o
od
pr
i
nt
ing
im
a
ge
s
int
o
t
w
o
c
la
s
s
e
s
,
n
a
me
ly
de
f
e
c
t
a
n
d
non
-
d
e
f
e
c
t.
T
he
n,
t
h
e
pr
op
o
s
e
d
C
on
4V
iT
m
e
t
ho
d
i
s
c
om
pa
r
e
d
w
it
h
t
he
s
ta
t
e
-
of
-
th
e
-
a
r
t
t
e
c
hn
iq
ue
s
t
ha
t
h
a
v
e
b
e
e
n
m
e
nti
on
e
d,
n
a
m
e
l
y
V
G
G1
6,
V
GG
19,
M
ob
il
e
N
e
tV
2,
E
f
f
i
c
i
e
n
tN
e
t
B
2,
I
n
c
e
pt
io
nV
3,
a
n
d
R
e
s
Ne
t5
0.
T
he
r
e
s
t
of
th
i
s
p
a
pe
r
i
s
s
tr
uc
tur
e
d
a
s
f
oll
o
ws
.
S
e
c
ti
on
2
d
e
s
c
r
ib
e
s
th
e
r
e
la
te
d
wo
r
k
s
a
bo
ut
d
e
e
p
l
e
a
r
n
in
g
m
od
e
l
s
in
3
D
P
r
in
ti
ng.
S
e
c
t
io
n
3
p
r
e
s
e
nt
s
th
e
m
e
t
ho
do
lo
gy
of
t
hi
s
r
e
s
e
a
r
c
h,
in
c
l
ud
in
g
t
he
d
a
t
a
s
e
t
a
c
qu
ir
e
d
f
r
om
a
d
e
ve
lo
pe
d
e
xtr
u
s
i
on
-
b
a
s
e
d
3
D
f
o
o
d
pr
i
nt
in
g
a
nd
th
e
p
r
o
po
s
e
d
a
r
c
h
it
e
c
tur
e
of
C
o
n
4V
iT
.
S
e
c
ti
on
4
pr
ov
id
e
s
th
e
r
e
s
u
lt
s
a
nd
di
s
c
u
s
s
io
n
of
t
h
e
m
od
e
l
p
e
r
f
or
m
a
n
c
e
,
a
n
d
s
e
c
ti
on
5
pr
e
s
e
nt
s
th
e
m
a
i
n
c
o
nc
lu
s
io
n
s
of
th
is
w
or
k.
2.
RE
L
AT
E
D
WORKS
B
a
umann
a
nd
R
oll
e
r
[
23]
c
onduc
ted
e
a
r
ly
r
e
s
e
a
r
c
h
on
de
f
e
c
t
c
ontr
ol
in
3D
p
r
int
ing
mac
hines
.
T
he
s
tudy
invol
ve
s
c
omput
e
r
vis
ion
to
de
tec
t
f
a
ult
diagnos
is
,
divi
ding
the
de
f
e
c
t
c
las
s
if
ica
ti
on
int
o
f
ive
c
las
s
e
s
,
na
mely
de
tac
hment,
mi
s
s
ing
mate
r
ial
f
low
,
de
f
o
r
med
objec
t,
s
ur
f
a
c
e
e
r
r
o
r
s
,
a
nd
de
viation
f
r
om
th
e
model.
T
hr
e
e
c
las
s
e
s
we
r
e
s
uc
c
e
s
s
f
ull
y
de
tec
ted
f
r
om
t
he
f
ive
c
las
s
e
s
,
with
a
de
tec
ti
on
r
a
te
of
60
to
8
0
%
[
23]
.
R
a
c
hmaw
a
ti
e
t
al
.
[
24]
,
int
r
oduc
e
d
da
ta
a
ugment
a
ti
on
f
or
3D
pr
int
ing
to
va
r
y
the
a
mount
of
da
ta
to
he
lp
r
e
duc
e
ove
r
f
it
ti
ng
.
T
he
s
tudy
us
e
d
a
r
e
gular
C
NN
,
a
nd
the
a
c
c
ur
a
c
y
of
the
s
tudy
wa
s
95.
45
%
.
Othe
r
s
tudi
e
s
that
uti
li
z
e
de
e
p
lea
r
ning
in
3D
pr
in
ti
ng
a
r
e
s
um
mar
ize
d
in
T
a
ble
1
.
As
s
e
e
n
in
T
a
ble
1
,
p
r
e
vious
r
e
s
e
a
r
c
h
us
ing
the
R
e
s
Ne
t50
model
with
a
3D
f
ood
p
r
int
ing
im
a
ge
da
tas
e
t
o
f
c
hoc
olate
objec
ts
r
e
s
ult
e
d
in
a
n
a
c
c
ur
a
c
y
of
93
.
80%
[
25
]
.
T
he
s
tudy
us
e
d
p
r
e
-
tr
a
ined
I
nc
e
pti
onV3
a
nd
R
e
s
Ne
t
50
models
with
a
ddit
ional
hype
r
pa
r
a
mete
r
tuni
ng
on
lea
r
ning
r
a
te
to
obtai
n
the
opti
mum
va
lue.
T
he
n
,
the
r
e
s
e
a
r
c
h
c
onduc
ted
by
P
a
r
a
s
ke
voudis
e
t
al.
[
26
]
moni
tor
e
d
de
f
e
c
ts
in
f
us
e
d
f
lui
d
f
a
br
i
c
a
ti
on
(
F
F
F
)
3D
pr
int
ing
.
T
he
s
tudy
us
e
d
the
VG
G16
pr
e
-
tr
a
ined
a
r
c
hit
e
c
tur
e
model
a
s
a
ba
s
e
n
e
twor
k
with
16
c
onvolut
ional
laye
r
s
a
nd
3
f
ull
y
c
onne
c
ted
laye
r
s
.
T
he
r
e
s
ult
ing
model
a
c
c
ur
a
c
y
is
92
.
70%
.
T
a
ble
1.
P
e
r
f
o
r
manc
e
c
ompar
is
on
of
dif
f
e
r
e
nt
de
e
p
lea
r
ning
models
in
3D
p
r
int
ing
W
or
ks
M
a
c
hi
ne
(
ma
te
r
ia
l
)
M
e
th
od
A
c
c
ur
a
c
y
(%)
S
e
ns
it
iv
it
y
(%)
P
r
e
c
is
io
n
(%)
F1
-
s
c
or
e
(%)
R
a
c
hma
w
a
ti
e
t
al
.
[
24]
3D
P
r
in
ti
ng
C
N
N
+
M
obi
le
N
e
t
95.45
-
-
-
B
a
umga
r
tl
e
t
al
.
[
27]
3D
P
r
it
ni
ng
C
N
N
+
C
la
s
s
ic
M
L
96.80
96.80
96.52
96.42
(
ka
ppa
s
c
or
e
)
M
a
w
a
r
di
e
t
al
.
[
25]
3D
f
ood pr
in
ti
ng
R
e
s
N
e
t5
0
93.80
96,56
96,84
96.70
P
a
r
a
s
ke
voudis
e
t
al
.
[
26]
3D
P
r
in
ti
ng
V
G
G
16
92.70
92.00
75.01
82.10
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
Ar
ti
f
I
ntell
I
S
S
N:
2252
-
8938
Hy
br
id
c
onv
olut
ional
v
is
ion
tr
ans
for
me
r
for
e
x
t
r
us
ion
-
bas
e
d
3D
food
-
pr
int
ing
de
fec
t
…
(
C
holi
d
M
aw
ar
di)
3313
De
f
e
c
t
c
las
s
if
ica
ti
on
in
3D
f
ood
pr
in
ti
ng
ge
ne
r
a
ll
y
e
xhibi
ts
lowe
r
a
c
c
ur
a
c
y
c
ompar
e
d
to
tr
a
dit
ional
3D
pr
int
ing
.
T
his
dis
c
r
e
pa
nc
y
a
r
is
e
s
f
r
om
the
dif
f
e
r
e
nc
e
s
in
p
r
int
ing
mate
r
ials
,
whic
h
pos
e
c
ha
ll
e
nge
s
f
or
c
omput
e
r
vis
ion
s
ys
tems
.
W
hil
e
3D
f
ood
pr
in
ti
ng
u
ti
li
z
e
s
s
of
t
mate
r
ials
li
ke
c
hoc
olate
a
nd
pa
s
ta,
tr
a
dit
ional
3D
pr
int
ing
e
mpl
oys
mor
e
r
igi
d
mate
r
ials
that
a
r
e
e
a
s
ier
to
a
na
lyze
f
o
r
objec
t
de
tec
ti
on
a
n
d
im
a
ge
c
las
s
if
ica
ti
on
[
28]
,
[
29
]
.
Give
n
thes
e
c
ha
ll
e
nge
s
,
de
e
p
lea
r
ning
models
a
r
e
c
ons
ider
e
d
we
ll
-
s
uit
e
d
f
or
de
f
e
c
t
de
tec
ti
on
in
3D
f
ood
pr
int
ing.
3.
M
E
T
HO
DOL
OG
Y
T
his
s
e
c
ti
on
outl
ines
the
pr
oc
e
s
s
f
or
de
tec
ti
ng
a
n
d
c
las
s
if
ying
p
r
int
r
e
s
ult
s
f
r
om
3D
f
ood
pr
int
ing
de
vice
s
int
o
two
c
a
tegor
ies
:
de
f
e
c
t
a
nd
non
-
de
f
e
c
t.
C
las
s
if
i
c
a
ti
on
is
pe
r
f
or
med
us
ing
a
ne
wly
pr
opos
e
d
a
lgor
it
hm
a
hybr
id
model
that
c
ombi
ne
s
a
C
NN
with
a
ViT
on
im
a
ge
s
c
a
ptur
e
d
f
r
om
the
3D
f
ood
pr
int
ing
de
vice
.
T
he
de
f
e
c
t
de
tec
ti
on
pr
oc
e
s
s
is
il
lus
tr
a
ted
i
n
F
igur
e
1
.
T
he
f
ir
s
t
s
tage
invol
ve
s
da
ta
c
oll
e
c
ti
on,
whe
r
e
vide
os
of
the
f
ood
be
ing
pr
int
e
d
a
r
e
r
e
c
or
de
d
us
ing
a
n
E
nde
r
-
V3
3D
pr
int
e
r
e
quipped
wi
th
a
L
uc
kybot
e
xtr
ude
r
.
Vide
o
c
a
ptur
e
is
f
a
c
il
it
a
ted
by
Oc
to
P
r
int
plugi
ns
.
T
he
s
e
videos
a
r
e
then
s
e
gmente
d
int
o
indi
vidual
i
mage
f
r
a
mes
,
whic
h
a
r
e
manua
ll
y
labe
led
a
s
e
it
he
r
de
f
e
c
t
or
non
-
de
f
e
c
t
ba
s
e
d
on
the
a
c
tual
c
ondit
ion
of
the
pr
i
nted
r
e
s
ult
s
.
T
he
n,
da
ta
p
r
e
pr
oc
e
s
s
ing
is
c
onduc
te
d
on
the
labe
led
im
a
ge
s
.
T
he
da
tas
e
t
is
then
s
pli
t
int
o
80
%
tr
a
ini
ng
da
ta
a
nd
20%
va
li
da
ti
on
da
ta.
T
he
n
e
xt
s
tep
invol
ve
s
d
e
ve
lopi
ng
the
hybr
id
model,
whic
h
int
e
gr
a
tes
C
NN
a
nd
ViT
c
omponents
th
r
oug
h
s
e
ve
r
a
l
tr
a
ns
f
or
mer
e
nc
ode
r
blocks
.
Dur
ing
the
tr
a
ini
ng
pha
s
e
,
va
li
da
ti
on
is
pe
r
f
o
r
med
to
mi
ti
ga
te
the
r
is
k
o
f
ove
r
f
it
ti
ng.
T
o
a
s
s
e
s
s
the
model's
pe
r
f
or
manc
e
,
a
c
onf
us
ion
matr
ix
is
e
mpl
oye
d.
F
igur
e
1.
T
he
r
e
s
e
a
r
c
h
methodology
3.
1
.
Dat
a
c
oll
e
c
t
ion
T
he
ini
ti
a
l
p
r
oc
e
s
s
f
or
c
a
ptur
ing
im
a
ge
s
of
3D
f
ood
pr
int
ing
invol
ve
s
us
ing
a
L
ogit
e
c
h
C
270
we
bc
a
m
to
r
e
c
or
d
the
pr
int
ing
pr
oc
e
dur
e
,
a
s
il
lus
tr
a
ted
in
F
igur
e
2.
A
R
a
s
pbe
r
r
y
P
i
mi
c
r
oc
ont
r
oll
e
r
s
e
r
ve
s
a
s
the
int
e
r
f
a
c
e
be
twe
e
n
the
3D
f
ood
p
r
int
ing
de
vice
a
nd
the
c
omput
e
r
,
e
na
bli
ng
the
de
tec
ti
on
a
nd
r
e
c
o
r
ding
of
the
pr
int
ing
pr
oc
e
s
s
.
Va
r
ious
pr
int
ing
tas
ks
a
r
e
c
o
nduc
ted
us
ing
dif
f
e
r
e
nt
de
s
igns
,
f
r
o
m
whic
h
two
o
utcome
s
a
r
e
s
e
lec
ted:
one
r
e
pr
e
s
e
nti
ng
a
de
f
e
c
t
a
nd
the
othe
r
a
non
-
de
f
e
c
t.
Af
ter
the
p
r
int
ing
is
c
ompl
e
ted,
the
r
e
c
or
dings
a
r
e
then
s
e
gmente
d
int
o
indi
vidual
im
a
ge
s
.
F
or
the
de
f
e
c
t
c
a
tegor
y,
whic
h
include
s
the
f
a
il
e
d
p
r
int
vi
de
o
with
a
dur
a
ti
on
o
f
2
m
inut
e
s
a
nd
11
s
e
c
onds
,
im
a
ge
s
a
r
e
e
xtr
a
c
ted
e
ve
r
y
s
e
c
ond,
r
e
s
ult
ing
in
a
tot
a
l
o
f
2
62
im
a
ge
s
.
S
im
il
a
r
ly
,
the
non
-
de
f
e
c
t
c
a
tegor
y,
whi
c
h
ha
s
a
dur
a
ti
on
o
f
2
mi
nu
tes
a
nd
12
.
5
s
e
c
onds
,
pr
oduc
e
s
265
im
a
ge
s
.
T
he
im
a
ge
s
f
r
om
the
de
f
e
c
t
p
r
o
c
e
s
s
a
r
e
c
a
tegor
ize
d
a
s
de
f
e
c
t
s
a
mpl
e
s
,
while
tho
s
e
f
r
om
t
he
non
-
de
f
e
c
t
pr
oc
e
s
s
a
r
e
c
las
s
if
ied
a
s
non
-
de
f
e
c
t
s
a
mpl
e
s
.
I
n
a
ddit
ion,
the
da
tas
e
t
is
s
uppleme
nted
with
im
a
ge
s
f
r
om
a
r
e
gular
3D
pr
int
ing
de
vice
.
T
his
inclus
ion
a
dds
va
r
iety
to
the
da
tas
e
t
a
nd
e
nha
nc
e
s
da
ta
r
e
pr
e
s
e
ntation,
ult
i
mate
ly
im
pr
ov
ing
t
he
a
c
c
ur
a
c
y
o
f
the
mod
e
l.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
Ar
ti
f
I
ntell
,
Vol.
14
,
No.
4
,
Augus
t
20
25
:
331
1
-
3323
3314
F
igur
e
2.
E
xpe
r
im
e
ntal
s
e
tup
f
o
r
de
f
e
c
t
de
tec
ti
on
i
n
3D
f
ood
pr
int
ing
3.
2
.
Dat
a
p
r
e
p
r
oc
e
s
s
in
g
T
his
s
e
c
ti
on
de
s
c
r
ibes
th
e
da
ta
pr
e
pr
oc
e
s
s
ing
s
te
ps
ne
c
e
s
s
a
r
y
to
pr
e
pa
r
e
the
im
a
ge
s
f
or
e
f
f
e
c
ti
ve
pr
oc
e
s
s
ing
by
the
de
e
p
lea
r
ning
model
.
T
he
pr
e
p
r
oc
e
s
s
ing
tec
hniques
include
r
e
s
izing,
r
e
s
c
a
li
ng,
a
nd
da
ta
a
ugmenta
ti
on.
I
ni
ti
a
ll
y,
the
3D
f
ood
pr
int
ing
im
a
ge
s
a
r
e
c
a
ptur
e
d
f
r
om
the
c
a
mer
a
a
nd
r
e
s
ize
d
to
128
×
128
pixels
[
30]
.
F
oll
owing
thi
s
,
i
mage
s
c
a
li
ng
is
a
ppli
e
d
to
a
djus
t
the
pixel
va
lues
f
r
om
the
r
a
nge
o
f
[
0,
255]
to
[
0,
1
]
[
31]
.
T
h
is
r
e
s
c
a
li
ng
is
c
r
uc
ial
f
or
pr
e
ve
nti
n
g
pixel
va
lues
f
r
om
be
c
omi
ng
e
xc
e
s
s
ively
lar
ge
or
s
mall,
whic
h
c
a
n
lea
d
to
nume
r
ica
l
ins
tabili
ty
a
nd
s
low
down
the
c
omput
a
ti
ona
l
pr
oc
e
s
s
[
32
]
.
All
e
xpe
r
i
ments
a
r
e
c
onduc
ted
us
ing
the
Ke
r
a
s
li
br
a
r
y
in
P
ython,
uti
li
z
i
ng
a
n
A100
GPU
with
150
GB
of
memor
y
.
I
n
thi
s
r
e
s
e
a
r
c
h,
va
r
ious
da
ta
a
ugmenta
ti
on
tec
hniques
a
r
e
e
mpl
oye
d
to
e
nha
nc
e
the
da
tas
e
t
a
nd
f
a
c
il
it
a
te
the
hybr
i
d
modeling
p
r
oc
e
s
s
be
twe
e
n
C
NN
a
nd
ViT
s
.
T
he
s
e
tec
hniques
a
r
e
de
s
igned
to
mi
ti
ga
te
ove
r
f
it
ti
ng
a
nd
im
p
r
ove
the
ove
r
a
ll
a
c
c
ur
a
c
y
o
f
t
he
model.
T
he
a
ugmenta
ti
on
methods
us
e
d
inclu
de
width
s
hif
t,
he
ight
s
hif
t,
z
oom
r
a
nge
,
f
l
ip,
a
nd
r
otatio
n
r
a
nge
[
33
]
.
A
c
ompl
e
te
s
umm
a
r
y
of
the
a
ug
menta
ti
on
tec
hniques
a
ppli
e
d
to
th
e
3D
f
ood
p
r
int
ing
im
a
ge
d
a
tas
e
t
is
pr
e
s
e
nted
in
T
a
ble
2
.
T
a
ble
2.
Va
lues
a
nd
pa
r
a
mete
r
s
of
the
a
ppli
e
d
tr
a
n
s
f
or
mation
tec
hniques
P
a
r
a
me
te
r
s
V
a
lu
e
of
pa
r
a
me
te
r
s
A
c
ti
on
W
id
th
s
hi
f
t
r
a
nge
0.2
R
a
ndoml
y a
dj
us
ts
t
he
i
ma
g
e
'
s
hor
iz
ont
a
l
s
iz
e
by 20%
.
H
e
ig
ht
s
hi
f
t
r
a
nge
0.2
R
a
ndoml
y a
dj
us
ts
t
he
i
ma
g
e
'
s
ve
r
ti
c
a
l
s
iz
e
by 20%
.
z
oom_r
a
nge
0.2
E
xt
e
nd t
he
z
oom by 0.2 f
r
om t
he
c
e
nt
e
r
.
s
he
a
r
_r
a
nge
0.2
0.2 i
s
t
he
i
ma
ge
'
s
e
xt
e
n
s
io
n.
r
ot
a
ti
on_r
a
nge
10
S
pi
n i
n a
-
10 t
o a
-
10
-
de
gr
e
e
c
ir
c
le
.
r
e
s
c
a
le
1./
255
s
c
a
le
s
(
nor
ma
li
z
e
s
)
t
he
i
ma
ge
pi
xe
l
va
lu
e
s
t
o f
a
ll
w
it
hi
n t
he
r
a
nge
of
0 t
o 1,
f
r
om a
n i
ni
ti
a
l
va
lu
e
r
a
nge
of
0 t
o 255.
3.
3
.
Dat
a
s
p
li
t
t
i
n
g
T
he
da
tas
e
t
is
divi
de
d
in
a
n
80:20
r
a
ti
o,
with
8
0%
a
ll
oc
a
ted
f
or
tr
a
ini
ng
a
nd
20%
r
e
s
e
r
ve
d
f
or
va
li
da
ti
on.
T
his
s
pli
t
is
c
ons
is
tently
a
ppli
e
d
to
b
oth
the
C
on4ViT
model
a
nd
other
be
nc
hmar
k
m
ode
ls
to
e
ns
ur
e
that
the
r
e
s
ult
s
a
r
e
c
ompar
a
ble.
B
y
maint
a
ini
ng
the
s
a
me
tr
a
ini
ng
a
nd
va
li
da
ti
on
da
ta
dis
t
r
ibut
ion
a
c
r
os
s
a
ll
models
,
we
c
a
n
c
onf
idently
a
tt
r
ibut
e
a
ny
obs
e
r
ve
d
dif
f
e
r
e
nc
e
s
in
pe
r
f
or
manc
e
to
va
r
i
a
ti
ons
in
model
a
r
c
hit
e
c
tur
e
r
a
ther
than
incons
is
tenc
ies
in
the
da
ta.
T
his
a
ppr
oa
c
h
e
nha
nc
e
s
the
r
e
li
a
bil
it
y
of
the
e
va
luation
a
nd
s
tr
e
ngthens
the
c
onc
lus
ions
dr
a
wn
f
r
om
the
c
ompar
a
ti
ve
a
na
lys
is
.
3.
4
.
Hyb
r
i
d
m
e
t
h
od
CN
N
-
ViT
(
Con4ViT
)
T
his
s
e
c
ti
on
e
xplains
the
f
unc
ti
ona
li
ty
of
the
C
on4ViT
model
,
whic
h
c
ombi
ne
s
the
s
tr
e
ngths
of
C
NN
s
a
nd
ViT
s
to
e
f
f
e
c
ti
ve
ly
c
a
ptur
e
both
loca
l
a
nd
global
f
e
a
tur
e
s
in
im
a
ge
s
.
T
he
model
be
gins
w
it
h
loca
l
f
e
a
tur
e
e
xtr
a
c
ti
on
thr
ough
a
c
onvolut
ional
block
c
ompr
is
ing
thr
e
e
laye
r
s
.
A
f
ter
the
c
onvolut
ional
op
e
r
a
ti
ons
a
r
e
pe
r
f
or
med,
the
r
e
s
ult
ing
mul
t
i
-
dim
e
ns
ional
output
is
f
lattene
d
int
o
a
one
-
dim
e
ns
ional
ve
c
tor
.
T
hi
s
ve
c
tor
is
then
pr
oc
e
s
s
e
d
by
the
tr
a
ns
f
or
mer
e
nc
ode
r
,
w
hich
uti
li
z
e
s
a
s
e
lf
-
a
tt
e
nti
on
mec
ha
nis
m
to
r
e
c
og
nize
the
r
e
lations
hip
be
twe
e
n
e
leme
nts
in
the
ve
c
tor
a
c
r
os
s
f
our
tr
a
ns
f
or
me
r
e
nc
ode
r
blocks
.
A
c
ompl
e
te
block
diagr
a
m
il
lus
tr
a
ti
ng
the
a
r
c
hit
e
c
tur
e
of
the
pr
opos
e
d
C
on4ViT
hybr
id
model
is
s
hown
in
F
igur
e
3
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
Ar
ti
f
I
ntell
I
S
S
N:
2252
-
8938
Hy
br
id
c
onv
olut
ional
v
is
ion
tr
ans
for
me
r
for
e
x
t
r
us
ion
-
bas
e
d
3D
food
-
pr
int
ing
de
fec
t
…
(
C
holi
d
M
aw
ar
di)
3315
F
igur
e
3.
T
he
p
r
opos
e
d
C
on4ViT
hybr
id
method
T
he
input
im
a
ge
o
f
s
ize
128×
128×
3
is
f
e
d
int
o
C
NN
to
e
xtr
a
c
t
loca
l
f
e
a
tur
e
s
[
34]
with
s
e
que
nti
a
l
C
NN
s
c
ons
is
ti
ng
of
3
c
onvolu
ti
ona
l
a
nd
max
-
poo
li
ng
laye
r
s
.
T
he
c
onvolut
ion
laye
r
uti
li
z
e
d
r
e
c
ti
f
ie
d
li
ne
a
r
unit
(
R
e
L
U
)
a
c
ti
va
ti
on
f
unc
ti
on
as
s
hown
in
(
1)
[
3
4]
.
(
,
)
=
∑
∑
×
−
1
=
0
(
+
,
+
)
.
(
,
)
−
1
=
0
(
1)
W
he
r
e
(
,
)
is
the
im
a
ge
input
in
pixel
(
,
)
,
(
,
)
is
the
we
ight
of
ke
r
ne
l/
f
il
ter
with
s
ize
×
a
nd
(
,
)
is
the
output
a
f
ter
the
c
onvolut
ion
ope
r
a
ti
on
a
t
pos
it
ion
(
,
)
.
T
he
n
,
the
pooli
ng
laye
r
uti
li
z
e
s
the
(
2)
.
(
,
)
=
(
{
(
2
+
,
2
+
)
|
,
∈
{
0
,
1
}
}
)
(
2)
W
he
r
e
(
,
)
is
the
output
a
f
ter
the
max
pooli
ng
ope
r
a
ti
on
a
t
pos
it
ion
(
,
)
,
the
indi
c
e
s
a
nd
it
e
r
a
te
ove
r
the
2
×
2
pooli
ng
window
,
a
nd
the
s
tr
ide
s
is
2,
indi
c
a
ti
ng
that
the
pooli
ng
window
moves
2
pi
xe
ls
a
t
a
ti
me
in
both
dim
e
ns
ions
.
T
he
pooli
ng
ope
r
a
ti
on
r
e
duc
e
s
the
input
dim
e
ns
ion
by
taking
the
maximum
va
lue
of
e
a
c
h
s
ub
-
a
r
e
a
in
the
input
matr
ix.
I
f
the
pooli
ng
s
ize
is
2×
2
,
f
r
om
e
a
c
h
2×
2
block
,
the
maximum
va
lue
is
take
n
a
s
the
pooli
ng
r
e
s
ult
.
Af
ter
the
pooli
ng
ope
r
a
ti
on
,
a
c
ombi
na
ti
on
with
f
l
a
tt
e
n
is
pe
r
f
or
med
us
ing
the
r
e
s
ha
pe
f
e
a
tur
e
with
the
e
nc
ode
r
s
tanda
r
d.
I
n
f
latten
,
the
i
mage
is
pr
oc
e
s
s
e
d
int
o
pa
tche
s
s
o
tha
t
it
c
a
n
be
c
onve
r
ted
in
to
a
ve
c
tor
s
e
que
nc
e
.
As
s
e
e
n
in
F
igur
e
4,
the
3D
f
ood
pr
int
i
ng
input
im
a
ge
is
pr
oc
e
s
s
e
d
int
o
non
-
ove
r
lapping
pa
tche
s
.
I
n
thi
s
pr
oc
e
s
s
,
the
or
igi
na
l
im
a
ge
in
F
igur
e
4
(
a
)
i
s
f
ir
s
t
divi
de
d
int
o
mul
ti
ple
s
maller
r
e
gions
in
F
ig
ur
e
4
(
b
)
,
e
a
c
h
of
s
ize
20
×
20
pixels
.
T
he
s
e
pa
tche
s
a
r
e
then
tr
a
ns
f
or
med
int
o
one
-
dim
e
ns
ional
ve
c
tor
s
thr
ough
a
f
latten
ope
r
a
ti
on.
F
latten
c
onve
r
ts
a
mul
ti
-
dim
e
ns
ional
t
e
ns
or
int
o
a
one
-
dim
e
ns
ional
ve
c
tor
wi
thout
c
ha
nging
the
va
lues
of
the
e
leme
nts
in
the
tens
or
.
F
o
r
e
xa
mpl
e
,
if
the
input
is
a
3D
tens
or
with
s
ize
(
ℎ
_
,
ℎ
,
,
)
(
e
.
g
.
,
f
r
o
m
th
e
c
on
vo
l
ut
io
n
la
y
e
r
)
,
the
n
f
la
tt
e
n
w
il
l
c
on
ve
r
t
it
i
nt
o
a
2
D
te
ns
o
r
o
f
s
iz
e
(
ℎ
_
,
ℎ
×
×
)
M
a
thema
ti
c
a
ll
y,
f
or
a
n
input
1
,
1
,
1
,
1
,
1
,
2
…
of
s
ize
(
ℎ
,
,
)
the
r
e
s
ult
is
(
3)
.
(
)
=
[
1
,
1
,
1
,
1
,
1
,
2
…
,
ℎ
,
,
]
(
3)
(
a
)
(
b)
F
igur
e
4.
P
a
tch
o
f
f
latten
im
a
ge
3D
f
ood
pr
int
ing
o
f
(
a
)
input
im
a
ge
a
nd
(
b)
pa
tch
im
a
ge
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
Ar
ti
f
I
ntell
,
Vol.
14
,
No.
4
,
Augus
t
20
25
:
331
1
-
3323
3316
T
he
n,
the
im
a
ge
pa
tche
s
with
the
f
latten
p
r
oc
e
s
s
e
nter
the
e
nc
ode
r
tr
a
ns
f
or
mer
,
with
a
laye
r
nor
maliza
ti
on
p
r
oc
e
s
s
f
or
mul
ti
-
he
a
d
a
tt
e
nti
on.
M
ult
i
-
he
a
d
a
tt
e
nti
on
in
th
e
tr
a
ns
f
or
me
r
model
c
a
lcu
late
s
the
a
tt
e
nti
on
we
ight
in
(
4)
,
a
s
e
xplaine
d
in
[
19
]
.
(
,
,
)
=
(
√
)
(
4)
A
S
of
tM
a
x
f
unc
ti
on
c
onve
r
ts
thes
e
a
tt
e
nt
ion
va
lues
int
o
a
mul
ti
-
he
a
d
a
tt
e
nti
on
pr
oba
bil
it
y
dis
tr
ibut
ion.
I
t
a
ls
o
a
ll
ows
the
model
to
f
oc
us
on
t
he
input
's
mor
e
im
por
tant
or
r
e
leva
nt
pa
r
ts
ba
s
e
d
on
the
a
nd
va
lues
a
nd
a
s
s
ign
mea
s
ur
e
d
va
lues
to
the
s
e
l
e
c
ted
inf
or
mation
.
Af
ter
the
a
tt
e
nti
on
p
r
oc
e
s
s
,
it
g
oe
s
to
the
f
e
e
d
-
f
or
wa
r
d
ne
twor
k
(
F
F
N
)
,
whic
h
is
a
l
inea
r
tr
a
ns
f
or
mation
ope
r
a
ti
on
[
19
]
,
a
s
s
hown
i
n
(
5
)
:
(
)
=
m
a
x
(
0
,
1
+
1
)
2
+
2
(
5)
W
he
r
e
is
the
input
to
the
F
F
N,
1
,
a
nd
2
a
r
e
we
ight
matr
ice
s
,
1
a
nd
2
a
r
e
bias
ve
c
tor
s
.
T
he
ope
r
a
t
ion
invol
ve
s
f
ir
s
t
a
pplyi
ng
a
li
ne
a
r
t
r
a
ns
f
or
mation
,
f
oll
owe
d
by
a
R
e
L
U
a
c
ti
va
ti
on
f
unc
ti
on
(
r
e
pr
e
s
e
nted
by
max(
0,
⋅
))
,
a
nd
then
a
pplyi
ng
a
nother
l
inea
r
tr
a
ns
f
or
mation.
T
he
ope
r
a
ti
on
1
+
1
it
is
a
li
ne
a
r
ope
r
a
ti
on
in
the
f
ir
s
t
laye
r
a
nd
the
hidden
laye
r
,
a
nd
then
the
r
e
s
ult
pa
s
s
e
s
thr
ough
the
R
e
L
U
a
c
ti
va
ti
on
f
unc
ti
on.
T
he
r
e
s
ult
of
the
a
c
ti
va
ti
on
f
unc
ti
on
is
pa
s
s
e
d
to
the
ne
xt
laye
r
,
whe
r
e
it
is
mul
ti
pli
e
d
by
the
we
ight
s
2
a
nd
a
dde
d
with
the
bias
2
to
give
the
f
inal
ou
tput
.
T
he
n
e
xt
s
tep
is
to
c
a
lcula
te
the
los
s
f
unc
ti
on
,
de
s
c
r
ibed
in
the
(
6)
.
=
−
1
∑
=
1
(
)
(
6)
W
he
r
e
L
is
the
ove
r
a
ll
los
s
va
lue
f
or
the
ba
tch
of
pr
e
dictions
,
is
the
nu
mber
o
f
s
a
mpl
e
s
,
a
nd
is
th
e
pr
oba
bil
it
y
o
f
the
c
o
r
r
e
c
t
c
las
s
with
a
los
s
f
unc
ti
o
n
c
a
l
c
ulating
how
s
igni
f
ica
nt
the
dif
f
e
r
e
nc
e
is
be
twe
e
n
the
model's
pr
e
dicte
d
pr
oba
bil
it
y
a
nd
the
a
c
tual
labe
l.
3.
5
.
M
od
e
l
t
r
ain
in
g
an
d
vali
d
at
io
n
T
he
ne
xt
s
tep
invol
ve
s
tr
a
ini
ng
the
model
us
ing
the
da
tas
e
t
f
or
a
tot
a
l
of
30
e
poc
hs
,
dur
ing
whic
h
model
pa
r
a
mete
r
s
a
r
e
a
djus
ted
to
e
nha
nc
e
pe
r
f
o
r
manc
e
.
Va
li
da
ti
on
da
ta
is
uti
li
z
e
d
to
a
s
s
e
s
s
the
model's
e
f
f
e
c
ti
ve
ne
s
s
thr
oughout
thi
s
pr
oc
e
s
s
.
T
he
Ada
m
opti
mi
z
e
r
is
e
mpl
oye
d
to
opti
m
ize
the
model,
e
ns
ur
ing
e
f
f
icie
nt
c
onve
r
ge
nc
e
dur
ing
tr
a
ini
ng
.
T
r
a
ini
ng
is
c
ondu
c
ted
mul
ti
ple
ti
mes
to
c
ove
r
a
ll
a
r
c
hit
e
c
tur
e
s
be
ing
c
ompar
e
d,
including
the
p
r
opos
e
d
C
on4Vit
model,
VG
G16,
VG
G19,
M
obil
e
Ne
tV2,
E
f
f
icie
n
tNe
tB
2,
I
nc
e
pti
onV3,
a
nd
R
e
s
Ne
t50.
T
his
c
ompr
e
he
ns
ive
a
ppr
oa
c
h
a
ll
ows
f
or
a
thor
ough
e
va
luation
of
e
a
c
h
model's
pe
r
f
or
ma
nc
e
.
3.
6
.
M
od
e
l
e
val
u
at
ion
an
d
p
e
r
f
or
m
an
c
e
e
valu
a
t
ion
I
n
thi
s
s
tudy,
the
pe
r
f
o
r
manc
e
of
the
C
on4ViT
m
ode
l
f
or
3D
f
ood
pr
int
ing
de
f
e
c
t
c
las
s
if
ica
ti
on
is
e
va
luate
d
us
ing
a
c
onf
us
ion
mat
r
ix
[
35]
.
T
his
m
a
tr
ix
s
umm
a
r
ize
s
the
c
ounts
o
f
t
r
ue
pos
it
ives
(
T
P
)
,
f
a
ls
e
pos
it
ives
(
F
P
)
,
tr
ue
ne
ga
ti
ve
s
(
T
N)
,
a
nd
f
a
ls
e
ne
ga
ti
ve
s
(
F
N)
.
T
he
s
e
f
our
c
a
tegor
ies
e
na
ble
the
c
a
lcul
a
ti
on
of
ke
y
pe
r
f
or
manc
e
met
r
ics
:
a
c
c
ur
a
c
y,
r
e
c
a
ll
,
pr
e
c
is
io
n,
a
nd
F
1
-
s
c
or
e
,
de
f
ined
by
(
7)
-
(
10)
[
36]
.
=
+
+
+
+
(
7)
=
+
(
8)
=
+
(
9)
1
−
=
2
×
×
+
(
10)
Additi
ona
ll
y,
the
gr
a
dient
-
we
ight
e
d
c
las
s
a
c
ti
va
ti
on
mapping
(
Gr
a
d
-
C
AM
)
method
will
be
us
e
d
to
a
na
lyze
the
im
a
ge
r
e
gions
that
a
r
e
c
r
uc
ial
f
or
de
ter
mi
ning
c
las
s
if
ica
ti
on
r
e
s
ult
s
[
37]
.
Gr
a
d
-
C
AM
is
a
vis
ua
li
z
a
ti
on
tec
hnique
in
de
e
p
lea
r
ning
that
h
ighl
ight
s
im
por
tant
a
r
e
a
s
of
a
n
im
a
ge
that
in
f
lu
e
nc
e
the
model's
pr
e
dictions
.
I
t
ge
ne
r
a
tes
a
he
a
tm
a
p
in
dica
ti
ng
the
s
igni
f
ica
nt
r
e
gions
f
or
the
pr
e
dict
e
d
c
las
s
,
c
a
lcula
ted
us
ing
the
(
11)
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
Ar
ti
f
I
ntell
I
S
S
N:
2252
-
8938
Hy
br
id
c
onv
olut
ional
v
is
ion
tr
ans
for
me
r
for
e
x
t
r
us
ion
-
bas
e
d
3D
food
-
pr
int
ing
de
fec
t
…
(
C
holi
d
M
aw
ar
di)
3317
=
(
∑
)
(
11)
He
r
e
,
the
we
ight
s
a
r
e
c
omput
e
d
thr
ough
global
a
ve
r
a
ge
pooli
ng
o
f
the
g
r
a
dients
a
s
in
(
12
)
.
=
1
∑
∑
(
12)
W
he
r
e
r
e
pr
e
s
e
nts
the
a
c
ti
va
ti
on
f
r
om
the
k
th
f
il
te
r
in
the
las
t
laye
r
.
Onc
e
the
he
a
tm
a
p
is
ge
ne
r
a
ted
,
i
t
is
r
e
s
ha
pe
d
to
28
×
28
pixels
a
nd
ove
r
laid
onto
the
or
igi
na
l
im
a
ge
,
with
c
olor
c
oding
to
highl
ight
the
i
mpor
tant
a
r
e
a
s
.
Gr
a
d
-
C
AM
pr
ovides
va
luable
vis
ua
l
ins
ig
hts
int
o
the
r
e
gions
that
the
model
f
oc
us
e
s
on,
e
nha
nc
ing
int
e
r
pr
e
tabili
ty
a
nd
unde
r
s
tanding
of
the
model
’
s
d
e
c
is
ion
-
making
pr
oc
e
s
s
.
4.
RE
S
UL
T
S
AN
D
DI
S
CU
S
S
I
ON
T
his
s
e
c
ti
on
pr
e
s
e
nts
the
r
e
s
ult
s
of
da
ta
c
oll
e
c
ti
on,
da
ta
pr
e
pr
oc
e
s
s
ing,
Gr
a
d
-
C
AM
a
na
lys
i
s
,
a
nd
e
xpe
r
im
e
nts
f
or
the
pe
r
f
or
manc
e
e
va
luation
of
th
e
pr
opos
e
d
a
nd
de
ve
loped
C
on4ViT
model
f
or
d
e
f
e
c
t
a
nd
non
-
de
f
e
c
t
c
las
s
if
ica
ti
on
in
3D
f
ood
pr
int
ing
.
T
he
c
ompar
a
ti
ve
pe
r
f
or
manc
e
of
the
pr
opos
e
d
model
w
it
h
other
pr
e
-
tr
a
ined
ba
s
e
d
models
is
a
ls
o
e
xplaine
d
in
thi
s
s
e
c
ti
on.
4.
1
.
Dat
a
c
oll
e
c
t
ion
As
a
r
e
s
ult
of
the
da
ta
c
oll
e
c
ti
on
s
tage
,
we
ob
taine
d
2
,
085
i
mage
s
a
s
a
c
ombi
na
ti
on
of
527
p
r
int
r
e
s
ult
s
im
a
ge
f
r
om
a
3D
f
ood
p
r
int
ing
de
vice
a
nd
1
,
558
pr
int
r
e
s
ult
s
f
r
om
a
3D
p
r
int
ing
de
vice
.
B
a
s
e
d
on
the
80:20
r
a
ti
o,
the
t
r
a
ini
ng
da
ta
c
ons
is
ts
of
1
,
669
im
a
ge
s
,
a
nd
the
va
li
da
ti
on
da
ta
c
ons
is
t
s
of
416
im
a
ge
s
.
F
igur
e
5
s
hows
e
xa
mpl
e
s
f
r
om
the
3D
f
ood
pr
int
in
g
da
tas
e
t,
div
ided
int
o
de
f
e
c
t
a
nd
non
-
de
f
e
c
t
c
a
tegor
ies
.
F
igur
e
5.
I
mage
da
tas
e
t
e
xa
mpl
e
f
r
om
the
3D
f
ood
pr
int
ing
de
vice
,
uti
li
z
ing
c
hoc
olate
a
s
the
mate
r
ial
4.
2
.
Dat
a
p
r
e
p
r
oc
e
s
s
in
g
T
he
tec
hnique
of
da
ta
pr
e
pr
oc
e
s
s
ing
in
the
f
or
m
o
f
da
ta
a
ugmenta
ti
on
that
pr
od
uc
e
s
im
a
ge
s
a
s
s
e
e
n
in
F
igur
e
6
.
F
igur
e
6
(
a
)
s
hows
the
or
igi
na
l
im
a
ge
o
f
3D
f
ood
pr
int
ing
,
F
igur
e
6
(
b)
is
a
r
otation
with
a
va
lue
of
10%
f
r
om
the
ini
ti
a
l
pos
it
ion,
F
igur
e
6
(
c
)
e
nlar
g
e
s
the
dis
play
with
z
oom_r
a
nge
f
r
om
a
s
c
a
le
of
20%
.
I
n
F
igur
e
6(
d)
,
width
_
s
ha
r
e
_r
a
nge
is
a
ls
o
done
by
s
hif
ti
ng
the
im
a
ge
by
20%
,
a
nd
in
F
igu
r
e
6(
e
)
,
th
e
im
a
ge
he
ight
a
djus
ts
to
the
he
ight
s
hif
t
r
a
nge
with
20
%
o
f
the
or
igi
na
l
im
a
ge
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
Ar
ti
f
I
ntell
,
Vol.
14
,
No.
4
,
Augus
t
20
25
:
331
1
-
3323
3318
(
a
)
(
b)
(
c
)
(
d)
(
e
)
F
igur
e
6.
R
e
s
ult
da
ta
pr
e
pr
oc
e
s
s
ing
3D
f
ood
pr
in
ti
ng
of
(
a
)
o
r
igi
na
l
i
mage
,
(
b)
r
otation
,
(
c
)
z
oom_r
a
n
ge
,
(
d)
width_s
ha
r
e
_r
a
nge
,
a
nd
(
e
)
he
ight
_s
ha
r
e
_r
a
nge
4.
4
.
T
r
ain
in
g
an
d
va
li
d
at
io
n
In
F
igur
e
7,
tr
a
ini
ng
a
nd
va
li
da
ti
on
we
r
e
pe
r
f
o
r
med
on
the
C
on4ViT
model
with
30
e
poc
hs
.
T
h
e
model
tr
a
ini
ng
pr
oc
e
s
s
is
s
e
e
n
in
the
blue
li
ne
,
while
the
model
va
li
da
ti
on
us
e
s
the
r
e
d
li
ne
.
T
h
e
r
e
s
ult
s
obtaine
d
a
r
e
the
a
c
c
ur
a
c
y
r
e
s
ult
s
in
tr
a
ini
ng
of
98
.
20%
a
n
d
the
a
c
c
ur
a
c
y
r
e
s
ult
s
in
va
li
da
ti
on
of
95
.
9
1%
.
F
igur
e
7.
T
r
a
ini
ng
a
nd
va
li
da
ti
on
C
on4ViT
model
4.
3
.
M
od
e
l
e
val
u
at
ion
F
igur
e
8
s
hows
the
c
onf
us
ion
matr
ix
of
the
C
on4ViT
model.
I
n
F
igu
r
e
8(
a
)
,
it
c
a
n
be
s
e
e
n
that
the
pr
opos
e
d
model
wi
th
416
va
li
da
ti
on
da
ta
ha
s
go
od
pe
r
f
o
r
manc
e
,
with
199
i
mage
s
c
or
r
e
c
tl
y
c
las
s
if
ied
a
s
de
f
e
c
t
(
TP
)
a
nd
200
im
a
ge
s
c
or
r
e
c
tl
y
c
las
s
if
ied
a
s
non
-
de
f
e
c
t
(
TN
)
.
9
im
a
ge
s
c
or
r
e
c
tl
y
c
las
s
if
ied
a
s
de
f
e
c
t
(
FP
)
,
8
im
a
ge
s
c
or
r
e
c
tl
y
c
las
s
if
ied
a
s
non
-
de
f
e
c
t
(
FN
)
.
I
n
F
igur
e
8
(
b)
,
it
c
a
n
a
ls
o
be
s
e
e
n
that
the
model
us
e
d
whe
n
us
ing
the
e
nti
r
e
da
ta,
na
mely
2
,
085
im
a
ge
s
,
with
1
,
024
im
a
ge
s
c
or
r
e
c
tl
y
c
las
s
if
ied
a
s
de
f
e
c
t
(
TP
)
a
nd
1
,
010
im
a
ge
s
c
or
r
e
c
tl
y
c
las
s
if
ied
a
s
non
-
de
f
e
c
t
(
TN
)
.
13
im
a
ge
s
c
or
r
e
c
tl
y
c
las
s
if
ied
a
s
de
f
e
c
t
(
FP
)
3
0
im
a
g
e
s
c
or
r
e
c
tl
y
c
las
s
if
ied
a
s
non
-
de
f
e
c
t
(
FN
)
.
W
it
h
the
r
e
s
ult
s
of
the
C
on4ViT
model
e
va
luation
pe
r
f
o
r
manc
e
us
ing
da
ta
va
li
da
ti
on,
good
r
e
s
ult
s
we
r
e
obtaine
d,
na
me
ly
,
a
c
c
ur
a
c
y,
pr
e
c
is
ion,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
.
T
he
a
c
c
ur
a
c
y
of
the
C
on4ViT
model
r
e
a
c
he
d
95.
91
%
,
with
a
p
r
e
c
is
ion
of
95
.
69%
,
a
s
e
ns
it
ivi
ty
o
f
96
.
15%
,
a
nd
a
n
F
1
-
s
c
or
e
of
95.
92
%
.
4.
4.
Grad
-
CA
M
an
alys
is
T
his
model
wa
s
pe
r
f
or
med
with
a
ddit
ional
a
na
lys
is
us
ing
vis
ua
li
z
a
ti
on
to
vis
ua
ll
y
unde
r
s
tand
whic
h
pa
r
ts
of
the
im
a
ge
a
r
e
c
ons
ider
e
d
ne
c
e
s
s
a
r
y
a
nd
c
o
n
tr
ibut
e
to
the
model's
pr
e
dictions
[
37]
.
F
igur
e
9
s
hows
the
he
a
tm
a
p
vis
ua
li
z
a
ti
on
a
r
e
a
,
whic
h
is
the
c
r
it
ica
l
a
r
e
a
f
oc
us
e
d
on
by
the
3D
f
ood
pr
int
ing
i
mage
.
T
he
vis
ua
l
f
oc
us
is
c
los
e
to
the
li
ghter
or
b
lue
bounda
r
y
of
th
e
he
a
tm
a
p,
whic
h
s
hows
the
s
ur
r
ounding
a
r
e
a
that
ha
s
the
mos
t
s
igni
f
ica
nt
i
nf
luenc
e
on
the
model
p
r
e
diction.
I
n
a
pplyi
ng
Gr
a
d
-
C
AM
to
the
C
on4ViT
model,
F
igur
e
9(
a
)
s
hows
the
or
a
nge
a
nd
r
e
d
c
olor
s
on
the
e
dge
of
the
de
s
ign
a
nd
s
li
ghtl
y
be
low
the
noz
z
le
o
f
the
3D
f
ood
p
r
int
ing
e
xtr
ude
r
.
F
igur
e
9(
b)
s
hows
the
r
e
d
a
nd
or
a
nge
a
r
e
a
s
a
r
ound
the
p
r
int
unde
r
the
noz
z
le
of
the
pr
int
ing
he
a
d,
a
nd
the
blue
c
olo
r
is
a
t
the
noz
z
le
point
.
F
igu
r
e
9(
c
)
c
ove
r
s
mo
r
e
s
ur
f
a
c
e
s
a
r
ound
t
he
pr
int
a
r
e
a
,
with
the
ho
t
c
olor
s
pr
e
a
d
ove
r
a
wide
r
a
r
e
a
,
while
the
blue
c
olor
is
in
the
inner
pa
r
t
of
the
pr
int
pr
oc
e
s
s
.
Ove
r
a
ll
,
Gr
a
d
-
C
AM
c
a
n
r
e
c
ognize
r
e
leva
nt
vis
ua
l
f
e
a
tur
e
s
to
identi
f
y
or
moni
tor
pr
int
a
c
ti
vit
y
a
nd
indi
c
a
te
e
s
s
e
nti
a
l
pa
r
ts
of
the
im
a
ge
f
or
the
pr
e
dicte
d
c
las
s
.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
nt
J
Ar
ti
f
I
ntell
I
S
S
N:
2252
-
8938
Hy
br
id
c
onv
olut
ional
v
is
ion
tr
ans
for
me
r
for
e
x
t
r
us
ion
-
bas
e
d
3D
food
-
pr
int
ing
de
fec
t
…
(
C
holi
d
M
aw
ar
di)
3319
(
a
)
(
b)
F
igur
e
8.
P
r
e
dicte
d
labe
l
with
c
onf
us
ion
matr
ix
of
(
a
)
m
ode
l
e
va
luation
with
da
ta
va
li
da
ti
on
a
nd
(
b)
m
ode
l
e
va
luation
with
a
ll
da
ta
(
a
)
(
b)
(
c
)
F
igur
e
9.
Gr
a
d
-
C
AM
im
a
ge
of
the
3D
f
ood
pr
int
in
g
of
(
a
)
G
r
a
d
-
C
AM
non
de
f
e
c
t
,
(
b)
Gr
a
d
-
C
AM
de
f
e
c
t
with
noz
z
le
f
oc
us
,
a
nd
(
c
)
Gr
a
d
-
C
AM
de
f
e
c
t
with
wide
r
a
r
e
a
4.
5
.
Com
p
ar
is
on
of
Co
n
4ViT
m
od
e
l
wit
h
an
ot
h
e
r
p
r
e
-
t
r
ain
e
d
m
od
e
l
T
o
f
u
r
ther
e
va
luate
the
model's
pe
r
f
or
manc
e
,
th
e
pr
opos
e
d
C
on4ViT
model
wa
s
c
ompar
e
d
with
other
C
NN
models
ba
s
e
d
on
pr
e
-
tr
a
ined
lea
r
ning,
na
mely
VG
G16,
VG
G19,
M
obil
e
Ne
tV2,
E
f
f
icie
ntNe
tB
2,
I
nc
e
pti
onV3,
a
nd
R
e
s
Ne
t50.
T
a
ble
3
c
ompar
e
s
o
ur
pr
opos
e
d
C
on4ViT
model
pe
r
f
or
manc
e
with
o
ther
pr
e
-
tr
a
ined
de
e
p
-
lea
r
ning
models
.
T
he
r
e
s
ult
ing
pe
r
f
o
r
manc
e
r
e
s
ult
s
we
r
e
9
5.
91
%
a
c
c
ur
a
c
y,
95
.
69%
p
r
e
c
is
ion,
96.
15%
r
e
c
a
ll
,
a
nd
95.
92%
F
1
-
s
c
or
e
.
T
a
ble
3.
C
ompar
is
on
of
the
C
on4ViT
model
a
ppr
o
a
c
h
with
other
p
r
e
-
tr
a
ined
models
M
ode
l
P
a
r
a
me
te
r
(
mi
ll
io
n)
A
c
c
ur
a
c
y
(%)
P
r
e
c
is
io
n
(%)
R
e
c
a
ll
(%)
F1
-
s
c
or
e
(%)
V
G
G
16
17.9
77.88
85.89
66.80
75.15
V
G
G
19
23.2
86.30
86.10
86.83
86.46
M
obi
le
ne
tV2
2.4
82.95
87.28
77.29
81.98
C
on4ViT
6.7
95.91
95.69
96.15
95.92
E
f
f
ic
ie
nt
N
e
tB
2
9.3
90.87
90.76
91.11
90.93
I
nc
e
pt
io
nV
3
22.3
84.62
91.24
90.51
90.97
R
e
s
N
e
t5
0
23.8
93.83
96.84
96.56
96.70
One
of
the
ke
y
f
indi
ngs
in
thi
s
c
ompar
is
on
is
that
t
he
C
on4ViT
model
ha
s
a
r
e
latively
low
pa
r
a
mete
r
c
ount
of
6
.
7
mi
ll
ion
,
e
s
pe
c
ially
whe
n
c
ompar
e
d
to
lar
ge
r
models
s
uc
h
a
s
VG
G19
(
23.
2
mi
ll
i
on)
a
nd
R
e
s
Ne
t50
(
23.
8
mi
ll
ion)
.
T
his
s
maller
pa
r
a
mete
r
c
ount
indi
c
a
tes
that
C
on4ViT
is
mor
e
li
ghtwe
ight
,
making
it
a
good
c
hoice
f
or
de
ploym
e
nt
in
r
e
s
our
c
e
-
c
ons
tr
a
ined
e
nvir
onments
with
li
mi
ted
c
omput
ing
powe
r
.
De
s
pit
e
ha
ving
f
e
we
r
pa
r
a
mete
r
s
,
C
on4ViT
a
c
hieve
d
the
highes
t
a
c
c
ur
a
c
y
of
95
.
91%
,
f
a
r
outp
e
r
f
or
mi
ng
a
ll
other
models
li
s
ted
in
c
or
r
e
c
tl
y
pr
e
dicting
ou
tcome
s
on
t
he
e
va
luation
da
tas
e
t.
Evaluation Warning : The document was created with Spire.PDF for Python.
I
S
S
N
:
2252
-
8938
I
nt
J
Ar
ti
f
I
ntell
,
Vol.
14
,
No.
4
,
Augus
t
20
25
:
331
1
-
3323
3320
W
he
n
looki
ng
a
t
pr
e
c
is
ion,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
,
C
on4ViT
c
ons
is
tently
lea
ds
in
a
ll
thes
e
metr
ics
.
I
t
ha
s
a
n
im
p
r
e
s
s
ive
pr
e
c
is
ion
of
95.
69
%
,
mea
ning
that
whe
n
it
pr
e
dicts
the
pos
it
ive
c
las
s
,
it
is
mo
r
e
li
ke
ly
to
be
c
or
r
e
c
t,
whic
h
is
e
s
s
e
nti
a
l
in
a
ppli
c
a
ti
ons
whe
r
e
ne
ga
ti
ve
pos
it
ives
a
r
e
de
tr
im
e
ntal.
I
ts
r
e
c
a
ll
s
c
or
e
is
a
ls
o
high
a
t
96
.
15%
,
indi
c
a
ti
ng
the
model's
a
bil
it
y
to
identi
f
y
a
lar
ge
pr
opor
ti
on
of
a
c
tual
pos
i
ti
ve
c
a
s
e
s
a
c
c
ur
a
tely.
W
it
h
a
n
F
1
-
s
c
or
e
of
95.
92%
,
C
on4
ViT
s
tands
out
a
s
the
be
s
t
-
pe
r
f
or
mi
ng
model
in
ter
ms
of
ove
r
a
ll
ba
lanc
e
d
pe
r
f
or
manc
e
.
I
n
c
ompar
is
on,
VG
G16
a
nd
VG
G19
ha
ve
highe
r
pa
r
a
mete
r
c
ounts
but
lowe
r
pe
r
f
o
r
manc
e
metr
ics
,
pa
r
ti
c
ular
ly
in
r
e
c
a
ll
a
nd
F
1
-
s
c
or
e
s
,
indi
c
a
ti
ng
that
they
s
tr
uggle
to
ba
lanc
e
a
c
c
ur
a
c
y
a
nd
e
f
f
icie
nc
y.
M
obil
e
Ne
tV2,
while
li
ghtwe
ight
with
only
2.
4
mi
ll
ion
pa
r
a
mete
r
s
,
doe
s
not
a
c
hieve
the
s
a
me
leve
l
of
pe
r
f
or
manc
e
a
s
C
on4Vi
T
a
c
r
os
s
a
ll
metr
ics
.
E
f
f
icie
ntNe
tB
2
a
nd
I
nc
e
pti
onV3
de
li
ve
r
c
ompetit
ive
r
e
s
ult
s
,
but
both
mus
t
c
a
tch
up
to
C
on4ViT
's
metr
ics
.
W
hil
e
E
f
f
icie
ntNe
tB
2
ha
s
a
moder
a
te
pa
r
a
me
ter
c
ount
(
9.
3
mi
ll
ion)
a
nd
s
oli
d
a
c
c
ur
a
c
y,
mor
e
is
ne
e
de
d
to
a
c
hieve
the
ove
r
a
ll
pe
r
f
o
r
manc
e
leve
l
o
f
C
on4ViT
,
indi
c
a
ti
ng
that
s
im
ply
be
ing
e
f
f
icie
nt
in
ter
ms
o
f
pa
r
a
mete
r
s
doe
s
not
gua
r
a
ntee
be
tt
e
r
r
e
s
ult
s
.
R
e
s
Ne
t50
a
c
hieve
s
high
metr
ics
,
pa
r
ti
c
ular
ly
in
the
F
1
-
s
c
or
e
(
96
.
70%
)
,
but
doe
s
not
outper
f
or
m
C
on4Vi
T
in
a
ny
indi
vidual
metr
ics
a
nd
ha
s
a
much
lar
ge
r
pa
r
a
mete
r
c
ount.
I
n
c
onc
lus
ion,
thi
s
a
na
lys
is
s
hows
that
the
C
on4Vi
T
model
outpe
r
f
or
ms
a
ll
other
c
ompar
is
on
models
in
ter
ms
of
a
c
c
ur
a
c
y,
p
r
e
c
is
ion,
r
e
c
a
ll
,
a
nd
F
1
-
s
c
or
e
.
T
his
make
s
i
t
a
n
e
xc
e
ll
e
nt
c
hoice
f
or
tas
ks
tha
t
r
e
quir
e
high
a
c
c
ur
a
c
y
a
nd
model
e
f
f
icie
nc
y.
I
ts
lowe
r
pa
r
a
mete
r
c
ount
a
nd
e
xc
e
ll
e
nt
pe
r
f
or
manc
e
metr
ics
s
ugge
s
t
that
thi
s
model
c
a
n
be
ve
r
y
e
f
f
e
c
ti
ve
f
or
a
wid
e
r
a
nge
of
a
ppli
c
a
ti
ons
,
e
s
pe
c
ially
whe
r
e
c
omp
utational
r
e
s
our
c
e
s
a
r
e
a
c
ons
tr
a
int
.
T
he
r
e
s
ult
s
s
hown
in
F
igur
e
10
s
howe
d
that
the
tr
a
ini
ng
a
nd
va
li
da
ti
on
pe
r
f
or
manc
e
of
C
on4ViT
on
3D
f
ood
pr
in
ti
ng
de
f
e
c
t
c
las
s
if
ica
ti
on
ha
s
low
f
luctua
ti
on.
How
e
ve
r
,
with
f
e
w
pa
r
a
mete
r
s
,
t
he
f
inal
pe
r
f
or
manc
e
va
lue
on
C
on4ViT
ha
s
good
r
e
s
ult
s
.
E
f
f
icie
ntNe
tB
2
is
be
tt
e
r
a
t
maintaining
s
table
va
li
da
ti
on
a
c
c
ur
a
c
y
by
s
howing
be
tt
e
r
ge
ne
r
a
li
z
a
ti
o
n.
Ove
r
a
l
l,
the
tes
ted
pr
e
-
tr
a
ini
ng
models
ha
ve
good
va
lues
,
bu
t
the
pr
opos
e
d
C
on4ViT
model
ha
s
good
a
c
c
ur
a
c
y
r
e
s
ult
s
s
o
that
it
c
a
n
be
us
e
d
in
other
r
e
s
e
a
r
c
h
s
e
ts
,
s
uc
h
a
s
lar
ge
or
s
mall
da
ta
s
e
ts
.
F
igur
e
10.
T
r
a
ini
ng
a
nd
va
li
da
ti
on
pe
r
f
or
manc
e
c
o
mpar
e
d
to
s
ome
other
methods
with
C
on4ViT
5.
CONC
L
USI
ON
T
his
pa
pe
r
pr
opos
e
s
a
hyb
r
id
method
c
ombi
ni
ng
C
NN
with
ViT
on
3D
f
ood
pr
int
ing
de
f
e
c
t
c
las
s
if
ica
ti
on.
F
or
thi
s
pur
pos
e
,
we
c
onduc
ted
e
xpe
r
im
e
nts
with
2
,
085
da
ta
f
r
om
the
3D
f
ood
pr
int
ing
Evaluation Warning : The document was created with Spire.PDF for Python.