IAES
Inter
national
J
our
nal
of
Articial
Intelligence
(IJ-AI)
V
ol.
14,
No.
6,
December
2025,
pp.
4902
∼
4912
ISSN:
2252-8938,
DOI:
10.11591/ijai.v14.i6.pp4902-4912
❒
4902
Optimizing
sparse
ter
nary
compr
ession
with
thr
esholds
f
or
communication-efcient
federated
lear
ning
Nith
yaniranjana
Murth
y
Chittaiah,
Manjula
Sunkadakatte
Haladappa
Department
of
Computer
Science
and
Engineering,
Uni
v
ersity
V
isv
esv
araya
Colle
ge
of
Engineering,
Bang
alore
Uni
v
ersity
,
Beng
aluru,
India
Article
Inf
o
Article
history:
Recei
v
ed
Oct
26,
2024
Re
vised
Oct
24,
2025
Accepted
No
v
8,
2025
K
eyw
ords:
Communication
ef
cienc
y
Distrib
uted
machine
learning
Federated
learning
Sparse
ternary
compression
STC
threshold
ABSTRA
CT
Federated
learning
(FL)
enables
decentralized
model
traini
ng
while
preserving
client
data
pri
v
ac
y
,
yet
suf
fers
from
signicant
communication
o
v
erhead
due
to
frequent
parameter
e
xchanges.
This
study
in
v
estig
ates
ho
w
v
arying
sparse
ternary
com
pression
(STC)
thresholds
impact
communication
ef
cienc
y
and
model
accurac
y
across
the
CIF
AR-10
and
MedMN
IST
datasets.
Experiments
tested
thresholds
ranging
from
1.0
to
1.9
and
batch
sizes
of
10,
15,
and
20.
Results
demonstrated
that
selecting
thresholds
between
1.2
and
1.5
reduced
total
communication
costs
by
approximately
10–15%,
while
maintaining
acceptable
accurac
y
le
v
els.
These
ndings
suggest
that
careful
threshold
tuning
can
achie
v
e
substantial
communication
sa
vings
wit
h
minimal
compromise
in
model
performance,
of
fering
practical
guidance
for
impro
ving
the
ef
cienc
y
and
scalability
of
FL
systems.
This
is
an
open
access
article
under
the
CC
BY
-SA
license
.
Corresponding
A
uthor:
Nith
yaniranjana
Murth
y
Chittaiah
Department
of
Computer
Science
and
Engineering,
Uni
v
ersity
of
V
isv
esv
araya
Colle
ge
of
Engineering
Bang
alore
Uni
v
ersity
Beng
aluru,
Karnataka,
India
Email:
nith
ya.semantic@gmail.com
1.
INTR
ODUCTION
Federated
learning
(FL)
represents
a
decentralized
approach
to
machine
learning,
where
se
v
eral
cl
ients
w
ork
together
to
train
a
shared
model
without
direct
ly
sharing
local
datasets.
Unlik
e
traditional
approaches
that
rely
on
central
ized
data
aggre
g
ation
for
model
de
v
elopment,
FL
ensures
tha
t
data
remains
on
each
client
de
vice,
thus
enhancing
pri
v
ac
y
.
This
decentralized
setup
is
particularly
adv
antageous
in
domains
lik
e
healthcare,
nance,
and
mobile
systems,
where
data
sensiti
vity
is
a
primary
concern
[1].
Despite
the
potential
adv
antages,
FL
f
aces
signicant
challenges,
particularly
in
terms
of
communication
ef
cienc
y
.
In
a
typical
FL
setup,
model
updates
are
iterati
v
ely
sent
by
clients
to
a
central
serv
er
,
which
aggre
g
ates
these
updates
and
returns
a
global
model
to
the
clients.
Modern
machine
learning
models
often
consist
of
m
illions
of
parameters,
and
communicating
these
updates
can
lead
to
substantial
netw
ork
bandwidth
consumption,
especially
in
resource-constrained
en
vironments.
As
a
result,
the
frequenc
y
and
v
olume
of
these
communic
ations
can
become
a
bottleneck,
af
fecting
both
the
scalability
and
ef
cienc
y
of
FL
systems
[2].
FL
inte
grates
data
from
a
v
ariety
of
de
vices,
such
as
sensors,
mobile
phones,
and
IoT
systems,
each
possessing
distinct
data
characteristics
and
serving
applications
across
domains
lik
e
healthcare,
nance,
and
online
education.
Ho
we
v
er
,
FL
encounters
challenges
including
pri
v
ac
y
risks,
implementation
dif
culties,
hardw
are
constraints,
communication
costs,
and
de
vice
una
v
ailability
[3]-[5].
Man
y
applications
across
v
arious
domains
in
v
olv
e
sensiti
v
e
personal
data,
making
it
crucial
for
all
stak
eholders,
including
companies,
agencies,
J
ournal
homepage:
http://ijai.iaescor
e
.com
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Artif
Intell
ISSN:
2252-8938
❒
4903
and
researchers,
to
tak
e
collecti
v
e
res
pon
s
ibility
in
safe
guarding
this
information
from
misuse
and
ab
use
[6]-[8].
V
arious
model
compression
strate
gies
ha
v
e
emer
ged
to
m
itig
ate
communication
bottlenecks.
One
s
uch
technique
is
sparse
ternary
compression
(STC),
which
reduces
the
size
of
the
model
updates
by
compressing
them
into
three
discrete
states:
-1,
0,
and
1.
By
representing
the
model
updates
in
this
ternary
form,
the
number
of
bits
required
for
transmission
is
signicantly
reduced,
leading
to
lo
wer
communication
costs.
Ho
we
v
er
,
the
ef
fecti
v
eness
of
STC
is
closely
tied
to
the
select
ion
of
the
compression
threshold,
which
inuences
the
e
xtent
of
sparsity
in
the
updates.
Selecting
an
appropriate
threshold
helps
balance
communication
sa
vings
with
model
performance;
ho
we
v
er
,
a
poorly
chosen
v
alue
may
cause
signicant
information
de
gradation
or
f
ail
to
compress
suf
ciently
[9].
FL
can
be
applied
in
v
arious
domains,
including
smart
retai
l,
ener
gy
management,
smart
cit
ies,
nance,
v
ehicle-to-v
ehicle
communication,
and
healthcare
[10],
[11].
Applications
such
as
patient
monitoring,
moisture
detection
in
crops,
and
predicti
v
e
te
xt
in
k
e
yboards
use
FL
to
obtain
data-dri
v
en
insights.
W
ith
the
in
v
olv
ement
of
sensiti
v
e
personal
data,
ensuring
pri
v
ac
y
is
paramount.
F
or
instance,
the
General
Data
Protection
Re
gulation
(GDPR)
is
one
of
se
v
eral
re
gulatory
frame
w
orks
adopted
in
re
gions
including
the
EU
and
UK
to
safe
guard
personal
data
[12],
[13].
The
paper
in
v
estig
ates
the
impact
of
v
arying
STC
thresholds,
specically
focusing
on
v
alues
of
1.0,
1.2,
1.5,
1.7,
and
1.5,
on
communication
ef
cienc
y
and
model
accurac
y
in
FL.
Systematic
e
xperimentation
is
conducted
with
dif
ferent
threshold
v
alues,
follo
wed
by
an
analysis
of
the
ef
fects
on
a
well-established
machine
learning
task.
Insights
are
of
fered
on
ho
w
STC
thresholds
af
fect
the
trade-of
f
between
communication
cost
and
model
performance.
Our
study
aims
to
of
fer
guidance
on
choosing
optimal
STC
thresholds
to
enhance
ef
cienc
y
,
scalability
,
and
the
practical
use
of
FL
in
real-w
orld
scenarios.
2.
RELA
TED
W
ORK
McMahan
et
al.
[14]
introduced
the
concept
of
FL
through
the
de
v
elopment
of
the
Federated
A
v
eraging
(FedA
vg)
algorithm,
which
enables
aggre
g
ation
of
locally
computed
gradients
from
multiple
decentralized
de
vices.
F
ollo
wing
this
w
ork,
FL
be
g
an
recei
ving
increased
attention
due
to
its
ability
t
o
perform
machine
learni
n
g
directly
on
local
cli
ent
data
wit
hout
requiring
c
entral
data
transfer
.
This
foundational
study
demonstrated
the
feasibility
of
training
models
across
distrib
uted
systems
while
preserving
user
pri
v
ac
y
,
pa
ving
the
w
ay
for
wider
adoption
in
both
research
and
industry
.
Sattler
et
al.
[15]
introduced
STC
to
reduce
communication
costs
in
FL
by
compressing
model
updates.
STC
b
uilds
on
top-k
gradient
sparsication
and
adds
ternarization
with
optimal
Golomb
encoding,
enabling
more
ef
cient
do
wnstream
communication.
Quantized
SGD
(QSGD),
proposed
by
Alistarh
et
al.
[16],
is
a
technique
that
minimizes
communication
o
v
erhead
in
distrib
uted
machine
learning
by
quantizing
gradients.
QSGD
enhances
training
ef
cienc
y
and
scalability
by
reducing
the
data
e
xchanged
between
w
ork
ers
during
gradient
descent
while
maintaining
con
v
er
gence.
Bernstein
e
t
al.
[17]
presented
signSGD,
a
compression
technique
that
pro
vides
theoretical
con
v
er
gence
guarantees
on
independent
and
identically
distrib
uted
(IID)
data.
The
technique
reduces
the
bit
size
for
each
gradient
update
by
a
f
actor
of
32
by
con
v
erting
each
update
into
a
binary
sign.
Additionally
,
signSGD
performs
do
wnload
compression
by
aggre
g
ating
binary
updates
from
all
clients
through
a
majority
v
oting
mechanism.
Rothchild
et
al.
[18]
introduced
FetchSGD
to
address
communicat
ion
bottlenecks
and
con
v
er
gence
issues
in
FL.
In
resear
ch
by
W
ang
et
al.
[19],
communication
cost
optimization
w
as
identied
as
a
crucial
f
actor
,
with
three
primary
techniques
suggested
for
reduction:
i)
allo
wing
local
updates
to
reduce
the
frequenc
y
of
communication,
ii)
compressing
messages
to
lo
wer
the
v
olume
of
data
transmitted,
and
iii)
reducing
communication
traf
c
at
the
serv
er
by
lim
iting
the
number
of
parti
cipating
clients
per
round.
Research
conducted
by
Y
ang
et
al.
[20]
focused
on
wireless
communication
using
the
principle
of
Ov
er
-the-Air
computation.
A
model
for
FL
w
as
designed
by
e
xploiting
signal
superposition
of
wireless
de
vices,
combined
with
beamforming
design
and
joint
de
vice
selection,
to
enhance
statistical
learning
performance.
Chen
et
al.
[21]
identied
challenges
in
wireless
communication
and
proposed
a
model
that
jointly
selects
users
and
allocates
resources,
ef
fecti
v
ely
reducing
pack
et
loss
and
impro
ving
FL
model
performance.
Emphasis
w
as
also
placed
on
reducing
ener
gy
consumption
during
local
training
and
transmission,
which
is
a
k
e
y
consideration
for
impro
ving
FL
ef
cienc
y
.
Cheng
et
al.
[22]
e
xamined
tw
o
primary
f
actors
for
reducing
communication
costs:
i)
i
ncreasing
computational
po
wer
on
local
de
vices
to
allo
w
for
more
local
updates
before
global
aggre
g
ation,
Optimizing
spar
se
ternary
compr
ession
with
thr
esholds
for
...
(Nithyanir
anjana
Murthy
Chittaiah)
Evaluation Warning : The document was created with Spire.PDF for Python.
4904
❒
ISSN:
2252-8938
and
ii)
selecting
more
participants
for
each
round
by
enhancing
parallelism.
Simulation
results
indicated
that
increasing
parallelism
signicantly
reduced
communication
costs.
Ho
we
v
er
,
its
ef
fecti
v
eness
became
apparent
only
after
reaching
a
specic
threshold,
compared
to
simply
increasing
the
computational
po
wer
on
participant
de
vices.
Liu
et
al.
[7]
e
xtended
the
concept
to
v
ertical
federated
learning
(VFL),
which
in
v
olv
es
the
same
set
of
users
or
participants
with
dif
ferent
feature
sets.
F
or
e
xample,
in
the
same
city
,
a
local
bank
and
a
local
retail
compan
y
may
share
the
same
customer
base
b
ut
maintain
distinct
feature
sets,
f
acilitating
collaborati
v
e
model
b
uilding.
A
no
v
el
algorithm
called
Federated
Stochastic
Block
Coordinate
Descent
(FedBCD)
w
as
introduced,
enabling
se
v
eral
local
updates
before
communicating
with
the
global
serv
er
for
aggre
g
ation.
The
algorithm
also
ensured
local
con
v
er
gence
with
fe
wer
communication
rounds.
FedZip,
a
compression
frame
w
ork
for
FL,
w
as
introduced
by
Malekijoo
et
al.
[23]
to
address
communication
o
v
erhead,
ener
gy
consumption,
and
performance
de
gradation.
Sparsication
w
as
achie
v
ed
using
T
op-z
pruning,
while
K-means
clustering,
quantization
of
model
weights,
and
Huf
fman
encoding
were
emplo
yed
for
model
compression.
When
applied
to
client-to-serv
er
communication,
FedZip
impro
v
ed
communication
ef
cienc
y
in
FL
systems
with
minimal
ef
fects
on
accurac
y
and
con
v
er
gence.
Haddadpour
et
al.
[24]
focused
on
reducing
uplink
communication
costs
by
compressing
messages
sent
from
client
de
vices
to
the
central
serv
er
.
T
o
pre
v
ent
information
loss
during
compression,
the
central
serv
er
generates
a
con
v
e
x
combination
of
the
pre
vious
global
model
and
the
aggre
g
ated
updates
from
local
models.
3.
METHODOLOGY
3.1.
Resear
ch
design
The
study
aimed
to
e
v
aluate
the
impact
of
v
arying
STC
thresholds
on
communication
ef
cienc
y
and
model
accurac
y
in
a
FL
conte
xt.
Experiments
tested
STC
threshold
v
alues
of
1.0,
1.2,
1.5
,
1.7,
and
1.9
across
batch
sizes
of
10,
15,
and
20.
T
w
o
di
v
erse
image
classication
datasets—CIF
AR-10
and
MedMNIST—were
selected
to
represent
both
natural
images
and
medical
imaging
tasks,
supporting
broader
applicability
.
Data
w
as
partitioned
in
a
non-IID
(Non-Independent
and
Identically
Distrib
uted)
manner
to
mimic
realistic
client
heterogeneity
typical
in
federated
settings.
This
design
enabled
a
systematic
analysis
of
ho
w
dif
ferent
threshold
settings
and
batch
sizes
inuence
communication
cost
and
model
performance
across
v
arying
data
characteristics.
3.2.
Experimental
pr
ocedur
e
The
e
xperimental
procedure
in
v
olv
ed
se
v
eral
k
e
y
steps
designed
to
systematically
e
v
aluate
the
impact
of
STC
thresholds
on
FL
performance.
Data
partitioning:
The
CIF
AR-10
and
MedMNIST
datasets
were
used
to
pro
vide
di
v
erse
e
v
aluation
scenarios.
Data
w
as
partitioned
among
multiple
simulated
clients
(de
vices)
to
replicate
a
FL
system
with
non-IID
(non-independent
and
identically
distrib
uted)
distrib
utions
(see
T
able
1
for
en
vironment
conguration).
This
setup
aimed
to
closely
mimic
real-w
orld
heterogeneity
,
where
data
is
une
v
enly
distrib
uted
across
participants.
T
able
1.
Federated
learning
conguration
details
Conguration
attrib
ute
Client
count
Selection
fraction
Labels
per
client
Mini-batch
size
Dataset
balance
f
actor
Assigned
v
alue
10
0.1
2
10,
15,
20
1.0
Model
training:
a
con
v
olutional
neural
netw
ork
(CNN)
architecture
based
on
V
GG11
*
w
as
emplo
yed
as
the
client-side
model.
Hyperparameters
and
model
congurations
used
for
CIF
AR-10
and
MedMNIST
e
xperiments
are
summarized
in
T
able
2.
Each
client
trained
the
model
locally
on
its
partitioned
data
for
a
x
ed
number
of
iterations.
After
local
training,
model
parameters
were
compressed
using
the
STC
technique
before
transmission
to
the
central
serv
er
,
aiming
to
reduce
communication
costs.
T
able
2.
Models
and
h
yperparameters
used
in
e
xperiments
P
arameter
CIF
AR-10
v
alue
MedMNIST
v
alue
Model
architecture
V
GG11
*
V
GG11
*
Learning
rate
0.016
0.010
Optimizer
SGD
SGD
Loss
function
Cross-entrop
y
Cross-entrop
y
Iterations
20,000
20,000
Int
J
Artif
Intell,
V
ol.
14,
No.
6,
December
2025:
4902–4912
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Artif
Intell
ISSN:
2252-8938
❒
4905
Note:
V
GG11
*
denotes
a
streamlined
adaptation
of
the
V
GG11
model
[25],
where
dropout
and
batch
normalization
layers
ha
v
e
been
e
xcluded,
and
both
the
con
v
olutional
lter
count
and
the
dimensions
of
the
fully
connected
layers
ha
v
e
been
scaled
do
wn
to
half.
STC:
the
STC
te
chnique
reduces
model
updates
to
three
states:
-1,
0,
and
1,
ef
fecti
v
ely
lo
wering
the
number
of
bits
needed
for
communication.
Compression
is
controlled
by
a
threshold
parameter
τ
,
which
go
v
erns
the
sparsity
of
the
updates.
The
STC
algorithm
operates
as
in
(1).
STC
(
g
i
)
=
1
if
g
i
≥
τ
−
1
if
g
i
≤
−
τ
0
if
−
τ
<
g
i
<
τ
(1)
Where
g
i
represents
an
indi
vidual
model
gradient
component.
Elements
with
absolute
v
alues
e
xceeding
τ
are
retained
with
their
sign,
while
others
are
zeroed
out.
Experiments
systematically
v
aried
τ
across
v
alues
1.0,
1.2,
1.5,
1.7,
and
1.9
to
study
its
inuence
on
both
communication
cost
and
accurac
y
.
This
range
enabled
a
detailed
analysis
of
ho
w
increas
ed
sparsication
impacts
the
trade-of
f
between
compression
ef
cienc
y
and
model
performance.
The
ternarization
approach
also
enables
encoding
gradients
with
fe
wer
bits,
reducing
communication
o
v
erhead
signicantly
.
Model
aggre
g
ation
and
update:
the
central
serv
er
aggre
g
ated
compressed
updates
from
all
participating
clients
using
weighted
a
v
eraging,
as
in
(2).
w
(
t
+1)
=
K
X
k
=1
n
k
n
w
(
t
)
k
(2)
Here,
w
(
t
+1)
denotes
the
global
model
at
round
t
+
1
,
w
(
t
)
k
are
the
parameters
from
client
k
,
n
k
is
the
number
of
samples
on
client
k
,
and
n
is
the
total
number
of
samples
across
all
clients.
The
aggre
g
ated
global
model
w
as
then
redistrib
uted
to
all
clients
for
the
ne
xt
round
of
local
training.
Performance
e
v
aluation:
performance
w
as
assessed
using
tw
o
primary
metrics:
model
accurac
y
and
communication
cost,
measured
in
bits
transmitted.
Accurac
y
w
as
e
v
aluated
on
held-out
v
alidation
data,
while
communication
cost
accounted
for
the
total
number
of
bits
sent
during
each
communication
round.
Experiments
were
repeated
o
v
er
multiple
rounds
to
analyze
con
v
er
gence
beha
vior
and
to
compare
the
ef
fects
of
dif
ferent
thresholds
and
batch
sizes
across
both
datasets.
3.3.
T
esting
and
data
acquisition
The
testing
phase
i
n
v
olv
ed
running
the
FL
training
process
with
each
STC
threshold
v
alue.
Res
ults,
including
model
accurac
y
and
communication
cost,
were
recorded
at
each
iteration.
The
data
acquisition
process
w
as
automated
using
scripts
that
logged
the
necessary
metrics,
ensuring
consistent
and
reliable
data
collection.
The
recorded
data
w
as
then
analyzed
to
identify
trends
and
dra
w
conclusions
about
the
ef
fecti
v
eness
of
dif
ferent
STC
thresholds
in
FL
systems.
The
res
earch
follo
wed
a
rigorous
e
xperime
ntal
protocol,
ensuring
that
the
results
are
scientically
v
alid
and
reproducible.
The
methods
used
for
testing
and
data
acquisition
were
based
on
established
practices
in
FL
research
pro
viding
a
solid
foundation
for
the
conclusions
dra
wn
in
the
study
.
4.
RESUL
TS
The
e
xperimental
ndings
e
v
aluating
t
he
impact
of
STC
thresholds
on
communication
ef
cienc
y
and
model
performance
in
FL
are
presented
in
this
section.
Results
were
systematically
analyzed
for
dif
ferent
thresholds
(1.0,
1.2,
1.5,
1.7,
and
1.9)
and
batch
sizes
(10,
15,
and
20),
us
ing
tw
o
datasets:
CIF
AR-10
and
MedMNIST
.
K
e
y
aspects
e
xam
ined
include
total
communication
cost,
model
accurac
y
across
training
iterations,
and
maximum
accurac
y
achie
v
ed
across
batch
sizes.
These
analyses
re
v
eal
the
balance
between
reducing
communication
o
v
erhead
and
maintaining
model
ef
fecti
v
eness,
of
fering
practical
guidance
for
selecting
appropriate
STC
thresholds
in
di
v
erse
FL
scenarios.
Threshold
1.9
is
e
xcluded
from
all
plots
due
to
cons
istently
lo
w
accurac
y
.
Its
inclusion
w
ould
distort
axis
scaling
and
obscure
meaningful
comparisons
across
other
thresholds.
Optimizing
spar
se
ternary
compr
ession
with
thr
esholds
for
...
(Nithyanir
anjana
Murthy
Chittaiah)
Evaluation Warning : The document was created with Spire.PDF for Python.
4906
❒
ISSN:
2252-8938
4.1.
T
otal
bits
sent
up
vs.
batch
size
f
or
differ
ent
sparse
ter
nary
compr
ession
thr
esholds
(in
MB)
The
rst
aspect
of
the
analysis
focused
on
e
v
aluating
communication
ef
cienc
y
in
the
FL
s
ystem
by
measuring
the
total
bits
sent
during
the
training
process.
Figures
1
and
2
illustrate
ho
w
total
bits
sent
v
aried
with
dif
ferent
batch
s
izes
and
STC
thresholds
across
tw
o
datasets:
CIF
AR-10
and
MedMNIST
.
F
or
the
CIF
AR-10
dataset
(Figure
1),
the
results
indicated
a
general
trend
where
increasing
the
STC
threshold
resulted
in
lo
wer
total
bi
ts
sent
across
batch
sizes.
Higher
thresholds
(such
as
1.5)
ef
fecti
v
ely
reduced
the
v
olume
of
communication,
particularly
at
lar
ger
batch
sizes.
This
beha
vior
sugges
ts
that
aggressi
v
e
compression
successfully
reduces
the
data
that
must
be
transmitted
between
clients
and
the
serv
er
,
thereby
lo
wering
communication
o
v
erhead.
The
MedMNIST
dataset
results
(Figure
2)
generally
reinforced
these
ndings.
Ho
we
v
er
,
while
higher
thresholds
still
reduced
communication
costs,
the
e
xtent
of
this
reduction
w
as
less
uniform
across
batch
sizes,
lik
ely
attrib
utable
to
the
dataset’
s
inherent
comple
xity
and
distrib
ution
characteristics.
Ov
erall,
the
ndings
across
both
datasets
emphasize
the
v
alue
of
tuning
STC
thresholds
to
optimize
communication
ef
cienc
y
in
FL
setups.
Careful
sel
ection
of
the
threshold
parameter
enables
signicant
reductions
in
transmission
v
olume,
b
ut
it
remains
essential
to
balance
this
g
ain
with
potential
impacts
on
model
accurac
y
,
as
discussed
in
subsequent
sections.
Results
for
threshold
1.9
are
e
xcluded
from
the
table
and
gures
due
to
consistently
poor
accurac
y
,
illustrating
that
e
xcessi
v
e
compression
de
grades
model
performance
be
yond
acceptable
limits.
The
numerical
v
alues
corresponding
to
Figures
1
and
2
are
summarized
in
T
able
3,
detailing
total
bits
sent
for
each
STC
threshold
and
batch
size
combination.
Figure
1.
T
otal
bits
sent
up
vs.
batch
size
for
dif
ferent
STC
thresholds
on
CIF
AR-10
Figure
2.
T
otal
bits
sent
up
vs.
batch
size
for
dif
ferent
STC
thresholds
on
MedMNIST
Int
J
Artif
Intell,
V
ol.
14,
No.
6,
December
2025:
4902–4912
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Artif
Intell
ISSN:
2252-8938
❒
4907
T
able
3.
T
otal
bits
sent
up
(MB)
vs.
batch
size
for
dif
ferent
STC
thresholds
on
CIF
AR-10
and
MedMNIST
Batch
size
CIF
AR-10
MedMNIST
1.0
1.2
1.5
1.7
1.0
1.2
1.5
1.7
10
1574.49
1573.26
1574.03
1570.65
1572.18
1572.49
1575.15
1573.39
15
1574.06
1573.24
1571.68
1572.66
1572.70
1572.30
1572.20
1572.23
20
1572.60
1573.50
1571.53
1571.08
1573.52
1571.55
1571.65
1570.90
Note:
Threshold
1.9
is
e
xcluded
due
to
consistently
poor
accurac
y
4.2.
Accuracy
vs.
iterations
f
or
differ
ent
sparse
ter
nary
compr
ession
thr
esholds
The
second
aspect
of
the
analysis
focused
on
e
v
aluating
model
accurac
y
as
training
progressed
o
v
er
iterations,
highlighting
con
v
er
gence
beha
vior
under
dif
ferent
STC
thresholds.
Figures
3
and
4
sho
w
the
accurac
y
curv
es
for
the
CIF
AR-10
and
MedMNIST
datasets,
respecti
v
ely
,
across
thresholds
1.0,
1.2,
1.5,
1.7,
and
1.9.
F
or
the
CIF
AR-10
results
(Figure
3),
accurac
y
impro
v
ed
steadily
o
v
er
20,000
iterations
for
thresholds
1.0,
1.2,
and
1.5.
Threshold
1.0
consistently
deli
v
ered
the
highest
accurac
y
at
con
v
er
gence,
demonstrating
more
stable
learning
with
less
a
ccurac
y
drop.
Threshold
1.5
sho
wed
competiti
v
e
early
con
v
er
gence
b
ut
e
xhibited
slightly
more
v
ariability
in
late
training
stages.
Thresholds
1.7
and
1.9
led
to
noticeably
de
graded
accurac
y
,
with
threshold
1.9
f
ailing
to
impro
v
e
signicantly
during
training.
This
pattern
indicates
that
o
v
erly
aggressi
v
e
compression
(high
thresholds)
can
harm
learning
by
discarding
too
much
gradient
information.
Figure
3.
CIF
AR-10:
accurac
y
vs.
iterations
for
dif
ferent
STC
thresholds
Figure
4.
MedMNIST
:
accurac
y
vs.
iterations
for
dif
ferent
STC
thresholds
Optimizing
spar
se
ternary
compr
ession
with
thr
esholds
for
...
(Nithyanir
anjana
Murthy
Chittaiah)
Evaluation Warning : The document was created with Spire.PDF for Python.
4908
❒
ISSN:
2252-8938
F
or
MedMNIST
(Figure
4),
similar
trends
were
observ
ed.
Thresholds
1.0
and
1.5
achie
v
ed
higher
and
smoother
accurac
y
g
ains
across
training
rounds,
with
threshold
1.0
generally
con
v
er
ging
at
slightly
higher
accurac
y
.
Threshold
1.2
also
deli
v
ered
competiti
v
e
performance,
reinforci
ng
that
moderately
lo
w
thresholds
preserv
e
critical
learning
signals.
Con
v
ersely
,
thresholds
1.7
and
1.9
consistently
lagged
in
accurac
y
across
iterations,
indicating
diminished
learning
capacity
due
to
e
xcessi
v
e
sparsication.
These
results
conrm
that
selecting
lo
wer
STC
thresholds
(e.g.,
1.0,
1.2)
supports
better
con
v
er
gence
and
model
accurac
y
while
still
of
fering
meaningful
compression
benets.
The
trade-of
f
between
communication
reduction
and
model
performance
is
e
vident:
while
higher
thresholds
reduce
data
transfer
costs,
the
y
risk
compromising
accurac
y
.
Careful
selection
of
STC
thresholds
is
thus
essential
for
balancing
ef
cienc
y
and
learning
quality
in
FL
settings.
4.3.
Max
accuracy
vs.
batch
size
f
or
differ
ent
sparse
ter
nary
compr
ession
thr
esholds
The
nal
aspect
of
the
a
nalysis
e
xamined
the
maximum
accurac
y
achie
v
ed
at
dif
ferent
batch
sizes
for
each
STC
threshold.
Figures
5
and
6
display
these
results
for
t
he
CIF
AR-10
and
MedMNIST
datasets,
respecti
v
ely
,
sho
wing
ho
w
threshold
selecti
on
and
batch
size
interact
to
inuence
model
performance.
F
or
the
CIF
AR-10
dataset
(Figure
5),
thresholds
1.0
and
1.2
consistently
achie
v
ed
higher
maximum
accurac
y
across
all
batch
sizes
compared
to
higher
thresholds
such
as
1.5,
1.7,
and
1.9.
The
results
sho
wed
that
as
b
a
tch
size
increased
from
10
to
20,
accurac
y
impro
v
ed
most
noticeably
for
lo
wer
thresholds.
F
or
instance,
threshold
1.0
peak
ed
near
71%
at
batch
size
20,
highlighting
its
stability
and
rob
ustness
e
v
en
as
communication
cost
increased.
By
contrast,
thresholds
1.7
and
1.9
sho
wed
lo
wer
and
atter
trends,
indicating
de
graded
performance
lik
ely
due
to
e
xcessi
v
e
sparsication
of
gradient
updates.
This
beha
vior
suggests
that
aggressi
v
e
compression
harms
learning
signal
retention,
particularly
in
lar
ger
batch
congurations.
F
or
the
MedMNIST
dataset
(Figure
6),
a
dif
ferent
pattern
emer
ged.
Threshold
1.5
generally
achie
v
ed
competiti
v
e
or
e
v
en
superior
accurac
y
at
ba
tch
size
15,
peaking
abo
v
e
70%.
Threshold
1.0
maintained
strong
performance
b
ut
did
not
al
w
ays
outperform
1.5,
especially
at
mid-sized
batches.
Meanwhile,
thresholds
1.7
and
1.9
ag
ain
sho
wed
weak
er
accurac
y
across
batch
sizes,
reinforcing
that
o
v
erly
aggressi
v
e
sparsication
reduces
learning
capacity
.
The
v
ariability
across
batch
sizes
and
thresholds
in
MedMNIST
emphasizes
that
optimal
threshold
selection
may
be
dataset-specic
and
should
consider
data
comple
xity
and
distrib
ution.
Ov
erall,
these
results
highlight
the
importance
of
selecting
STC
thresholds
carefully
to
balance
communication
ef
cienc
y
with
m
od
e
l
accurac
y
.
Lo
wer
thresholds
(e.g.,
1.0,
1.2)
generally
f
a
v
or
accurac
y
at
the
cost
of
higher
communication
v
olume,
while
higher
thresholds
reduce
communication
b
ut
risk
learning
de
gradation,
especially
at
lar
ger
batch
sizes.
T
ailoring
thresholds
to
dataset
properties
and
batch
size
can
thus
enhance
FL
performance.
The
detailed
numerical
v
alues
corresponding
to
Figures
5
and
6
are
pro
vided
in
T
able
4,
summarizing
the
maximum
accurac
y
achie
v
ed
for
each
threshold
and
batch
size
combination.
Figure
5.
CIF
AR-10:
max
accurac
y
vs.
batch
size
for
dif
ferent
STC
thresholds
Int
J
Artif
Intell,
V
ol.
14,
No.
6,
December
2025:
4902–4912
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Artif
Intell
ISSN:
2252-8938
❒
4909
Figure
6.
MedMNIST
:
max
accurac
y
vs.
batch
size
for
dif
ferent
STC
thresholds
T
able
4.
Max
accurac
y
(%)
vs.
batch
size
for
dif
ferent
STC
thresholds
on
CIF
AR-10
and
MedMNIST
Batch
size
CIF
AR-10
MedMNIST
1.0
1.2
1.5
1.7
1.0
1.2
1.5
1.7
10
65.83
66.38
62.78
62.12
65.71
65.54
63.29
55.42
15
68.82
68.36
67.24
62.19
69.03
65.38
70.38
56.46
20
71.12
69.37
68.85
63.39
68.15
66.07
64.62
66.84
4.4.
Summary
heatmaps
of
accuracy
and
communication
cost
T
o
complement
the
detailed
analyses
presented
in
the
pre
vious
subsections,
Figures
7
and
8
pro
vide
heatmap
visualizations
summarizing
the
combined
ef
fects
of
STC
threshold
and
batch
size
on
both
maximum
accurac
y
and
total
communication
cost.
F
or
the
CI
F
AR-10
dataset
(Figure
7),
Figure
7(a)
sho
ws
that
lo
wer
STC
thresholds
(1.0,
1.2)
combined
with
lar
ger
batch
sizes
(20)
achie
v
e
the
highest
maximum
accurac
y
(e
xceeding
71%),
while
Figure
7(b)
re
v
eals
a
clear
reduction
in
communication
cost
as
t
h
e
threshold
increases,
conrming
the
trade-of
f
between
model
performance
and
transmission
ef
cienc
y
.
(a)
(b)
Figure
7.
CIF
AR-10
Heatmaps:
(a)
max
accurac
y
(%)
by
STC
threshold
and
batch
size
and
(b)
total
communication
(MB)
by
STC
threshold
and
batch
size
F
or
MedMNIST
(Figure
8),
a
similar
pattern
is
observ
ed,
though
with
notable
v
ariations
across
Optimizing
spar
se
ternary
compr
ession
with
thr
esholds
for
...
(Nithyanir
anjana
Murthy
Chittaiah)
Evaluation Warning : The document was created with Spire.PDF for Python.
4910
❒
ISSN:
2252-8938
thresholds
and
batch
sizes
as
presented
in
Figures
8(a)
and
8(b).
The
accurac
y
heatmap
highlights
that
threshold
1.5
at
batch
size
15
yields
strong
accurac
y
,
while
communic
ation
cost
remains
consistently
sensiti
v
e
to
threshold
adjustments.
These
visual
s
u
m
maries
underscore
the
importance
of
careful
threshold
tuning
tailored
to
both
the
data
distrib
ution
and
t
he
desired
balance
between
accurac
y
and
communication
sa
vings
in
FL
deplo
yments.
(a)
(b)
Figure
8.
MedMNIST
Heatmaps:
(a)
max
accurac
y
(%)
by
STC
threshold
and
batch
size
and
(b)
total
communication
(MB)
by
STC
threshold
and
batch
size
These
ndings
pro
vide
a
comprehensi
v
e
understanding
of
the
role
of
STC
threshold
selection
in
FL
across
dif
ferent
datasets
and
batch
sizes.
The
results
demonstrate
a
clear
trade-of
f
between
communication
ef
cienc
y
and
model
accurac
y
that
v
aries
with
both
dataset
characteristics
and
training
conguration.
Higher
STC
thresholds
generally
reduced
total
communication
costs
in
both
CIF
AR-10
and
MedMNIST
e
xperiments,
conrming
their
ef
fecti
v
eness
for
minimizing
data
transmission
o
v
erhead.
Ho
we
v
er
,
this
benet
came
at
the
cost
of
accurac
y
,
particularly
e
vident
at
lar
ger
batch
sizes
and
for
thresholds
abo
v
e
1.5,
where
e
xcessi
v
e
sparsication
de
graded
learning
performance.
In
the
accurac
y
vs.
iterations
analysis,
lo
wer
thresholds
(1.0,
1.2)
consistently
supported
smoother
and
higher
con
v
er
gence,
while
higher
thresholds
f
ailed
to
maintain
learning
stability
across
training
rounds.
Similarly
,
in
maximum
accurac
y
vs.
batch
size
e
v
aluations,
CIF
AR-10
results
f
a
v
ored
lo
wer
thresholds
for
achie
ving
the
highest
accurac
y
,
especially
at
lar
ger
batch
sizes.
MedMNIST
sho
wed
more
v
aried
beha
vior
,
with
threshold
1.5
performing
competiti
v
ely
at
mid-sized
batches
b
ut
still
re
v
ealing
accurac
y
losses
at
higher
thresholds
lik
e
1.7
and
1.9.
Ov
erall,
these
results
highlight
the
importance
of
carefully
selecting
and
tuning
STC
thresholds
based
on
dataset
comple
xity
and
batch
size
to
balance
communication
ef
cienc
y
with
model
delity
in
FL
systems.
4.5.
Comparati
v
e
analysis
The
study
e
xtends
the
foundational
w
ork
of
Sattler
et
al.
[15]
by
thoroughly
in
v
estig
ating
STC
thresholds
be
yond
the
baseline
v
alue
(
τ
=
1
.
0
).
Our
e
xperimental
results,
as
detailed
in
Figures
1-6
and
T
ables
3-4,
re
v
eal
distinct
adv
antages
of
our
proposed
thresholds
(
τ
=
1
.
2
,
τ
=
1
.
5
,
and
τ
=
1
.
7
)
o
v
er
the
base
paper’
s
τ
=
1
.
0
.
W
e
consistently
observ
e
reduced
communication
o
v
erhead,
with
signicant
sa
vings
(e.g.,
up
to
∼
2.5
MB
),
while
simultaneously
achie
ving
competiti
v
e
or
superior
accurac
y
.
F
or
instance,
τ
=
1
.
5
frequently
outperform
s
τ
=
1
.
0
in
MedMNIST
at
mid-sized
batches,
demonstrating
that
careful
STC
tuning
can
yield
better
communication-accurac
y
trade-of
fs
without
signicant
performance
compromise.
Be
yond
STC-specic
comparisons,
our
ndings
contrib
ute
to
the
broader
eld
of
communication-
ef
cient
FL.
While
methods
lik
e
QSGD
[16],
signSGD
[17],
and
FedZip
[23]
of
fer
v
arious
compression
mechanisms,
our
lightweight
threshold
tuning
strate
gy
for
STC
stands
out.
It
pro
vides
dataset-adapti
v
e
compression
that
a
v
oids
the
more
comple
x
serv
er
-side
decoding
or
architectural
changes
often
associated
with
other
techniques.
Int
J
Artif
Intell,
V
ol.
14,
No.
6,
December
2025:
4902–4912
Evaluation Warning : The document was created with Spire.PDF for Python.
Int
J
Artif
Intell
ISSN:
2252-8938
❒
4911
5.
CONCLUSION
This
study
in
v
estig
ated
the
ef
fect
of
v
arying
STC
thresholds
specically
τ
=
1
.
2
,
1
.
5
,
1
.
7
,
and
1
.
9
—on
communi
cation
ef
cienc
y
and
model
accurac
y
in
FL
using
CIF
AR-10
and
MedMNIST
datasets.
The
e
xperimental
results
demonstrate
that
careful
threshold
tuning
plays
a
vital
role
in
balancing
communication
cost
with
model
performance.
Threshold
τ
=
1
.
2
consistently
achie
v
ed
the
best
trade-of
f
by
of
fering
high
accurac
y
with
reduced
communication,
while
τ
=
1
.
5
performed
particularly
well
on
MedMNIST
with
mid-sized
batches.
Though
thresholds
τ
=
1
.
7
and
1
.
9
resulted
in
hi
gh
e
r
compression,
the
y
sho
wed
reduced
con
v
er
gence
quality
at
lar
ger
batch
sizes,
indicating
potential
o
v
er
-sparsication.
Ov
erall,
our
ndings
emphasize
the
need
to
e
xplore
thresholds
be
yond
the
con
v
entional
baseline
(
τ
=
1
.
0
),
as
moderate
v
alues
(1.2–1.5)
can
pro
vide
meaningful
impro
v
ements
without
compromising
accurac
y
.
Future
research
may
e
xtend
this
w
ork
by
quantifying
ener
gy
consumption,
e
v
aluating
computational
o
v
erhead,
and
designing
adapti
v
e
thresholding
mechanis
ms
that
dynam
ically
adjust
to
v
arying
data
distrib
utions
and
training
condi
tions
in
heterogeneous
FL
en
vironments.
FUNDING
INFORMA
TION
The
authors
declare
that
no
funding
w
as
recei
v
ed
for
conducting
this
study
.
A
UTHOR
CONTRIB
UTIONS
ST
A
TEMENT
This
journal
uses
the
Contrib
utor
Roles
T
axonomy
(CRediT)
to
recognize
indi
vidual
author
contrib
utions,
reduce
authorship
disputes,
and
f
acilitate
collaboration.
Name
of
A
uthor
C
M
So
V
a
F
o
I
R
D
O
E
V
i
Su
P
Fu
Nith
yaniranjana
Murth
y
Chittaiah
✓
✓
✓
✓
✓
✓
✓
✓
✓
✓
Manjula
Sunkadakatte
Haladappa
✓
✓
✓
✓
✓
C
:
C
onceptualization
I
:
I
n
v
estig
ation
V
i
:
V
i
sualizat
ion
M
:
M
ethodology
R
:
R
esources
Su
:
Su
pervision
So
:
So
ftw
are
D
:
D
ata
Curation
P
:
P
roject
Administ
ration
V
a
:
V
a
lidation
O
:
Writing
-
O
riginal
Draft
Fu
:
Fu
nding
Acquisiti
on
F
o
:
F
o
rmal
Analysis
E
:
Writing
-
Re
vie
w
&
E
diting
CONFLICT
OF
INTEREST
ST
A
TEMENT
The
authors
state
no
conict
of
interest.
D
A
T
A
A
V
AILABILITY
The
supporting
data
of
this
study
are
openly
a
v
ailable
in
the
follo
wing
sources:
CIF
AR-10:
https://www
.cs.toronto.edu/
kriz/cif
ar
.html,
and
MedMNIST
:
https://medmnist.com/.
REFERENCES
[1]
J.
Chen,
H.
Y
an,
Z.
Liu,
M.
Zhang,
H.
Xiong,
and
S.
Y
u,
“When
federated
learning
meets
pri
v
ac
y-preserving
computation,
”
A
CM
Computing
Surve
ys
,
v
ol.
56,
no.
12,
2024,
doi:
10.1145/3679013.
[2]
M.
Liu,
H.
Jiang,
J.
Chen,
A.
Badokhon,
X.
W
ei,
and
M.
C.
Huang,
“
A
collaborati
v
e
pri
v
ac
y-preserving
deep
learning
system
in
distrib
uted
mobile
en
vironment,
”
Pr
oceedings
-
2016
International
Confer
ence
on
Computational
Science
and
Computational
Intellig
ence
(CSCI)
,
pp.
192–197,
2017,
doi:
10.1109/CSCI.2016.0043.
[3]
Y
.
Liu,
J.
J.
Q.
Y
u,
J.
Kang,
D.
Niyato,
and
S.
Zhang,
“Pri
v
ac
y-preserving
traf
c
o
w
prediction:
a
federated
learning
approach,
”
IEEE
Internet
of
Things
J
ournal
,
v
ol.
7,
no.
8,
pp.
7751–7763,
2020,
doi:
10.1109/JIO
T
.2020.2991401.
[4]
J.
Xu,
B.
S.
Glicksber
g,
C.
Su,
P
.
W
alk
er
,
J.
Bian,
and
F
.
W
ang,
“Federated
learning
for
healthcare
informatics,
”
J
ournal
of
Healthcar
e
Informatics
Resear
c
h
,
v
ol.
5,
no.
1,
2021,
doi:
10.1007/s41666-020-00082-4.
[5]
K.
Y
ang,
T
.
Jiang,
Y
.
Shi,
and
Z.
Ding,
“Federated
learning
via
o
v
er
-the-air
computation,
”
IEEE
T
r
ansactions
on
W
ir
eless
Communications
,
v
ol.
19,
no.
3,
pp.
2022–2035,
2020,
doi:
10.1109/TWC.2019.2961673.
[6]
Y
.
Zhao,
M.
Li,
L.
Lai
,
N.
Suda,
D.
Ci
vin,
and
V
.
Chandra,
“Federated
learning
with
non-IID
data,
”
arXiv:1806.00582v2
,
2018.
[7]
Y
.
Liu
et
al.
,
“
A
communication
ef
cient
v
ertical
federated
learning
frame
w
ork,
”
The
2nd
International
W
orkshop
on
F
eder
ated
Learning
for
Data
Privacy
and
Condentiality
(in
Conjunction
with
NeurIPS
2019)
,
V
ancouv
er
,
Canada,
2019.
Optimizing
spar
se
ternary
compr
ession
with
thr
esholds
for
...
(Nithyanir
anjana
Murthy
Chittaiah)
Evaluation Warning : The document was created with Spire.PDF for Python.