IAES Inter national J our nal of Articial Intelligence (IJ-AI) V ol. 14, No. 5, October 2025, pp. 4171 4180 ISSN: 2252-8938, DOI: 10.11591/ijai.v14.i5.pp4171-4180 4171 BonoNet: a deep con v olutional neural netw ork f or r ecognizing bangla compound characters Kazi Rifat Ahmed 1 , Nusrat J ahan 2,3 , Adiba Masud 1,4 , Nusrat T asnim 5 , Sazia Sharmin 6 , Nusrat J ahan Mim 1 , Imran Mahmud 1 1 Department of Softw are Engineering, F aculty of Science and Information T echnology , Daf fodil International Uni v ersity , Dhaka, Bangladesh 2 Department of Information T echnology and Management, F aculty of Science and Information T echnology , Daf fodil International Uni v ersity , Dhaka, Bangladesh 3 F aculty of Electronic Engineering and T echnology (FKTEN), Uni v ersiti Malaysia Perlis, Arau, Malaysia 4 Department of Computer Science, Colle ge of AI, Cyber and Computing, Uni v ersity of T e xas at San Antonio, San Antonio, United States 5 Department of Information and Communication T echnology , Bangladesh Uni v ersity of Professionals, Dhaka, Bangladesh 6 Department of Computer Science, F aculty of Science and T echnology , American International Uni v ersity , Dhaka, Bangladesh Article Inf o Article history: Recei v ed Aug 11, 2024 Re vised Jun 28, 2025 Accepted Aug 6, 2025 K eyw ords: Bangla BonoNet Compound characters Deep con v olutional neural netw ork Handwritten Optical character recognition ABSTRA CT The bangla alphabet includes v o wels, consonants, and compound symbols. The compound nature of bangla is a product of combining tw o or more root bangla characters into one graph. The y are dif cult to dif ferentiate because the y ha v e a sophisticated geometric shape and an immense v ariety of scripts used by dif- ferent places and indi vi duals. This is one of the greatest challenges in creating ef fecti v e optical character recognition (OCR) systems for bangla. In this paper , a deep con v olutional neural netw ork (DCNN)-based system is presented to iden- tify bangla compound characters with high precision. The model w as trained using the AIBangla dataset. It has about 171 classes of bangla compound char - acters. A DCNN system, BonoNet, w as designed to classify compound charac- ters. BonoNet outperformed all other state-of-the-art architecture on the test set and impro v ed o v er current state-of-the-art architecture methods. BonoNet will greatly impro v e the automation and analysis of the bangla language by accu- rately identifying these compound comple x characters. This is an open access article under the CC BY -SA license . Corresponding A uthor: Nusrat Jahan Department of Information T echnology and Management, F aculty of Science and Information T echnology Daf fodil International Uni v ersity Dhaka, Bangladesh Email: nusrat.swe@diu.edu.bd 1. INTR ODUCTION Bangla is the se v enth most widely spok en language on earth, and it is spok en by nearly 300 mil lion people in South Asia’ s Beng ali re gion. It is the of cial and national language of Bangladesh and is spok en by close to 98% of Bangladesh’ s population. The script of the language has v o wels, consonants, and comple x letters within distincti v e and separate visual structures, leading to a distincti v e and elaborate syste m of writing. This comple xity has pro v en to be dif cult for optical character recognition (OCR) to handle, especially in the case of character recognition of handwritten documents on ph ysical media. OCR technology has been prized as an in v aluable resource for digitizing written materials for man y years, b ut bangla’ s compound characters pose unique challenges according to their structural comple xity and di v ersity . J ournal homepage: http://ijai.iaescor e .com Evaluation Warning : The document was created with Spire.PDF for Python.
4172 ISSN: 2252-8938 V arious computational solutions e xist to be implemented in handwritten character recognition. These include machine learning techniques, articial neural netw orks (ANN), multilayer perceptrons (MLP), support v ector machines (SVM), and gro wing emphasis on dee p learning models such as con v olutional neural netw orks (CNNs) [1], [2]. The CNNs themselv es ha v e long been in demand for ha ving greater precision and less reliance on human-created feature e xtraction. Their ability to l earn visual features automatically at a hierarchical le v el mak es them e xtremely resourceful in image recognition and classication tasks. Bangla script layout consists of 50 simple characters, out of which 11 are v o wels and 39 are con- sonants [3]. The y together create more than 171 compound characters by the combination of simple ones. Despite remarkable adv ancements in studies, the majority of the earlier research mostly had an aim t o recog- nize simple characters only . Compound characters, due to their unpredictability and strong v ariability , ha v e been less e xplored in e xisting OCR models [4]. There thus remains an enormous scope for systems capable of handling such comple xity with rob ust accurac y and generalizability . Figure 1 sho ws e xamples of the simple and compound characters used in this w ork. Figure 1. Bangla basic and compound characters e xample In the recent past, se v eral deep learning-inspired models ha v e been introduced for bangla character recognition. Ahmed et al . [5] introduced a deep con v olutional neural netw ork (DCNN) with 76,000 training im- ages for character classi cation, while Ashiquzzaman et al . [6] emplo yed e xponential linear unit (ELU)-based methods to enhance performance on the CMA TERDB 3.1.3.3 dataset. Azad et al . [7] introduced DCon vAEN- Net, an autoencoder -DCNN combination, on dat asets such as BanglaLekha-Isolate and Ekush. Uddin et al . [8] used h ybrid Con vLSTM to sho w good performance on identifying bangla handwritten digit s. Be gum et al . [9] used longest run (LR)+chain code histogram (CH) features, whereas Chakraborty and P aul [10] did bidirec- tional con v ersion from simple to compound and vice v ersa. Cho wdhury et al . [11] achie v ed impro v ed accurac y using CNN with data augmentation, whereas Hasan et al . [12], [13] e xperimented with V GG-16, ResNet-50, and DenseNet—identifying DenseNet particularly ef fecti v e for simple as well as compound characters on the AIbangla dataset. Other approaches follo wed handcrafted features and combination strate gies. Kibria et al . [14] emplo yed SVM and MLP classiers with local recepti v e eld (LRF), histogram of oriented gradients (HOG), and diagonal features, and Khan et al . [15] follo wed high performance on the BanglaLekha-Exclusi v e dataset using SE-ResNeXt. Mukherjee et al . [16] e xperimented with v arious learning methods on 10,000 bangla web images. Saha et al . [17], [18] introduced BBCNet-15 for impro v ed basic character recognition and compared local binary pattern (LBP)-based descriptors under v arious classes. Sarika et al . [19] demonstrated V GG-16 performance for T elugu script, and R abi et al . [20] demonstrated e xcellent results for KD ANet on BanglaLekha. Pramanik and Bag [21] used chain-code features for compound character recognition using ICD AR and CMA- TERdb databases. K oiso et al . [22] e xtended OCR research to Japanese script. Separately , Jishan et al . [1] in- te grated NLP with h ybrid neural netw orks for te xt image recognition, utilizing grammar analysis and language modeling techniques, along with other researchers, used NLP to recognize dif ferent channels from images and te xts [23]–[25]. Despite such a heterogeneous body of w ork, most w orks still emphasize unique character recognition. Handwritten compound characters are still dif cult to classify since the y are visually v ariable and conte xtually v ariable. T o address this issue, in this w ork, a shallo w DCNN architecture named BonoNet is introduced that is specically tar geted to w ards the accurate recognition of bangla compound characters. BonoNet outperforms the state-of-the-art models ResNet and DenseNet on the AIBangla dataset. Unlik e other methods, BonoNet automates feature e xtraction and tackles the high intra- class similarity and inter -class v ariability common in compound bangla characters. Int J Artif Intell, V ol. 14, No. 5, October 2025: 4171–4180 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 4173 2. METHOD In this section, the approach of selecting bangla compound characters which will be utilized is e xplained. A skilled DCNN, ’BonoNet’, has been designed to identify bangla compound characters ef ciently . The suggested approach is illustrated in Figure 2. Figure 2. Proposed methodology for compound character recognition 2.1. Data collection The methodology that w as suggested w as adapted from the AIBangla dataset created by Hasan et al . [12]. The dataset consists of handwritten bangla characters through submissions of more than 2,000 indi viduals from v arious institutes in Bangladesh. It is a ne w benchmark in the eld with a holistic use case and performance benchmark. AIBangla dataset has a lar ge bangla character set, including 249,911 im- ages for compound characters and 80,403 images for simple characters in 50 classes. As their dataset does not contain an y number data, AIBangla g athered 330,314 images for 221 classes. The y also ha v e a dataset of 171 compound characters in bangla, which we will use. The AIBangla dataset samples are represented in Figure 3. Figure 3. Fe w e xamples of the dataset 2.2. Data pr epr ocessing Data processing is benecial for impro ving accurac y and reduci ng the comple xity of an image. Python OpenCV w as utilized to implement the preprocessing steps. Figure 4 sho ws the preprocessing steps that were adopted here. F or achie ving that, it w as initiated by transforming RGB images into gray-scale to lo wer their dimensionality and mitig ate the load on the model. The dataset needs to be transf ormed to gray-scale so as to cancel tone v ariability and noise. OpenCV library in Python is utilized for achie ving that, which succeeds in transforming images into gray-scale and remo ving noise by making use of Gaussian blur . Image thresholding also reduces analysis by transforming images into binary black and white. P articularly utilized multi-otsu thresholding, which classies pix els into classes as per their intensity of gray le v el. W ith the use of OpenCV under Python, m ulti-otsu thresholding is used o v er the dataset, enhancing its prepara tory procedure. T o nd precise handwritten compound characters, the unnecessary parts of the image are eliminated. Utilizing contour detection from Python’ s OpenCV , to detect the edges of Bangla compound characters. After detection, the image is cropped to the size of the character . Image resizing is another critical step, which accelerates neural netw ork training by minim izing the pix els. In our instance, images are resized to 28 × 28, leading to better model results as presented in Figure 5. The results before pre-processing are sho wn in Figure 5(a) and after pre-processing are sho wn in Figure 5(b). BonoNet: a deep con volutional neur al network for r eco gnizing bangla ... (Kazi Rifat Ahmed) Evaluation Warning : The document was created with Spire.PDF for Python.
4174 ISSN: 2252-8938 Figure 4. Data preprocessing steps of train and test data (a) (b) Figure 5. The dataset of (a) before preprocessing and (b) after preprocessing T raining, test, and v alidati on sets ha v e been di vided. T o be specic, 80% of the data were de v oted to training purposes, and 10% to test purposes, with the remaining 10% for v alidation purposes. Comple x characters are the center of attention, which are further classied into 171 cate gories. A total of 199,803 samples are present for utilization in training the model. A total of 25,123 sampl es are al so present for testing the model’ s performance. Additionally , 24,908 samples are present for v alidating the performance of the model while training it. The model can be e xhausti v ely trained, tested, and v alidated with this pro vision. 2.3. Pr oposed method: BonoNet ar chitectur e This DCNN produces 28 × 28 images with 7 con v olutional layers, 3 fully connected layers, and 5 dropout layers. The rst and second layers: k ernel of size 3 × 3 and 32 lters. In the model, a batch- normalization layer with pool size of 22 and without strides v alue is used after each con v olution layer . Rectied linear unit (ReLU) is an acti v ati on function for all con v olutional layers, and max pooling is skipped by the rst layer . F or reducing o v ertting, a dropout layer is appl ied after the second layer , and then a max pooling layer . All third and fourth con v olutional layers consi st of 64 lters with an acti v ation function. ReLU acti v ation func- tion and a maxpooling layer were applied after the fourth layer . Batch normalization is applied e v erywhere e xcept the output layer . A dropout layer after the max pool layer is complete. 5 th , 6 th , and 7 th ha v e 128, 128, and 256 lters. BatchNormalization follo wed by max pooling and dropout in each layer . Then we ha v e the atten layer to classify , follo wed by three fully connected layers of 512, 512, and 171 neurons. Batchnormal- ization layer on the last layer , dropout in the lo wer tw o. Us es softmax acti v ation in the nal layer . Figure 6 is the proposed DCNN model. 2.4. Model br eakdo wn The BonoNet architecture is or g anized into three main components. It consists of a feature e xtractor and a classier . It also includes training parameters that optimize the o v erall performance of the model. Int J Artif Intell, V ol. 14, No. 5, October 2025: 4171–4180 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 4175 2.4.1. F eatur e extractor Feature e xtraction lies at the foundation of the BonoNet architecture, dealing with input image data ef- ciently . Ra w inputs are con v erted into structured features through the application of se v eral layers. Processing initiates with an input layer , with the con v o y layers for recognizing the features. Input layer: the model be gins with an input layer that tak es in the image data. Con v olutional layers (Con v2D): multiple con v olutional layers e xtract features from the image. Each layer applies lters to detect patterns lik e edges, te xtures, and shapes at dif ferent le v els of abstraction. ReLU acti v ation: ReLU is used to introduce non-linearity , helping the model capture comple x patterns. Batch normalization: this layer normalizes the output of the pre vious layer . Max pooli ng : max pooling do wnsamples the feature maps, reducing dimensionality and computational cost while preserving important information. Dropout: dropout randomly deacti v ates a portion of neurons during training, pre v enting o v ertting. 2.4.2. Classier Dense layers are fully connected layers that use the e xtracted features to determine the class of the image. The architecture includes an initial dense layer with 512 units, follo wed by a layer with 1,024 units. Finally , there is an output layer with a number of units equal to the number of classes. 2.4.3. Model training parameter The proposed model w as trained using a set of designated parameters optimized to ensure ef fecti v e con v er gence and generalization. T able 1 sho ws the parameters has been used in the proposed methodology . These parameters were carefully selected to enhance the model’ s training performance. Figure 6. BonoNet architecture T able 1. T raining parameters used for BonoNet P arameter V alue Learning rate 0.0001 Decay f actor 0.2 Early stopping patience 2 Loss function Cate gorical cross-entrop y T otal epochs 100 Epochs before stopping 36 Ev aluation dataset Fresh v alidation set 2.5. Benets f or image pr ocessing The structure of a CNN is e xactly what is needed to solv e image processing problems because it is pos sible to train it to learn hierarchical representations of visual features. The con v olutional layers are able to detect local features, and the pooling layers of fer dimensionality reduction and translation in v ariance. The dense layers then concatenate these features ultimately to enable accurate prediction. The use of batch normalization and dropout methods impro v es training ef cienc y and generalization and thus mak es the model in v ariant to image data v ariation. BonoNet: a deep con volutional neur al network for r eco gnizing bangla ... (Kazi Rifat Ahmed) Evaluation Warning : The document was created with Spire.PDF for Python.
4176 ISSN: 2252-8938 3. RESUL TS AND DISCUSSION 3.1. Setup and en vir onment The research made use of the mentioned resources and specications. The e xperiment w as conducted by using a CPU model Intel(R) Core (TM) i5-8265U running at 1.60 GHz and 1.80 GHz with 8 GB of RAM. The de vice is equipped with 8 GB of RAM, Intel(R) UHD Graphics 620 and NVIDIA GeF orce MX110 GPU, T oshiba MQ04ABF100 HDD, and W indo ws 11 Home operating system. The program w as e x ecuted using Jup yter Notebook 6. 4.12 as well as the Anaconda platform v ersion 2.3. 3.2. Experiment r esults The ‘BonoNet’ model w as tested with 10% of images from the datasets. From the 171 classes of im- age classication, this model acquired around 90.01% training accurac y and 89.99% v alidation accurac y . This model als o obtained 90.01% precision in the train and 90.01% in the v alidation set. Ov erall, it pro vided 90.01% accurac y in recognizing bangla compound characters. T able 2 sho ws the result achie v ed in classication. The e v aluation criteria for the model depend on its performance with the v alidation dataset, which contains 24,908 samples. The model’ s precision, recall, and F1 score are all 0.90, indicating its 90% accurac y in correctly identifying the class in its predictions (precision), in the actual instances pres ent (recall), and in o v erall performance considering both precision and recall (F1 score). This sho ws that the model is reliable and ef fecti v e in accurately predicting the correct classes for t he v alidation data. Figure 7 sho ws the training and v alidation accurac y of the ‘BonoNet’ model for identifying the Bangla compound characters. T able 2. Detailed metrics of BonoNet model Class Prec ision Recall F1-score Support class 0 0.89 0.89 0.89 1169 class 1 0.90 0.90 0.90 1170 . . . . . . . . . . . . . . . . class 170 0.89 0.89 0.89 1169 micro a vg 0.89 0.90 0.90 199803 macro a vg 0.90 0.89 0.89 199803 weighted a vg 0.90 0.90 0.90 199803 Figure 7. BonoNet model training and v alidation accurac y and loss Int J Artif Intell, V ol. 14, No. 5, October 2025: 4171–4180 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 4177 The charts display ho w well the model performed during 35 epochs. The model’ s accurac y increas es as it trains, as sho wn in the left plot. Both training and v alidation accurac y initially be gin at a lo w le v el and quickly impro v e, reaching a plateau of about 90%, indicating impro v ed predicti v e abilities of the model. The model’ s loss, whic h is a me asure of error , is displayed on the plot to the right. The be ginning and nal states for both training and v alidation loss start quite high and proceed to drop of f signicantly until both le v els sta- bilize at a lo w le v el. This sho ws that the model is learning ef fecti v ely , with errors decreasing as it under goes training. In general, the data i nd i cates that the model is performi ng uniformly well on both the training and v al- idation sets, achie ving high accurac y with minim al errors. The ‘BonoNet’ model successfully classied com- pound characters with intricate structures, leading to lo wer errors compared to simple and numeral characters. T able 3. displays the comparison between the accurac y and class cate gorization of the ‘BonoNet’ model and the current model. T able 3. Comparison with e xisting and proposed models Name Number of classes/images Accurac y (%) Chakraborty and P aul [10] 300,000 89.20 Hasan et al . [12] 171 81.83 Kibria et al . [14] 171 85.91 Pramanik and Bag [21] 171 88.74 Saha et al . [17] 171 73.3 Proposed Model (BonoNet) 171 90.01 Here, T able 3 illustrates the precision of v arious models in identifying bangla compound charac- ters. In their study , Chakraborty and P aul [10] obtained an 89.20% accurac y using a v ast dataset containing 300,000 images. Hasan et al . [12], Kibria et al . [14], Pramanik and Bag [21], and Saha et al . [17] utilized datasets comprising 171 classes and attained accuracies of 81.83%, 85.91%, 88.74%, and 73.3% in the same order . Similarly , the BonoNet model, which w as proposed, utilized a dataset consisting of 171 cate gories and managed to reach an accurac y rate of 90.01%, the highest among all models. This indicates that the BonoNet model is more accurate than the other models for this particular task. The ‘BonoNet’ model surpasses v arious models to achie v e impro v ed results in recognizing compound characters. 4. CONCLUSION Recently , CNN has g ained much notice due to its adv anced ability to cate gorize images ef fe cti v ely . The model is consistently rele v ant. The ’BonoNet’ model, de v eloped with CNN, outperformed the prior model in accurately recognizing bangla compound characters. W e utilized the model to impro v e the results. The model ’BonoNet’ can achie v e optimal recogniti on accurac y for accurate identi cation of bangl a compound characters. Conclusions were compared to the graphs generated to v erify the model. Graphs were produced for accurac y and loss functions at each c ycle. The proposed models achie v ed a 90.01% le v el of accurac y . T o enhance the model’ s accurac y for potential gro wth in the future. Using more adv anced and operational de vices in addition to our trained de vice can enhance the accurac y of the proposed model. The training potential of the dataset will gro w as tim e is sa v ed. Increasing the size of the datasets for training and v alidation could potentially impro v e the odds. Alternati v ely , the model can be trained using lar ger image input sizes in order to potentially impro v e results. Only indi vidual bangl a compound characters can be used with the suggested method. In the future, we aim to combine simple and comple x bangla characters to dene a complete bangla w ord within a sentence. A CKNO WLEDGMENTS W e w ould lik e to thanks all the authors for their contrib ution. The Daf fodil International Un i v er sity has pro vided a great support for pro viding the en vironment to do the research. FUNDING INFORMA TION No nancial support w as recei v ed for the completion of this study . BonoNet: a deep con volutional neur al network for r eco gnizing bangla ... (Kazi Rifat Ahmed) Evaluation Warning : The document was created with Spire.PDF for Python.
4178 ISSN: 2252-8938 A UTHOR CONTRIB UTIONS ST A TEMENT This journal uses the Cont rib utor Roles T axonomy (CRediT) to recognize indi vidual author contrib u- tions, reduce authorship disputes, and f acilitate collaboration. Name of A uthor C M So V a F o I R D O E V i Su P Fu Kazi Rif at Ahmed Nusrat Jahan Adiba Masud Nusrat T asnim Sazia Sharmin Nusrat Jahan Mim Imran Mahmud C : C onceptualization I : I n v estig ation V i : V i sualization M : M ethodology R : R esources Su : Su pervision So : So ftw are D : D ata Curation P : P roject Administration V a : V a lidation O : Writing - O riginal Draft Fu : Fu nding Acquisition F o : F o rmal Analysis E : Writing - Re vie w & E diting CONFLICT OF INTEREST ST A TEMENT The authors af rm that this study w as conducted without an y conicting interests. D A T A A V AILABILITY The data supporting this research are directly a v ailable on Kaggle via https://www .kaggle.com/datasets/ a wmium/handwritten-bangla-characterdataset-aaibangla, originally published by the dataset authors in associa- tion with the paper a v ailable at https://doi.or g/10.1109/ICBSLP47725.2019.201481. The dataset, titled “Hand- written bangla character dataset (AI-Bangla)”, w as used under the terms specied by its public release. REFERENCES [1] M. A. Jishan, K. R. Mahmud, A. K. Al Azad, M. R. A. Rashid, B. P aul, and M. S. Alam, “Bangla language te xtual i mage description by h ybrid neural netw ork model, Indonesian J ournal of Electrical Engineering and Computer Science , v ol. 21, no. 2, pp. 757–767, 2020, doi: 10.11591/ijeecs.v21.i2.pp757-767. [2] M. G. Hussain, B. Sultana, M. Rahman, and M. R. Hasan, “Comparison analysis of B angla ne ws articles classication using support v ector machine and logistic re gression, TELK OMNIKA (T elecommunication Computing Electr onics and Contr ol) , v ol. 21, no. 3, pp. 584–591, 2023, doi: 10.12928/TELK OMNIKA.v21i3.23416. [3] A. Hasan, M. H. Jobayer , M. A. A. M. Pias, T . Alam, and R. Khan, “Bangla sign language recognition with multimodal deep learning fusion, Engineering Reports , v ol. 7, no. 4, 2025, doi: 10.1002/eng2.70139. [4] M . Kabir , O. B . Mahfuz, S. R. Raiyan, H. Mahmud, and M. K. Hasan, “BanglaBook: A lar ge-scale bangla dataset for sentiment analysis from book re vie ws, arXiv-Computer Science , 2023, doi: 10.48550/arXi v .2305.06595. [5] S . Ahmed, F . T absun, A. S. Re yadh, A. I. Shaa, and F . M. Shah, “Beng ali handwritten alphabet recognition using deep con v olutional neural netw ork, 5th International Confer ence on Computer , Communication, Chemical, Materials and Electr onic Engineering (IC4ME2) , 2019, doi: 10.1109/IC4ME247184.2019.9036572. [6] A . Ashiquzzaman, A. K. T ushar , S. Dutta, and F . Mohsin, An ef cient method for impro ving classication accurac y of handwrit- ten Bangla compound characters using DCNN with dropout and ELU, 2017 3r d IEEE International Confer ence on Resear c h in Computational Intellig ence and Communication Networks (ICRCICN) , pp. 147–152, 2017, doi: 10.1109/ICRCICN.2017.8234497. [7] M. A. Azad, H. S. Singha, and M. M. H. Nahid, “Bangla handwritten character recognition using deep con v olutional autoencoder neural netw ork, 2020 2nd International Confer ence on Advanced Information and Communication T ec hnolo gy (ICAICT) , 2020, doi: 10.1109/ICAICT51780.2020.9333472. [8] A. H. Uddin, J. Khatun, M. A. M e ghna, and P . Mahmud, “Bangla handwritten digit recognition using RNN-C NN h y- brid approach, 2022 25th International Confer ence on Computer and Information T ec hnolo gy (ICCIT) , pp. 288–293, 2022, doi: 10.1109/ICCIT57492.2022.10055089. [9] H. Be gum, A. Rad, and M. M. Islam, “Recognition of bangla handwritten characters using feature combinations, 2018 5th IEEE Uttar Pr adesh Section International Confer ence on Electri cal, Electr onics and Computer Engineering (UPCON) , 2018, doi: 10.1109/UPCON.2018.8597076. [10] S. Chakraborty and S. P aul, “Beng ali handwritten character transformation: basic to compound and compound to basic using con v olutional neural netw ork, International Confer ence on Robotics, Electrical and Signal Pr ocessing T ec hniques , pp. 142–146, 2021, doi: 10.1109/ICREST51555.2021.9331247. Int J Artif Intell, V ol. 14, No. 5, October 2025: 4171–4180 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Artif Intell ISSN: 2252-8938 4179 [11] R. R. Cho wdhury , M. S. Hossain, R. U. Islam, K. Andersson, and S. Hossain, “Bangla handwritten character recognition using con v olutional neural netw ork with data augment ation, 2019 J oint 8th International Confer ence on Informatics, Electr onics and V ision (ICIEV) and 2019 3r d International Confer ence on Ima ging , V ision and P attern Reco gnition (icIVPR) , pp. 318–323, 2019, doi: 10.1109/ICIEV .2019.8858545. [12] M. M. Hasan, M. M. Abir , M. Ibrahim , M. Sayem, and S. Abdullah, AIBangla: A benchmark dataset for isolated bangla handwritten basic and compound character recognition, 2019 International Confer ence on Bangla Speec h and Langua g e Pr ocessing (ICBSLP) , 2019, doi: 10.1109/ICBSLP47725.2019.201481. [13] M. N. Hasan, R. I. Sultan, and M. Kasedullah, An automated system for recognizing isolated handwritten bangla characters using deep con v olutional neural netw ork, ISCAIE 2021 - IEEE 11th Symposium on Computer Applications and Industrial Electr onics (ISCAIE) , pp. 13–18, 2021, doi: 10.1109/ISCAIE51753.2021.9431799. [14] M. R. Kibria, A. Ahmed, Z. Firda ws i, and M. A. Y ousuf, “Bangla compound character recognition using support v ec- tor machine (SVM) on adv anced feature sets, 2020 IEEE Re gion 10 Symposium (TENSYMP) , pp. 965–968, 2020, doi: 10.1109/TENSYMP50017.2020.9230609. [15] M. M. Khan, M. S. Uddin, M. Z. P arv ez, and L. Nahar , A squeeze and e xcitation ResNeXt-based deep learning model for Bangla handwritten compound character recognition, J ournal of King Saud Univer sity - Computer and Information Sciences , v ol. 34, no. 6, pp. 3356–3364, 2022, doi: 10.1016/j.jksuci.2021.01.021. [16] P . Mukherjee, S. Sen, K. Ro y , and R. Sarkar , “Recognition of online handwritten Bangla characters using supervised and unsuper - vised learning approaches, International J ournal of Computer V ision and Ima g e Pr ocessing (IJCVIP) , v ol. 10, no. 3, pp. 18–30, 2020, doi: 10.4018/ijcvip.2020070102. [17] C. Saha, R. H. F aisal, and M. M. Rahm an, “Bangla handwritten basic character recognition using deep con v olutional neural net- w ork, 2019 J oint 8th International Confer ence on Informatics, Electr onics and V ision (ICIEV) and 3r d International Confer ence on Ima ging , V ision and P attern Reco gnition (icIVPR) , pp. 190–195, 2019, doi: 10.1109/ICIEV .2019.8858575. [18] C. Saha, R. H. F aisal, and M. M. Rahman, “Bangla handwritten character recognition using local binary pattern and its v ari- ants, 2018 International Confer ence on Inno vations in Science , Engineering and T ec hnolo gy (ICISET) , pp. 236–241, 2018, doi: 10.1109/ICISET .2018.8745645. [19] N. Sarika, N. Sirisala, and M. S. V elpuru, “CNN based optical character recognition and applications, Pr oceedings of the 6th Inter - national Confer ence on In ventive Computation T ec hnolo gies (ICICT) , pp. 666–672, 2021, doi: 10.1109/ICICT50816.2021.9358735. [20] K. K. Rabbi, A. Hossain, P . De v , A. Sadman, D. Z. Karim, and A. A. Rasel, “KD ANet: Handwritten character recognition for Bangla language using deep learning, 2022 25th International Confer ence on Computer and Information T ec hnolo gy (ICCIT) , pp. 651–656, 2022, doi: 10.1109/ICCIT57492.2022.10054708. [21] R. Pramanik and S. Bag, “Shape decomposition-based handwritten compound character recognition for Bangla OCR, J ournal of V isual Communication and Ima g e Repr esentation , v ol. 50, pp. 123–134, Jan. 2018, doi: 10.1016/j.jvcir .2017.11.016. [22] N. K oiso, Y . T ak emoto, Y . Ishika w a, and M. T akata, “Proposed met hod of acquiring train data for early-modern Japanese printed character recognizers, J ournal of Super computing , v ol. 81, no. 6, 2025, doi: 10.1007/s11227-024-06866-4. [23] F . M. Rusli, K. A. Adhiguna, and H. Ira w an, “Indonesian ID card e xtractor using optical character recognition and natural language post-processing, 2021 9th International Confer ence on Information and Communication T ec hnolo gy (ICoICT) , pp. 621–626, 2021, doi: 10.1109/ICoICT52021.2021.9527510. [24] H . Moussaoui, N. E. Akkad, and M. Benslimane, “License plate te xt recognition using deep lear ning, NLP , and im- age processing techniques, Statistics, Optimization and Information Computing , v ol. 12, no. 3, pp. 685–696, 2024, doi: 10.19139/SOIC-2310-5070-1966. [25] S. Rajendran, M. A. K umar , R. Rajalakshmi, V . Dhanalakshmi, P . Balasubramanian, and K. P . Soman, “T amil NLP technologies: challenges, state of the art, trends and future scope, Communications in Computer and Information Science , pp. 73–98, 2023, doi: 10.1007/978-3-031-33231-9 6 . BIOGRAPHIES OF A UTHORS Kazi Rifat Ahmed has comple ted his B.Sc. from Daf fodil International Uni v ersity in Softw are Engineering and M.Sc. from the Institute of Information T echnology , Jahangirnag ar Uni- v ersity . He is currently w orking as a lecturer in the Department of Softw are Engineering, Daf fodil International Uni v ersity . His research interests are machine learning, deep learning, NLP , and com- puter vision. He has published in high-impact journals and conferences, aiming to adv ance AI-dri v en solutions in healthcare and security . He can be contacted at email: rif at.swe@diu.edu.bd. Nusrat J ahan is w orking as an assistant professor and head at Department of Information T echnology & Management in Daf fodil International Uni v ersity , Bangladesh. She completed her M.Sc. and B.Sc. in Information T echnology from Institute of Information T echnology , Jahangirnag ar Uni v ersity . She is doing her Ph.D. from Department of Computer Engineering, Uni v ersity Malaysia Perlis (Uni Map). She is interested in technology management, computer netw orks, m achine learning, and articial intelligence. She can be contacted at email: nusrat.swe@diu.edu.bd. BonoNet: a deep con volutional neur al network for r eco gnizing bangla ... (Kazi Rifat Ahmed) Evaluation Warning : The document was created with Spire.PDF for Python.
4180 ISSN: 2252-8938 Adiba Masud is currently pursuing her Ph.D. in the Department of Computer Science, Uni v ersity of T e xas at San Antonio, T e xas, USA. She has completed her B.Sc. and M.Sc. from the Institute of Information T echnology , Jahangirnag ar Uni v ersity . She is currently on study lea v e as a lecturer in the Department of Softw are Engineering, Daf fodil International Uni v ersity . Her research interests are machine learning, deep learning, NLP , and computer vision. She can be contacted at email: adiba.swe@diu.edu.bd. Nusrat T asnim has completed her B.Sc. and M.Sc. from the Institute of Information T echnology , Jahangirnag ar Uni v ersity . She is currently w orking as a lecturer in the Department of Information and Communication T echnology , Bangladesh Uni v ersity of Professionals. She w as a former lecturer in the Department of Softw are Engineering, Daf fodi l International Uni v ersity . Her research interests are machine learning, deep learning, NLP , and computer vision. She can be con- tacted at email: nusrattasnim17@gmail.com. Sazia Sharmin has completed her B.Sc. and M.Sc. from t he Institute of Information T echnology , Jahangirnag ar Uni v ersity . She is currently w orking as a lecturer in the Department of Computer Science at the American International Uni v ersity , and she w as pre viously a lecturer in the Department of Softw are Engineering at Daf fodil International Uni v ersity . Her research interests are machine learning, deep learning, NLP , and computer vision. She can be contacted at email: sazia.sharmin@aiub .edu. Nusrat J ahan Mim has completed her B.Sc. and M.Sc. from the Department of Softw are Engineering, Daf fodil International Uni v ers ity . She is currently w orking as a lecturer in the Department of Softw are Engineering, Daf fodil International Uni v ersity . Her research interests are machine learning, deep learning, NLP , and computer vision. She can be contacted at email: nusratjahan.swe@diu.edu.bd. Imran Mahmud recei v ed the master’ s de gree in softw are engineering from the Uni v ersity of Hertfordshire, U.K., in 2008, and the Ph.D. de gree in technology management from Uni v ersiti Sains Malaysia, in 2017. He is currently the head and an professor with the Department of Softw are Engineering, Daf fodil International Uni v ersity , Bangladesh. He is also a visiting professor with the Graduate School of Business, Uni v ersiti Sains Malaysia. Pre viously , he w as a senior lecturer with the Graduate School of Business, Uni v ersiti Sains Malaysia. He w as a visiting lecturer with the Institute of T echnology , Bandung, Indonesia, and the Hong K ong Management Association, Hong K ong. He achie v ed se v eral a w ards, including the Hall of F ame and Prestigious Publication A w ard from Uni v ersiti Sains Malaysia, the young researcher from Kasetsart Uni v ersity , Thailand, and the young scientis t in T echnology Management from the V enus International F oundation, India. He can be contacted at email: imranmahmud@daf fodilv arsity .edu.bd. Int J Artif Intell, V ol. 14, No. 5, October 2025: 4171–4180 Evaluation Warning : The document was created with Spire.PDF for Python.