Articles

Access the latest knowledge in applied science, electrical engineering, computer science and information technology, education, and health.

Filter Icon

Filters article

Years

FAQ Arrow
0
0

Source Title

FAQ Arrow

Authors

FAQ Arrow

28,428 Article Results

Machine learning in detecting and interpreting business incubator success data and datasets

10.11591/ijict.v14i2.pp446-456
Mochammad Haldi Widianto , Puji Prabowo
This research contributes to creating a proposed architectural model by utilizing several machine learning (ML) algorithms, heatmap correlation, and ML interpretation. Several algorithms are used, such as K-nearest neighbors (KNN) to the adaptive boosting (AdaBoost) algorithm, and heatmap correlation is used to see the relationship between variables. Finally, select K-best is used in the results, showing that several proposed model ML algorithms such as AdaBoost, CatBoost, and XGBoost have accuracy, precision, and recall of 94% and an F1-score of 93%. However, the computing time the best ML is AdaBoost with 0.081s. Then, finally, the proposed model results of the interpretation of AdaBoost using select K-best are the best features “last revenue” and “first revenue” with k feature values of 0.58 and 0.196, these features influence the success of the business. The results show that the proposed model successfully utilized model classification, correlation, and interpretation. The proposed model still has weaknesses, such as the ML model being outdated and not having too many interpretation features. The future research might maximize with ML models and the latest interpretations. These improvements could be in the form of ML algorithms that are more immune to data uncertainty, and interpretation of results with wider data.
Volume: 14
Issue: 2
Page: 446-456
Publish at: 2025-08-01

A survey of missing data imputation techniques: statistical methods, machine learning models, and GAN-based approaches

10.11591/ijai.v14.i4.pp2876-2888
Rifaa Sadegh , Ahmed Mohameden , Mohamed Lemine Salihi , Mohamedade Farouk Nanne
Efficiently addressing missing data is critical in data analysis across diverse domains. This study evaluates traditional statistical, machine learning, and generative adversarial network (GAN)-based imputation methods, emphasizing their strengths, limitations, and applicability to different data types and missing data mechanisms (missing completely at random (MCAR), missing at random (MAR), missing not at random (MNAR)). GAN-based models, including generative adversarial imputation network (GAIN), view imputation generative adversarial network (VIGAN), and SolarGAN, are highlighted for their adaptability and effectiveness in handling complex datasets, such as images and time series. Despite challenges like computational demands, GANs outperform conventional methods in capturing non-linear dependencies. Future work includes optimizing GAN architectures for broader data types and exploring hybrid models to enhance imputation accuracy and scalability in real-world applications.
Volume: 14
Issue: 4
Page: 2876-2888
Publish at: 2025-08-01

Power of blockchain technology for enhancing efficiency transparency and data provenance in supply chain management

10.11591/ijai.v14.i4.pp3452-3461
Kanimozhi Thirunavaukkarasu , Inbavalli Mani
Global supply chains face increasing challenges in improving efficiency, transparency, and compliance with regulatory requirements. Traditional supply chain systems often suffer from inefficiencies due to fragmented data and manual processes, which result in delays and higher costs. Blockchain technology has emerged as a potential solution by offering decentralization, data immutability, and automation through smart contracts. However, existing blockchain implementations struggle with issues like scalability and transaction speed, which limits their effectiveness in supply chain management. This study introduces a new framework based on distributed ledger technology (DLT) with enhanced smart contract functions and data provenance tracking. The framework aims to improve transaction throughput, reduce latency, and provide better data integrity, enabling more efficient and transparent supply chain operations. By incorporating mechanisms to track the origin and movement of goods, the framework ensures that stakeholders have real-time access to accurate information, improving decision-making and trust across the supply chain. We evaluate the performance of this framework using the AnyLogic simulation platform, comparing it to traditional blockchain systems. Metrics such as transaction throughput, latency, and efficiency are analyzed to demonstrate the improvements achieved by the proposed system. The results show significant enhancements in transaction speed and operational efficiency, offering a practical solution for optimizing supply chains in various industries.
Volume: 14
Issue: 4
Page: 3452-3461
Publish at: 2025-08-01

Deep learning algorithms for breast cancer detection from ultrasound scans

10.11591/ijict.v14i2.pp427-437
Lawysen Lawysen , Gede Putra Kusuma
Breast cancer is a highly dangerous disease and the leading cause of cancer related deaths among women. Early detection of breast cancer is considered quite challenging but can offer significant benefits, as various treatment interventions can be initiated earlier. The focus of this research is to develop a model to detect breast cancer based on ultrasound results using deep learning algorithms. In the initial stages, several preprocessing processes, including image transformation and image augmentation were performed. Two types of models were developed: utilizing mask files and without using mask files. Two types of models were developed using four deep learning algorithms: residual network (ResNet)-50, VGG16, vision transformer (ViT), and data-efficient image transformer (DeiT). Various algorithms, such as optimization algorithms, loss functions, and hyperparameter tuning algorithms, were employed during the model training process. Accuracy used as the performance metric to measure the model’s effectiveness. The model developed with ResNet-50 became the best model, achieving an accuracy of 94% for the model using mask files. In comparison, the model developed with ResNet-50 and DeiT became the best model for the model without mask files, with an accuracy of 80%. Therefore, it can be concluded that using mask files is crucial for producing the best-performing model.
Volume: 14
Issue: 2
Page: 427-437
Publish at: 2025-08-01

Deep learning for grape leaf disease detection

10.11591/ijict.v14i2.pp653-662
Pragati Patil , Priyanka Jadhav , Nandini Chaudhari , Nitesh Sureja , Umesh Pawar
Agriculture is crucial to India's economy. Agriculture supports almost 75% of the world's population and much of its gross domestic product (GDP). Climate and environmental changes pose a threat to agriculture. India is recognized for its grapes, a commercially important fruit. Diseases reduce grape yields by 10-30%. If not recognized and treated early, grape diseases can cost farmers a lot. The main grape diseases include downy and powdery mildew, leaf blight, esca, and black rot. This work creates an Android grape disease detection app which uses machine learning. When a farmer submits a snapshot of a diseased grape leaf, the smartphone app identifies the ailment and offers grape plant disease prevention tips. In this research, an android app that detects grape plant illnesses use convolutional neural network (CNN) and AlexNet machine learning architectures. We investigated and compared CNN and AlexNet architecture's efficacy for grape disease detection using accuracy and other metrics. The dataset used comes from Kaggle. CNN and AlexNet architectures yielded 98.04% and 99.03% accuracy. AlexNet was more accurate than CNN in the final result.
Volume: 14
Issue: 2
Page: 653-662
Publish at: 2025-08-01

Leveraging machine learning for column generation in the dial-a-ride problem with driver preferences

10.11591/ijai.v14.i4.pp2826-2838
Sana Ouasaid , Mohammed Saddoune
The dial-a-ride problem (DARP) is a significant challenge in door-to-door transportation, requiring the development of feasible schedules for transportation requests while respecting various constraints. This paper addresses a variant of DARP with time windows and drivers’ preferences (DARPDP). We introduce a solution methodology integrating machine learning (ML) into a column generation (CG) algorithm framework. The problem is reformulated into a master problem and a pricing subproblem. Initially, a clustering-based approach generates the initial columns, followed by a customized ML-based heuristic to solve each pricing subproblem. Experimental results demonstrate the efficiency of our approach: it reduces the number of the new generated columns by up to 25%, accelerating the convergence of the CG algorithm. Furthermore, it achieves a solution cost gap of only 1.08% compared to the best-known solution for large instances, while significantly reducing computation time.
Volume: 14
Issue: 4
Page: 2826-2838
Publish at: 2025-08-01

A fusion convolution neural network-local binary pattern histogram algorithm for emotion recognition in human

10.11591/ijai.v14.i4.pp2734-2740
Arpana G Katti , Chidananda Murthy M V
This paper proposes a fusion of algorithms namely convolution neural networks (CNN) and local binary pattern histogram (LBPH) techniques to comprehend the emotions in humans for greyscale images. In this work, the combined advantages of CNN for its ability to extract features, suitability for image processing and LBPH algorithm to identify the emotions of the human images are included. Though there are enhanced fused algorithms with CNN for image processing, the combination of LBPH with CNN is precise and simple in design. In this work, the secondary data sample is used to recognize the human emotions. The secondary data set consists of 160 samples with emotions of happy, anger, sad, and surprise is considered for making decisions. In comparison, the accuracy of the proposed method is high compared to the other algorithms.
Volume: 14
Issue: 4
Page: 2734-2740
Publish at: 2025-08-01

Data-driven support vector regression-genetic algorithm model for predicting the diphtheria distribution

10.11591/ijai.v14.i4.pp2909-2921
Wiwik Anggraeni , Yeyen Sudiarti , Muhammad Ilham Perdana , Edwin Riksakomara , Adri Gabriel Sooai
Indonesia is one of the countries with the largest number of diphtheria sufferers in the world. Diphtheria is a case of re-emerging disease, especially in Indonesia. Diphtheria can be prevented by immunization. Diphtheria immunization has drastically reduced mortality and susceptibility to diphtheria, but it is still a significant childhood health problem. This study predicted the number of diphtheria patients in several regions using support vector regression (SVR) combined with the genetic algorithm (GA) for parameter optimization. The area is grouped into 3 clusters based on the number of cases. The proposed method is proven to overcome overfitting and avoid local optima. Model robustness tests were carried out in several other regions in each cluster. Based on the experiments in three scenarios and 12 areas, the hybrid model shows good forecasting results with an average mean squared error (MSE) of 0.036 and a symmetric mean absolute percentage error (SMAPE) of 41.2% with a standard deviation of 0.075 and 0.442, respectively. Based on experiments in various scenarios, the SVR-GA model shows better performance than others. Compares two- means tests on MSE and SMAPE were given to prove that SVR-GA models have better performance. The results of this forecasting can be used as a basis for policy-making to minimize the spread of diphtheria cases.
Volume: 14
Issue: 4
Page: 2909-2921
Publish at: 2025-08-01

Contract-based federated learning framework for intrusion detection system in internet of things networks

10.11591/ijai.v14.i4.pp3324-3333
Yuris Mulya Saputra , Divi Galih Prasetyo Putri , Jimmy Trio Putra , Budi Bayu Murti , Wahyono Wahyono
A plethora of national vital infrastructures connected to internet of things (IoT) networks may trigger serious data security vulnerabilities. To address the issue, intrusion detection systems (IDS) were investigated where the behavior and traffic of IoT networks are monitored to determine whether malicious attacks or not occur through centralized learning on a cloud. Nonetheless, such a method requires IoT devices to transmit their local network traffic data to the cloud, thereby leading to data breaches. This paper proposes a federated learning (FL)-based IDS on IoT networks aiming at improving the intrusion detection accuracy without privacy leakage from the IoT devices. Specifically, an IoT service provider can first motivate IoT devices to participate in the FL process via a contract-based incentive mechanism according to their local data. Then, the FL process is executed to predict IoT network traffic types without sending IoT devices’ local data to the cloud. Here, each IoT device performs the learning process locally and only sends the trained model to the cloud for the model update. The proposed FL-based system achieves a higher utility (up to 44%) than that of a non-contract-based incentive mechanism and a higher prediction accuracy (up to 3%) than that of the local learning method using a real-world IoT network traffic dataset.
Volume: 14
Issue: 4
Page: 3324-3333
Publish at: 2025-08-01

Enhancing logo security: VGG19, autoencoder, and sequential fusion for fake logo detection

10.11591/ijict.v14i2.pp506-515
Debani Prasad Mishra , Prajna Jeet Ojha , Arul Kumar Dash , Sai Kanha Sethy , Sandip Ranjan Behera , Surender Reddy Salkuti
This paper deals with a way of detecting fake logos through the integration of visual geometry group-19 (VGG19), an autoencoder, and a sequential model. The approach consists of applying the method to a variety of datasets that have gone through resizing and augmentation, using VGG19 for extracting features effectively and autoencoder for abstracting them in a subtle manner. The combination of these elements in a sequential model account for the improved performance levels as far as accuracy, precision, recall, and F1-score are concerned when compared to existing approaches. This article assesses the strengths and limitations of the method and its adapted comprehension of brand identity symbols. Comparative analysis of these competing approaches reveals the benefits resulting from such fusion. To sum up, this paper is not only a major contribution to the domain of counterfeit logo detection but also suggests prospects for enhancing brand security in the digital world.
Volume: 14
Issue: 2
Page: 506-515
Publish at: 2025-08-01

Performance analysis and comparison of machine learning algorithms for predicting heart disease

10.11591/ijai.v14.i4.pp2849-2863
Neha Bhadu , Jaswinder Singh
Heart disease (HD) is a serious medical condition that has an enormous effect on people's quality of life. Early as well as accurate identification is crucial for preventing and treating HD. Traditional methods of diagnosis may not always be reliable. Non-intrusive methods like machine learning (ML) are proficient in distinguishing between patients with HD and those in good health. The prime objective of this study is to find a robust ML technique that can accurately detect the presence of HD. For this purpose, several ML algorithms were chosen based on the relevant literature studied. For this investigation, two different heart datasets the Cleveland and Statlog datasets were downloaded from Kaggle. The analysis was carried out utilizing the Waikato environment for knowledge analysis (WEKA) 3.9.6 software. To assess how well various algorithms predicted HD, the study employed a variety of performance evaluation metrics and error rates. The findings showed that for both the datasets radio frequency is a better option for predicting HD with an accuracy and receiver operating characteristic (ROC) values of 94% and 0.984 for the Cleveland dataset and 90% and 0.975 for the Statlog dataset. This work may aid researchers in creating early HD detection models and assist medical practitioners in identifying HD.
Volume: 14
Issue: 4
Page: 2849-2863
Publish at: 2025-08-01

Domain-specific knowledge and context in large language models: challenges, concerns, and solutions

10.11591/ijai.v14.i4.pp2568-2578
Kiran Mayee Adavala , Om Adavala
Large language models (LLMs) are ubiquitous today with major usage in the fields of industry, research, and academia. LLMs involve unsupervised learning with large natural language data, obtained mostly from the internet. There are several challenges that arise because of these data sources. One such challenge is with respect to domain-specific knowledge and context. This paper deals with the major challenges faced by LLMs due to data sources, such as, lack of domain expertise, understanding specialized terminology, contextual understanding, data bias, and the limitations of transfer learning. This paper also discusses some solutions for the mitigation of these challenges such as pre-training LLMs on domain-specific corpora, expert annotations, improving transformer models with enhanced attention mechanisms, memory-augmented models, context-aware loss functions, balanced datasets, and the use of knowledge distillation techniques.
Volume: 14
Issue: 4
Page: 2568-2578
Publish at: 2025-08-01

Using the ResNet-50 pre-trained model to improve the classification output of a non-image kidney stone dataset

10.11591/ijai.v14.i4.pp3182-3191
Kazeem Oyebode , Anne Ngozi Odoh
Kidney stone detection based on urine samples seems to be a cost-effective way of detecting the formation of stones. Urine features are usually collected from patients to determine if there is a likelihood of kidney stone formation. There are existing machine learning models that can be used to classify if a stone exists in the kidney, such as the support vector machine (SVM) and deep learning (DL) models. We propose a DL network that works with a pre-trained (ResNet-50) model, making non-image urine features work with an image-based pre-trained model (ResNet-50). Six urine features collected from patients are projected onto 172,800 neurons. This output is then reshaped into a 240 by 240 by 3 tensors. The reshaped output serves as the input to the ResNet-50. The output of this is then sent into a binary classifier to determine if a kidney stone exists or not. The proposed model is benchmarked against the SVM, XGBoost, and two variants of DL networks, and it shows improved performance using the AUC-ROC, Accuracy and F1-score metrics. We demonstrate that combining non-image urine features with an image-based pre-trained model improves classification outcomes, highlighting the potential of integrating heterogeneous data sources for enhanced predictive accuracy.
Volume: 14
Issue: 4
Page: 3182-3191
Publish at: 2025-08-01

Image analysis and machine learning techniques for accurate detection of common mango diseases in warm climates

10.11591/ijai.v14.i4.pp2935-2944
Md Abdullah Al Rahib , Naznin Sultana , Nirjhor Saha , Raju Mia , Monisha Sarkar , Abdus Sattar
Mangoes are valuable crops grown in warm climates, but they often suffer from diseases that harm both the trees and the fruits. This paper proposes a new way to use machine learning to detect these diseases early in mango plants. We focused on common issues like mango fruit diseases, leaf diseases, powdery mildew, anthracnose/blossom blight, and dieback, which are particularly problematic in places like Bangladesh. Our method starts by improving the quality of images of mango plants and then extracting important features from these images. We use a technique called k-means clustering to divide the images into meaningful parts for analysis. After extracting ten key features, we tested various ways to classify the diseases. The random forest algorithm stood out, accurately identifying diseases with a 97.44% success rate. This research is crucial for Bangladesh, where mango farming is essential for the economy. By spotting diseases early, we can improve mango production, quality, and the livelihoods of farmers. This automated system offers a practical way to manage mango diseases in regions with similar climates.
Volume: 14
Issue: 4
Page: 2935-2944
Publish at: 2025-08-01

Investigation on low-performance tuned-regressor of inhibitory concentration targeting the SARS-CoV-2 polyprotein 1ab

10.11591/ijai.v14.i4.pp3003-3013
Daniel Febrian Sengkey , Angelina Stevany Regina Masengi , Alwin Melkie Sambul , Trina Ekawati Tallei , Sherwin Reinaldo Unsratdianto Sompie
Hyperparameter tuning is a key optimization strategy in machine learning (ML), often used with GridSearchCV to find optimal hyperparameter combinations. This study aimed to predict the half-maximal inhibitory concentration (IC50) of small molecules targeting the SARS-CoV-2 replicase polyprotein 1ab (pp1ab) by optimizing three ML algorithms: histogram gradient boosting regressor (HGBR), light gradient boosting regressor (LGBR), and random forest regressor (RFR). Bioactivity data, including duplicates, were processed using three approaches: untreated, aggregation of quantitative bioactivity, and duplicate removal. Molecular features were encoded using twelve types of molecular fingerprints. To optimize the models, hyperparameter tuning with GridSearchCV was applied across a broad parameter space. The results showed that the performance of the models was inconsistent, despite comprehensive hyperparameter tuning. Further analysis showed that the distribution of Murcko fragments was uneven between the training and testing datasets. Key fragments were underrepresented in the testing phase, leading to a mismatch in model predictions. The study demonstrates that hyperparameter tuning alone may not be sufficient to achieve high predictive performance when the distribution of molecular fragments is unbalanced between training and testing datasets. Ensuring fragment diversity across datasets is crucial for improving model reliability in drug discovery applications.
Volume: 14
Issue: 4
Page: 3003-3013
Publish at: 2025-08-01
Show 48 of 1896

Discover Our Library

Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.

Explore Now
Library 3D Ilustration