Articles

Access the latest knowledge in applied science, electrical engineering, computer science and information technology, education, and health.

Filter Icon

Filters article

Years

FAQ Arrow
0
0

Source Title

FAQ Arrow

Authors

FAQ Arrow

29,922 Article Results

Improved YOLOv8 for rail squat detection and identification

10.11591/ijeecs.v40.i2.pp1129-1140
Van-Dinh Do , Phuong-Ty Nguyen , Minh-Tuan Ha
Rail transport plays a vital part in the country's economy by ensuring the safe movement of both goods and passengers. Therefore, maintaining rail safety through consistent surface defect inspection is extremely importan. However, squat defect detection on rail surfaces faces considerable difficulties due to weather impacts, lighting changes, and variations in image contrast. These challenges hinder the accuracy and reliability of traditional inspection methods. To solve this problem, this study proposes an improved YOLOv8 model for the identification and classification of squat defects. The methodology involves capturing images of the rail track, preprocessing them to enhance image quality, labeling squat defects for training purposes, and training the proposed model using the labeled dataset. The improved YOLOv8 model incorporates enhancements such as multi-scale convolution modules and attention mechanisms to improve feature extraction and defect recognition. Experimental results demonstrate the effectiveness of the proposed method, achieving an impressive accuracy of 0.92 in detecting and categorizing squat defects. These findings highlight the potential of the proposed approach to enhance railway safety by providing a reliable and efficient solution for rail surface inspection.
Volume: 40
Issue: 2
Page: 1129-1140
Publish at: 2025-11-01

Classification of voice pathologies using one dimensional feature vector and two dimensional scalogram

10.11591/ijeecs.v40.i2.pp654-666
Ranita Khumukcham , Sharmila Meinam , Kishorjit Nongmeikapam
Most research work focus only on binary classification of voice pathologies such as normal and pathological classification. However, the current work gives importance to multiclass classification too. The paper compares onedimensional (1D) feature vectors based machine learning (ML) techniques and two-dimensional (2D) scalogram image based deep learning (DL) model for binary and multiclass classification of voice pathology. The multiclass classification classifies the voice signal into four categories which are healthy, hyperkinetic dysphonia, hypokinetic dysphonia, and reflux laryngitis. The current work demonstrates the evaluation of 1D feature vectors extracted from speech signal such as MFCC (mel-frequency cepstral coefficient) and pitch with various ML techniques like K-nearest neighbor (KNN), Naïve Bayes, and discriminant analysis (DA). Another technique that uses time-frequency scalograms derived using three different wavelets, i.e., analytical Morlet (amor), Bump, and Morse, are used for training a pretrained GoogleNet architecture, which is a very popular DL model. Experimental results show that 2D scalogram image based DL model for binary (96.05%) and multiclass (89.8%) classification of voice pathology gives better performance while comparing with 1D feature vectors based ML techniques.
Volume: 40
Issue: 2
Page: 654-666
Publish at: 2025-11-01

Query keyword extraction in discriminative marginalized probabilistic neural method for multi-document summarization

10.11591/ijeecs.v40.i2.pp907-915
Bambang Subeno , Indra Budi , Evi Yulianti
The large number of textual documents in the medical field makes it very difficult for readers to obtain comprehensive information. Users usually use a query approach to get the desired information. Using the correct query will produce relevant information. In the existing discriminative marginalized probabilistic neural method, referred to as DAMEN, used for multi-document summarization, a background sentence query is used to retrieve the top-K relevant documents and then generate a summary of these documents. However, the background sentence query used to retrieve the top-K documents did not provide accurate summary results. The author improved the DAMEN model by adding a keyword extraction process to the query background sentence. We call this model Q-DAMEN. Our model shows significant improvement over the original DAMEN method, with the best results achieved by the variation of using a keyword query entered into the discriminator component and a background sentence query entered into the generator component. The multipartieRank keyword extraction method shows the best results with a Rouge-1 value of 29.12, Rouge-2 of 0.79, and Rouge-L of 15.53. The results demonstrate that the more accurate the keywords extracted from the sentence background query, the more accurate the multi-document summaries generated.
Volume: 40
Issue: 2
Page: 907-915
Publish at: 2025-11-01

Deep-learning-based hand gestures recognition applications for game controls

10.11591/ijeecs.v40.i2.pp883-897
Huu-Huy Ngo , Hung Linh Le , Man Ba Tuyen , Vu Dinh Dung , Tran Xuan Thanh
Hand gesture recognition is among the emerging technologies of human computer interaction, and an intuitive and natural interface is more preferable for such applications than a total solution. It is also widely used in multimedia applications. In this paper, a deep learning-based hand gesture recognition sys tem for controlling games is presented, showcasing its significant contributions toward advancing the frontier of natural and intuitive human-computer interac tion. It utilizes MediaPipe to get real-time skeletal information of hand land marks and translates the gestures of the user into smooth control signals through an optimized artificial neural network (ANN) that is tailored for reduced com putational expenses and quicker inference. The proposed model, which was trained on a carefully selected dataset of four gesture classes under different lighting and viewing conditions, shows very good generalization performance and robustness. It gives a recognition rate of 99.92% with much fewer param eters than deeper models such as ResNet50 and VGG16. By achieving high accuracy, computational speed, and low latency, this work addresses some of the most important challenges in gesture recognition and opens the way for new applications in gaming, virtual reality, and other interactive fields.
Volume: 40
Issue: 2
Page: 883-897
Publish at: 2025-11-01

Mediterranean and northern european archaeology: a computational comparison

10.11591/csit.v6i3.p326-334
Hamza Kchan , Saira Noor
Despite the proliferation of computational tools in archaeology, few studies systematically compare their regional adaptations or explore the epistemological assumptions guiding their application. This paper addresses four critical research gaps: (i) the lack of comparative regional analysis between the Mediterranean and Northern Europe in computational archaeology, (ii) the insufficient integration of philosophical and epistemological frameworks in predictive modeling, (iii) the underexplored application of artificial intelligence (AI) and network theory in spatial analysis, and (iv) the limited interdisciplinary synthesis of biological, geospatial, and digital data. By examining representative case studies from both regions, we highlight the methodological innovations, theoretical orientations, and institutional dynamics that shape regional practices. The study underscores the necessity of integrating computational methods with interpretive depth and interdisciplinary collaboration to foster a more reflective and inclusive digital archaeology. 
Volume: 6
Issue: 3
Page: 326-334
Publish at: 2025-11-01

Enhancing document text classification using hybrid deep contextual and correlation network

10.11591/ijeecs.v40.i2.pp1100-1108
Shilpa Shilpa , Shridevi Soma
Document analysis involves the extraction and processing of information from documents, a task increasingly automated through the use of deep learning (DL) technologies. Despite the high predictive power of DL models, their black-box nature poses challenges to transparency and interpretability, hindering their integration into the industry. This paper introduces the hybrid deep contextual and correlation network (HDCCNet), a novel methodology designed to improve both the accuracy and interpretability of multi-category classification tasks. HDCCNet leverages a hybrid layer category correlation module to deepen category connections, thereby enhancing the understanding and prediction of category interrelations. To address potential prediction divergence, residual connections are incorporated, ensuring stable and reliable performance. Furthermore, HDCCNet reduces model parameters, accelerating convergence and making the model more efficient. This efficiency is particularly beneficial for practical applications, allowing faster deployment and scalability. By bridging the gap between DL’s capabilities and industry needs for transparency, HDCCNet provides a robust solution for automated document processing, paving the way for broader adoption of DL technologies in business environments.
Volume: 40
Issue: 2
Page: 1100-1108
Publish at: 2025-11-01

Robot vision and virtual reality integration to help paralyzed patients mobility

10.11591/ijeecs.v40.i2.pp610-618
Abdul Jalil , I Wayan Suparno
This study aims to develop a device that can assist the mobility of paralyzed patients, enabling them to communicate with family and caregivers by integrating robot vision and virtual reality (VR). The method used to connect audio and visual data communication between robot vision and VR is by utilizing the robot operating system (ROS2) middleware communication node through topics over a wireless network. In this research, paralyzed individuals can maneuver based on the movement direction of robot vision, which is remotely controlled via a joystick through Bluetooth communication. The input devices used in this system include a camera, microphone, joystick, and ultrasonic sensors. The processing part uses a Raspberry Pi as the data processing center, and the output includes a DC motor, servo motor, speaker, 5-inch monitor, and headset. The results indicate that the integration of robot vision and VR can assist paralyzed individuals in communicating with family or caregivers at distances of up to 10 meters. This is due to the maximum joystick control range for moving the robot via Bluetooth communication being 10 meters. Furthermore, this study shows that the use of robot vision and VR can improve paralyzed patients’ motivation, supporting the medical field in patient care.
Volume: 40
Issue: 2
Page: 610-618
Publish at: 2025-11-01

Intrusion detection system using hybrid CNN-LSTM model in cloud computing

10.11591/ijeecs.v40.i2.pp840-849
Maha Mohammad Alshehri , Shoog Abdullah Alshehri , Samah Hazzaa Alajmani
Cloud computing has revolutionized online service delivery with its flexibility and cost efficiency. Nevertheless, the growing importance of stored data makes it a target for cyberattacks, posing security and privacy risks. This calls for effective solutions to safeguard data and infrastructure, particularly with regard to intrusion attacks and distributed attacks such as distributed denial of service (DDoS). Therefore, there is a need to develop an effective intrusion detection system (IDS) using deep learning to ensure the protection of cloud data and infrastructure. In this paper, a hybrid model aims to leverage the power of convolutional neural networks (CNNs) to analyze spatial features and extract complex patterns, while long short-term memory LSTMs are used to understand temporal data sequences and detect attacks that evolve over time to detect intrusions in cloud computing environments on the CSE-CIC-IDS2018 dataset. The model was trained and tested on DDoS attacks, and the results demonstrated high performance in detecting attacks with high accuracy and efficiency. This hybrid model achieved an accuracy of 99.88%, a precision of 99.83%, a recall of 99.94%, and an F1-score of 99.88%.
Volume: 40
Issue: 2
Page: 840-849
Publish at: 2025-11-01

A deep learning-integrated proxy model for efficient cryptocurrency payments

10.11591/ijeecs.v40.i2.pp1023-1039
Vinay Kumar Kasula , Akhila Reddy Yadulla , Bhargavi Konda , Mounica Yenugula , Supraja Ayyamgari
Blockchain technology allows decentralized cryptocurrencies to change digital finances by providing secure, pseudonymous transactions to users. Since blockchain ledgers operate in a public environment, users can face potential privacy risks due to the exposure of their transaction patterns. Conventional cryptocurrency systems use block generation for transaction confirmation, yet this process produces latency and impacts the real-time efficiency of transactions. This paper develops a proxy-assisted cryptocurrency payment system that employs blind signature principles to achieve better system privacy and enhanced speed. The core functionality of this proposed system aims to protect transaction secrecy as it speeds up confirmation processes. A proxy node handles transaction requests through blind signature protocols that guarantee data confidentiality as part of the methodology. The proposed system utilizes deep learning tools, which include recurrent neural networks (RNN), graph neural networks (GNN), and reinforcement learning (RL) to forecast confirmation results, identify scams, and control proxy functions dynamically. Research indicates that the introduced method substantially boosts privacy features, decreases transaction latencies, and enhances the security of all transactions by providing an encouraging roadmap for secure cryptocurrency systems that preserve privacy.
Volume: 40
Issue: 2
Page: 1023-1039
Publish at: 2025-11-01

Laryngeal pathology detection using EMD-based voice acoustic features analysis and SVM-RBF

10.11591/ijeecs.v40.i2.pp640-653
Sofiane Cherif , Abdelhafid Kaddour , Abdelmoudjib Benkada , Said Karoui , Ouissem Chibani Bahi , Asmaa Bouzid Daho
Traditional techniques for detecting laryngeal pathologies, such as laryngoscopy and endoscopy, are costly and invasive. This study presents a novel approach for detecting laryngeal disorders using empirical mode decomposition (EMD)-based acoustic features analysis and support vector machine (SVM) with a radial basis function (RBF) kernel. The experiments were conducted using the Saarbrucken voice database (SVD). The voice signals were then decomposed using EMD to extract the intrinsic mode functions (IMFs). The IMF with the highest energy value was selected as the most relevant. A set of acoustic features, including mel-frequency cepstral coefficients (MFCCs), linear predictive cepstral coefficients (LPCCs), Pitch (fundamental frequency), higher-order statistics (HOSs), zero-crossing rate (ZCR), spectral centroid (SC), and spectral roll-off (SRO), is derived from the most relevant IMFs and fed into an SVM classifier to differentiate between healthy and pathological voices. Experimental results demonstrate the effectiveness of the proposed methodology, achieving a high classification accuracy of 94.5%, a sensitivity of 94.2%, a specificity of 95.3%, and an F1 score of 96.1%, outperforming conventional approaches. These results highlight the potential of EMD-based voice analysis as a non-invasive and reliable tool for early diagnosis of laryngeal disorders.
Volume: 40
Issue: 2
Page: 640-653
Publish at: 2025-11-01

Efficient lung disease detection using a hybrid vision transformer and YOLO framework with transfer learning

10.11591/ijeecs.v40.i2.pp1141-1148
Kashaf Khan , Abdul Aleem
Lung diseases are among the most important causes of morbidity and mortality worldwide; it require prompt and accurate diagnosis methods. A novel hybrid deep learning framework for integrating you only look once version 8 (YOLOv8), considering real-time detection and vision transformer (ViT-B/16) for global context-based classification of lung diseases in chest X-ray images, is presented. Based on transfer learning and a two-stage detection-classification pipeline, this proposed model is applicable to dealing with inter-image variability, overlapped disease features and lack of annotated medical examples. Our developed hybrid model achieves the highest classification accuracy of 96.8% and 0.98 AUC-ROC on the National Institutes of Health (NIH) Chest X-ray dataset, which consists of over 112,000 images covering 14 diseases, and outperforms its several current state-of-the-art models. In addition, attention heatmaps and bounding box visualizations highly correlate with clinical variables and enhance interpretability. This paper demonstrates the practicability of hybrid vision driven architectures for better medical image analysis and shows their integration into clinical decision-support systems.
Volume: 40
Issue: 2
Page: 1141-1148
Publish at: 2025-11-01

Detection of short circuit faults in two-level voltage source inverter using convolution neural network

10.11591/ijeecs.v40.i2.pp580-589
Sai Aioub , Belghiti Zakariya , El Menzhi Lamiaâ
Voltage source inverters (VSIs) play a critical role in modern industrial systems, particularly in controlling the operation of equipment such as induction motors. Ensuring their reliable performance is crucial, as faults like short circuits can severely disrupt industrial processes. This paper introduces a new diagnostic approach for detecting and localizing short circuit faults in VSIs. The method uses Lissajous curves derived from the Clark transformation of the VSI’s 3-phase voltage components (Vα, Vβ). These curves serve as input data for a convolutional neural networks (CNNs) model, enabling the accurate classification of single and double short circuit faults. Simulation results using MATLAB/Simulink demonstrate that the proposed method achieves 100% classification accuracy within 100 ms, highlighting its suitability for real-time applications. The approach offers significant advantages in speed and accuracy over traditional techniques, with potential implications for enhancing the reliability and safety of inverter-driven systems in industrial environments.
Volume: 40
Issue: 2
Page: 580-589
Publish at: 2025-11-01

Vehicle recognition on indian roads using data augmentation and VGG-16 model

10.11591/ijeecs.v40.i2.pp1177-1186
Arunkumar K. L. , Poornima K. M. , Ajit Danti , Manjunatha H. T.
In an advanced intelligent transportation system vehicle recognition and classi f ication is very significant. In current research trend, recognition of vehicles is done byusingmachinelearning (ML)andcomputervisiontechniques. Vehicle’s multi-view images or videos with different lighting conditions are annotated and given to the deep neural network to build an automated system to recognize the vehicles models. The augmentation of data can increase the number of sam ples in learning, with the small available datasets. Geometric transformations, brightness changes, and different filter operations are applied to the data through data augmentation. Furthermore, be orthogonal experiments we determine the optimal data augmentation method to obtain 96% accuracy in results. Detailed information is reported based on the classification of four different types of vehi cles and the results show that convolutional neural network with 16 layers deep techniques are effective in solving challenging tasks while recognizing moving vehicles.
Volume: 40
Issue: 2
Page: 1177-1186
Publish at: 2025-11-01

Evaluation of the impact of machine learning on the prediction of residential energy consumption

10.11591/ijeecs.v40.i2.pp567-579
Richar Martín Machaca-Casani , Luis Alfredo Figueroa-Mayta , Joel Contreras-Nuñez
The objective of this research was to compare the performance of machine learning models and traditional statistical methods for the prediction of residential energy consumption, using a dataset with relevant variables such as consumption, temperature, time of day, type of housing, and energy usage habits. A quantitative and comparative methodology was applied, involving data preprocessing, variable encoding, and normalization, as well as division into training and testing sets. The random forest, support vector machine (SVM), deep neural network (MLP), and linear regression models were trained and evaluated using standard metrics such as mean absolute error (MAE), root mean squared error (RMSE), and R² on test and cross-validation sets. Results show that SVM and linear regression achieved better accuracy and generalization capability, while random forest and the deep neural network exhibited lower explanatory power, reflected in negative R² values. Using the trained models, a projection of residential energy consumption for the 2026–2030 period was performed, revealing a generally increasing trend across all models, although with differences in the magnitude of the predictions. In conclusion, under the current conditions, traditional models demonstrate greater robustness, highlighting the need to tailor algorithm selection to the data context. These projections provide a valuable tool for future energy planning.
Volume: 40
Issue: 2
Page: 567-579
Publish at: 2025-11-01

Ensemble recursive feature elimination-based ensemble classification for medical diagnosis

10.11591/ijeecs.v40.i2.pp758-771
Thirumalaimuthu Thirumalaiappan Ramanathan , Md. Jakir Hossen , Abdullah Al Mamun , Joseph Emerson Raja
The application of data mining techniques for the extraction of patterns from medical datasets is useful in the prediction of various diseases from the data of patients. An appropriate feature selection method is required for the medical datasets to give better results for the medical data mining process. In data preprocessing, feature selection is an important process that finds the most relevant features from the dataset. Considering all features of the medical dataset without using any feature selection process may sometimes lead to inaccurate results. Most of the medical datasets contain meaningless data that are not relevant to the data mining process. These data can be eliminated through the feature selection process. This paper presents an integration of an ensemble feature selection approach and an ensemble classification approach through a classifier called the ensemble recursive feature elimination-based ensemble classifier (ERFE-EC) for the classification of medical data. Four different medical datasets were used for testing the ERFE-EC method, which showed promising results.
Volume: 40
Issue: 2
Page: 758-771
Publish at: 2025-11-01
Show 100 of 1995

Discover Our Library

Embark on a journey through our expansive collection of articles and let curiosity lead your path to innovation.

Explore Now
Library 3D Ilustration