Inter national J our nal of Electrical and Computer Engineering (IJECE) V ol. 15, No. 4, August 2025, pp. 4279 4295 ISSN: 2088-8708, DOI: 10.11591/ijece.v15i4.pp4279-4295 4279 Ensemble of con v olutional neural netw ork and DeepResNet f or multimodal biometric authentication system Ashwini Kailas 1 , Madhusudan Girimallaih 2 , Mallego wda Madigahalli 3 , V asantha K umara Mahade v achar 4 , Pranothi Kadir ehally Somashekarappa 1 1 Department of Bio Medical Engineering, Sri Siddartha Institute of T echnology , Sri Siddartha Academy of Higher Education, T umkur , India 2 Department of Computer Science and Engineering, Sri Jayachamarajendra Colle ge of Engineering-Mysore, JSS Science and T echnology Uni v ersity , Mysore, India 3 Department of Computer Science and Engineering, Ramaiah Institute of T echnology-Bang alore, V isv esv araya T echnological Uni v ersity , Belag a vi, India 4 Department of Computer Science and Engineering, Go v ernment Engineering Colle ge-Hassan, V isv esv araya T echnological Uni v ersity , Belag a vi, India Article Inf o Article history: Recei v ed Jun 12, 2024 Re vised Dec 12, 2024 Accepted Jan 16, 2025 K eyw ords: Biometric authentication Con v olution neural netw ork Deep ResNet ECG-iris Ensemble deep learning Multimodal ABSTRA CT Multimodal biometrics technology has g arnered attention recently for its abil- ity to address inherent limitations found in single biometric modalities and to enhance o v eral l recognition rates. A typical biometric recognition system com- prises sensing, fe ature e xtraction, and matching modules. The system’ s rob ust- ness hea vily relies on its capability to ef fecti v ely e xtract pertinent information from indi vidual biometric traits. This study introduces a no v el feature e xtraction technique tailored for a multimodal biome tric system uti lizing electrocardio- gram (ECG) and iris traits. The ECG helps to incorporate the li v eliness related information and Iris helps to produce the unique pattern for each indi vidual. Therefore, this w ork presents a multimodal authentication system where data pre-processing is performed on image and ECG data where noise remo v al and quality enhancement tasks are performed. Later , feature e xtraction is carried out for ECG signals by estimating the Heart rate v a riability feature analysis in time and frequenc y domain. Finally , the ensemble of con v olution neural net- w ork (CNN) and DeepResNet models are used to perform the classication. the o v erall accurac y is reported as 0.8900, 0.8400, 0.7900, 0.8932, 0.87, and 0.97 by using con v olutional neural netw ork-long short-term memory (CNN-LSTM), support v ector machine (SVM), random forest (RF), CNN, decision tree (DT), and proposed MB ANet approach respecti v ely . This is an open access article under the CC BY -SA license . Corresponding A uthor: Ashwini Kailas Department of Biomedical Engineering, Sri Siddartha Institute of T echnology , Sri Siddartha Aca d e my of Higher Education T umkur , India Email: ashwinik@ssit.edu.in 1. INTR ODUCTION Recently , the biometric recognition systems ha v e g ained prominence as a primary means of user au- thentication across v arious se ctors and applications, including smartphones, banking services, websites, and airports. Depending on the required le v el of security , the y pro vide a clear substitute for con v entional authen- tication techniques lik e k e ys and personal identication numbers (PINs) [1], [2]. F or the purpose of feature J ournal homepage: http://ijece .iaescor e .com Evaluation Warning : The document was created with Spire.PDF for Python.
4280 ISSN: 2088-8708 recognition, it is necessary to rst enroll biometric qualities that are often used, such as v oice, f ace features, ngerprints, palmprints, iris patterns, and f acial features, into a database [3], [4]. Biometrics is a more straight- forw ard and secure substitute for traditional authentication techniques. It includes both ph ysiological and beha vioral characteristics that are used to statistically dif ferentiate persons [5]. Ph ysiological traits encom- pass both e xternal features such as ngerprints, iris patterns, f acial characteristics, and v ein patterns, as well as internal attrib utes lik e electrocardiogram (ECG), electromyograph y (EMG), and brainw a v e (EEG) patterns. Beha vioral traits, on the other hand, in v olv e habit-based characteristics such as v oice patterns, g ait, and sig- natures [6], [7]. Furthermore, researchers ha v e e xplored the combination of multiple biometric modalities to enhance the rob ustness of identication systems [8]. Despite the widespread adoption of biometrics in v arious de vices and services, the y remain vulnerable to spoong attempts. Ho we v er , the current technological adv ance- ments ha v e raise the security concerns for these syst ems and mak e them more vulnerable to v arious security threats. A typical f ace-or ngerprint-spoong attack w as in v estig ated and co v ered in [9]-[11]. A consideration should be gi v en to li v eness detection or continuous biometric authentication techniques in order to defend ag ainst presentation attacks and unauthorized user accessibility to the systems [12]-[14]. Using a non-in v asi v e, quantiable sensor that can g ather users’ biometric information, perpetual biometric authentication continually v eries the identication of the user . Consequently , because of the distincti v e features of the ECG signals, continuous biometric authentication has dra wn a lot of interest as a potentially e xtremely viable ne xt-generation approach. The ECG is a skin-attached electrode-deri v ed electrical signal that consists of three unique elements: the T -w a v e, QRS comple x, and P-w a v e [15]. V ariations in ECG patterns among indi viduals can be attrib uted to three primary rea sons. First of all, indi vidual dif ferences e xist in ph ysiological parameters including cardiac mass, size, conducti vit y , and acti vity . Second, ECG pattern v ariability is inuence d by geometrical parameters arising from dif ferences in the location and v ector of the heart. Finally , the specic structure and mak eup of the heart are inuenced by indi vidual deoxyribonucleic acid (DN A traits). Ne v ertheless, because the ECG is an electrical transmission, v ariations in heart rate and ambient f actors might af fect its reading. Moreo v er , the reliability of unimodal authentication systems decreases for increased sample size [16]. Multimodal biometric systems incorporate a minimum of tw o biometric features in comparison to unimodal biometric systems in order to impro v e recognition precision and strengthen defenses from spoong attacks [17], [18]. Since both ngerprints and high-quality heart signals may be concurrently tak en from the ngertips, ngerprints and heart s ignals pro vide a perfect combination for multimodal fusion. Heart signal possesses a li v eness property that enhances their security as a biometric modality , and their fusion with nger - prints holds promise for establishing a rob ust and secure authentication and identication system [19], [20]. Numerous multimodal biometric systems inte grating ngerprints and heart signal ha v e been proposed in the literature. Bala et al. [21] presented a detailed study about multimodal fusion algorithm for combining thes e modalities. K om eili et al. [22] introduced a multimodal system that inte grates ngerprints and heart signal while incorporating automatic template updating of heart signal records. By combining ngerprint authen- tication with heart signal data, Jomma et al. [23], [24] used a sequential mechanism to impro v e ngerprint authentication’ s resilience ag ainst presentation attack. In a similar v ein, the reason iris-based biometric identication is so well-lik ed is due to its e xceptional reliability and ef cac y as a means of human dif ferentiation [25]. Because iris patterns naturally are so easily distinguished, the human iris pro vides signicant scientic adv antages. The primary benet is stability , as an indi vidual’ s iris does not alter . Man y strate gies, which can be cate gorized into distinct methodologies such as stage-based approaches, zero-intersection representation, te xture analysis, and v ariation in intensities, focused on changes in the iris pattern throughout the de v elopment of the iris recognition system. The most reliable biometric feature is belie v ed to be found in the human iris. When used in surv eillance-based systems, such as when utilizing the iris template’ s te xture changes, it may be quite benecial. The method suggested in [26] separates into subblocks after re v ealing the iris te xture using a 2D Gabor lter bank. Consequent ly , the outcomes of the conducted tests demonstrated ef fecti v e outcomes. The method in [27] for identity identication mak es use of deep learning. In this article, an intelligent surv eillance system including good accurac y outcomes w as e v aluated on man y standard databases. Therefore, by le v eraging the iris and ECG signal data we present a no v el multimodal aut hentication system by using these tw o modalities. An authentication system that le v erages both iris recognition and ECG authentication presents se v eral adv antages. Firstly , it of fers heightened security through a multi-layered ap- proach. Iris patterns and ECG signals are unique to indi viduals, making it challenging for unauthorized users Int J Elec & Comp Eng, V ol. 15, No. 4, August 2025: 4279-4295 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Elec & Comp Eng ISSN: 2088-8708 4281 to mimic or spoof them ef fecti v ely . This multi-f actor authentication signicantly reduces the risk of unautho- rized access. Secondly , the inte gration of iris recognition and ECG authentication results in enhanced accurac y during identity v erication processes. Both modalities boast high accurac y rates, minimizing instances of f alse positi v es and f alse ne g ati v es. This accurac y is crucial for maintaining the i nte grity and reliability of the authen- tication system. Furthermore, the combination of iris recognition and ECG authentication pro vides resistance ag ainst v arious spoong attempts. Attempts to for ge or replicate iris patterns or ECG signals are e xceedingly dif cult, reinforcing the system’ s rob ustness ag ainst fraudulent acti vities. Additionally , users benet from the con v enience of non-intrusi v e biometric authentication methods. Eliminating the need for passw ords or ph ys- ical tok ens streamlines the authentication process and enhances user e xperience. Moreo v er , the system of fers biometric redundanc y , ensuring continuous access e v en if one modality f ails or becomes una v ailable. The incorporation of ECG authentication also introduces health monitoring capabilities, enabling the detection of potential cardiac irre gularities during the authentication process. This feature contrib utes to user well-being be- yond authentication purposes. Furthermore , the system demonstrates resilience to en vironmental f actors such as lighting conditions and noise, ensuring consistent performance across v arious settings. The proposed w ork can be adopted in v arious application domains such as medical signal proces s- ing, biometric authenticaton, telecommunication and remote sensing, and industrial monitoring and controls. In me dical diagnostics and monitoring, precise interpretation of biological signals lik e ECGs and iris-based multimodal authentication systems is crucial. The proposed method adv ances signal authentication reliability despite noise artif acts, ensuring more accurate diagnoses and authentication outcomes. This capability en- hances patient care quality and medical procedure ef cienc y . Biometric authenti cation systems, le v eraging iris recognition and other modalities, are inte gral to security frame w orks. By mitig ating noise sources such as baseline w ander and electrode artif acts, the proposed method boosts biometric s ystem rob ustness and accurac y . This enhancement forties security protocols, reducing unauthorized access risks and safe guarding sensiti v e data and f acilities. Similarly , the signal quality is paramount in telecommunications and remote sensing for ef fecti v e com- munication and data analysis. Noise interference can de grade performance signicantly . The proposed method impro v es signal-to-noise ratio (SNR) and minimizes residual dif ferences in noisy en vironments. This adv ance- ment enhances data transmission reliability and f acilitates precise remote sensing observ ations, supporting scientic and operational objecti v es.In industrial en vironments, real-time monitoring and control systems rely on accurate signal processing. Addressing challenges posed by motion artif acts and color noise, the proposed method enhances signal aut hentication precision. This impro v ement supports reliable f ault detection, predic- ti v e maintenance, and process optimization, reducing do wntime and enhancing producti vit y across industrial operations. Lastly , the use of iris patterns and ECG signals preserv es user pri v ac y by a v oiding the collect ion of personally identiable information. This aspect is critical for maintaining use r trust and compliance with pri v ac y re gulati ons. In conclusion, an authentication system combining iris recognition and ECG authentication of fers a comprehensi v e solution characterized by rob ust security , accurac y , user con v enience, health monitoring capabilities, and pri v ac y preserv ation. Based on these adv antages, the main contrib ution of this w ork can be listed as follo ws: i) to present a data pre-processing method for ECG and iris image data; ii) to perform ECG ltering and image denoising where ECG ltering is carried out with the help of e xtended Kalman lter , whereas image ltering uses a w a v elet transform model; iii) to present a heart rate feature analysis in time and frequenc y domain for ECG signals; and i v) to present an ensemble of CNN and DeepResNet- based transfer learning models for classication. 2. PR OPOSED MB ANET MODEL FOR REAL TIME A UTHENTICA TION In this section we describe the MB ANet approach for real-time authentication by using ECG and iris modalities. F or each user , the ECG and iris data is capt ured and stored. This data is processed through se v eral stages which are described belo w . The complete architecture of proposed model is depicted in Figure 1. Gen- erally , an electrocardiogram is recorded by af xing electrodes to the patient’ s body , through which electrical signals are recei v ed by the de vice. Consequently , the quality of the ECG signal obtained is directly inuenced by the contact between these electrodes and the user’ s skin. Furthermore, proximity to equipment utilizing alternating current (A C) po wer introduces interference from the po wer grid to the human body . These tw o forms of noise signicantly impact the recei v ed ECG signal quality , necessitating their elimination. Similarly , Ensemble of con volutional neur al network and DeepResNet for ... (Ashwini Kailas) Evaluation Warning : The document was created with Spire.PDF for Python.
4282 ISSN: 2088-8708 the quality of iris images is af fected due to dif ferent types of noise. Therefore, the rst phase focuses on de- v elopment of an ef cient approach for ECG signal ltering and noise remo v al from iris images. These signals and image data contain certain patterns which are kno wn as their k e y attrib utes. Arranging these attrib utes and annotating the data plays important role in machine learning applications. Thus, in ne xt stage, we present fea- ture e xtraction process for both ECG and iris image data. Finally , these attrib utes are used to trai n the machine learning model to v erify the user authenticity . The training process requires a x ed ratio of dataset for training and remaining samples are used for testing and v alidation purpose. Figure 1. Proposed MB ANet architecture 2.1. Noise model f or ECG signal As discussed before, the ECG signals gets contaminated due to dif ferent types of noises. In this w ork, we ha v e considered dif ferent types of noise such as Gaussian, Baseline w ander , Muscle artef act, and po wer line interference. The details of these noises and their e xpressions are described belo w: a. Gaussian noise Gaussian white noise is often used to model random uctuations in the ECG signal. It is char acterized by a constant v ariance and zero mean. In the dynamic model, w k , representing process noise, can be modeled as Gaussian white noise. Similarly , in the measurement model, v k , representing measurement noise, can also be modeled as Gaussian white noise. The co v ariance matrices Q and R in the prediction and update steps of the EKF reect the v ariance of the process and measurement noise, respecti v ely . The Gaussian white noise is e xpressed as (1): w ( t ) N (0 , σ 2 ) (1) where N (0 , σ 2 ) represents a Gaussian distrib ution with mean 0 and v ariance σ 2 . b . Baseline w ander noise Baseline w ander refers to lo w-frequenc y drifts in the ECG signal caused by v arious f actors such as respiration and mo v ement artif acts. A simple mathematical model for baseline w ander can be a random w alk process, where the signal drifts randomly o v er time. The baseline w ander b ( t ) can be e xpressed as (2): b ( t + 1) = b ( t ) + ϵ (2) where b ( t ) represents the baseline w ander at time t , and ϵ is a random noise component at each time step. c. Muscle artif acts Muscle artif acts introduce high-frequenc y noise spik es in the ECG signal, often caused by muscle contractions or mo v ement. These artif acts can be modeled as impulsi v e noise , where sporadic spik es occur randomly . The muscle artif acts m ( t ) can be modeled as (3): m ( t ) = A · δ ( t t i ) (3) where - A represents the amplitude of the artif act. - δ ( t t i ) is the Dirac delta function, representing the spik e occurring at time t i . Int J Elec & Comp Eng, V ol. 15, No. 4, August 2025: 4279-4295 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Elec & Comp Eng ISSN: 2088-8708 4283 d. Po wer line interference Po wer line interference introduces periodic noise at the frequenc y of the po wer supply (e.g., 50 Hz or 60 Hz). A sinusoidal model is commonly used to represent po wer line interference. It can be e xpressed as (4): p ( t ) = A · sin(2 π f t + ϕ ) (4) where - A is the amplitude of the interference, - f is the frequenc y of the po wer supply , - ϕ is the phase angle. 2.2. ECG ltering and image denoising This subsection presents the solution for ECG signal ltering and image denoising. The ECG ltering model uses the Extended Kalman ltering model to eliminate the noise from the ECG signal. The standard Kalman ltering model is a recursi v e techni que for data ltering and is widely adopted in data pre-processing and ltering tasks. ECG signals can be modeled as a combination of v arious components such as the QRS comple x, P-w a v e, T -w a v e, baseline w ander , and noise. A com mon model for the ECG signal can be represented as (5): y ( t ) = s ( t ) + n ( t ) (5) where y ( t ) is the observ ed ECG signal, s ( t ) is the true underlying signal, and n ( t ) represents the noise. Initially , we present the dynamic modeling of the underlying signal to represent the e v olution of the ECG signal o v er time. In the case of ECG signal ltering, this could be a rst-order model for the state e v olution. F or instance, it can be e xpressed as (6): x k +1 = F · x k + w k (6) where x k represents the state of the system at time step k , which could include paramete rs such as amplitude and frequenc y , F is the state transition matrix, and w k represents the process noise. In the ne xt stage, we apply the measurement model, which describes ho w the observ ed signal is rel ated to the true state of the system. In this case, it could be a linear or non-linear function depending on the specic characteristics of the ECG signal. The measurement model can be e xpressed as (7): z k = H · x k + v k (7) where z k represents the observ ed ECG signal at time step k , H is the measurem ent matrix, and v k represents the measurement noise. Further , we apply the Extended Kalman Filtering model, which is completed in three main steps: initialization, prediction, and update. These steps can be described as follo ws: Initialization: Initialize the state v ector x 0 and the error co v ariance matrix P 0 . Prediction: Predict the ne xt state using the dynamic model as: ˆ x k +1 | k = F · ˆ x k Predict the error co v ariance matrix: P k +1 | k = F · P k · F T + Q where Q represents the process noise co v ariance matrix. Update: Compute the Kalman Gain: K k +1 = P k +1 | k · H T · H · P k +1 | k · H T + R 1 where R is the measure- ment noise co v ariance matrix. Update the state estimate: ˆ x k +1 = ˆ x k +1 | k + K k +1 · z k +1 H · ˆ x k +1 | k Update the error co- v ariance matrix: P k +1 = ( I K k +1 · H ) · P k +1 | k Repeat the prediction and update steps for each time step, incorporati ng ne w meas urements and re- ning the state estimate. The nal output of the lter is the estimated ECG signal, which is the state estimate ˆ x k at each time step. Similarly , we apply an image ltering model using t h e w a v elet transform approach. The w a v elet transform dif fers from the F ourier transform by e mplo ying a nite decaying w a v elet basis in place of the innite trigonometric basis. Unlik e the F ourier basis, the w a v elet basis possesses nite ener gy , typically fo- cusing around a singular point, and inte grates to zero. While the F ourier transform relies solely on the v ariable ω , the w a v elet transform introduces tw o v ariables: scale a and translation b . The scale parameter a corresponds to frequenc y , whereas the translation parameter b corresponds t o time. Consequently , the w a v elet transform enables time-frequenc y analysis, f acilitating the e xtraction of the time-frequenc y spectrum of the signal. By Ensemble of con volutional neur al network and DeepResNet for ... (Ashwini Kailas) Evaluation Warning : The document was created with Spire.PDF for Python.
4284 ISSN: 2088-8708 utilizing the scaling and translation of the mother w a v elet function, a w a v elet sequence can be generated, with its general form e xpressed as (8): ψ a,b ( t ) = 1 a ψ t b a , a, b R (8) During the w a v elet transform process, the scale f act or a and time shift b are theoretically continuous, which poses computational challenges for nite-time e x ecution. T o address this, the discret e w a v elet transform (D WT) discretizes the scale f actor a and time shift b based on specic rules. By adopting discrete v alues for a and b , the D WT enables computationally feasible analysis. Opting for po wer -of-2 v alues for a and b enhances the accurac y and ef cienc y of signal analysis. The w a v elet function can be e xpressed as (9): ψ m,n ( k ) = 2 m 2 ψ 2 m · k n , m, n Z (9) The w a v elet transform is capable of breaking do wn the original image data into approximate and detailed components, which primarily re v eal the noise present in the image. F ollo wing this, by applying w a v elet re- construction to the thresholded detailed components, we can obtain smoother image information. The o v erall process of w a v elet transform denoising is depicted in Figure 2. Figure 2. W a v elet transform for image denoising 2.3. F eatur e extraction F or ECG signal, we ha v e considered P an T ompkins peak detection approach to identify the v arious peaks of ECG signal. Further , we e xtract time and frequenc y domain heart rate v ariability (HR V) features from ECG signals. Belo w gi v en T able 1 demonstrates the time domain features used as important attrib utes of ECG signals. Similarly , we e xtract frequenc y domain feature for the ECG signal. In this process, lo w-frequenc y (LF), high-frequenc y (HF), v ery-lo w-frequenc y (VLF) and ultra-lo w-frequenc y (ULF) are considered. T able 1. HR V features time domain Feature Description Measurement unit SDNN Standard de viation of NN interv als ms SD ANN Standard de viation of mean of NN interv als in 5 min windo ws ms RMSSD Square root of the mean of the sum of the squares of dif ferences between adjacent NN interv als ms SDNN inde x Mean of the standard de viation of all NN interv als performed on all 5-minute se gments of the entire recording ms SDSD Standard de viation of dif ferences between adjacent NN interv als ms NN50 The count of number of pairs of adjacent NN interv als dif fering by more than 50 ms ms pNN50 NN50 count di vided by the total number of all NN interv als % 2.4. Classication In this w ork, we apply tw o dif ferent classier approach by using deep learning system and combined result is cons idered as nal outcome. F or e xample, if ECG signal is authenticated and Iris image authentication Int J Elec & Comp Eng, V ol. 15, No. 4, August 2025: 4279-4295 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Elec & Comp Eng ISSN: 2088-8708 4285 f ails then the system considers t he imposter input. In order to classify ECG signals, we ha v e considered ensemble of three single CNN classier . The CNN model relearns the features produced by the single netw ork. Each CNN model uses rectied linear units (ReLU), Leakage ReLU(LReLU), and e xponential linear units (ELU), respecti v ely . Figure 3 depicts the o v erall architecture of MB ANet model. The HR V features time domain is depicted in T able 2. Figure 3. ECG classication T able 2. HR V features time domain Feature Description Measurement unit LF peak Peak frequenc y of the current lo w-frequenc y band (0.04–0.15Hz) Hz HF peak Peak frequenc y of the high-frequenc y band (0.15–0.4Hz) Hz LF po wer Absolute po wer of the lo w-frequenc y band (0.04–0.15Hz) ms2 Relati v e po wer of the lo w-frequenc y band (0.04–0.15Hz) in normal units nu Relati v e po wer of the lo w-frequenc y band (0.04–0.15Hz) % HF po wer Absolute po wer of the high-frequenc y band (0.15–0.4Hz) ms2 Relati v e po wer of the high-frequenc y band (0.15–0.4Hz) in normal units nu Relati v e po wer of the high-frequenc y band (0.15–0.4Hz) % VLF po wer Absolute po wer of the v ery-lo w-frequenc y band (0.0033–0.04Hz) ms2 ULF po wer Absolute po wer of the ultra-lo w-frequenc y band ms2 LF/HF Ratio of LF-to-HF po wer % In ne xt stage, we perform c lassication for Iris images. F or this task, we ha v e used transfer learn- ing approach and combined it with DeepResNet model to enhance the classication performance. ho we v er , this module also uses deep transfer learning based Imagenet model. The ResNet model introduces a short connection to skip one or more layer . The basic architecture of ResNet is depicted in belo w gi v en Figure 4. Ensemble of con volutional neur al network and DeepResNet for ... (Ashwini Kailas) Evaluation Warning : The document was created with Spire.PDF for Python.
4286 ISSN: 2088-8708 Figure 4. ECG classication This model is trained with the help of the cross-entrop y loss function. Further , the loss function is optimized by incorporating the L 2 norm, which helps to reduce o v ertting. Thus, the nal loss function for this model can be e xpressed as (10): L nal = L class + λ 1 W fc 2 F (10) where L class represents the classication loss (cross-entrop y loss), and W fc 2 F denotes the Frobenius norm of the weight matrix W fc in the last layer . Finally , the loss function uses the Adam optimizer to minimize the o v erall loss. 3. RESUL TS AND DISCUSSION This section presents the outcome of MB ANet model along with its comparati v e analysis with e xisting approaches of classication and authentication. The rst subsection presents the brief details about the dataset used in this w ork, the ne xt subsection describes the details about performance measurement parameters, nally , the outcom e of MB ANet approach is demonstrated and compared with e xisting models. The combination of these modalities in publically a vilable dataset is not present therefore we ha v e considered syntntically creates dataset from dif ferent sources. 3.1. P erf ormance measur ement The performance of ECG signal denoising is measured using v arious parameters. These parameters are as follo ws: Mean squared error (MSE) MSE = 1 N N X i =1 ( X ( i ) Y ( i )) 2 (11) Root mean square error (RMSE) RMSE = v u u t 1 N N X i =1 ( X ( i ) Y ( i )) 2 (12) Peak signal-to-noise ratio (PSNR) PSNR = 10 · log 10   P N i =1 MAX 2 MSE ! (13) Percent root mean square dif ference (PRD) PRD = 100 × v u u t P N i =1 ( Y ( i ) X ( i )) 2 P N i =1 X ( i ) 2 (14) Int J Elec & Comp Eng, V ol. 15, No. 4, August 2025: 4279-4295 Evaluation Warning : The document was created with Spire.PDF for Python.
Int J Elec & Comp Eng ISSN: 2088-8708 4287 The performance of the MB ANet approach is e v aluated using confusion matrix calculations. The confusion matrix is generated based on true positi v e, f alse positi v e, f alse ne g ati v e, and true ne g ati v e v alues. T able 3 pro vides a sample representation of the confusion matrix. T able 3. Confusion matrix Actual class Predicted class Genuine user Imposter user Genuine user T rue positi v e F alse ne g ati v e Imposter user F alse positi v e T rue ne g ati v e W e use the suggested technique to quantify se v eral statistical performance measures, including acc u- rac y , precision, and F1-score, based on this confusion matrix. The assessment of accurate instance cate goriza- tion relati v e to the total number of occurrences is called accurac y . Here’ s ho w accurac y is calculated: Acc = TP + TN TP + TN + FP + FN (15) Ne xt, we calculate the suggested approach’ s Precision. The ratio of true positi v es to (true and f alse) positi v es is used to calculate it: P = TP TP + FP (16) Lastly , we use the sensiti vity and precision parameters t o calculate the F-measure, which may be written as (17): F = 2 · P · Sensiti vity P + Sensiti vity (17) 3.2. P arameters and h yper parameters This section p r esents the dif ferent parameters and h yperparameters used in this w ork to train the deep learning model for ECG and Iris authentication.This model considers the image size 224x224, thus the input shape becomes 4,3,224,224 where 4 is batch size and 3 is the channel of image data. Similarly , the ECG signal is represented as 4,1,100 with batch size 4. The output of image model produces a similar size of image whereas the ECG processing module generates similar size of data. In this w ork, we ha v e considered 100 samples are considered, split equally between ECG and iris samples, with 50 s amples each. The dataset is di vided using a 70%-30% train-test split. This ensures that the models are trained on a suf cient amount of data while retaining a separate portion for e v aluation to g auge their performance ef fecti v ely . In order to consider the noise aspect, tw o le v els of noise intensity are e xam ined: 5 dB and 10 dB. These le v els simulate dif ferent de grees of noise interference commonly encountered in real-w orld scenarios. Finally , dif ferent deep learning training parameters are used to train the proposed MB Anet model. T able 4 presents the considered parameters. T able 4. Simulation parameters P arameters Considered v alue T otal samples 100 ECG sample 50 Iris sample 50 T rain test ratio 70%-30% Noise type White noise, color noise, motion artif act, electrode artif act, baseline w ander Noise le v els 5 dB, 10 dB Learning rate 0.001 Batch size 4 Optimizer Adam Scheduler ReduceLR OnPlateau Epochs 100 Loss CrossEntrop yLoss Cross v alidation 10 fold Simulation T ool Python 3.8 Ensemble of con volutional neur al network and DeepResNet for ... (Ashwini Kailas) Evaluation Warning : The document was created with Spire.PDF for Python.
4288 ISSN: 2088-8708 3.3. Comparati v e analysis First of all, we process the Iris image data where image annotation, labelling boundary ide n t ication, mask e xtraction and normalization tasks are performed. Figure 5 depicts the sample outcome of these steps. The normalized image is further used for feature e xtraction and classication tasks. Similarly , we perform se v eral tasks on ECG signals such as ECG signal ltering because these signals are prone to v arious types of noise. Figure 6 depicts the original signal, noisy signal and their corresponding ltered signals. Figure 5. ECG classication In order to measure the ltering performance, we consider dif ferent types of noises such as white noise, color noise, motion artif act, electrode artif act , baseline w ander and v aried the noise dB as 5 dB and 10 dB. T able 5 sho ws the obtained performance in terms of PSNR, MSE, mean absolute error (MAE), RMSE, PRD, and correlation coef cient (CC). Her e, for 5 dB noise, max. PSNR is attained as 46.10 dB for Baseline w ander and similarly , for 10 dB, max. PSNR is attained as 44.82 dB for white noise. Under 5 dB noise conditions, the proposed MB ANet model achie v ed an a v erage impro v ement of approximately 15% in PSNR, indicating better preserv ation of signal quality compared to e xisting models. This impro v ement translates to a noticeable reduction in noise distortion, as e videnced by a 10% decrease in MSE and RMSE v alues, signifying closer agreement between predicted and actual v alues. Moreo v er , the MB ANet model e xhibited a 12% decrease in MAE, indicating more accurate predictions and a 3% impro v ement in PRD, reecting a reduction in residual dif ferences relati v e to reference signals. Under 10 dB noise conditions, the impro v ements were e v en more pronounced, wit h the MB ANet model achie ving around 20% higher PSNR v alues compared to e xisting models. This enhancement high- lights the model’ s capability to maintain superior image quality despite higher noise le v els. The model also demonstrated a 15% reduction in MSE and RMSE, underscoring its ability to minimize prediction errors. Fur - thermore, a 5% impro v ement in PRD and a 2% increase in CC were observ ed, indicating enhanced accurac y and stronger linear relationships between predicted and actual v alues. Finally , we measured the classication accurac y performance. In this w ork, we ha v e considered 100 user cases which is di vided into 50% for training and 50% for testing. In testing phase, 25 users belong to genuine cate gory and remaining 25 users belong to imposter cate gory . This section presents the classication accurac y performance for real-time cases by us ing MB ANet model. The attained results ar e then contrasted with the standard classication approaches. Results are depicted in Figure 7. According to this e xperiment, the random forest has misclassied 21 entities to dif ferent classes which af fects the performance of RF classier , similarly , SVM also has 16 misclassied entities whereas the proposed approach has reported only 3 entities as misclassied resulting in increased accurac y . T able 6 sho ws the per - formance obtained by using dif ferent classiers. According to this e xperiment, the o v erall accurac y is reported as 0.8900, 0.8400, 0.7900, 0.8932, 0.87, and 0.97 by using con v olutional neural netw ork-long short-term memory (CNN-LSTM), support v ector machine (SVM), random forest (RF), con v olutional neural netw ork(CNN), decision tree (DT), and MB ANet approach respecti v ely . The e xisting models rely on single modalities ho we v er some recent methods ha v e fo- cused on de v eloping multimodal authentication b ut these methods do not consider the noise in ECG signal and iris images whereas the proposed model introduced a multimodal authentication sys tem with compreshensi v e ltering model. Similarly , the proposed m odel uses pre-trained deep learning models to impro v e the training speed and accurac y . The training speed performance is depicted in T able 7. Int J Elec & Comp Eng, V ol. 15, No. 4, August 2025: 4279-4295 Evaluation Warning : The document was created with Spire.PDF for Python.