Indonesian J our nal of Electrical Engineering and Computer Science V ol. 41, No. 1, January 2026, pp. 283 299 ISSN: 2502-4752, DOI: 10.11591/ijeecs.v41.i1.pp283-299 283 Remaining useful life estimation of turbofan engine: a sliding time windo w appr oach using deep lear ning Alawi Alqushaibi 1 , Mohd Hilmi Hasan 1,2 , Said J adid Abdulkadir 1,2 , Shakirah Mohd T aib 1,2 , Safwan Mahmood Al-Selwi 1 , Mohammed Gamal Ragab 1 , Ebrahim Hamid Sumiea 1 1 Department of Computer and Information Sciences, Uni v ersiti T eknologi PETR ON AS, Seri Iskandar , Malaysia 2 Centre for Research in Data Science (CERD AS), Uni v ersiti T eknologi PETR ON AS, Seri Iskandar , Malaysia Article Inf o Article history: Recei v ed Dec 1, 2023 Re vised No v 9, 2025 Accepted Dec 13, 2025 K eyw ords: CMAPSS Con v olutional neural netw ork Deep features Prognostics Recurrent neural netw orks R UL prediction T urbof an engine ABSTRA CT System de gradation is a common and una v oidable process that frequently oc- curs in aerospace sector . Thus, prognostics is emplo yed to a v oid unforeseen breakdo wns in intricate industrial systems. In prognostics, the system health status, and its remaining useful life (R UL) are e v aluated using numerous sen- sors. Numerous researchers ha v e utilized deep-learning techniques to estimate R UL based on sensor data. Most of the studies proposed solving this problem with a single deep neural netw ork (DNN) model. This paper de v eloped a no v el turbof an engine R UL predictor based on se v eral DNN models. The method includes a time windo w technique for sample preparation, enhancing DNN’ s ability to e xtract features and learn the pattern of turbof an engi ne de gradation. Furthermore, the ef fecti v eness of the proposed approach w as conrmed using well-kno wn model e v aluation metrics. The e xperimental results demonstrated that among four dif ferent DNNs, the long short-term memory (LSTM)-based predictor achie v e d the better scores on an independent testing dataset with a root- mean-square error of 15.30, mean absolute error score of 2.03, and R-squared score of 0.4354, which outperformed t he pre viously reported results of turbof an R UL estimation methods. This is an open access article under the CC BY -SA license . Corresponding A uthor: Ala wi Alqushaaibi Department of Computer and Information Sciences, Uni v ersiti T eknologi PETR ON AS Seri Iskandar , 32610 Perak, Malaysia Email: ala wi 18000555@utp.edu.my 1. INTR ODUCTION Prognostics and Health Management (PHM) stands as a b ur geoning discipline with the primary objec- ti v e of predicting the prospecti v e health condition of a gi v en system, pinpointing latent f aults, and f acilitating punctual maintenance interv entions to impro v e the reliability as well as the operational a v ailability of the sys- tem [1]. As contemporary engineering systems, including aerospace, automoti v e, and manuf acturing domains, continue to gro w in comple xity , there arises an escalating demand for adv anced PHM methodologies adept at managing substantial datasets and furnishing precise prognostications [2], [3]. The concept of PHM has e v olv ed o v er the past fe w decades, dri v en by the need to impro v e system reliability , safety , and ef cienc y [4]. Initially , PHM w as mainly used in the aerospace industry to monitor the health of aircraft engines and to predict their remaining useful life [5]. W ith adv ancements in sensor technology , data analytics, and machine learning algorithms, the scope of PHM has e xpanded to other domains [6]. T oday , PHM is appl ied in a wide range of applications, including wind turbines [7], medical de vices, and infrastructure systems [8]. J ournal homepage: http://ijeecs.iaescor e .com Evaluation Warning : The document was created with Spire.PDF for Python.
284 ISSN: 2502-4752 Despite the numerous adv antages inherent in PHM, it is imperati v e to ackno wledge the e xistence of se v eral challenges that w arrant attention. A predominant challenge resides in the absence of standardized prac- tices within the discipline, a f actor that introduces com p l e xity in the comparati v e e v aluation of di v erse PHM methodologies [9]. An additional obstacle pertains to the requisite acquisition of e xtensi v e sets of superior data for the purpose of training predicti v e models. This process often incurs signicant costs and consumes substan- tial time, thereby presenting a formidable challenge [10]. Additionally , PHM requires interdisciplinary e xper - tise, which may not al w ays be rea dily a v ailable [11]. Deep learning (DL), a subset within the realm of machine learning, has demonstrated remarkable promise in the domain of PHM o wing to its capacity to comprehend intricate correlations between input characteristics and predicti v e outcomes. Ov er recent years, methodologies stemming from DL, including: con v olutional neural netw orks (CNNs), recurrent neural netw orks (RNNs), and autoencoders, ha v e found e xtensi v e application and utilization within PHM research endea v ors [12]. W ithin the realm of f ault diagnosis, the utilization of deep learning methodologies has been instru- mental in discerning and cate gorizing f aults by anal yzing sensor data [13]. CNNs ha v e been sho wn to be ef fecti v e in feature e xtraction from sensor signals, while RNNs ha v e been used to capture temporal dependen- cies between sensor measurements [14]. Autoencoders ha v e also been used for f ault detecti on by learning the normal operat ing conditions of a sys tem and det ecting de viations from these conditions [15]. In remaining useful life (R UL) prediction, DL models ha v e been used to pre dict the R UL of a system based on its current and past health status [16]. CNNs and RNNs ha v e been used to model the temporal e v olution of system health, while autoencoders ha v e been used to learn the underlying feature representations of sensor data [17]. Another important application of deep learning in PHM is anomaly detection [18]. DL models ha v e been used to rec- ognize abnormal beha viour in sensor data, which can indicate potential f aults or anomalies [19]. CNNs and autoencoders ha v e been sho wn to be ef fecti v e in detecting anomalies in sensor data [10]. Ho we v er , this study focuses on R UL predication which is in the third le v el of PHM [20]. This article endea v ors to furnish a comparati v e analysis concerning pre v alent deep learning architec- tures emplo yed in prognostics for predicting R UL. Our emphasis will be on e xamining CNNs, RNNs, LSTM netw orks, and g ated recurrent unit (GR U) models. The performance of these architectures will be e v aluated based on prediction accurac y , computational comple xity , and generalization ability to unseen data. Our aim is to pro vide practitioners and researchers with an inclusi v e o v ervie w of these architectures and their relati v e weaknesses and strengths for R UL prediction in prognostics. The v alidati on of this methodology’ s ef fecti v e- ness w as conducted using the commercial modular aero-propulsion system simulation (C-MAPSS) turbof an aero-engine benchmark datasets supplied by N ASA. The subsequent sections of this ma nu s cript are structured as follo ws: section 2 furnishes a compre- hensi v e o v ervie w delineating the background and pertinent literature that form the foundation of this study . Section 3 delineates the suggested approach for conducting research, while section 4 e xamines and scrutinizes the obtained results from empirical e xperiments. Section 5 goes o v er the ndings and analysis. Finally , Section 6 encapsulates the conclusions dra wn from this study and delineates prospecti v e a v enues for future research. 2. RELA TED W ORK In the aerospace sector , ensuring safety and reliability stands as a paramount consideration go v erning operational ef cienc y . Across v arious industries, rotating machinery assumes a pi v otal role, yet remains sus- ceptible to f ailure due to demanding operational en vironments and prolonged usage hours [21]. F ailures within these systems can lead to operational disruptions and substantial nancial ramications. Exploring the moni- tored relationship between de vice data and its associated R UL has g arnered signicant attention in data-dri v en prognostics. Numerous machine learning algorithms, particularly NN methods, ha v e been de vised to un v eil the correlation between the collected feature data and the anticipated R UL. The benet of emplo yi n g NNs for PHM lies in their ability to model intricate, highly nonlinear , multidimensional structures without a prior understanding of the system’ s ph ysical beha vior . Di v erse forms of de vice data, lik e ra w sensor readings, can serv e as direct inputs for these models. Ho we v er , establishing natural condence limits for deep neural netw ork (DNN) methodologies applied to prognostic issues demonstrate encouraging outcomes R UL prognostication remains a challenge [22], DNN-based approaches to prognostic problems sho w promising results [20]. Fentaye et al. [22] emplo yed the traditional multilayer perceptron (MLP) technique to forecast the R UL of bearings during laboratory testing, demonstra ting superior predicti v e performance compared to reliability-based alternati v es. Fink et al. [23] presented a mul ti-layer neural netw ork approach emplo ying Indonesian J Elec Eng & Comp Sci, V ol. 41, No. 1, January 2026: 283–299 Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752 285 multi-v alued neurons specically designed to address the challenge of forecasti ng the performance and de gra- dation time series. A case study w as carried out with a specic focus on consider the deterioration of a rail w ay turnout system. Kha w aja et al. [24] de vised a neural netw ork method for predicting condence that includes a condence distrib ution node, addressing the limitation in neural netw ork techniques where obtaining e xplicit condence limits for R UL predictions pro v es challenging. Additionally , se v eral fuzzy logic approaches ha v e been inte grated to MLP netw orks to enhance learni ng acquisition for PHM. Using RNN, Malhi et al. [25] proposed emplo ying RNNs and competiti v e learning techniques for long-term prognostics re g arding the health status of machinery . The y utilized the continuous w a v elet (WT) to preparation vibration singles obtained from a f aulty rolling bearing, subsequently emplo ying these preprocessed indicators as inputs for their model. The authors in the authors recommended an long short-term memory (LSTM) approach for R UL prediction in aero engines. This method w as proposed to address scenari os in v olving highly intricate operations, h ybrid f aults, and substantial noise le v els, thereby enhancing the capabilities be yond those of fered by the norm RNN. Zhao et al. [26] applied LSTM netw orks to a tool wear health monitoring task. The y i nte grated both frequenc y and time domain functions within their approach, Ren et al. [20] introduced an optimized DL technique designed for collaborati v e estimation of R UL in multiple bearings. The y substantiated the method’ s viability and superiority through numerical e v aluations conducted on a real dataset. Liao et al. [27] introduced an inno v ati v e restricted Boltzmann machine designed for representation learning aimed at determining the R UL of machines. This approach incorporates a no v el re gularization term along with an unsupervised self-or g anizing map algorithm. The study from Zhang et al. [28] presented a multi-objecti v e DBN ensemble approach. This method com- bined one of the e v olutionary algorithms with a con v entional DBN training approach to concurrently de v elop multiple DBNs, emphasizing both accurac y and di v ersity in their construction. In another study from Zheng et al. [29], the C-MAPPS benchmark dataset w as used to predict the R UL of the turbof an engine using LSTM based on on-time sequence representation. The use of CNN to estimate the R UL of the same engine w as proposed in [20]. The process uses a time windo w method as input feature to the suggested model. Hence, more de gradation data should be collected. As a result, the dimension of model inputs has increased, causing dif culty in the de v elopment of the DNN model, that is, ho w to set up netw ork nodes and netw ork layers to a v oid o v ertting and reduce time and computational e xpenses while also a v oi ding getting stuck in local minimum points. Muneer et al. [30] pro vide four data-dri v en prognostic models that emplo y DNNs with an attention mechanism to precisely estimate the turbof an engines’ R UL. W ithout requiring a prior understanding of prognostics or signal processing, the models increase DNN feature e xtraction by utilizing a sliding time windo w method. T o enhance the prediction of R UL for turbof an engines, Muneer et al. [31] also pro vide a no v el attention-based deep CNN design. The suggested model mak es use of multi v ariate temporal information by selecting features based on the processability metric and preparing samples using a time windo w technique. Another study recently conducted by Peng et al. [1]. As a technique for R UL prediction, the combination of 1-D CNNs with LSTM and full con v olutional layer (1-FCLCNN) w as proposed. This technique e xtracts the spatial and temporal characteristics from the FD003 and FD001 datasets produced by the turbof an engine using LSTM and 1-FCLCNN. Researchers ha v e also focus ed a great deal of emphasis on CNN applicati on s in R UL-related disciplines [16]. Bab u et al. [19] the deep CNN method w as initially applied for R UL prediction. CNN f ared better than the MLP , SVM, and SVR models, according to the data. The CNN method, which w as suggested by [19] w as e xamined and tested using the C-MAPSS dataset, yielding an RMSE of 18.45. Similarly , Li et al. [10] suggested a deep CNN time windo w method for impro v ed signal e xt raction. The method w as tested on N ASA s turbof an engine (C-MAPSS dataset) de gradation problem and demonstrated a signicant adv antage. Ev en with the CNN model’ s high performance, additional optimization is still needed because it still tak es longer to train than other shallo w approaches. Furthermore, the recommended method has a hea vy computational load. W en et al. [32] created a brand-ne w residual CNN (ResCNN). ResCNN mak es use of the residual block, which can help solv e the v anishing/e xploding gradient problem by using shortcut connections to bypass se v eral con v olutional layer blocks. Moreo v er , the k-fold ensemble method helped to enhance ResCNN. N ASA s C-MAPSS benchmark dataset w as used to test the suggested ensemble ResCNN. A ne w technique for deep features learning for R UL predictions utilizing multi-scale CNN (MS-CNN) and time-frequenc y representation (TFR) has been pro vided in another w ork s uggested by [33]. The bearing de- terioration signal’ s non-stationary character can be ef ciently sho wn by TFR. By using WT , we were able to accumulate time series deterioration signals and create TF Rs that are rich in v aluable information. These TFRs were high dimensional, thus bilinear interpolation w as utilized to reduce their size before the y were utilized Remaining useful life estimation of turbofan engine: a sliding time window ... (Alawi Alqushaibi) Evaluation Warning : The document was created with Spire.PDF for Python.
286 ISSN: 2502-4752 as inputs for the DL models. Ne v ertheless, the suggested method [33] e xhibits a fe w limitations. Initially , the algorithm’ s training duration is sluggish, necessitating an enhancement in computational speed. Secondly , utilizing a graphical processing unit becomes imperati v e to assist in handling the primary TFR processing. Additionally , Li et al. [34] aimed to enhance machines’ R UL estimation by introducing a netw ork structured as a controlled ac yclic graph that mer ges LSTM and CNN to predict R UL. Li et al. [34] observ ed that when emplo ying a singular timestamp as input, padding signals within the same training batch adv ersely impacted the o v erall predicti v e capability of the inte grated approach. T o mitig ate this issue, the authors adopted their proposed method to create a short-term sequence, mo ving the time windo w (TW) in increments of du- ration single-phase. Additionally , the y replaced the con v entional linear function, based on the de gradation mechanism, with a piece-wise R UL technique. In conclusion, the authors af rmed that augmenting the length of the time windo w could enhance the accurac y of their proposed model. In another study conducted by Zhang et al. [35] the y emplo yed CNN-based e xtreming gradient boosting (CNN-XGB) utilizing an e xtended of TW . This approach aimed to address challenges within a ero-engine systems that often function across di v erse oper - ating conditions. These v ariations might impact the system’ s de gradation path dif ferently , potentially hindering the accurac y of R UL prediction. The suggested method underwent v alidation utilizing N ASA C-MAPSS tur - bof an aero-engine datasets. It resulted in an RMSE of 20.3, with a re po r ted training duration of 621.7 seconds. W ang et al. [36] proposed the MS-CNN to estimate the R UL of rolling bearing. The suggested approach by Liu et al. [36] aims to o v ercome the capability to learn local and global features synchronously limit ed to con v entional CNN. Con v olution lters with v arying dilution rates were combined to form a dilated con v olution block capable of learning features in a v ariety of recepti v e elds. Concatenating numerous stack ed, inte grated, and dilated con v olution blocks v aried depths allo wed for the e xtraction of local and global features. The pro- posed method’ s ef fecti v eness w as v alidated by a benchmark dataset bearing called PR ONOSTIA. Hence, in this study , we aimed to in v estig ate dif ferent DNN models for R UL estimation to determine the technique with an e xcellent feature e xtraction and high capability to e xpect the R UL of a turbof an engine. 3. MA TERIALS AND METHODS This research utilizes a comparati v e analysis m ethod to assess ho w ef fecti v ely four prominent deep learning models can predict the R UL of v arious engines units. The proposed DNN-based models are rigorously trained and e v aluated using well-kno wn performance metrics. The initial section of the proposed methodology is dedicated to describing the four candidate deep learning models, while the subsequent sections outline the nal tw o stages of the methodology . 3.1. Candidate model training and optimization This part of fers an in-depth o v ervie w of the DNN structures and optimization strate gies i mplemented for creating candidate models to predict the R UL of turbof an engines. T o achie v e this goal, se v eral commonly used NN architectures, including CNNs, RNNs, g ated recurrent unit (GR U), and LSTM, were emplo yed in this study . In addition, we applied the randomized h yperparameter search method, similar to that described in [34], to enhance the performanc e of the DNN models. This approach in v olv es conducting a random search across a broad h yperparameter space, allo wing for the identication of optimal h yperparameters with limited computational ef forts . Specically , we randomly sampled h yperparameters, created models using these param- eters, and e v aluated their performance. Subsequent subsections will pro vide concise descriptions of each DNN architecture used in this study for the R UL prediction of turbof an engines. 3.1.1. RNNs T raditional DNNs ha v e a limitation in that the indi vidual neuron weights cannot identify e xact rep- resentations of features for the corresponding R UL due to the comple x system structure. T o o v ercome this limitation, RNNs address this issue by incorporating a loop mechanism that operates o v er time steps. Speci- cally , a sequence v ector { x 1 , . . . , x n } through a recurrence formul a r t = f α ( r t 1 , x t ) , where f represents the acti v ation function, α represents a set of parameters used at each time step t , and x t is the input at timestep t [37]. This research e xplores three types of recurrent neurons for de v eloping candidate RNN-based models: a basic RNN unit, GR U, and LSTM unit. The parameters controlling the connections between the hidden layers and input, as well as the connections between acti v ations starting from the hidden layer and e xtending to the output layer , remain constant throughout each time step in a v anilla recurrent neuron. The operation of a fun- damental recurrent neuron during the forw ard passing can be formulated in a specic manner , which will be Indonesian J Elec Eng & Comp Sci, V ol. 41, No. 1, January 2026: 283–299 Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752 287 elaborated on in the subsequent sections. a t = g ( W a [ a <t 1 > , X t ] + b a ) (1) y t = f ( W y a t + b y ) (2) At each timest ep t , the acti v ation function g is denoted by g , where t represents the current timestep and X t represents the input at that timestep. The bias is r epresented by b a , and W a represents the cumulati v e weights at timestep t for the acti v ation output denoted by a t . The output of the acti v ation, a t , can be uti lized to generate forecasts for y t at time t , if required. The model emplo ys an embedding layer to map the R UL into a v ect or space of dimension R20, transforming semantic relationshi ps into geometric ones. The successi v e layers of the DNN e xamine these geometric shapes in order to identify and understand comple x feature representations, which are then e v aluated by the output layer to mak e predictions, emplo ying a singular si gmoid unit. Despite the ef fecti v eness of DNNs using basic RNN neurons in se v eral domains, these models encounter challenges pertaining to the v anishing gradient problem and their limited capacity to capture long-term relationships. In order to address these obsta- cles, the scholarly community has suggested alternate designs for recurrent neurons, namely the GR U [38] and the LSTM [39], which ha v e sho wn i mpro v ed performance in mitig ating the v anishing gradients problem and aiding the acquisition of long-term dependence [40]. Nascer et al. [41] presented a GR U model that demonstrates enhanced ef cac y in the task of long- term relationship learning within time-series datasets. The operational characteristics of the GR U may be mathematically described using the follo wing set of equations: H t = tanh( W c r H t , X t ] + b c ) (3) Γ r = σ ( W r [ H ( t 1) , X t ] + b r ) (4) Γ u = σ ( W u [ H ( t 1) , X t ] + b u ) (5) H t = Γ u · H t + (1 Γ u ) · H ( t 1) (6) a t = H t (7) W r , W c , and W u are the weight matrices, whil e b r , b c , and b u are the bias terms for the input X t at each time step t . σ represents the logistic re gression function, and a t is the acti v ation v alue at time step t . Except for GR U neurons , the RNN model emplo ying GR U is similar to those using plain RNN neurons. T able 2 sho ws the GR U-based RNN model architecture for R UL estimation. Hochreiter and Schmidhuber [39] added the LSTM neuron, which impro v ed on the RNN unit and made the GR U more rob ust. The follo wing dif ferences between GR U and LSTM cells: In standard LSTM units, there is no signicant g ate lik e Γ r used in the computation of H t . LSTM units utilize tw o distinct g ates, namely the output g ate Γ o and the update g ate Γ u , instead of just relying on an update g ate Γ u . The acti v ation outputs of the LSTM unit for other hidden units in the netw ork are computed by the output g ate, which monitors the visibility of the content in the memory cell ( H t ). In contrast, the update g ate re gulates the e xtent of information replacement on the pre vi ous hidden state, H ( t 1) , in order to get the updated hidden state, H t . This process entails the determination of the e xtent to which information stored in memory cells should be discarded in order to guarantee optimal functionality . LSTM units emplo y tw o apparent g ates in place of a single upd a te g ate Γ u found in GR U units. These are the output g ate Γ o and the for get g ate Γ u . The output g ate is responsible for re gulating the visibility of the memory cell content H t in calculati n g the acti v ation outputs of the LSTM unit for other hidden units in the netw ork. The for get g ate, on the other hand, manages the de gree to which the pre vious memory content H ( t 1) is o v erwritten to produce H t . This in v olv es determining the e xtent to which information in the memory cell should be disre g arded to maintain ef fecti v e functioning. A k e y dif ference between LSTM and GR U architectures is that in LSTM, the content of the memory cell H t might not be the same as the acti v ation v alue a t at time t . Remaining useful life estimation of turbofan engine: a sliding time window ... (Alawi Alqushaibi) Evaluation Warning : The document was created with Spire.PDF for Python.
288 ISSN: 2502-4752 Furthermore, the LSTM model which is based on the RNN method w as de v eloped with an architec- tural design that has a strong resemblance to both the GR U and basic RNN models. The only dif ferentiation e xists in the use of LSTM units inside the recurrent layers. Figure 1 is structure of a deep RNN-based model proposed for R UL estimation. Figure 1. Structure of a deep RNN-based model proposed for R UL estimation 3.1.2. CNNs CNNs are particularly ef fecti v e at processing learning tasks that entail comple x spatial patterns in high-dimensional input data. Such challenges are pre v alent in v arious domains, including, b ut not limited to, image processing [42], video analysis [43], analysi s of amino acid sequences [41], [44] and the e xamination of time-series f ailure signals. The primary objecti v e of CNNs is to l earn hierarchical lters capable of transform- ing lar ge input data into precise class labels while emplo ying a minimal number of trainable parameters. This transformation is accomplished through sparse interactions between the input data and trainable parameters, f acilitated by a mechanism kno wn as parameter sharing. This method allo ws CNNs to acquire representations that are equi v ariant, also kno wn as feature maps, of the intricate and spatially or g anized input data [45]. In a deep CNN, the units in the deep layers ha v e the ability to indirectly interact with a signicant percentage of the input data. This is achie v ed through the use of pooling operations. Pooling operations streamline the output at a particular point by emplo ying a statis tical summary , enabling the netw ork of the model to acquire intricate properties from this compacted representation map [10]. The topmost section of the CNN typically includes man y fully connected layers (FCL), including the output layer , le v eraging the intricate information acquired by the preceding layers to mak e predictions. The architecture based on CNN that is used for the R UL prediction, which consists of tw o con v olution- maxpool blocks in the embedding layer , a global a v erage layer , and an output layer of the sigmoid neuron. The learning ef cienc y of the CNN model is signicantly impro v ed through the use of multiple non-linear feature e xtractions. This enables the model to autonomously le arn hierarchical data representations. Consequently , the size of the con v olution k ernel and the quantity of con v olution layers greatly inuence the model’ s predicti v e capabilities. Figure 2 ill ustrates the CNN architecture designed for R UL estimation in this study . The initial input data are in a tw o-dimensional (2D) format, where one dimension represents the feature number in a 1D format, and the other dimension corresponds to the sensor’ s time sequence, also in 1D. Figure 2. The proposed architecture for R UL prediction using a deep CNN Indonesian J Elec Eng & Comp Sci, V ol. 41, No. 1, January 2026: 283–299 Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752 289 After that, the CNN model processes the input data through four con v olutional layers, each with a similar structure, to e xtract features. These e xtracted features are then inte grated with a con v olut ional layer equipped with a single lter , sized 3 × 1. After the feature maps are attened, the y are connected to a fully connected layer . T o mitig ate o v ertting, a dropout method is applied. The acti v ation function for each layer is the ReLU. In this research, the optimization of the model is handled by the Stochastic Gradient Descent (SGD) algorithm. Considering the current characteristics of the turbof an aeroengine datasets, our models has been adjusted to impose a higher penalty for delayed (lag) predictions. The formulat ion of this loss is specied as follo ws in the study . loss = 1 N N X i =1 ω ( y i ˆ y i ) 2 (8) Where y i is the actual v alue and ˆ y i is the predicted v alue. N is the v alidation set sample count. Penalty coef cient ω is set to 1 if real v alue y i e xceeds anticipated v alue ˆ y i , and to 2 if actual v alue is less than e xpected v alue. 3.2. Data pr e-pr ocessing and normalization In practical scenarios, ra w data from sensors, operational parameters, and run-to-f ailure inform ation are typically accessible. T o prepare the data for training and testing, it is necessary to standardize the v alues of each sensor , as the scales may be dif ferent. In the e xperiment conducted, data from 21 sensors were utilized, and an y anomalous or un v arying data w as e xcluded. The normalization technique used is Min-Max scaler , w as applied to each feature to scale the data into a range between 0 and 1. In addition, for systems where the health decay is not linear from the be ginning of operations, piece-wise functions can be used to enhance the precision of the estimated R UL t calc . Also, if information about v arying w orkloads, operational en vironments, and specic modes of deterioration is a v ailable, it can be inte grated into the R UL estimation model to further rene its accurac y in certain applica tions. Figure 3 e xplains t he measurement and the ra w input of FD001 data set for R UL of each sensor . Figure 3. R UL display plot for each sensor measurement and the ra w input of FD001 dataset The second method is Min-Max normalization, which in v olv es scaling the ra w data from the sensors to t within the range of 0 and 1. T o achie v e this, the sensor’ s minimum and maxim u m readings data are identied, and these v alues are used to map the data onto the range -1 and 1. The normalized sensor output ˆ x i is calculated by taking the ratio of the dif ference between the original se n s or output x i and the minimum v alue, to the range, which is the dif ference between the maximum and minimum v alues . It is important to note that normalization Remaining useful life estimation of turbofan engine: a sliding time window ... (Alawi Alqushaibi) Evaluation Warning : The document was created with Spire.PDF for Python.
290 ISSN: 2502-4752 is necessary because dif ferent sensors may ha v e dif ferent v alue scales, and normalizing the data allo ws for f air comparison and accurate training and testing of the models. Furthermore, in certain applicati o ns , such as those with non-linear R UL decay , piece-wise functions can be used to adjust the estimated R UL t calc goals. Incorporating kno wledge of dif ferent w orkloads, operational en vironments, and deterioration modes into the R UL estimation model can also impro v e its accurac y if such information is a v ailable. x i = x i min x i max x i min x i (9) Additionally , to incorporate multi v ariable temporal information, a time windo w (TW) approach is adopted, as pre viously done in a study [10]. F or the training dataset FD001, a TW length of 30 w as selected, and all historical data within the TW w as e xtracted to form a high-dimensional input v ector . This v ector has a length of 14 × 30, using 14 out of t he 21 a v ailable sensors as ra w input features. In this study , the de v eloped DNN-based models were specically intended to forecast t he R UL of aero-engines operating under a single condition. Consequently , the FD001 da taset, which comprises data collected under a single operating condition, w as chosen for e xperimental analysis. The structure of the netw ork used for feature e xtraction w as adapted to align with the dynami c qualities of the operational data of an aero-engine, which ca n v ary across dif ferent operating conditions. 3.3. P erf ormance metrics In this study , prognostic performance w as as sessed using three metrics: R-squared ( R 2 ), mean ab- solute error (MAE), and RMSE. The rationale behind the selection of these three indicators is their e xtensi v e application in cutting-edge model performance assessment. The rst e v aluation metric, RMSE, is presented: RMSE = v u u t 1 N N X i =1 d 2 i (10) MAE is the sum of anticipated errors or the mean of all absolute errors: MAE = 1 n X n | X P X | (11) Thus, X P is estimated data, X is the ground truth data, and n is the number of samples. Statistical measure is R 2 sho ws ho w much of the v ariation of a dependent v ariable can be accounted for by an independent v ariable: R 2 = 1 R S S T S S (12) where TSS is the total sum of squares, RSS is the sum of residual squares, and R 2 is the determination coef - cient. 3.3.1. Pr ognostic pr ocedur e Figure 4 sho ws the multi-phase prognostic e xperimental strate gy . Preprocessing be g an with the e x- traction of 14 ra w sensor v alues and normalization to scale the FD001 dataset inside the [-1, 1] r ange. W e then produced training and testing datasets with time sequence information limited to Ntw . DNN models used pre-pro vided 2D standardized data. It w as unnecessary to manually construct signal processing features lik e sk e wness and kurtosis. Thus, no prognostics or signal processing kno wledge is required. This w as fol lo wed by b uilding the proposed deep neural netw ork models for life R UL prediction and specifying their hidden layer count, con v olution lter size, and other parameters. The DNN models were trained using normalized train- ing data and labeled R UL v alues for training samples. Back-propag ation learning and mini-batches in SGD updated the netw ork’ s weight. T o train each epoch, the data were randomly di vided into se v eral tin y batches of 512 samples. Use the micro batch mean loss function to tweak each layer’ s weights in the training deep neural net w ork model. Experi mental e xperiments determined the best batch size of 512 samples, which w as emplo yed in all case studies. T o assure con v er gence, a v ariable learning rate w as used, starting at 0.005 for the rst 25 optimization epochs and then progressing to 0.001. DNN candidate models cannot e xceed 250 training epochs by def ault. Indonesian J Elec Eng & Comp Sci, V ol. 41, No. 1, January 2026: 283–299 Evaluation Warning : The document was created with Spire.PDF for Python.
Indonesian J Elec Eng & Comp Sci ISSN: 2502-4752 291 Figure 4. Prediction process of our proposed approach 4. EXPERIMENT AL RESUL TS This section presents a summary of the e xperimental ndings and discusses their signicance. F irstly , the C-MAPSS benchmark dataset is introduced in the rst subsection. Secondly , the e xperimental results and performance analysis are presented in the second subsection. Finally , the last subsection pro vides a comparati v e analysis with e xisting literature. 4.1. Benchmark dataset f or C-MAPSS The C-MAPSS dataset serv es as a widely utilized resource in adv anced prognostic research, compris - ing four sub-datasets that depict the engine’ s beha vior under di v erse operational conditions and mechanisms of f ailure [46]. Each subset includes both training and testing sets, accompanied by actual R UL v alues. These subsets are characterized by 21 sensors and three operational settings [47]. Each engine unit under goes distinct le v els of deterioration, gradually de grading o v er time until it reaches a point of system f ailure, marking the culmination of an unhealth y operational c ycle. As a result, sensor recordings in the testing set cease before the occurrence of the system f ault. The dataset is presented in a compressed te xt format, where indi vidual ro ws signify data snapshots tak en within a single operational c ycle, and each column corresponds to a distinct v ari- able. T able1 1 pro vide comprehensi v e details about the datase t. The objecti v e of the e xperiment w as to predict the R UL of the engine unit in the testing set and that of a single-engine unit. F or the purposes of this research, only the rst subset of data labeled FD001 w as utilized for the v erication of the DNN models. Consequently , this data subset consisted of 100 training samples and 100 test samples. T able 1. Description of C-MAPSS benchmark dataset C-MAPSS Dataset FD001 Engine units for training 100 Engine units for testing 100 Operating conditions 1 F ault modes 1 Remaining useful life estimation of turbofan engine: a sliding time window ... (Alawi Alqushaibi) Evaluation Warning : The document was created with Spire.PDF for Python.
292 ISSN: 2502-4752 4.2. Analyses of candidate model perf ormance and experimental r esults f or 100 testing engines This subsection discusses the prognostic performance of the suggested DNN-based models for R UL estimation. An analysis w as conducted to in v estig ate the ef fects of v arious f actors on the outcomes, such as the quantity of concealed la yers and residual scatter plots for each model. The comparison of the deep structure of the proposed four models with that of other prominent NN architectures demonstrated the proposed DNN- based models’ ef fecti v eness. Additionally , t h e proposed approach’ s superiority w as pro v en by comparing the most recent state-of-the-art prognostic outcomes on the same C-MAPSS datase t. Figure 5 sho ws the RNN- based model prediction for 100 engine units in the F D00 1 dataset. The graph’ s X-axis represents the actual R UL v alues, where the Y - axis of the graph denotes the predicted R UL v alues across the whole testing dataset. Figure 6 sho ws FD001 test dataset residual analysis of an LSTM-based model (best model). Figure 5. Sorting predication for the 100 testing engine units in FD001 using the LSTM-based model Figure 6. FD001 test dataset residual analysis of an LSTM-based model (best model) 5. RESUL T AND AN AL YSIS After analys ing the e v al uation metrics, it becomes e vident that the decision tree class ier (DTC) and XGBoost Classier (XGBC) models display the most ele v ated accurac y scores in comparison to the other mod- els. Nonetheless, when scrutinizing the precision and recall scores, it is clear that the DTC e xhibits the lo west Indonesian J Elec Eng & Comp Sci, V ol. 41, No. 1, January 2026: 283–299 Evaluation Warning : The document was created with Spire.PDF for Python.