Abstract
Human activity recognition has complex applications because of its worldly use of acquisition devices, namely video cameras and smartphones, and its capability to take human activity data. Human activity recognition became a hot scientific subject in the area of computer vision. It is convoluted in the growth of numerous significant applications like virtual reality, human-computer interaction, video surveillance, home monitoring, and security. Then, a broad range of activity recognition models is established for disabled individuals. Human activity recognition is recognized as the art of naming and identifying activities utilizing artificial intelligence-based deep learning and machine learning methods. In this manuscript, an Enhanced Activity Recognition for Disability People Using a Deep Learning Model and Nature-Inspired Optimization Algorithms (EARDP-DLMNOA) model is proposed. The EARDP-DLMNOA model mainly relies on improving the activity recognition model using advanced optimization algorithms. Initially, the data normalization stage is executed using the min-max normalization to convert input data into a beneficial format. Furthermore, the EARDP-DLMNOA model employs the adaptive chimp optimization (AdCO) technique for the feature subset selection. The deep convolutional auto-encoder (DCAE) technique categorizes data into predefined classes based on its features for the activity recognition process. Finally, the DCAE model’s hyperparameter selection uses the zebra optimization algorithm (ZOA) model. A wide-ranging experimentation is carried out to validate the performance of the EARDP-DLMNOA approach under the HAR through the smartphone dataset. The experimentation validation of the EARDP-DLMNOA approach portrayed a superior accuracy value of 97.58% over existing methods.
Similar content being viewed by others
Introduction
Recently, the study of Human activity recognition (HAR) is developed in the most favoured subject areas1. Because of the accessibility of lower energy consumption, accelerometers, sensors, minimal cost, and progressions in the Internet of Things (IoT), machine learning (ML), artificial intelligence (AI), and computer vision (CV), multiple applications are built utilizing human-centred design monitoring to identify, categorize, and recognize human behaviour. Investigators have projected several approaches around this subject2. HAR is a vital tool for observing a person’s dynamism, and it might be achieved by utilizing ML methodologies3. HAR is an automatically recognizing technique that analyses human activities depending on data from multiple wearable gadgets and smartphone sensors, namely gyroscope, accelerometer sensors, location, time, and other geographical sensors4. While incorporated with another technology, like the IoT, it is employed in various application fields like sports, industry, and medical care. With the developments in the IoT field, smart environments are inhabited by a different set of IoT gadgets loaded with sensors, which have become an essential part of everyday existence, consequently making the monitoring of activity recognition for older people5. Everyday routines refer to daily self-care actions of individuals, like bathing, feeding, grooming, cleaning, and dressing. Individuals with elderly people or citizens with impairments who are incompetent to accomplish HAR are frequently seen as a sign of their well-being. Examining several everyday actions of an individual is a favourable method for observing their general health and well-being status6.
Such investigation could assist in identifying unusual activities that indicate an advanced medical concern or the existence of a risky incident. In that case, performing punctually is paramount for confining the effect of such life-threatening or life-changing incidents7. To protect individuals with elderly and disabilities, they should be supervised and protected 24 h daily. AI has become more prevalent for HAR in recent years due to its self-learning nature and strong classification techniques. Recently, various surveys is organized for HAR utilizing deep learning (DL) and ML8. DL and ML-based methodologies are vital to the activity recognition process in disabled and elderly individuals. The growing requirement for personalized care and support for individuals with disabilities has emphasized the significance of developing innovative technologies to assist in daily activities. One promising approach is using advanced recognition systems that monitor and interpret human movements9. By utilizing emerging technologies such as sensors and intelligent algorithms, it is possible to create more effectual solutions that improve independence and quality of life. This is particularly critical for those with mobility limitations, as real-time activity recognition can give timely interventions and personalized feedback. Furthermore, integrating nature-inspired optimization techniques presents a way to fine-tune these systems for more accurate and adaptive performance10.
In this manuscript, an Enhanced Activity Recognition for Disability People Using a Deep Learning Model and Nature-Inspired Optimization Algorithms (EARDP-DLMNOA) model is proposed. The EARDP-DLMNOA model mainly relies on improving the activity recognition model using advanced optimization algorithms. Initially, the data normalization stage is executed using the min-max normalization to convert input data into a beneficial format. Furthermore, the EARDP-DLMNOA model employs the adaptive chimp optimization (AdCO) technique for the feature subset selection. The deep convolutional auto-encoder (DCAE) technique categorizes data into predefined classes based on its features for the activity recognition process. Finally, the DCAE model’s hyperparameter selection uses the zebra optimization algorithm (ZOA) model. A wide-ranging experimentation is carried out to validate the performance of the EARDP-DLMNOA approach under the HAR through the smartphone dataset. The key contribution of the EARDP-DLMNOA approach is listed below.
-
The EARDP-DLMNOA model utilizes min-max normalization to scale input data, ensuring uniformity across diverse feature ranges. This approach improves the model’s capability to process data consistently, resulting in improved performance in activity recognition tasks. Standardizing the data values improves the overall accuracy and stability of the system.
-
The EARDP-DLMNOA approach employs the AdCO technique to select the most relevant feature subsets, improving efficiency by mitigating the data dimensionality. This process assists in focusing on the key activity patterns, improving the model’s ability to recognize human activities accurately. As a result, it enhances both the system’s performance and computational efficiency.
-
The EARDP-DLMNOA methodology utilizes the DCAE model to automatically learn complex features from raw sensor data, enabling accurate HAR. This approach improves the system’s detection of subtle patterns and activity discrepancies. Learning robust features significantly enhances the recognition process’s accuracy and reliability.
-
The EARDP-DLMNOA method applies the ZOA model to fine-tune parameters, optimizing the recognition process for improved performance. This adaptive learning approach ensures the system can effectively adjust to various activities. By optimizing the model parameters, ZOA enhances accuracy and overall system flexibility.
-
The novelty of the EARDP-DLMNOA model is in integrating nature-inspired optimization techniques, namely AdCO and ZOA, with the DCAE method for HAR. This integration presents a unique solution that enhances efficiency and accuracy and adapts dynamically to diverse activity patterns. Using these advanced techniques, the model gives real-time, robust recognition of complex human behaviours.
The article’s structure is as follows: Sect. 2 reviews the literature, Sect. 3 describes the proposed method, Sect. 4 presents the evaluation of results, and Sect. 5 offers the study’s conclusions.
Literature survey
Takawale and Paithane11 projected a new method for brain activity recognition from EEG signals. An innovative hybrid classification of the combined coot blue monkey optimizer (HC-CCBO) approach is developed in this study. Primarily, enhanced Z-score normalization was employed to pre-process the EEG signal. Discrete wavelet transforms (DWT), enhanced correlation and statistical aspects were removed. A hybrid classification leverages DMO and Bidirectional gated recurrent unit (Bi-GRU) techniques utilized. Moreover, the weights of deep max out (DMO) and Bi-GRU were enhanced through the combined coot blue monkey optimization (CCBO) optimizer. Eventually, the output scores of the rest, left fist, fists, right fist, and feet, are attained from the recommended hybrid brain activity detection technique. In12, the latest advances in multiple-view HAR for AAL are evaluated, aiming at the value of lightweight DL techniques. The method encompasses the advancement of the HAR method, the development of multi-view databases, and highly effective DL techniques specifically intended for AAL backgrounds. Moreover, data protection, adaptability, and synchronization are addressed with probable aids such as sensor fusion and TL. In13, the LSTM Neural Network (NN) categorizes human activity. Depending on these alerts will be operated in the event of fall recognition; conversely, data will be archived for future reference. Yi and Hwang14 projected a novel data-centric method for HAR by utilizing a semi-supervised generative adversarial network (SGAN). A data supplement approach that systematically improves the data quality is projected to enhance the HAR precision, instead of the method, by employing refinement of data and data-driven feature extraction methods. The projected HAR model utilizes simple SGAN to attain significantly higher precision with only a smaller fraction of the labelled data. Thus, the projected HAR technique could decrease over data labelling, a time-consuming and labour-intensive process for multiple HAR jobs. Chen et al.15 developed an innovative technique that depends on the self-supervised learning structure, Self-supervised learning (SimCLR), for daily activity detection utilizing ambient sensor data. The encoder module is the central part containing dual convolution layers. The LSTM layer succeeded. This structure permits the method to take either spatial or temporal dependency in the sensor data, allowing the removal of informative aspects for downstream jobs. Ye et al.16 developed an innovative DL structure that depends on graph attention networks (GANs) and embedding technology such as the time- and location-oriented graph attention (TLGAT) models. The embedding technology changes sensor observations into equivalent vector features. Subsequently, TLGAT offers a sensor observation arrangement as a fully connected graph to the technique’s temporal co-relation and the location of the sensor’s co-relation between sensor observations.
Mekruksavanich and Jitpattanakul17 inspect the existing investigation on HAR utilizing DL models and deliberate proper detection approaches. Primarily, several CNNs are used to establish an effectual structure for HAR. Afterwards, a hybrid CNN that integrates a channel attention mechanism is also used. This mechanism allows the system to take deep spatio-temporal features hierarchically and discriminate between diverse human actions in daily life. Jiang et al.18 propose the Narrow Kernel and Dual-view Feature Fusion Convolutional NN (NKDFF-CNN) technique for improving the accuracy of gesture recognition based on multi-channel surface electromyography (sEMG) signals. Hoang, Matrella, and Ciampolini19 categorize sleep postures accurately using ML methods applied to acceleration data from a smart bed. The study evaluates the performance of Linear Discriminant Analysis (LDA), K-Nearest Neighbors Classification (KNN), Classification and Regression Trees (CART), and Naive Bayes (NB) models achieving high accuracy and stable results for sleep posture classification. Rezaee et al.20 developed an optimized DL method using BiLSTM networks and the Grey Wolf Optimizer (GWO) technique for real-time student activity classification and health monitoring with accelerometer data. Arasi et al.21 propose the metaheuristic optimization-driven ensemble model for smart monitoring of indoor activities for disabled persons (MOEM-SMIADP) model for indoor activity detection in disabled individuals, utilizing data pre-processing, feature selection with the marine predator algorithm (MPA) method, and classifiers such as GCN, LSTM-seq2seq, and convolutional autoencoder (CAE). Hyperparameters are tuned using an improved coati optimization algorithm (ICOA) method to enhance accuracy. Prabagaran et al.22 developed a Hybrid Siamese Top-Down Neural Network optimized with the Poplar Optimization Algorithm (Hyb-STDNN-POA) methodology. The model utilizes Quantized Discrete Haar Wavelet Transform (QDHWT) for pre-processing, Shape-Aware Mesh Normal Filtering (SAMNF) for noise removal, Hybrid Siamese Top-Down Neural Networks (Hyb-STDNN) for segmentation and classification, and Poplar Optimization Algorithm (POA) for tuning. Jin et al.23 explore the role of finite element analysis (FEA), ML, and digital twins (DT) in advancing soft robotics, covering material selection, design, control, and maintenance while addressing current challenges and future directions.
Kumar, Singh, and Ojha24 propose a modified Ant Colony Optimization (ACO) method for effectual metro network routes, comparing it with recent methods and highlighting its productivity, planning time, and cost-efficiency advantages. Jawad et al.25 optimize energy efficiency and plant comfort in greenhouse systems utilizing Artificial Bee Colony (ABC), Fuzzy Controller (FC), Genetic Algorithm (GA), Firefly Algorithm (FA), and ACO. Tahsin et al.26 develop a stroke rehabilitation system utilizing Kinect sensor data, eXtreme Gradient Boosting (XGB)-based classification, Gray Wolf Optimizer (GWO)-based hyperparameter tuning, and feature pre-processing for improved accuracy. Rathi et al.27 predict learners’ learning styles based on interaction behaviour using a NN, optimized by a hybrid Squirrel Search and Rider Optimization Algorithm (ROA), and mapped using the Felder-Silverman Learning Style Model. Chakraborty, Rathi, and Singh28 design a robotic exoskeleton for index finger rehabilitation utilizing a Stephenson III mechanism optimized for flexion/extension motion through trajectory-based synthesis. Li et al.29 develop a highly sensitive, stretchable, and skin-conformable photonic artificial throat for accurate bilingual speech recognition and silent communication. Thapar et al.30 identify skin cancer using dermoscopy images by segmenting lesions with grasshopper optimization (GO) based on swarm intelligence (SI). Thakur, Dangi, and Lalwani31 develop a robust activity classification model using convolutional neural network (CNN) and recurrent neural network (RNN) for learning sequential and spatial patterns, optimized by whale optimization algorithm (WOA) and GWO through hybrid learning algorithms (HLA). Subha, Jeyakumar, and Deepa32 present a Gaussian Aquila Optimizer (GAO)-based Dual CNN (DCNN) method for detecting and classifying osteoarthritis from X-ray images. Sivasakthivel et al.33 present a four-state Brain-Computer Interface (BCI) model utilizing Welch Power Spectral Density (W-PSD) and a hybrid Feed Forward Neural Network–Cheetah Optimization Algorithm (FFNN–COA) approaches for improved communication in paralyzed individuals. Purni and Vedhapriyavadhana34 develop a DL-based model for multi-class skin cancer detection using improved Canny edge detection and optimized CNN approach. Yedukondalu et al.35 compute cognitive load by extracting and classifying EEG features utilizing robust local mean decomposition (R-LMD), binary arithmetic optimization (BAO), and optimized ensemble learning classifiers. Table 1 summarises existing studies on activity recognition and optimization techniques.
Despite crucial advancements in HAR, various limitations remain in the existing models. Many techniques face difficulty handling noise and incomplete or unbalanced data, affecting the systems’ robustness and accuracy. Integrating multiple sensors and modalities often introduces data fusion and synchronization complexities, resulting in inefficiencies. Furthermore, many methods lack real-time processing capabilities, affecting their practical implementation. While DL models, namely CNN, LSTM, and BiLSTM, exhibit promise, they require extensive labelled data, making them resource-intensive. Additionally, some optimization methods like ACO, GWO, and MPA do not always guarantee global optimality, potentially resulting in suboptimal results. There is also a lack of generalizability across different datasets, particularly for applications involving diverse human behaviours or environmental conditions.
Materials and methods
This manuscript proposes an EARDP-DLMNOA model. The proposed model mainly relies on improving the activity recognition model using advanced optimization approaches. To accomplish that, the EARDP-DLMNOA model has data normalization, dimensionality reduction, activity recognition, and parameter selection, as demonstrated in Fig. 1.
Min-max normalization
Initially, the data normalization stage is executed by the min-max normalization to convert input data into a beneficial format36. This model is chosen as it effectually scales the input data within a fixed range, typically [0, 1], ensuring consistency across all features. This methodology is particularly advantageous when the data contains varying scales, as it eliminates biases caused by differences in magnitude. By transforming the data into a standardized format, min-max normalization improves the model’s capability to process data uniformly, enhancing ML model’s convergence speed and accuracy. Compared to other techniques like Z-score normalization, which assumes a Gaussian distribution, min-max normalization is more appropriate for models where preserving the original relationships between data values is significant. Its simplicity and efficiency make it a robust choice for large-scale datasets commonly encountered in human activity recognition tasks.
Data standardization is essential for making the calculations easier, which is achieved over the data normalization method. Min-Max normalization is applied for carrying out the data normalization and is specified as:
The data applied to the process is exposed as \(\:B\), and the normalized data is presented as \(\:S{t}_{B}\). The lower value is exposed as \(\:{B}_{l}\), and the highest value is given as \(\:{B}_{h}\). Formerly, the essential features were selected from the normalized data utilizing the presented optimization model.
Dimensionality reduction using AdCO
Besides, the proposed EARDP-DLMNOA model utilizes the AdCO approach for the feature selection subset process36. This approach is chosen because it can efficiently explore and exploit large search spaces while adapting to dynamic environments. AdCO replicates the behaviour of chimpanzees, which allows it to adaptively adjust its search strategy, balancing exploration and exploitation for optimal feature selection. This results in enhanced efficiency and accuracy in detecting the most relevant features from a large dataset, ultimately mitigating dimensionality and computational cost. Compared to conventional methods like GAs or sequential feature selection, AdCO presents faster convergence and better handling of complex, nonlinear relationships within the data. Its flexibility and robustness make it specifically appropriate for real-world applications where data characteristics may vary. Additionally, the capability of the AdCO model to adapt during the feature selection process ensures that it can continuously improve as more data is introduced. Figure 2 depicts the steps involved in the AdCO methodology.
The AdCO is developed by incorporating the adaptive weighting tactic inside the traditional Chimp optimization model to improve the convergence rate. Brain-to‐Body Ratio (BBR) level is more significant for dolphins and chimpanzees; BBR is comparatively higher for chimpanzees. BBR is a measurement of brain size that is concerned with body size. The animal with a high BBR level has intelligent behaviour; thus, Chimp is considered to solve the optimization problem by reflecting its searching behaviour. Even the Chimp optimization model faces the challenge of local optimum trapping, which is resolved by combining the adaptive weighting approach with the traditional model. Therefore, the presented AdCO model helps offer the best optimum solution globally for selecting the essential features that improve classification precision. Moreover, the model’s balanced exploitation and exploration rate effectively resolves the optimization problem.
Mathematical modelling. However, to design the optimization model, four types of chimps are considered: driver, barrier, chaser, and attacker.
Attacker: Attackers are experienced in predicting the prey’s possible escaping paths and acting to influence or block these paths. They apply their capability to predict the movements of the prey and tactically place themselves on separate escape routes. Therefore, the chance of the prey escaping is minimized, making capturing easier.
Barrier: Barriers positioning themselves in a tree to generate a barrier or obstruction, which blocks the prey’s route by producing a physical barrier.
Chaser: The Chaser is a chimpanzee who relates to prey and observes or follows the goal to take it.
Driver: The Driver is in charge of guiding and coordinating the optimization method. It is similar to a coordinator or leader inside the chimpanzee’s troop. Drivers are in charge of selecting the optimal solutions or leading the search near promising regions of the searching region.
Initialization: The chimp’s population (search agents) and the maximum iteration counts are initialized.
Fitness Estimation: Estimating fitness permits models to function more effectively by preventing unneeded assessments and accelerating the optimization convergence. It is projected as follows:
Whereas fitness is signified as \(\:Fit\).
Target Chasing: the target is seized in either the exploration or exploitation stages utilizing different attacking approaches. The driving and chasing behaviour of the search agent is mathematically calculated as follows:
Whereas the current iteration is described as \(\:g\), the target’s location is \(\:{C}_{tar}\), and the search agent’s location is \(\:{C}_{chimp}\). Next, the coefficient vectors are symbolized as \(\:r,f\), and \(\:e\) and are described as:
Here, the \(\:k\) value reduces from 2.5 to \(\:0\) in the local search or randomization stages. The randomly generated numbers have an interval of \(\:\left[\text{0,1}\right]\), which is signified as \(\:{m}_{1}\) and \(\:{m}_{2}.\)
Exploitation Stage: However, when exploring the search space, all search agents, except for the attack search agent, participate in capturing the prey. The solution achieved by the search agents in this stage is intended as:
Here, the carrier, driver, barrier, and attacker solutions are represented as \(\:{C}_{o},\:{C}_{P}\), \(\:{C}_{N}\), and \(\:{C}_{M}\), and the solution updating is characterized as \(\:C\left(g+1\right)\). Now, the location update by the individual chimp is stated as,
Meanwhile, \(\:{a}_{M},{a}_{N},{a}_{O}\), and \(\:{a}_{P}\) are associated with the distances between the target and the attacker and the barrier, carrier, and driver chimps. The variables\(\:\:{r}_{1},{r}_{2},{r}_{3}\), and \(\:{r}_{4}\) are coefficient changes between\(\:\left[\text{0,1}\right]\). The adaptive weighting approach is now combined into the Chimp model to improve the convergence rate without trapping the solution at local ideals. The equation for the adaptive weighting approach is specified as:
The present ad advanced iteration is now signified as \(\:g\) and \(\:{g}_{\text{m}\text{a}\text{x}},\) \(\:L\) mentions the adaptive weight. The control factor is described as \(\:b\); once the location of the chimp is not upgraded, then \(\:b\) is included with the area. Otherwise, \(\:b\) is divided by 2 while upgrading the location of the Chimp search agent. Then, the control factor changes the chimp while trapping at the local best solution. Formerly, the adaptive weight comprised chimp for improving the exploration ability is expressed as,
Exploration Stage: The exploration method includes separating the search agent’s behaviour to find the best solution (target) and then meeting to attack it. For mathematic modelling, the divergence value is characterized as \(\:r\) with an arbitrary value, specifically in the interval of 1 and \(\:-1\). Now, the vector \(\:e\) concerning the value ranging from \(\:(0\), 2), which helps the model in offering random weights for the location of the target in the persistence of distance depending on Eq. (5). Chaotic-based updating solution: Chaotic behaviour mentions behaviour, which looks unpredictable and random however emulates deterministic rules. The expression for the chaotic behavior‐based solution update is stated as:
The randomly generated number \(\:\epsilon\:\) contains the value range of \(\:(0\),1).
Fitness\(\:\:Re\)-evaluation: The fitness is \(\:re\)‐evaluated to check the possibility of the solution gained by the model. Termination: The attainment of an improved solution for selecting the optimum best features or achieving maximum iteration ends the model’s termination.
The fitness function (FF) considers the classification precision and the chosen feature counts. It maximizes the classification precision and reduces the set size of the selected features. Then, the succeeding FF is applied to assess individual solutions, as exposed in Eq. (16).
Whereas ErrorRate represents the classification error rate utilizing the chosen features. ErrorRate is computed as the percentage of incorrect classified to the number of classifications completed, stated as a value amongst (0, 1), \(\:\#SF\) denotes selected feature counts, and \(\:\#{All}_{F}\) refers to total attribute counts in the novel dataset. \(\:\alpha\:\) is applied for controlling the significance of subset length and classification quality. In this experiment, \(\:\alpha\:\) is set to 0.9.
DCAE-based activity recognition
For the activity recognition process, the DCAE technique is utilized37. This technique is chosen because it can automatically learn hierarchical features from raw sensor data without requiring manual feature extraction. DCAE outperforms in capturing spatial and temporal patterns in complex datasets, which is crucial for precisely recognizing human activities. Its convolutional layers effectually extract relevant features, while the auto-encoder structure enables efficient dimensionality reduction, preserving significant data for recognition. Compared to conventional ML models, DCAE can handle high-dimensional sensor data more effectively and is less prone to overfitting due to its unsupervised learning nature. Additionally, the DL ability of the DCAE model allows it to adapt to new activities, making it highly scalable for real-time applications. This adaptability and robustness make DCAE appropriate for dynamic, real-world environments where human activities vary widely. Figure 3 demonstrates the DCAE framework.
The DCAE uses convolutional and deconvolutional layers contrary to fully connected layers established in a DAE method. Owing to its application of CNN features, DCAE might be more appropriate for applications. CNN sets itself apart because it has quality latent in translation latent f: local connections and parameter sharing. In the encoder procedure, convolutional layers are mapped to an internal layer to function as feature extractors and obtain features. The hidden form of the current layers \(\:{n}^{th}\) feature mapping is exposed in the form below.
\(\:W\) refers to the filter’s representation, \(\:b\) stands for suitable \(\:{n}^{th}\) feature mapping bias, \(\:\sigma\:\) represents activation function (like ReLU or sigmoid) and \(\:\text{*}\) signifies the \(\:2D\) convolution method.
This method is then related to converting the resultant features, whereas the deconvolutional layers carry out the contrasting task of reconstructing the latent representation, returning it to its first state.
The symbol characterizes the 2D convolution process *. In contrast, \(\:c\) denotes the consistent bias, \(\:\sigma\:\) designated for the activation function, \(\widetilde {WT}_{n}\) is for the flipping process over either weight dimension, and \(\:H\) is meant for the latent feature mapping group.
The DCAE encourages lowering the reconstruction error to expose latent representations in its inner layer. Choosing the cross-entropy (logistic) loss function is reliable, with research approaches showing that networks trained with emotional loss are considerably better than those trained with the Euclidean (L2) loss function. The last often displays reduced durability, particularly in convolutional NNs with deconvolutional layers. The backpropagation (BP) method, which is related to traditional networks, calculates the error gradient affecting all parameters.
In that case, the target value is indicated by \(\:\widehat{y},\) and the reconstructed value is indicated by \(\:y.\)
Parameter selection using ZOA model
Finally, the hyperparameter selection of the DCAE model is implemented by utilizing the ZOA method38. This method is chosen because it can effectually optimize complex, high-dimensional problems by replicating the social behaviour and movement patterns of zebras in nature. ZOA strikes an effectual balance between exploration and exploitation, ensuring it avoids local minima and finds optimal solutions in challenging search spaces. Unlike conventional optimization techniques, such as gradient-based methods or GAs, ZOA does not require derivative information, making it appropriate for non-differentiable and highly nonlinear problems. Its population-based search strategy also gives robustness against noise and uncertainties in the data. Additionally, the flexibility and adaptability of the ZOA model allow it to fine-tune model parameters in real time, improving the overall performance of activity recognition systems. Given its simplicity and efficiency, ZOA outperforms other optimization approaches regarding computational cost and convergence speed for dynamic and complex datasets. Figure 4 specifies the steps involved in the ZOA model.
Zebras are animals bred from horses and usually live in southern and eastern Africa. These animal’s body feathers are black and white striped. Their most captivating features derive from this fur structure. They show two kinds of behaviour patterns in social life. These are defence behaviours and food searches against predators. Lead zebras guide other zebras in the group toward food sources. Zebras show dual behaviours to escape predators. The initial is to escape with a zigzag movement design. The second is to converge and attempt to scare or confuse the predator. The zebra’s social behaviour stimulated ZOA.
Initialization: The zebra population in ZOA is explained mathematically as a candidate solution for hunting in the searching region. Zebras are primarily positioned arbitrarily in the search area on the plain where the food resources are placed. The location of all zebras is a decision variables matrix. The amount of decision variables differs depending on the problem dimensions. Once the population matrix is produced in ZOA, it is arbitrarily made based on Eq. (20).
\(\:X\) is the zebra population; \(\:{X}_{i}\) refers to \(\:{the\:i}^{th}\) zebra, \(\:{x}_{i,j}\) is the location for the \(\:{j}^{th}\) dimensions of the \(\:{i}^{th}\) zebra, pop stands for the zebra’s population size, and \(\:dim\) represents the problem dimension. Every zebra individual characterizes one candidate solution. The target zebra’s function values are computed by utilizing the size values of all zebra individuals. Values from the objective function of the zebra population are deposited in a matrix. This matrix framework is presented in Eq. (21).
Whereas \(\:Fitness\) denotes matrix values of the objective function.
The values gained with the objective function are compared to the individuals in the population, and the leader zebra in the optimal location is established. Subject to the problem type, the zebra with the lower fitness value or the zebra with the higher fitness value is established as the optimal leader zebra. In all iterations, the positions of the zebras and their fitness values in their novel locations are upgraded. Dual kinds of zebra’s behaviour are applied when defining novel locations of the zebra population. These behaviours are (a) searching for food and (b) defending against predators.
Foraging behavior: zebras spend most of their time-consuming food
Their food resources are usually sedges and grasses. One of the zebras is described as the plain’s zebra, and this zebra guides the population. During ZOA, the lead zebra deliberates the top member of the population and guides other population members near its location in the searching region. Mathematic modelling of this phase is presented in Eqs. (22) and (23).
Whereas \(\:{X}_{i}^{new1}\) refers to the novel location of the \(\:{i}^{th}\) zebra derived from foraging behaviour, \(\:{x}_{i,j}^{new1}\) represents \(\:{j}^{th}\) dimensions location of the \(\:{i}^{th}\) novel zebra, \(\:Fi{t}_{i}^{new1}\) represents the fitness value of the \(\:{i}^{th}\) novel zebra, \(\:Zebr{a}_{j}^{Best}\) denotes the pioneer zebra, \(\:rand\) denotes random number in the interval \(\:\left[\text{0,1}\right]\), and \(\:I=\)round (\(\:1+rand\)).
Defense strategies against predators
In this phase, the defence tactics of zebras against their oppressors were mathematically modelled to upgrade their locations in the searching region of the zebra population. The defence strategies of the zebras differ based on the kind of their oppressors. They escape against their major oppressors, the lions, in a zigzag form and with an arbitrary side-turning movement. They perform frighteningly and confusingly near other enemies. These dual defensive tactics are also expected to be possible. In Eq. (24), the defence approach of zebras against lions is demonstrated in M1, and the defence approach of zebras alongside another predator is demonstrated in M2. The zebra’s location is upgraded in Eq. (25).
Whereas \(\:{X}_{i}^{new2}\) denotes the novel location of the \(\:{i}^{th}\) zebra according to defence strategies behaviour, \(\:{x}_{i,j}^{new2}\) denotes the \(\:{j}^{th}\) dimension location of the \(\:{i}^{th}\) novel zebra, \(\:Fi{t}_{i}^{new2}\) represents the fitness value of the \(\:{i}^{th}\) novel zebra, \(\:Zebr{a}_{j}^{Attack}\) means attacking zebra, \(\:rand\) stands for randomly generated number in the interval \(\:\left[\text{0,1}\right]\), and \(\:I=\)round (\(\:1+rand\))\(\:,\) \(\:Iter\) is present iteration number, \(\:Ite{r}_{\text{m}\text{a}\text{x}}\) is maximal iteration number, \(\:R\) means constant value \(\:(R=0.01)\). \(\:S\) denotes the probability of selecting one of the defence approaches for randomly produced zebras in the interval \(\:\left[\text{0,1}\right].\) Fitness selection is a significant aspect that affects the performance of the ZOA. The hyperparameter choice procedure includes the solution encoder model to assess the efficiency of the candidate solutions. Algorithm 1 describes the ZOA technique.
In this paper, the ZOA considers precision as the main condition for designing the FF that is expressed as shown.
Here, TP and FP represent the true and false positive values.
Experimental result and discussion
The performance evaluation of the EARDP-DLMNOA method is studied under HAR through the smartphone dataset39. The suggested technique is simulated using the Python 3.6.5 tool on PC i5-8600k, 250GB SSD, GeForce 1050Ti 4GB, 16GB RAM, and 1 TB HDD. The parameter settings are provided: learning rate: 0.01, activation: ReLU, epoch count: 50, dropout: 0.5, and batch size: 5. This dataset includes 7352 records below six activities, as illustrated in Table 2. Additionally, this dataset consists of 561 attributes, but only 355 attributes is selected.
Figure 5 demonstrates the correlation matrix of the EARDP-DLMNOA technique. The correlation matrix provides a detailed view of the relationships between diverse variables, emphasizing both positive and negative correlations. Each entry represents the merit and direction of the relationship between two variables, with values closer to 1 or -1 illustrating robust correlations, while values near 0 indicate weak or no relationship. This matrix is crucial in comprehending how several factors in the dataset influence one another, assisting in feature selection and model optimization. The matrix can accentuate patterns, trends, or dependencies that might not be immediately visible, and it serves as a foundational tool for detecting redundant or highly correlated variables. Such insights are significant for enhancing the efficiency and accuracy of predictive models. The visual representation of the correlations assists in making informed decisions regarding data processing and analysis strategies.
Figure 6 illustrates the confusion matrix generated through the EARDP-DLMNOA model below 80:20 and 70:30 of TRPH/TSPH. The performances indicate that the EARDP-DLMNOA approach specifically has effectual identification and detection of all distinct classes. In the TRPH phase, the model performs exceptionally well, with most predicted values matching the actual activities, illustrating high accuracy across all categories. The number of misclassifications is relatively low, with a few instances where activities such as sitting and laying are confused with each other, or walking with walking downstairs or walking upstairs. In the TSPH phase, specifically in the 20% and 30% subsets, the performance decreases slightly, with some confusion between closely related activities such as standing and sitting or walking and walking downstairs. However, the overall accuracy remains high, with the model effectually distinguishing between most activities. This emphasizes the robustness of the model but also points to areas where additional fine-tuning might be required for activities with similar characteristics.
Table 3; Fig. 7 presents activity recognition of the EARDP-DLMNOA method below 80:20 and 70:30 of TRPH/TSPH. The performance showed that the EARDP-DLMNOA method gained efficacious identification of all classes. With 80%TRPH, the EARDP-DLMNOA method attains an \(\:acc{u}_{y}\) of 97.31%, \(\:pre{c}_{n}\) of 91.97%, \(\:rec{a}_{l}\) of 91.55%, \(\:F{1}_{score}\:\)of 91.73%, and \(\:{G}_{Measure}\) of 91.74%. In addition, with 20%TSPH, the EARDP-DLMNOA method attains an \(\:acc{u}_{y}\) of 97.58%, \(\:pre{c}_{n}\) of 92.75%, \(\:rec{a}_{l}\) of 92.47%, \(\:F{1}_{score}\:\)of 92.58%, and \(\:{G}_{Measure}\) of 92.60%. Also, with 70%TRPH, the EARDP-DLMNOA approach obtains an \(\:acc{u}_{y}\) of 96.15%, \(\:pre{c}_{n}\) of 88.20%, \(\:rec{a}_{l}\) of 88.13%, \(\:F{1}_{score}\:\)of 88.16%, and \(\:{G}_{Measure}\) of 88.16%. At last, with 30%TSPH, the EARDP-DLMNOA approach obtains an \(\:acc{u}_{y}\) of 96.24%, \(\:pre{c}_{n}\) of 88.59%, \(\:rec{a}_{l}\) of 88.47%, \(\:F{1}_{score}\:\)of 88.47%, and \(\:{G}_{Measure}\) of 88.50%.
In Fig. 8, the training (TRA) \(\:acc{u}_{y}\) and validation (VAL) \(\:acc{u}_{y}\) performances of the EARDP-DLMNOA technique below 80:20 is exemplified. The values of \(\:acc{u}_{y}\:\)are computed across a period of 0–30 epochs. The figure underscored that the values of TRA and VAL \(\:acc{u}_{y}\) present an increasing trend, notifying the capacity of the EARDP-DLMNOA approach with higher performance across numerous repetitions. In addition, the TRA and VAL \(\:acc{u}_{y}\) values remain close through the epochs, indicating decreased overfitting and superior performance of the EARDP-DLMNOA approach, which guarantees steady calculation on unseen samples.
In Fig. 9, the TRA loss (TRALOS) and VAL loss (VALLOS) graph of the EARDP-DLMNOA approach below 80:20 is highlighted. The loss values are computed throughout 0–25 epochs. The values of TRALOS and VALLOS signify a declining tendency, which indicates the proficiency of the EARDP-DLMNOA method in harmonizing a tradeoff between generalization and data fitting. The consecutive dilution in values of loss and securities increases the maximum performance of the EARDP-DLMNOA method and tunes the calculation results after a while.
The comparative study of EARDP-DLMNOA methodology by existing models is shown in Table 4; Fig. 1018,19,40,41,42. The simulation performance specified that the EARDP-DLMNOA methodology outperformed other outstanding performances. With \(\:acc{u}_{y}\), the EARDP-DLMNOA methodology has improved \(\:acc{u}_{y}\) by 97.58%, whereas the Autoencoders (VideoMAE), VGGNet + ConvNets, ANFIS, SVM, Random Forest (RF), 1dCNN, LightGBM, NKDFF-CNN, CART, and LDA models have attained lower \(\:acc{u}_{y}\) of 93.07%, 90.23%, 88.10%, 96.68%, 89.99%, 91.12%, 87.93%, 88.17%, 96.76%, and 90.04%, correspondingly. In addition, with \(\:{prec}_{n}\), the EARDP-DLMNOA technique has a maximal \(\:{prec}_{n}\) of 92.75%. In contrast, the Autoencoders (VideoMAE), VGGNet + ConvNets, ANFIS, SVM, RF, 1dCNN, LightGBM, NKDFF-CNN, CART, and LDA models have reached diminished \(\:{prec}_{n}\) of 86.76%, 85.55%, 88.62%, 88.63%, 85.65%, 87.49%, 90.59%, 88.68%, 88.71%, and 85.70%, correspondingly. Furthermore, with \(\:rec{a}_{l}\), the EARDP-DLMNOA technique attained higher values of 92.47%, whereas the Autoencoders (VideoMAE), VGGNet + ConvNets, ANFIS, SVM, RF, 1dCNN, LightGBM, NKDFF-CNN, CART, and LDA models attained slightly lesser values of 88.93%, 89.45%, 85.26%, 86.71%, 91.49%, 85.34%, 83.72%, 85.33%, 86.79%, and 91.55%, subsequently. Moreover, with \(\:{F1}_{Score}\), the EARDP-DLMNOA technique has enhanced \(\:{F1}_{Score}\) by 92.58%, whereas the Autoencoders (VideoMAE), VGGNet + ConvNets, ANFIS, SVM, RF, 1dCNN, LightGBM, NKDFF-CNN, CART, and LDA models have gained decrease \(\:{F1}_{Score}\) of 91.83%, 84.55%, 87.50%, 88.42%, 89.70%, 86.93%, 88.69%, 87.57%, 88.49%, and 89.76%, correspondingly.
Table 5; Fig. 11 illustrate the computational time (CT) analysis of the EARDP-DLMNOA technique with existing methods. Among the models tested, the EARDP-DLMNOA technique demonstrated the fastest CT at 7.48 s, significantly outperforming other approaches. In comparison, the Autoencoders VideoMAE model took 10.84 s, while the LDA method recorded the lowest CT at 10.20 s. The RF model had the highest CT at 19.19 s, indicating its relatively higher resource consumption. Models like the 1dCNN model took 20.68 s, and CART required 21.62 s, both emphasizing the trade-off between computational complexity and performance. Moreover, the NKDFF-CNN, CART, and LDA techniques attained improved CTs of 14.07 s, 21.62 s, and 10.20 s. Overall, the EARDP-DLMNOA model exhibited the best balance of speed and accuracy in the analysis.
Conclusion
In this manuscript, an EARDP-DLMNOA model is proposed. The EARDP-DLMNOA model mainly relies on improving the activity recognition model using advanced optimization algorithms. Initially, the data normalization stage is executed using the min-max normalization to convert input data into a beneficial format. Furthermore, the EARDP-DLMNOA model utilizes the AdCO technique for the feature subset selection process. For the activity recognition process, the DCAE technique is employed to categorize data into predefined classes based on its features. Eventually, the DCAE model’s hyperparameter selection is implemented using the ZOA technique. A wide-ranging experimentation is carried out to validate the performance of the EARDP-DLMNOA approach under the HAR through the smartphone dataset. The experimentation validation of the EARDP-DLMNOA approach portrayed a superior accuracy value of 97.58% over existing methods. The limitations of the EARDP-DLMNOA approach comprise various challenges, including the reliance on specific datasets, which may restrict the generalizability of the results across diverse real-world scenarios. The models presented are often sensitive to noise and discrepancies in input data, which can affect their overall robustness. Moreover, the computational complexity of some optimization techniques can result in increased processing time and resource consumption, making real-time implementation difficult. Furthermore, many methods lack adaptability to dynamic environments or changes in user behaviour, affecting their scalability. Data fusion from diverse sensors remains a challenge due to synchronization issues and inconsistencies in data quality. There is also limited exploration of cross-domain applications, specifically when incorporating multiple data sources. Future works may improve model robustness, improve real-time performance, and develop more flexible, adaptive solutions to address these issues across varied conditions.
Data availability
The data supporting this study’s findings are openly available at https://www.kaggle.com/datasets/uciml/human-activity-recognition-with-smartphones, reference number [40].
References
Liaqat, S. et al. Novel ensemble algorithm for multiple activity recognition in elderly people exploiting ubiquitous sensing devices. IEEE Sens. J. 21(16), 18214–18221 (2021).
Najim, A. H., Elkhediri, S., Alrashidi, M. & Nasri, N. The impact of using IoT for elderly and disabled peoples healthcare: An overview. In 2022 2nd International Conference on Computing and Information Technology (ICCIT), 394–398 (IEEE, 2022).
Alam, M. A. U. Ai-fairness towards activity recognition of older adults. In MobiQuitous 2020-17th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services, 108–117 (2020).
Yazici, A. et al. A smart e-health framework for monitoring the health of the elderly and disabled. Internet Things. 24, 100971 (2023).
Zanella, A. et al. Internet of Things for elderly and fragile people. arXiv preprint arXiv:2006.05709. (2020).
Franco, P., Condon, F., Martínez, J. M. & Ahmed, M. A. Enabling remote elderly care: design and implementation of a smart energy data system with activity recognition. Sensors 23(18), 7936 (2023).
N, N. & Mala, R. Enhancing adverse drug reaction classification of attention deficit hyperactivity disorder diagnosis data using deep learning with optimization algorithm. Fusion Pract. Appl. 19(1), 38–49. https://doi.org/10.54216/fpa.190104 (2025).
Saile, K. N. D. & Navatha, K. Smart IoT devices for the elderly and people with disabilities. Advanced Healthcare Systems: Empowering Physicians with IoT-Enabled Technologies, 101–114 (2022).
Mustafa, M. A., Konios, A. & Garcia-Constantino, M. IoT-based activities of daily living for abnormal behavior detection: privacy issues and potential countermeasures. IEEE Internet Things Magazine. 4(3), 90–95 (2021).
Arias, E. J., Paz, L. M. A. & Chalacan, L. M. Multi-Sensor data fusion for accurate human activity recognition with deep learning. Full Length Article. 13(2), 62–62 (2023).
Takawale, A. J. & Paithane, A. N. Metaheuristic-assisted hybrid recognition model for brain activity detection. Biomed. Eng. Appl. Basis Commun. 37(01), 2450039 (2025).
Bari, A. et al. Advancements in multi-view human activity recognition for ambient assisted living. In 2024 Multimedia University Engineering Conference (MECON), 1–6 (IEEE, 2024).
Durga, Y. V., Venkatramaphanikumar, S. & Kishore, K. K. LSTM-based statistical framework for human activity recognition using mobile sensor data. Int. J. Adv. Intell. Paradigms. 29(1), 86–99 (2024).
Yi, M. K. & Hwang, S. O. A data-driven feature extraction method based on data supplement for human activity recognition. IEEE Sens. J. (2024).
Chen, H. et al. Leveraging self-supervised learning for human activity recognition with ambient sensors. In Proceedings of the 2023 ACM Conference on Information Technology for Social Good, 324–332 (2023).
Ye, J., Jiang, H. & Zhong, J. A graph-attention-based method for single-resident daily activity recognition in smart homes. Sensors 23(3), 1626 (2023).
Mekruksavanich, S. & Jitpattanakul, A. Hybrid convolution neural network with channel attention mechanism for sensor-based human activity recognition. Sci. Rep. 13(1), 12067 (2023).
Jiang, B. et al. NKDFF-CNN: A convolutional neural network with narrow kernel and dual-view feature fusion for multitype gesture recognition based on sEMG. Digit. Signal Proc. 156, 104772 (2025).
Hoang, M. L., Matrella, G. & Ciampolini, P. Metrological evaluation of contactless sleep position recognition using an accelerometric smart bed and machine learning. Sens. Actuators A Phys. 116309 (2025).
Rezaee, K. An advanced deep learning structure for accurate student activity recognition and health monitoring using smartphone accelerometer data. Health Manage. Inform. Sci. 11(2), 85–97 (2024).
Arasi, M. A., AlEisa, H. N., Alneil, A. A. & Marzouk, R. Artificial intelligence-driven ensemble deep learning models for smart monitoring of indoor activities in IoT environment for people with disabilities. Sci. Rep. 15(1), 4337 (2025).
Prabagaran, S., Bandla, A. K., Venkatesh, R. J. & Malar, M. J. Poplar optimization algorithm-driven hybrid Siamese top-down neural networks for accurate human activity recognition in IoT networks. In 2024 Second International Conference on Intelligent Cyber Physical Systems and Internet of Things (ICoICI), 392–398 (IEEE, 2024).
Jin, L. et al. Finite element analysis, machine learning, and digital twins for soft robots: state-of-arts and perspectives. Smart Mater. Struct., 34(3) (2025).
Kumar, H. S., Singh, A. & Ojha, M. K. Advanced Artificial Intelligence Strategy for Optimizing Urban Rail Network Design using Nature-Inspired Algorithms. arXiv preprint arXiv:2407.04087 (2024).
Jawad, M. et al. Energy optimization and plant comfort management in smart greenhouses using the artificial bee colony algorithm. Sci. Rep. 15(1), 1752 (2025).
Tahsin, T., Mumenin, K. M., Akter, H., Tiang, J. J. & Nahid, A. A. Machine learning-based stroke patient rehabilitation stage classification using kinect data. Appl. Sci. 14(15), 6700 (2024).
Rathi, S. et al. Learning style prediction of e-learner using hybrid optimizer-based neural network. J. Integr. Sci. Technol. 13(1), 1007–1007 (2025).
Chakraborty, D., Rathi, A. & Singh, R. Design and evaluation of exoskeleton device for rehabilitation of index finger using nature-inspired algorithms. Appl. Intell. 54(20), 10206–10223 (2024).
Li, W. et al. Wearable photonic artificial throat for silent communication and speech recognition. ACS Appl. Mater. Interfaces. (2025).
Thapar, P. et al. A hybrid grasshopper optimization algorithm for skin lesion segmentation and melanoma classification using deep learning. Healthc. Analytics. 5, 100326 (2024).
Thakur, D., Dangi, S. & Lalwani, P. A novel hybrid deep learning approach with GWO–WOA optimization technique for human activity recognition. Biomed. Signal Process. Control. 99, 106870 (2025).
Subha, B., Jeyakumar, V. & Deepa, S. N. Gaussian aquila optimizer based dual convolutional neural networks for identification and grading of osteoarthritis using knee joint images. Sci. Rep. 14(1), 7225 (2024).
Sivasakthivel, R. et al. Simulating online and offline tasks using hybrid cheetah optimization algorithm for patients affected by neurodegenerative diseases. Sci. Rep. 15(1), 8951 (2025).
Purni, J. T. & Vedhapriyavadhana, R. Eosa-net: A deep learning framework for enhanced multi-class skin cancer classification using optimized convolutional neural networks. J. King Saud University-Computer Inform. Sci. 36(3), 102007 (2024).
Yedukondalu, J. et al. Cognitive load detection through EEG lead wise feature optimization and ensemble classification. Sci. Rep. 15(1), 842 (2025).
Kaleeswari, P. et al. DABiG: breath pattern classification using the hybrid deep learning with optimal feature selection. Technol. Health Care. 09287329241303368 (2025).
Ponraj, A. et al. A multi-patch-based deep learning model with VGG19 for breast cancer classifications in the pathology images. Digit. HEALTH. 11, 20552076241313161 (2025).
Baş, E. & Baş, Ş. An example of classification using a neural network trained by the Zebra optimization algorithm. Sinop Üniversitesi Fen Bilimleri Dergisi. 9(2), 388–420 (2024).
http://www.kaggle.com/datasets/uciml/human-activity-recognition-with-smartphones
Bukht, T. F. N. et al. Robust human interaction recognition using extended Kalman filter. Comput. Mater. Continua. 81(2). (2024).
Lai, Y. C., Chiang, S. Y., Kan, Y. C. & Lin, H. C. Coupling analysis of multiple machine learning models for human activity recognition. Comput. Mater. Continua. 79(3). (2024).
Misaki, S. et al. Location-independent Doppler sensing system for device-free daily living activity recognition. IEEE Access. (2023).
Acknowledgements
The author extends his appreciation to the King Salman center For Disability Research for funding this work through Research Group no KSRG-2024-269.
Author information
Authors and Affiliations
Contributions
A whole manuscript prepaid by Dr. Mohammed Maray.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Maray, M. Advanced internet of things enhanced activity recognition for disability people using deep learning model with nature-inspired optimization algorithms. Sci Rep 15, 16809 (2025). https://doi.org/10.1038/s41598-025-00379-7
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-00379-7