Abstract
In the realm of radio frequency identification (RFID) systems, combatting counterfeit tags through anti-counterfeiting technologies has garnered significant attention, particularly physical layer identification methods, lauded for their cost-effectiveness and simplicity in deployment. Nonetheless, the performance of physical layer recognition method is significantly impacted by the conditions of the tag detection setting, especially in scenarios characterized by low signal-to-noise ratio (SNR), where classification accuracy tends to suffer. To tackle this challenge head-on, this study proposes the implementation of a cognitive risk control strategy, which fine-tunes tag distance within the tag classification process to bolster the SNR and enhance recognition precision. Beyond the enactment of cognitive risk control, this paper extends its focus to encompass enriched time domain and frequency domain feature extraction, totaling 104 features, aimed at further enhancing classification efficacy. Leveraging software-defined radio devices, classification experiments encompassing seven popular tag types from three distinct manufacturers were conducted. Results from these experiments reveal that upon integrating the cognitive risk control strategy, the average accuracy of tag classification experiences an approximate 11% increase. Concurrently, in comparison to traditional twenty-eight and seven features, the adoption of one-hundred-and-four features translates to an enhancement in classification accuracy by roughly 4.3% and 5.3%, respectively. These findings not only underscore the efficacy of cognitive risk control in elevating label classification accuracy within low SNR environments but also underscore the potential for augmenting classification performance through an increased feature set.
Similar content being viewed by others
Introduction
Radio frequency identification (RFID) technology, as an advanced automatic identification solution1, has realized the function of identifying electronic tags with distinct identities without manual operation or physical interaction in many industries. The technology is increasingly used in fields such as logistics, retail, medical and smart transportation. Especially with the promotion of 5G technology and the widespread use of smartphones, the application prospects of RFID have been further broadened. However, the security issue of RFID tags has become a pressing issue that requires resolution. Due to the lack of effective protection measures, tag information may face the risk of illegal access and data leakage, which not only threatens information security, but may also provide opportunities for criminals2.
To bolster the protection of RFID systems, industry experts have proposed a range of encryption technologies and security protocols3,4. The schemes usually require establishing a shared key between the tag and the reader to ensure the security of data transmission. However, they will increase the difficulty of key management, especially for passive tags with limited computing power, making implementation more complex. In addition, although hashing algorithms5,6 provide a solution without sharing keys, their security still needs to be improved. Although the access control mechanism7 can restrict unauthorized access, its complex verification process may increase the complexity and cost of the system. In addition to the above-mentioned software solutions, some hardware-based technologies have been proposed to prevent tag counterfeiting. For example, adding special materials or unique packaging designs during the production process8,9, although the measures are effective, they often increase production costs and affect reading and writing performance. In addition to using encryption technology and security protocols to ensure the security of RFID systems, deep learning technology10,11 is also applied in the process of identifying authentic and fake signals. The technology analyzes and classifies signals by building complex neural network models. Although deep learning is excellent at processing large amounts of data and identifying patterns, it also faces the challenge of high dependence on data volume and difficulty in tuning model parameters.
Recent research has found that the response signal of RFID tags has a unique physical property, namely a physically unclonable function (PUF)12,13,14. The characteristic originates from the tag’s hardware makeup and physical properties, such as antenna design, energy distribution, amplitude and phase response. The hardware features are developed during the production process and are challenging to replicate. Based on this, the physical layer anti-counterfeiting solution extracts the PUF features in the signal and combines signal processing and pattern recognition technology to identify the authenticity of the tag. Compared with traditional access control and encryption technologies, physical layer identification technology requires limited hardware changes and is reduced expenses, so it is appropriate for broad-scale RFID system applications. Meanwhile, in conditions with large noise disturbances or feeble signal strength, the extracted features might fail to be accurate enough, which may lead to erroneous classification results.
To tackle these issues, this paper proposes a physical layer identification technology aimed at improving the ability to distinguish genuine and fake tags. The primary contributions of the paper include the following. First, by introducing the conception of cognitive risk control (CRC)15,16, a framework is constructed that has the capacity to modify the signal-to-noise ratio (SNR) of tag signals via dynamically adjusting the distance between tags and antennas, notably improving classification accuracy in low SNR environments. Generally, if the tag is close to the reader, the signal strength will increase, and accordingly, the signal-to-noise ratio will increase. However, the relationship between distance and SNR is not always linear or monotonic. Near-field effects and nonlinear effects may cause signal distortion, thereby reducing SNR. In addition, changes in tag sensitivity mean that a single reading distance may not be suitable for all tags. Therefore, dynamically adjusting the distance may be a feasible way to improve SNR. Second, this technique deeply mines the time-based and frequency-based features of the tag response signal to boost the precision of categorization and alleviate the effects of feature selection on the model’s ability to generalize. By using Universal Software Radio Peripheral (USRP) to classify the signals of seven types of UHF RFID tags produced by three distinct manufacturers, the experimental findings reveal that by improving the SNR of the signals, the proposed method improves the classification accuracy by an average of 11%. Furthermore, after introducing new time- and frequency- based characteristics, the classification accuracy increased by 5% on average.
Related work
RFID anti-counterfeiting
Enhancing the protective capabilities of RFID systems commonly involves the adoption of security protocols. Some RFID security protocols achieve this by integrating existing ones such as Transport Layer Security (TLS), Secure Socket Layer (SSL)3, and Internet Protocol Security (IPSec)4. However, the protocols often entail higher energy consumption and demand greater stability in communication links, which may not be suitable for specific RFID application scenarios. Presently, RFID standards like EPC Global Class 1 Generation 217 offer security protocols featuring basic password authentication. While straightforward to implement, the method poses a security risk if password information is compromised. Conversely, the ISO/IEC 29167-10 standard18 offers a more comprehensive security protocol encompassing multiple encryption technologies and varying security levels to meet various security requirements. Despite its robust features, this standard has seen slower adoption compared to the EPC standard due to its recent introduction. Additionally, more advanced RFID security protocols like Verifiable Anonymous RFID protocol7, Hash-Lock19, Robust Security Network20 and Blocker Tag21 significantly enhance system security through technical means such as cryptographic algorithms, digital signatures, and the management of encryption keys22,23,24. However, the advanced security protocols also bring higher computational and storage burdens, posing deployment and operational challenges for RFID systems. Moreover, imperfect key management may weaken the protective efficacy of the protocols. Relatively simpler implementation methods like the one-way hash lock protocol5,25 and random value generation protocol6,26, though vaguely compromising security, impose lower computational and storage requirements, thus being more user-friendly. Therefore, achieving a balance between security and implementation complexity is crucial when selecting appropriate security algorithms or protocols to meet specific application needs.
Along with software protocols, hardware methods make use of the physical features of RFID tags to improve security. For instance, Faraday cage8,27 surrounds the device with a metallic mesh or conductor, effectively shielding against external electromagnetic interference and preventing unauthorized signal interception. Though more expensive and complex to implement, the approach proves highly effective in preventing physical attacks. Reflection shielding technology28, conversely, defending against attackers involves reflecting signals, providing adaptable protection that necessitates extra hardware and algorithm support. Safeguarding against physical damage entails embedding the tag in a delicate casing9, rendering tag information unreadable upon package damage, albeit unable to protect against threats targeting internal meta tag information. Temperature-sensitive tags29 detect unauthorized reading behavior through temperature changes, though requiring precise temperature adjustment to avoid false alarms. While hardware methods offer superior security performance in terms of physical protection, they necessitate early consideration in system design and prove challenging to implement through later software updates. Therefore, the hardware security measures require contemplation during the system conceptualization and manufacturing phases to ensure system versatility and portability.
In the realm of anti-counterfeiting, leveraging the physical layer signal characteristics of RFID tags to authenticate them is a common technical approach. One method entails the direct retrieval of tangible attributes, including the reflection coefficient28 and measurement distance30 from the signal. Alternatively, emphasizing the signal’s time and frequency properties enables feature extraction such as signal fingerprints12,13, phase changes14, frequency changes31, and statistics of a higher-order32. Extracting the features aims to discern subtle distinctions between genuine and fake tags. However, strong noise interference may lead to inaccuracies in feature extraction, thereby affecting classification accuracy.
Management of cognitive risks and selection of features
Initially utilized in cognitive radio, cognitive radar, and driverless vehicle technology15, CRC technology has demonstrated notable effectiveness in boosting system robustness and communication excellence. In the work of16, we explore the application of CRC technology in RFID communication security to bolster its security measures. In conventional CRC, it estimates hidden signals using the Kalman filter, calculates waveform parameter entropy to derive rewards, and employs the Q-learning algorithm to maximize these rewards by solving the Bellman optimization problem. The primary objective of CRC is to minimize signal estimation errors, thereby enhancing tracking accuracy. In this paper, CRC technology is harnessed to augment the anti-counterfeiting capabilities of RFID tags. Initially, the technology segregates transmitted signals and noise signals through IQ demodulation technology and a cluster decoding algorithm. Subsequently, the Signal-to-Noise Ratio (SNR) is computed based on the extracted noise signal, and a search strategy is employed to ensure the SNR meets or surpasses the predefined threshold. Different from the work of16, the search strategy of this paper will be completed by adjusting distance, which has fewer adjustment steps and thus has a faster search speed. To further enhance anti-counterfeiting efficacy, the CRC technology in this study encompasses classification, feature extraction, model training, and testing. It is pertinent to note that traditional physical layer recognition methods often focus solely on limited tag classification features, disregarding the diverse characteristics that tag signals may exhibit. Previous research on feature selection13,32,33,34 has highlighted that relying solely on a couple of unchanging features for classification can lead to decreased recognition accuracy. Hence, this study endeavors to incorporate a wide array of features, including more than one hundred features in both the time and frequency domains, to comprehensively enhance the detection of variations and unique features of tags, thereby achieving more precise authenticity identification.
Problem description
This paper will tackle the issue of suboptimal categorization precision of RFID tags in environments with low SNR as depicted in Fig. 1. The core principle involves conducting classification processing only when the SNR exceeds the threshold. Once the SNR does not surpass the threshold, the system enhances signal quality by adjusting measure of distance from the tag and the reader. Initially, the process preprocesses the tag’s response signal, segregating the baseband signal and desired signal. The noise signal is obtained through deducting the reference signal from the baseband signal, and the SNR is derived by the ratio of the noise signal’s strength to the strength of the reference signal. Afterward, the risk control unit is activated. The SNR that is retrieved is compablue to a defined threshold. If the SNR value is lower than the threshold, then the switch is redirected to the CRC unit; if the SNR meets or surpasses the threshold, the system proceeds to the feature and classification unit. Within the CRC unit, an algorithm based on random search adjusts the tag-reader distance to optimize SNR. The unit forms a closed loop with the preprocessing unit until SNR surpasses the threshold and terminates. Inside the classification unit, both extraction and selection of features take place for baseband, desired, noise, and standard signals. The selected features are subsequently input into a classifier to determine authenticity. Furthermore, the risk control unit’s threshold is modified according to the outcomes from the classification training phase. The preprocessing, classification, and risk control units collaborate in an external closed cycle, which is completed when the refined threshold reaches a satisfactory level of classification precision.
In the implementation of CRCs, two primary problems require attention. Firstly, the structural configuration of the CRC module holds significance. Multiple components contribute to tag SNR, including the electromagnetic surroundings and the hardware of the tag. While this study enhances SNR by adjusting the tag-reader distance, it’s crucial to note that the relationship between distance and SNR is not always linear or monotonic. Near-field effects and nonlinear effects may induce signal distortion, potentially reducing SNR. Additionally, variations in tag sensitivity imply that a single read distance may not be optimal for all tags. Hence, dynamically adjusting distance proves to be an effective strategy to enhance SNR. As shown in the upper left section of Fig. 1, the non-linear or non-monotonic relationship between distance and SNR necessitates a meticulously designed distance search strategy to ascertain the optimal SNR.
Secondly, the development of feature and classification modules poses another challenge. Conventional techniques often depend on a restricted set of predetermined features for classification, which may inadequately capture the diversity of tags. To more effectively capture differences between tags, this study not only captures features from the time domain but also integrates those from the frequency domain. Through the feature selection process, the effectiveness of newly extracted features is evaluated, determining their role in classification. By using this thorough method for extracting and selecting features, a more comprehensive representation of tag characteristics is obtained, leading to improved classification accuracy. As depicted at the bottom of Fig. 1, this feature and classification module aims to maximize the utilization of information from tag signals to achieve more accurate authenticity identification.
CRC for tag classification
Preprocessing
The CRC framework for classifying RFID tags is depicted in Fig. 1, and this section provides an in-depth exploration of every unit. Initially, the tag’s response signal detected by the reader undergoes preprocessing. The central duty of this module is twofold: the primary role is to determine the Signal-to-Noise Ratio (SNR) of the signal for managing the switch network, and its secondary role is to perform initial processing on the tag signal in anticipation of future feature extraction.
The preprocessing unit, illustrated in Fig. 2, begins by IQ demodulating35 the response signal to fetch the I and Q channel signals, acquiring the baseband signal additionally \(\:a\left(n\right)\) after modulo, where \(\:n\)=1, 2, …\(\:N\) represents the points-sampled. Subsequently, the desired signal is derived by making decisions on the baseband signals, denoted as
The decision is denoted as
where \(\:{v}_{0}\) and \(\:{v}_{1}\) denote a duo of central nodes for the baseband signal clusters \(\:a\left(n\right)\), associated with the bit 0 and 1, respectively. Notably, \(\:{v}_{0}\) is ascertained by the proximity to the silent period signal’s cluster centroid30, as also observed in Fig. 7 in the Experiment section. In order to standardize the signal powers, the baseband signal is adjusted to
The noise is extracted from the normalized signal by subtracting what is expected
Following this processing, four signal groups are obtained: the desired signal \(\:{a}_{\text{e}}\left(n\right)\), standard signal \(\:{a}_{\text{n}}\left(n\right)\), noise signal \(\:{a}_{{\upeta\:}}\left(n\right)\), and baseband signal \(\:a\left(n\right)\), from which features will be derived. At the end stage, SNR can be computed as
in which \(\:{P}_{\text{e}}\) and \(\:{P}_{{\upeta\:}}\) correspond to the mean power levels obtained from Eqs. (1) and (4), severally.
Management of risk and network switch
The risk management block’s main purpose is to manage the switch network, as illustrated in Fig. 3. If the SNR fails to meet the requirement, the CRC system switches to the cognitive control block. Within this framework, a self-regulating feedback mechanism where the SNR is fine-tuned till it surpasses or equals the threshold. Once the SNR is already sufficient, the system proceeds with the feature extraction and classification stage. This procedure is outlined below:
where the \(\:{k}_{1}\) and \(\:{k}_{2}\) switches are used to activate the cognitive control module and feature and classification module of the system. When a switch is set to ‘on’, it is in a closed position, and when set to ‘off’ position, it is in an open position. \(\:{V}_{\text{t}\text{h}}\:\)denotes the SNR threshold, established based on the maximum accuracy for classifying tags in the classification module, and is represented as follows:
where \(\:{f}_{c}\left(SNR\right)\) represents a classification accuracy metric dependent on SNR. This implies that when the classification accuracy is at its peak, the SNR value that corresponds to it becomes the essential threshold. The objective of the external feedback loop, as represented by Eq. (7), can be achieved using the training data.
Cognitive control
If the SNR fails to meet the threshold, cognitive control is engaged, and switch \(\:{k}_{1}\:\)closes, as illustrated in Fig. 4. The aim of cognitive control is to adjust the distance from the tag to the reader antenna to ensure that the SNR meets or exceeds the threshold when the current ratio falls short. Figure 5 illustrates shows a schematic representation of the process of searching for the desired SNR, employing different grids to reflect SNR values attainable at various distances. Green squares indicate coordinates meeting the required SNR, white squares signify those failing to meet the requirements, and blue squares indicate the present coordinates. In this illustration, while the distance variable is essentially one-dimensional, reflecting the distance from the tag to the reader, the actual positioning of the tag within the reader’s magnetic field encompasses three-dimensional coordinates. Consequently, considering the tag’s positioning in three dimensions, the distance variable should ideally align with this dimensionality. However, to streamline the search process, a pragmatic approach involves fixing the adjusted tag distance along a single axis within a two-dimensional plane. In this setup, distance adjustment can be conceptualized as having two directions: ‘up’ (to pull further) and ‘down’ (to draw closer). Given the absence of clear monotonicity between the SNR value and distance, as depicted in problem 1 in Fig. 1, a random search method is employed to locate the target threshold. Subsequently, we delve into the details of this algorithm.
Upon application of the \(\:t\)-th action \(\:{a}_{t}\), the initial distance state \(\:{d}_{t}\) undergoes updating to \(\:{d}_{t+1}\). As the process is Markovian, with \(\:{d}_{t+1}\) determined solely by \(\:{d}_{t}\), it can be expressed as
where \(\:\mathcal{A}\) represents the action set, encompassing adjustments such as increasing or decreasing distance, articulated as
in which \(\:\delta\:\) represents the increment by which the distance is adjusted.
Let the updated SNR denoted by \(\:{SNR}_{t+1}=Q({d}_{t},{\mathfrak{F}}_{t})\), where it is associated not only with the distance \(\:{d}_{t}\) but also with external factors \(\:{\mathfrak{F}}_{t}\), such as transmit power and electromagnetic conditions. Assuming time invariance during adjustment period of \(\:t=1,\:2,\:\dots\:T\), \(\:{\mathfrak{F}}_{t}\) can be treated as time-invariant. After removing the subscript \(\:t\) becomes
This random search algorithm follows a specific sequence of actions: if the revised SNR fails to meet the threshold, the same action is selected. If the revised SNR is equal to or higher than the threshold, then no further action is necessary, and the search halts. This search process is represented as
Moreover, as depicted in Fig. 5, the efficacy of these random searches hinges on the initial point selection, where a point nearer to the target can significantly decrease the search iterations. To determine this optimal starting point, the algorithm employs cross-validation. It accumulates SNRs from various tag-reader distances combinations within the training set. The amalgamation that results in the highest accumulation is then served as the starting element in the test set, represented as
where \(\:i\) is the label index for the test set. Additionally, environmental conditions \(\:\mathfrak{F}\) influences the final SNR, necessitating consistency between the test and training sets to avoid model generalization issues. Hence, training should encompass various environments \(\:\mathfrak{F}\). For instance, training can involve initial points for high, medium, and low transmitting power, with corresponding selections based on actual power during testing.
Table 1 outlines the sequential procedure of this search algorithm.
Feature and classification
The unprocessed signal from the tag response is processed to generate twice two sets of signals: the desired signal, standard signal, noise signal, and baseband signal signals. These signals serve as the basis for feature extraction, including traditional time domain statistics like mean, variance, maximum autocorrelation, skewness, Shannon entropy, second center distance and kurtosis13,36. Furthermore, frequency-domain characteristics include parameters like the centroid of gravity frequency, average squared frequency, root-mean-square frequency, frequency’s standard deviation and spectral kurtosis36 are extracted. Moreover, other temporal features such as form factor, maximum autocorrelation, margin factor, standard deviation, pulse factor, root mean square and crest factor are taken into account. Tables 2 and 3 provide a detailed breakdown of these characteristics. In total, twenty-six characteristics are extracted for every signal group, resulting in one hundred and four characteristics from the four groups.
Following feature extraction, the effectiveness of each feature for group classification needs to be assessed, necessitating the use of a feature selection method. Feature selection methods typically include filtering, embedded, and wrapping methods. Given that filtering methods’ performance is independent of the classifier, this study opts for filtering-based feature selection.
Consider a cell variable indicated within a training dataset \(\:{\upchi\:}\)=\(\:\langle\mathbf{S},y\rangle\), where \(\:\mathbf{S}\) is the signal feature vector containing elements \(\:{s}^{\left(1\right)},{s}^{\left(2\right)},\:\dots\:{s}^{\left(N\right)}\), and \(\:y\) is its classification category. Compute the weight \(\:{\omega\:}^{\left(m\right)}\) for per feature \(\:{s}^{\left(m\right)}\) in the training set and arrange them in descending order. Subsequently, choose the \(\:W\) characteristics that have the highest weights, defined as
Organize the chosen characteristics into a new cell \(\:{{\upchi\:}}^{\text{K}}\)=\(\:\langle{\mathbf{X}}^{\text{K}},y\rangle\), thereby generating a new training set \(\:\mathcal{K}\), satisfying
where \(\:{\mathbf{X}}^{\text{K}}\:\)= [\(\:{x}^{\left({q}_{1}\right)},{x}^{\left({q}_{2}\right)},\:\dots\:{x}^{\left({q}_{W}\right)}\)]. Similarly, for a cell within a testing dataset denoted as \(\:{{\upchi\:}}^{\text{T}}\)=\(\:\langle{\mathbf{X}}^{\text{T}},y\rangle\), ensure that the test set \(\:\mathcal{T}\) fulfills
where \(\:{\mathbf{X}}^{\text{T}}\) constitutes a vector comprising the top \(\:W\) weighted features. Once feature selection has been completed, cross-validation can be carried out. Once the weight \(\:w\) of the classifier \(\:{f}_{\text{c}\text{l}\text{a}\text{s}}(\bullet\:)\) satisfying
the training phase concludes, and the test outcome has been acquired from
The classification accuracy can be calculated by comparing the test category \(\:\widehat{y}\) with theexpected category \(\:y\).
Table 4 displays the instructions for carrying out the CRC algorithm.
Experiment setup
Data generation
The experimental setup utilizes passive UHF tags adhering to the specifications of EPC C1 Gen2. A total of 210 tags representing seven common types available in the market are employed. These tags, manufactured by three distinct companies, are detailed in Table 5. Prior to data collection, all 210 tags are programmed with identical EPC codes. The writing process is executed using the UHF100U writer manufactured by Guangzhou Wangyuan Electronics. Refer to Table 6 for the parameters of the writer. The process of collecting data is carried out using a UHF RFID system37 operated by a USRP software-defined radio. The system adheres to the EPC C1 Gen2 standard, with its software implementation is done using GNU Radio. Comprehensive parameters of the system are outlined in Table 7. For access to the source code, visit https://github.com/nkargas/ Gen2-UHF-RFID-Reader.
During each instance of data gathering session, just a single tag is situated inside reader’s magnetic field to minimize collision risks associated with multiple tags. Notably, data collection occurs in a non-isolated environment, potentially susceptible to thermal noise, cellular device interference, wireless communication signals, and RF interference, among other sources. Tags are positioned sporadically along the bisector of the angle formed between the two antennas, as illustrated in Fig. 6. Each tag records data for a duration of 10 s, during which an EPC signal is randomly segmented with intervals of silent periods, as depicted in Fig. 7.
Algorithm and classification techniques
In this experimental setup, the CRC unit is employed to regulate the SNR of the signal that was received to align with the classifying criteria. The precise specifications governing this module are outlined in Table 8, while the distance searching algorithm is detailed in Table 1.
This experimentation employs a pair of distinct cross-validation methods to derive classifying outcomes. One approach involves tags sourced from various types or manufacturers, while the other focuses on tags originating from the same type and manufacturer. Below are the specifics of each method.
Firstly, cross-validation involving distinct tag categories or manufacturers (CrossVal I) employs a 5-fold approach, as depicted in Fig. 8. There are 30 tags in each tag category or manufacturer, with a total of \(\:L\) = 7 categories of tags or manufacturers. The true set is designated as the \(\:l\)-th type as true set, while the pseudo set is the \(\:m\)-th category (where \(\:l\ne\:m\)). Training sets \(\:{\mathcal{S}}_{l}\), \(\:{\mathcal{S}}_{m}\) and test sets \(\:{\mathcal{T}}_{l}\), \(\:{\mathcal{T}}_{m}\) are then formed. Subsequently, the accuracy of classifying the \(\:l\)-th type is determined by averaging the outcomes of binary classification obtained from the test set \(\:{\mathcal{T}}_{l}\) and every \(\:{\mathcal{T}}_{m}\), with \(\:m\) ranging from 1 to \(\:L\).
Secondly, cross-validation involving identical tag categories and manufacturers (CrossVal II) also adopts a 5-fold methodology, as illustrated in Fig. 8. Every tag category comprises 30 tags. Initially, a tag is randomly selected from the \(\:l\)-th category, and 29 data samples from this tag serve as the true set, designated as ‘1’. Subsequently, 29 data samples are collected from the leftover 29 tags of the same type to form the false set, labeled as ‘0’. These sets are partitioned into training sets \(\:{\mathcal{S}}_{l}^{1}\), \(\:{\mathcal{S}}_{l}^{0}\) and test sets \(\:{\mathcal{T}}_{l}^{1}\), \(\:{\mathcal{T}}_{l}^{0}\), respectively. The accuracy of categorizing of the \(\:l\)-type tag is determined through binary classification results obtained from the test sets \(\:{\mathcal{T}}_{l}^{1}\) and \(\:{\mathcal{T}}_{l}^{0}\).
It is crucial to analyze various aspects that could alter the classification outcomes in this experiment, including feature selection, the quantity of selected features, the recently appended time- and frequency-based features, and diverse classifiers, such as SVM, RF, KNN and Vgg16, as outlined below:
-
CRC : the approach suggested within this paper;
-
7 with SVM: retrieve the classic 7 time-based features13 from the baseband signal of EPC tags, but exclude feature selection, then employing a Support Vector Machine (SVM) classifier;
-
28 with SVM: the entire set of twenty-eight features are extracted from the tag’s EPC signal, including baseband, normalization, expected, and noise components32, but exclude feature selection, and employing a SVM classifier;
-
104 with SVM: additional frequency- and time- based features are incorporated, increasing the feature count to 104 (see Tables 2 and 3), then employ a SVM classifier, omitting feature selection;
-
104 with ReliefF: ReliefF feature selection33 is applied to the 104 features, selecting those with weights exceeding 0, then employ a SVM classifier;
-
21/41/61/81/101 with chi2: Chi-square test for feature selection34 is applied to the one hundred and four features, selecting \(\:W\) =21, 41, 61, 81, and 101 features, then employ a SVM classifier;
-
21/41/61/81/101 with fsulaplacian: Apply Laplacian-based feature selection38 to the one hundred and four features, selecting \(\:W\) =21, 41, 61, 81, and 101 features, then employ a SVM classifier;
-
7/28/104 with RF: Employ the features at indices 7, 28, and 104, and apply a random forest classifier39 with 50 trees, with no feature selection;
-
7/28/104 with KNN: The features at indices 7, 28, and 104, coupled with a K-nearest neighbors40 set to consider 3 neighbors, with no feature selection;
-
VGG16: a deep neural network takes as input the time-frequency distribution obtained through wavelet transform of the tag signal10,11,13.
In this experiment, the classification performance is assessed using the classification accuracy, denoted as \(\:acc\), defined by the formula
where \(\:TP\) represents the count of true positives, \(\:TN\) is the count of true negatives, \(\:FP\) is the count of false positives, and \(\:FN\) is the count of false negatives.
Experimental findings
Preprocessing outcomes
Figure 9 illustrates the contrast between response signals from the identical tag both before and following modification with CRC, where the signal in the left has more noise, while the signal in the right has less noise. It reveals that when distance values are chosen randomly, the upper boundary of the response signal appears jagged, indicating a relatively low SNR. However, once CRC adjustments are implemented, the received signal boundary becomes regular and orderly, resulting in a significantly improved SNR. In Fig. 10, seven samples of SNR heatmaps for different random tags are presented. These samples demonstrate noticeable variations in SNRs across different distances and tags. The main goal of CRC is to determine the ideal SNR target value, based on the noise signal power and desired signal power as per Eq. (5). As can be seen from the figure, SNR is not necessarily proportional to the distance, so dynamically modulating the distance can obtain the desired SNR and better classification performance. Figure 11 displays the preprocessed desired signal and interference signal. The features used for classification will be extracted from these four groups of preprocessed signals.
An example of SNR heatmaps for seven random tags, where the types of tags can be seen in Table 5.
Furthermore, Fig. 12 illustrates the quantity of distance adjustments performed on seven random tags, indicating the amount of searches needed to achieve the target Signal-to-Noise Ratio. The data indicates that the search method employed yields fewer than 0.5 tag searches on average. It’s important to note that the results in Fig. 12 represent averages, and due to differences in the sensitivity of various tags, their search outcomes will vary. The fewer the number of searches, the less the computing time. For instance, if a tag belongs to type 7 in Fig. 10, it may require fewer searches due to more grids meeting the condition; conversely, if the tag is type 2, it may require more searches to reach the same SNR target.
Cross Val I
In this sub-section, we delve into the outcomes of CrossVal I cross-validation, which aims to assess the classification performance across various categories and manufacturers. Figure 13 showcases the SVM classifier’s accuracy in classification varies depending on the quantity of features used (7, 28, and 104), both with and without CRC. Upon scrutinizing the data in Fig. 13, it becomes evident that the introduction of CRC consistently enhances classification accuracy, regardless of the quantity of features. This result is used to compare the performance of the method with or without CRC, mainly showing the classification performance for tags of different types or different manufacturers. Furthermore, as the quantity of features increases, with one hundred and four features outperforming twenty-eight, and twenty-eight outperforming seven, there’s a notable improvement in classification accuracy, underscoring the positive impact of feature augmentation. Figure 14 depicts the average classification accuracy, echoing the trends observed in Fig. 13. Namely, the approach incorporating CRC outperforms the one without in terms of accuracy. This result is the average of the results in Fig. 13.
Subsequently, Figs. 15, 16, 17 and 18 display the classification outcomes achieved with RF and KNN classifiers, respectively. Figure 15’s result is similar to Fig. 13, but the classifier uses RF. Figure 16’s result is the average of the results in Fig. 15. Figure 17’s result is similar to Figs. 13 and 15, but the classifier uses KNN. Figure 18’s result is the average of the results in Fig. 17. These results indicate a significant enhancement in classification accuracy after integrating CRC, irrespective of the classifier used. This underscores the universal efficacy of the proposed CRC method in bolstering classification performance, independent of the classifier employed. Furthermore, increasing the number of features is consistently associated with higher classification accuracy across all classifiers.
Lastly, Figs. 19, 20 and 21 compare the classification accuracy with and without feature selection when utilizing one hundred and four features as input for the SVM model. Figure 19’s result is used to compare the performance of the method with or without ReliefF feature selection. It is also the classification performance of tags of different types or different manufacturers, and also considers the performance of the method with or without CRC. Figure 20’s result is the average of the results in Fig. 19. Figure 21’s result is used to compare the performance of the method with or without fsulaplacian feature selection. It is also the classification performance of tags of different types or different manufacturers, and also considers the performance of the method with or without CRC. Feature selection techniques such as ReliefF, fuslaplasian and chi2 are evaluated. Notably, regardless of whether feature selection is employed or the specific technique used, the classification accuracy of the one hundred and four features is not significantly affected. Additionally, the inclusion of CRC does not bring about a significant change in how feature selection affects the one hundred and four features. These findings diverge from those reported in the literature41, where the study suggests that when employing twenty-eight features, feature selection methods can yield better performance compared to scenarios without feature selection. Moreover, the quantity of selected features also influences feature selection outcomes, thereby affecting classification performance differently.
Cross ValII
In this sub-section, we delve into the analysis results of Cross ValII cross-validation, which specifically assesses the performance of classification tags originating from the same type and manufacturer. Figures 22, 23, 24, 25, 26 and 27 present the classification accuracy outcomes obtained using three classifiers: SVM, RF and KNN. Similar to the trends observed in Cross Val I, the implementation of CRC consistently leads to an increase in average classification accuracy across all classifiers, albeit with a less pronounced improvement compared to Cross Val I. Additionally, the classification accuracy remains closely linked to the quantity of features utilized. The highest level of accuracy in classification is achieved when 104 features are employed as input to the classifier, surpassing scenarios with only seven or twenty-eight features. Notably, when RF serves as the classifier, the average level of correct categorization can reach 95.8%. It is crucial to emphasize that, as illustrated in Fig. 22, without employing CRC, the classification accuracy of Type 6 occasionally surpasses the accuracy achieved with CRC. The underlying reasons for this phenomenon will be thoroughly explored in the subsequent discussion section. Figure 22’s result is used to compare the performance of the method with or without CRC, mainly showing the classification performance for the same type and manufacturer of tags. Figure 23’s result is the average of the results in Fig. 22. Figure 24’s result is similar to Fig. 22, but the classifier uses RF. Figure 25’s result is the average of the results in Fig. 24. Figure 26’s result is similar to Figs. 22 and 24, but the classifier uses KNN. Figure 27’s result is the average of the results in Fig. 26.
Figures 28, 29 and 30 depict the comparison of classification accuracy including and excluding feature selection when using 104 features as import for SVM in the CrossValIIcondition. Different techniques for selecting features, such as ReliefF, chi2 and fuslaplasian are examined. Upon analyzing the data in these figures, it becomes apparent that irrespective of whether feature selection is applied or the specific technique employed, there’s no notable effect on the accuracy of classifying the 104 features. Furthermore, consistent with the observations in Figs. 19, 20 and 21, the introduction of feature selection has an insignificant effect on the 104 features, regardless of the presence of CRC. Hence, it can be inferred that feature selection is not imperative, as it can potentially mitigate model overfitting and bolster its generalization ability. Figure 28’s result is used to compare the performance of the method with or without ReliefF feature selection. It is also the classification performance of the same type and the same manufacturer’s tags, and also considers the performance of the method with or without CRC. Figure 29’s result is similar to Fig. 28, but the feature selection method uses chi2. Figure 30’s result is similar to Figs. 28 and 29, but the feature selection method uses fsulaplacian. Figure 31’s result is used to judge the performance of Vgg16 combined with CRC, and the result is the average of the classification performance of seven categories of tags.
Moreover, we present the classification results using the VGG16 deep learning model, as depicted in Fig. 31. Although the mean level of accuracy of this model may not be exceptionally high - possibly due to the typical requirement of large data volumes for deep learning, while the tag data utilized in this study is relatively limited - it’s noteworthy that the classification accuracy of the VGG16 model with CRC exceeds that of the model without CRC. This serves as further confirmation of the efficacy of CRC in enhancing classification performance.
Other results
Since the copper sheet is a metal conductor, it will couple with the tag and interference will occur. In addition, the copper sheet will also reflect part of the carrier emitted by the reader, which will also interfere with the signal. Therefore, we placed a copper sheet within the reader antenna range to generate interference. The size of the copper sheet is 10 cm long and 5 cm wide. It is placed about 20 cm and 40 cm away from the antenna to test the algorithm in this paper. As the tag moves to different positions, the interference of the copper sheet on the tag will change, such as coupling interference. Therefore, the algorithm in this paper can find the appropriate position to obtain a larger signal-to-noise ratio or signal-to-interference ratio. Figure 32 shows the experimental results under the interference of the copper sheet. From the experimental results, it can be seen that when there is interference, the algorithm in this paper can still obtain a classification accuracy of more than 90%, while the classification accuracy of the traditional method is less than 90%.
In addition, we used ReliefF to calculate the weights of 104 features, as shown in Fig. 33. In the figure, positive weights indicate positive contributions to the classifier, and negative weights indicate negative contributions. As can be seen from the figure, all weights have positive contributions (except one feature), which is consistent with the results mentioned before, that is, the classification efficiency is highest when all 104 features are selected. This result shows that most of the features are meaningful, so there is no need to remove redundant features through feature selection methods.
Discussion and conclusion
This paper investigates the application of RFID tag technology for anti-counterfeiting purposes, particularly focusing on physical layer identification technology. Renowned for its affordability and simplicity in deployment, the technology traditionally operates effectively in environments with robust signal-to-noise ratios, accurately classifying tags according to extracted features. However, challenges arise in environments with weak signals, potentially compromising SNR and thereby impacting tag classification accuracy. In order to tackle this issue, the study proposes integrating CRC into the tag classification process, aiming to optimize SNR by adjusting reader-tag distances, thus bolstering classification accuracy across diverse environmental conditions.
In the experimental phase, we utilize the Universal Software Radio Peripheral (USRP) to validate the efficacy of the proposed approach. Initially, we assess the amount of searches required to achieve the desired SNR, calculating the tag’s response signal’s SNR using USRP. If the desired target is not attained, adjustments to the tag’s position are made. Unlike conventional CRC methods, we forego the use of the Q algorithm for adjustment, as we discover an abundance of locations meeting the target SNR at a given distance. Consequently, the adjustment process resembles more of a search problem. To minimize the search efforts, we establish an initial power value close to the target through pre-training. Results indicate an average search number of less than 0.5.
During the assessment of CRC options, special emphasis is placed on the enhancement of tag classification accuracy following SNR improvement. Combining SNR enhancement with feature extraction accuracy, we observe a notable improvement in classification accuracy. For tags of varying types or manufacturers, i.e. the results in Cross Validation I, CRC leads to an increase in the average classification accuracy approximately 15%. This enhancement is consistent across three types of classifiers. However, when dealing with tags of the identical type and manufacturer, i.e. the results in Cross Validation II, the average classification accuracy improvement is modest, around 8%. This could be explained by the relatively high accuracy in classifying tags from the same type and brand without CRC, as noted in existing literature13,42,43. Even in the absence of CRC, the average accuracy of classification is over 90%.
Furthermore, the study delves into a detailed analysis of extracting tag signal features. Our analysis expands from the time domain to incorporate frequency domain characteristics like spectral kurtosis. Additionally, add up to 104 features are derived by extracting characteristics such as peaks and pulses using four kinds of EPC signal. Theoretically, having one higher number of features is expected to provide a more detailed depiction of the tag signal, which can lead to improved classification accuracy for various tags. Empirical findings validate a roughly 2.4% enhancement in classification precision when categorizing various types of tags using the proposed 104 characteristics as opposed to the conventional 28 characteristics, and a 6.6% improvement compared to utilizing only 7 characteristics. Similar improvements are observed when classifying similar tags.
Another notable finding is the reduced dependency of tag classification on feature selection with an increasing number of features. In experiments involving the classification of different tag types, classification accuracy remains largely unaffected regardless of feature selection or the amount of selected features. This observation may be attributed to potential correlations or overlaps between different features as the number of features increases from 7 to 28 to 104. Consequently, even without selecting some features, the remaining ones can still provide similar information, resulting in minimal changes in classification accuracy. This finding offers an advantage by mitigating potential issues associated with determining the optimal number of features during feature selection, thereby circumventing overfitting or underfitting problems.
However, the experimental results also entail some limitations. Firstly, the assumption of static magnetic fields between the tag and the reader overlooks the potential influence of external environmental factors on the SNR. Consequently, adjustments solely based on changing the tag distance may not adequately address fluctuations in the electromagnetic environment. Additionally, the study primarily focuses on Alien-type tags that are typically available for purchase. To enhance the applicability of physical layer technology across a broader spectrum of classification scenarios, future research should involve testing a wider variety of tags and manufacturers. Subsequent endeavors will prioritize constructing a more extensive tag training dataset to enhance the algorithm’s functionality and adaptability.
Data availability
Sequence data that support the findings of this study have been deposited in the https://github.com/monk5469/Distance-based-CRC .
References
Nayak, R. Radio Frequency Identification (RFID): Technology and Application in Garment Manufacturing and Supply Chain (Chapman and Hall/CRC., 2019).
Syed, A. & Ahson Mohammad Ilyas. RFID Handbook: Applications, Technology, Security, and Privacy (CRC, 2017).
Kaur, J. & Kumar, A. Enhanced security in RFID using TLS protocol. Int. J. Eng. Technol. 6 (4), 1391–1396 (2014).
Huang, S., Wu, T. & Lai, C. An EPCglobal compliant RFID security scheme based on IPsec protocol. J. Netw. Comput. Appl. 34 (2), 556–565 (2011).
Rostami, M., Suriadi, S. & Barker, K. Towards secure RFID systems: a survey on attacks and challenges. J. Netw. Comput. Appl. 95, 42–54 (2017).
Tan, X., Li, F. & Zhu, H. RFID tag security enhancement with permutation-based hash function. Int. J. Distrib. Sens. Netw. 15 (4), 1550147719837713 (2019).
Rieback, M. R., Crispo, B. & Tanenbaum, A. S. RFID Guardian: a battery-powered mobile device for RFID privacy management. Proceedings of the 2006 ACM Symposium on Information, Computer and Communications Security, 220–230. (2006).
Chen, L., Zheng, Z. & Guan, X. Anti-collision protocol for passive RFID system using multiple faraday cages. Inf. Sci. 300, 1–10 (2015).
Lee, W., Lee, J. & Chang, K. Physical unclonable function-based low-power RFID authentication protocol for anti-counterfeiting. IEEE Trans. Industr. Electron. 61 (6), 2856–2865 (2014).
Tim, O. S., Roy, T. & Clancy, T. C. Over-the-air deep learning based radio signal classification. IEEE J. Selec. Topics Signal Process. 12 (1), 168–179 (2018).
Wei, S., Qu, Q., Zeng, X. & Liang, J. Self-attention bi-lstm networks for radar signal modulation recognition. IEEE Trans. Microwave Theory Tech. 69 (11), 5160–5172 (2021).
Huang, H. & Zhu, H. RFID security and privacy: a research survey. J. Netw. Comput. Appl. 52, 1–10 (2015).
Bertoncini, C. et al. Wavelet fingerprinting of radio-frequency identification (RFID) Tags[J]. IEEE Trans. Industr. Electron. 59 (12), 4843–4850 (2012).
Yan, J., Liu, L., Huang, J. & Li, Y. A survey of RFID anti-counterfeiting technologies. J. Intell. Manuf. 28 (2), 259–268 (2017).
Shuo Feng, S. & Haykin Cognitive Risk Control for Transmit-Waveform Selection in Vehicular Radar Systems, IEEE Transactions on Vehicular Technology, Volume: 67, Issue: 10, Oct. Page(s): 9542–9556. (2018).
Haifeng Wu, C., Pu, W., Gao, Y. & Zeng Cognitive risk control for physical-layer RFID Counterfeit tag identification. IEEE Trans. Instrum. Meas. 72, 1–15 (2023).
EPC radio frequency identity protocols class-1 generation-2 UHF RFID protocol for communications at 860 MHz-960 MHz, Version 2.0. 1, EPCglobal, G. S. Inc., Brussels, BE, 2015. (2015).
Technology – Automatic Identification and Data Capture Techniques – Part 10: Crypto Suite AES-128 Security Services for Air Interface Communications, ISO/IEC 29167-10:2017, Sept. (2017).
Avoine, G., Castelluccia, C. & Laszka, A. RFID security and privacy: a research survey. Proc. IEEE. 107 (2), 227–282 (2019).
Peris-Lopez, P., Tapiador, J. E. & Li, T. RFID security: a survey. J. Netw. Comput. Appl. 59, 403–422 (2016).
Poon, K. K. & Domingo, M. C. A survey on RFID security and privacy-relevant issues. IEEE Commun. Surv. Tutorials. 13 (4), 559–581 (2011).
Guo, Y., Yang, J. & Liu, B. Application of chaotic encryption algorithm based on variable parameters in RFID security. EURASIP J. Wirel. Commun. Netw. 2021(1), 1–17 (2021).
El Abkari et al. RFID system for hospital monitoring and medication tracking using digital signature. Digital Technologies and Applications: Proceedings of ICDTA 21, Fez, Morocco. Cham: Springer International Publishing, 1051–1060. (2021).
Elngar, A. A. et al. Augmenting security for electronic patient health record (ePHR) monitoring system using cryptographic key management schemes. Fusion: Pract. Appl. 5 (2), 42–52 (2021).
Yang, S. S. et al. Design and implementation of active access control system by using nfc-based eap-aka protocol. Wireless Pers. Commun. 118, 2487–2503 (2021).
Gao, M., YuBin & Lu URAP: a new ultra-lightweight RFID authentication protocol in passive RFID system. J. Supercomputing. 78 (8), 10893–10905 (2022).
Dobrykh, D. et al. Hardware RFID security for preventing far-field attacks. IEEE Trans. Antennas Propag. 70 (3), 2199–2204 (2021).
Kwon, O., Lee, S. J. & Lee, H. J. Reflective jamming attack and countermeasure in RFID systems. J. Netw. Comput. Appl. 36 (2), 633–643 (2013).
Han, B., Xu, H. & Xu, L. A temperature-sensitive RFID tag for anti-counterfeiting application. IEEE Trans. Industr. Electron. 65 (2), 1501–1509 (2018).
Eom, T., Shin, D. & Kim, K. An efficient RFID mutual authentication protocol based on distance bounding. In Proceedings of the 2014 International Conference on Information and Communication Technology Convergence, pp. 200–204, (2014).
Liu, Z., Xu, X., Zeng, Q. & Lu, J. An anti-counterfeit RFID system based on the harmonic analysis of tag signal. IEEE Trans. Instrum. Meas. 66 (6), 1039–1049 (2017).
Wu, H. et al. Feature selection and cross validation for physical-layer RFID counterfeit tag identification. IEEE Trans. Instrum. Meas. 71, 1–14 (2022).
Bonik-Sikonja, M. & I. Kononenko Theoretical and empirical analysis of ReliefF and RReliefF. Mach. Learn. 53, 23–69 (2003).
Satorra, A. & Bentler, P. M. A scaled difference chi-square test statistic for moment structure analysis. Psychometrika, 66(4), 507–514 (2001).
Wu, H., Wu, X., Li, Y. & Zeng, Y. Collision resolution with FM0 signal separation for short-range random multi-access wireless network, in IEEE Transactions on Signal and Information Processing over Networks, vol. 7, pp. 438–450, (2021). https://doi.org/10.1109/TSIPN.2021.3093000
Antoni, J. The spectral kurtosis: a useful tool for characterising non-stationary signals. Mech. Syst. Signal Process. 20(2), 282–307 (2006).
Kargas, N., Mavromatis, F. & Bletsas, A. Fully-coherent reader with commodity SDR for Gen2 FM0 and computational RFID. IEEE Wirel. Commun. Lett. 4 (6), 617–620 (2015).
He, X., Cai, D. & Niyogi, P. Laplacian Score for Feature Selection. NIPS Proceedings. (2005).
Breiman, L. Random forests. Mach. Learn. 45, 5–32, (2001).
Zhang, H., Wang, Z., Xia, W., Ni, Y. & Zhao, H. Weighted adaptive KNN algorithm with historical information fusion for fingerprint positioning. IEEE Wirel. Commun. Lett. 11 (5), 1002–1006. https://doi.org/10.1109/LWC.2022.3152610 (2022).
Costa, F. et al. A robust differential-amplitude codification for chipless RFID. IEEE Microw. Wirel. Compon. Lett. 25 (12), 832–834 (2015).
Romero, H. P., Remley, K. A., Williams, D. F. & Wang, C. Electromagnetic measurements for counterfeit detection of radio frequency identification cards. IEEE Trans. Microw. Theory Tech. 57 (5), 1383–1387 (2009).
Danev, B. & Heydt-Benjamin, T. S. and S.Capkun,Physical-layer identification of RFID devices, in Proc. 18th USENIX Secur. Symp., San Jose, CA, pp. 199–214. (2009).
Acknowledgements
This work was supported in part by the Natural Science Foundation of China under Grant 62161052 and in part by the Yunnan Key Laboratory of Unmanned Autonomous System.
Author information
Authors and Affiliations
Contributions
Wu is the originator of the concept and the guiding force behind the research idea. He provided the theoretical framework and oversaw the overall direction of the project.Wang was the principal experimentalist responsible for conducting the majority of the experiments and data collection. He played a pivotal role in the execution of the research and the analysis of the results.Pu provided critical guidance and assistance to Wang throughout the experimental process. He contributed to the experimental design and helped troubleshoot issues that arose during the research. Ma filled in gaps by completing additional work necessary for the completion of the research. His contributions were essential for ensuring the comprehensiveness and accuracy of the study.Zeng provided valuable insights and feedback on the research findings. She contributed to the interpretation of the data and the refinement of the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.
About this article
Cite this article
Wu, H., Wang, S., Pu, C. et al. Enhancing counterfeit RFID tag classification through distance based cognitive risk control. Sci Rep 15, 4150 (2025). https://doi.org/10.1038/s41598-025-87809-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-025-87809-8