Efficient IP address retrieval using a novel octet based encoding technique for high speed lookup to improve network performance

Sonai, Veeramani; Bharathi, Indira; Jamal, Sajjad Shaukat; Bassfar, Zaid; Nooh, Sameer Abdullah

doi:10.1038/s41598-024-84221-6

Download PDF

Article
Open access
Published: 17 January 2025

Efficient IP address retrieval using a novel octet based encoding technique for high speed lookup to improve network performance

Veeramani Sonai¹^na1,
Indira Bharathi²^na1,
Sajjad Shaukat Jamal³^na1,
Zaid Bassfar⁴^na1 &
…
Sameer Abdullah Nooh⁵^na1

Scientific Reports volume 15, Article number: 2254 (2025) Cite this article

809 Accesses
Metrics details

Subjects

Abstract

As the Internet becomes increasingly popular, the number of users connected to it grows significantly. Consequently, the packet processing speed of network systems, such as routers, must be enhanced. IP lookup is a critical task used to find the next hop address by searching for the longest prefix match in the forwarding information base (FIB). The binary trie is one of the most popular software-based approaches for IP lookup. Prefix compression techniques can improve both the time and space complexity of IP lookups, thereby enhancing overall system performance. This paper proposes an efficient deterministic approach to encoding IP prefixes that reduces storage complexity. The proposed technique generates a unique index for each prefix using an encoder, enabling searches to be performed in constant time. Experimental results show that the proposed method improves lookup time by 73%, 65%, and 66% compared to the existing binary trie, path-compressed trie, and multibit trie, respectively. Additionally, it achieves approximately 63% memory savings over these existing techniques.

IPv6 addressing strategy with improved secure duplicate address detection to overcome denial of service and reconnaissance attacks

Article Open access 24 October 2024

Oblivious network intrusion detection systems

Article Open access 15 December 2023

An adaptive detection model for IPv6 extension header threats based on deterministic decision automaton

Article Open access 25 April 2024

Introduction

As network user grows, the demand for efficient management of switch forwarding tables increases.The packet processing speed need to be improved to meet the rapid growth in the number of users, link speed and traffic of the Internet. Packet forwading is one of the important packet processing task in routers. The incoming packet’s destination IP address is searched against the stored forwarding information base (FIB) to determine the next hop information. This FIB data is constructucted using rounting information protocol for a given network. Longest prefix match based searching is performed on the stored destination IP addresses (32-bit for IPv4 and 128-bit for IPv6) of FIB to find the appropriate next hop information. Improvement in search time complexity also increases the throughput¹.

Software based IP lookup techniques are cost effective and scalable compared with the hardware based techniques. Typically, it uses hashing or hierarchical tree based data structure to perform IP lookup. A separate hashing table is maintained for each IP prefix length and search can be performed in constant time. Though hashing delivers a good IP lookup performance it consumes more memory and suffers with collision problem. The tree based IP lookup can be done either by using binary search tree or binary trie. The lookup time complexity of the BST is depends on the number of entries stored in FIB. BST based solution delivers good performence when the number of entries stored in the FIB is small. Binary Trie data structure is popularly used IP lookup technique and it delivers better performance than other tree based lookup techniques. The lookup time complexity of the binary trie is depends on the maximum length IP prefix (worst case 32 in IPv4 and 128 in IPv6) not on the number of entires stored in the FIB.

To address this challenge and reduce memory consumption, various data compression techniques can be applied. These techniques generally fall into two categories: dictionary-based schemes and statistical schemes. Both approaches transform sequences of binary data into a more compact binary representation, thereby reducing the space required for storage. Among the popular methods, the Huffman coding scheme is notable for its efficiency, requiring $O(\log n)$ operations in the worst case when updating its tree structure. In the literature, many search algorithms utilize trie-based data structures. The search time in these structures is primarily influenced by the height of the trie, making it a critical factor in performance optimization. While state-of-the-art studies have primarily focused on improving either time or space efficiency, the proposed approach in this study aims to enhance both. This work introduces a statistical encoding scheme that minimizes the storage requirements of IPv4 and IPv6 addresses in forwarding tables. Specifically, an octet-based encoding strategy is employed, which significantly compresses the address data, leading to more space-efficient storage and faster retrieval. By balancing space and time optimizations, the proposed method addresses key performance bottlenecks in modern network environments.

The height of the trie structures can be reduced by using the statistical or deterministic encoding techniques. This technique will improve the lookup performance but will have small preprocessing overhead. The compression based IP lookup technique will improve time and space complexity. This paper proposes a novel deterministic compression and collision free IP lookup technique, where lookup is performed in a constant time. The proposed technique will improve space and time complexity of the IP lookup technique there by it improves the throughput.

Contribution of this work:

This article proposes an efficient encoding to compress IP informations stored in the forwaring table
The Encoding aims to reduce the storage and increase the lookup speed
The proposed encoding algorithm perform octect based encoding and stored in an array index
The Update operation is performed on-the-fly and unique octet values are stored in the table, hence, update can be viewed as storing the next hop value in the SRAM memory
The proposed IP lookup is compared with state-of-the-art technique
The proposed lookup approach is implemeted with the real-time Router dataset

The proposed system achieves constant-time IP lookup through a unique prefix encoding technique. It supports growing FIB sizes, addressing challenges posed by increasing IPv6 adoption and larger address spaces. The proposed system performance remains stable under heavy network traffic. Offers a compact, high-performance solution that balances speed and memory usage. Also, the proposed system is ideal for resource-constrained environments like edge devices and IoT networks. These benefits collectively position the proposed approach as a comprehensive, scalable, and efficient solution for modern and future network infrastructures.

The rest of the paper is organized as follows. Section "Related works" discuss about related works, Section "Proposed encoding based IP lookup" brief about proposed novel IP lookup technique, Section "Implementation details" describe about the implementation details, Section "Results and discussion" discuss about results and analysis, followed by Section "Conclusion" is conclusion.

Related works

The software-based implementation follows either 1) Hash-based approach 2) Tree-based approach. Due to high false positive results, hash collisions and more memory requirement of hash based approach do not suitable for practical implementations. An alternative implementation of hashing is tree based implementation like trie data structure.

Hash based approaches

There are several hash based approaches are proposed using traditional as well as bloom filters^2,3,4 and⁵. All these approaches can perform better performance for a smaller table size, but it shows non-deterministic performance when a collision occurs.

Tree based approaches

The study⁶ looked at and experimentally assessed technique for trie-based ways that diminish memory accessing, memory consumption, and IP Lookup operations. This paper proposes CBF-HT, an efficient name lookups method that combines counted Blimp fitter and hash tabby to facilitate contents forwarders⁷. CBF-HT can achieves high-speed lookup and fast updates, and significant reduction in memory consumptions. An approach to construction of FIB and route table in Information Centric Network using a special Bloom filter (BF) is proposed to improve the search performance in the network^8,9,10,11,12. The dynamic name based lookup method is proposed for the vehicular named data network¹³. The most popular implementation of IP lookup operation uses tree-like structure called trie data structure is proposed in^14,15,16,17 and¹⁸. In this approach, the lookup operation takes place based on the length of the IP prefixes. The search starts from the root node and continues either left or right based on the prefix value 0 and 1 respectively. The search will terminate when a leaf node is reached. Binary trie consumes more memory due to existance of empty nodes. The worst case complexity of the tire based approach is O(w), where w is the maximum length of IP prefix. To reduce the memory usage in trie data structure, a leaf pushed trie is proposed, where all the prefix information is pushed to the leaf. There are several variants of trie based approach are available in the literature such as multibit trie, level compressed (LC) trie, Patrica, Lulea trie and bitmap trie¹⁹. Level compressed trie uses a combination of binary trie and path compressed trie. This approach replace a complete level binary trie with a single degree node. This algorithm delivers a better performance when all the levels present in the trie. The space complexity of this approach is O(nw). Path compressed trie is a varient of binary trie. This removes an internal single child or a path of single child nodes, which is not a part of valid prefix. As a result entire trie prefix path is compressed and it also save memory storage. If the trie structure is full, then the compression is not possible and it is equivalent to simple binary trie structure. Lulea compression is a variant of multibit trie data structure. The idea behind this approach is that that, consecutive elements with the same binary value (0 or 1) is replaced with a single value. This approach uses a bit map to represent omitted values. Every node will maintain bitmap as well as compressed value. The number of pointer usage in this approach is high. Tree bitmap approach reduces the pointer usage in the lulea compression by having one child pointer instead of all the child node’s. The problem with this approach is that, a large variable sized memory should be allocated²⁰.

Statistical compression based IP lookup

The author²¹ proposed a scheme reduces the NDN memory access by using Huffman coding. The encoded names get stored in the EHNRT for easier lookup in a way that reduces the cost of access operations. The method EHNRT will result in 47.30% reduction in memory usage. To faster the search algorithm, it takes advantage of the regional identity encoding to rapidly adapt search operations ²². The author proposed compressed FIB data frameworks that mighty cut down on the storage space needed within them NDN routers²³. They briefed two improvements to FCTree, a statistically compressed FCTree (StFCTree) and a dictionary compressed FCTree (DiFCTree). Normally, 32 comparisons typically are taken to find the next hop for IPv4 in the worst case scenario. To reduces the number of comparisons, this paper opts for effective compression of the IP prefixes and conducting IP lookups on the forwarding table²⁴. This study create an equilibrium between fast data access and good storage²⁵. To achieve this, compression must be coupled with a speedy search method. This approach aims to meet the needs of space and velocity for the OpenFlow switch.

This paper²⁶ uses a statistical based encoding scheme called Huffman encoding scheme, which uses a frequency information about the data to construct Huffman tree. It uses the tree to find the binary equivalent code for each data. The worst case time complexity of tree construction is O(nlogn) where n is the number of nodes in the tree. The Bonny et al.²⁷ used a canonical Huffman compression to reduce the routing table size drastically. Variable length decoding is one of the limitation of the Huffman encoding scheme. To overcome this problem this paper²⁷ uses a canonical compression scheme. In this approach code word with the same length are represented in consecutive integers. Group the whole dataset into three categories like the alphabet, numbers, operators and rest of all as one group. Then it arranges the entire symbol, into each group in non-increasing order based on their frequency and assigns a code for each symbol in each group. These methods increase the compression ratio by 2-3% than the adaptive Huffman coding technique.

Proposed encoding based IP lookup

Problem definition

The problem of normalizing and performing an IP lookup over the forwarding table is defined as follows. Given the forwarding table T with n number of IP address, find:

A method to encode each an IP address efficiently to minimize the memory requirement
An efficient method to perform an IP lookup and update operation which support high throughput.

The software-based IP lookup approach mostly uses trie or hash based approach to perform an IP lookup operation over the forwarding table. This paper proposes a method which is completely different from the existing solutions. The proposed solution completely eliminates the storage of forwarding information base (FIB) by devising a deterministic mathematical function. By using this mathematical function one can map an incoming packet destination IP address as an uniquely encoded binary index that directly points the next hop SRAM memory as shown in Fig. 1.

Proposed novel encoding of destination IP address

A novel encoding of destination IP address block diagram is shown in Fig. 2. An incoming packet destination IP address of each 8-bit value is given to 8-bit input integer encoding block, where the given each 8-bits of the destination IP address is encoded as a floating point number. All the generated floating point numbers are added using adder and the resultant floating point number is given to output encoder block to generate next hop address.

A given 8-bit input integer encoding is done using a mathematical function f(v), where $f(v) = \frac{s(v)-min(s)}{max(s)-min(s)}\times r$, where v is a given 8-bit integer, s is a set of integer values, min(s) is a minimal value in a set s, max(s) is a maximum value in a set s, and r is user specified range. In case of IPv4 there are four bytes in a given destination IP address. Hence, one need to have four encoded block which will work in parallel. This approach will follow an novel encoding for each byte, where each byte is encoded into floating point number by considering different user specified range r. Consider an example, for most significant byte (MSB) r value is in the range of 1 and 2. For second byte from MSB r value is in the range of 2 and 3, and for third byte from MSB will have r value in the range of 3 and 4. Finally, least significant byte is in the range of 4 and 5.

Let us consider the stored destination 32-bit IPv4 destination IP address be $a_i.b_i.c_i.d_i$. The set a will have set of elements belongs to the first octet of the all the stored IPs ($a = \{a_0, a_1, \ldots a_n\}$). Similarly, $b = \{b_0, b_1, \ldots b_n\}$, $c = \{c_0, c_1, \ldots c_n\}$, and $d = \{d_0, d_1, \ldots d_n\}$. The 8-bit input integer encoding will give values related to f(a), f(b), f(c), and f(d). These encoded values are added using adder unit. The adder will produce the result which is a floating point number. This number is in the form of x.y. The output encoding block will generate next hop memory table index by converting x and y in binary format and concatenating this binary values $\{x, y\}$. For example $8.7_d$ is output encoded as 10000111, where first 4-bits are $8_d = 1000$ and rest 4-bits are $7_d = 0111$. The total storage that is used in the proposed approach is summation of storage of each octet value (i.e., set a, b, c and d) and next hop storage. This will clearly indicate that memory requirement for storage is significantly reduced compared to the existing compression based and other technique from the literature. The proposed approach does not use any search technique for IP lookup. The on-the-fly encoding will produce the index or address of the next hop directly. Hence, search complexity is reduced significantly.

Implementation details

The proposed encoding based IP lookup technique can be easily implemented in the software. The proposed technique will completely eliminates the storage of the forwarding information base (FIB). This FIB is generated by the routing information (RIP) protocol for a given network. The proposed technique will store each octet value in a table and next hop information. The total size required to store each octet value is 256 in the worst case. The total storage required for IPv4 is $4 \times 256$ and IPv6 is $16 \times 256$ and storage for next hop. The generated FIB table is parsed using Algorithm 1 for each octet set (a, b, c and d) storage. The parsing complexity of the FIB table is O(n), where n is the number of entries in the FIB.

Theorem 1 show the complexity of reading each IP address from the forwarding table. Assume the forwarding table has n number of entries, in the worst case it will take O(n) time complexity to read the entire entries.

Theorem 1

Parsing complexity of the forwarding table is O(n)

Proof

Assume that the forwarding table consists of $\{p_1,p_2 \ldots p_n\}$ entries. Each stored destination IP address of the FIB is parsed to generate the octet sets. It takes O(n) time complexity to parse n entries. $\square$

Once the octet storage is generated. In the next step the incoming packets destination IP address is parsed to map each octet to the stored octet set. The encoding of the incoming destination IP address is done as shown in Fig. 3. After mapping the incoming IPs on to the stored sets, encoding is done on the each octet as explained in Algorithm 4. The encoded output e1, e2, e3 and e4 is generated. In the next step these values are added and added sum will be $s = e1 + e2 + e3 + e4$. This added sum is encoded using output encoder to generate the next hop address. From the forwarding table, octet values from each of the IP addresses of the form a.b.c.d can be extracted and store it in an array namely {a[255], b[255], c[255], d[255]}, here the maximum size of the array is 255 because each octet value takes 256 value ranges from 0 to 255. Algorithm 1 is called FIB_Parsing() to perform splitting the octet values from each of the IP addresses in the forwarding table. Then, the normalization for each of the array is performed separately and stored in the arrays namely $\{n_a[255], n_b[255], n_c[255], n_d[255]\}$ using the following relations and it is shown in Fig. 3

$$\begin{aligned} n_a[i]= (a[i]-min(a))/ (max(a)- min(a)) * range_1 \end{aligned}$$

where $0 \le i \le 255$ and

$$\begin{aligned} n_b[i]= (b[i]-min(b))/ (max(b)- min(b)) * range_2 \end{aligned}$$

where $0 \le i \le 255$ and

$$\begin{aligned} n_c[i]= (c[i]-min(c)) / (max(c)- min(c)) * range_3 \end{aligned}$$

where $0 \le i \le 255$ and

$$\begin{aligned} n_d[i]= (d[i]-min(d)) / (max(d)- min(d)) * range_4 \end{aligned}$$

where $0 \le i \le 255$.

The main objective of the proposed approach is to reduce storage requirements while also improving lookup performance. To find the unique index value for each octet, the minimum and maximum values of the octet set are required. This step is essential for determining the normalized value for each integer present in the octet set. The computation time is negligible, as it only occurs when a new octet is added to the octet set. This ensures that inserting new entries, even as the dataset grows, does not lead to significant performance degradation. Each octet in an IP address holds an integer ranging from 0 to 255. Even as the number of IP addresses grows, this range remains constant. As a result, the computational complexity of determining the minimum and maximum values within this fixed range is negligible. This approach also supports scalability as the number of IP addresses increases in real time. Algorithm 2 passes each octet array to be normalized.

Algorithm 3 receives each octet array and encode all the values and store it in a different array.

Theorem 2 describe about the uniqueness of normalized value for each octet value. Since each octet value has non overlapping ranges, the normalized value generated will not overlap each other.

An injective mapping between the encoded index set and the next hop set ensures that each index corresponds to a unique next hop, preventing collisions and enabling efficient data retrieval. Since the mapping is injective, the function avoids collisions, i.e., no two indices share the same next hop. Let $f:S \rightarrow T$ be a function representing the mapping between an encoded index set S and a next hop set T. For every $x_1$, $x_2$ $\in$ S, it must hold: $f(x_1)=f(x_2)$ Consider two distinct indices $x_1$ and $x_2$ in S. If $f(x_1)$ $\ne$ $f(x_2)$, this is consistent with injective. If $f(x_1)$ $=$ $f(x_2)$, it must hold $x_1$ equal to $x_2$. Thus, no two distinct indices map to the same next hop. Since the mapping satisfies $f(x_1) = f(x_2)$ $\Rightarrow$ $x_1 = x_2$, the function f is injective. Therefore, each encoded index corresponds to a unique next hop, ensuring no collisions in the mapping and enabling efficient data retrieval. Thus, the injective nature of the function guarantees that every index in the encoded set uniquely maps to a next hop, avoiding conflicts and supporting efficient lookup operations.

Theorem 2

Uniqueness of encoded values is also unique

Proof

Assume that the forwarding table consists of $\{p_1,p_2 \ldots p_n\}$ entries. For a given IP address say $p_i$ where $0< i < n-1$, it can take the form of k1.k2.k3.k4 where k1, k2, k3 and k4 belongs to values of first, second, third and fourth octet respectively.

For each of the IP address $p_i$, where $0< i < n-1$, it partition each octet value into 4 groups $o_i$, where $0< i < 4$. For each octet $o_i$, where $0< i < 4$, a normalized values are generated based on different ranges which are unique as shown in Fig. 4. The ranges for each octet is defined as: for the first octet $o_1$ and it lies in the range of $0< o_1 < k$ and k is some constant, for the second octet $o_2$ and it lies in the range of $k< o_1 < k+1$, for the third octet $o_3$ and it lies in the range of $k+1< o_1 < k+2$, for the fourth octet $o_4$ and it lies in the range of $k+2< o_1 < k+3$. Due to range for each $o_i$, where $0< i < 4$ is different, there is no overlapping between its normalized values. Using the normalized value for each value, the summation can be performed and it gives the results of some constant say s. Since the summation values are not lies within the range of all octets and it gives the values which is also unique. Then the binary representation for s is stored in the forwarding table. This process is repeated for each IP address and the corresponding binary values are stored into the forwarding table say $\{b_1,b_2 \ldots b_n\}$. One can observe that $length (b_i) < length (p_i)$ where $0< i < n-1$. For a given prefix with the length say l, where $1 < l \le 32$, the normalized binary values always take the maximum of l/2 bits. Because of unique value is generated from the normalization, the resulting binary representation also gives the uniquness. $\square$

In order to perform an IP lookup for a give destination IP address o1.o2.o3.o4, the normalize values for o1, o2, o3 and o4 accessed from the normalized arrays $n_a$, $n_b$, $n_c$ and $n_d$ respectively. Then the simple arithmetic operation is performed as follows using Algorithm 4:

$$\begin{aligned} s=sum(n_a[o1], n_b[o2], n_c[o3], n_d[o4]) \end{aligned}$$

The resultant arithmetic operation s is represented with its equivalent binary value (b) using the Algorithm 5. These s can be used as index say b ( By Algorithm 6) to point out the corresponding next hop value of SRAM memory. Extracting the next hop from SRAM memory is shown in Fig. 5. One can observe that the length of the encoded IP address b can take lesser number of bits than an original IP address.

IP lookup algorithm

This section discuss about the lookup operation carried out in the forwarding information base. With the given forwarding table, destination IP address (d) o1.o2.o3.o4 and the normalized arrays $n_a$, $n_b$, $n_c$, $n_d$, the next hop adddress is returned as shown in Fig. 5. One should extract normalized values for each octet o1, o2, o3 and o4 using the normalized arrays $n_a$, $n_b$, $n_c$, $n_d$. As like encoding process, the same arithmetic operation is applied over the destination IP address as shown in Algorithm 7. Then the resultant value is used to locate next hop from SRAM memory. Extracting the next hop for the index b is shown in Fig. 5.

Theorem 3 shows the lookup or update complexity is $O(\log k)$

Theorem 3

The complexity of the IP lookup/update takes $O(\log k)$, where k is the size of the digit.

Proof

Assume that IP address to be searched in the forwarding table be d. Since the extraction of normalized value and performing arithmetic operation takes only a constant time effort. The resultant arithmetic operation gives a constant which is converted into equivalent binary representation. During the conversion process for a given digit with the size k, in the worst case it takes $\left\lfloor {\log k}\right\rfloor +1$ bits. Retrieving the next hop using the index takes only O(1) effort. The time complexity of IP lookup can be written as $O(\log k)$. $\square$

The proposed technique offers promising results in terms of storage reduction and improved lookup performance. However, its practical implementation in environments with limited computational resources could face challenges. This can be addressed by simplifying the computational steps in the encoding and lookup processes. Additionally, incorporating a Bloom filter to store the index values along with the next-hop information could further optimize performance. These enhancements would make the technique more suitable for real-time applications.

The proposed method encodes each IP prefix into a unique index, enabling efficient lookups. The lookup time is claimed to be $O(\log k)$, where k represents the length (in bits) the encoded prefix. Each octet is normalized as an integer, which takes constant time. The conversion of the integer into a binary string requires $O(\log k)$ bits in the worst case. Even as the dataset size increases, the conversion time remains $O(\log k)$. Consequently, the overall lookup time is determined by $O(\log k)$.

Update complexity

The update represents that adding or deleting a new and/or old entry from the forwarding table. The insertion and deletion of an item can also be done easily in our proposed approach. The insertion and deletion can carry out independently without affecting each value in the memory.

The algorithm for insertion/delete is shown in Algorithm 8. In order to add a new IP address say d, the normalized value for each octet is extracted from the normalized table and a simple arithmetic operation is performed for those values. Both of the operations can be performed at a constant time. Since the arithmetic operation gives a constant number, again, it is converted into equivalent binary representation (by Algorithm 8) which takes $\log k$ effort where k is the size of the digit. Finally, the table value is updated according to the index value. In order to retrieve the next hop for the given IP address correctly, the same procedure is repeated as like insertion. Since the insertion operations are infrequent than lookup operation, the time complexity is justified using the same lookup algorithm. The deletion operation in our proposed work is designed in an simpler manner, to remove an item from the forwarding table there is no need to do anything at the index location of IP prefixes. After an item has been removed, its next hop value in the memory will not be used anyway; during the deletion operation, all the remaining items do not have to bother because it does not alter other location values. Normalizing the octet value takes constant time, O(1). When a new IP address is to be inserted into the routing table, the address is parsed and added to the respective octet. If the octet is already present, no action is taken. However, if the octet is not present, it will be added. In the case where a new octet is added, it takes constant time, O(1), to obtain the normalized value for that octet. Each octet in an IP address holds an integer ranging from 0 to 255. Even as the number of update operations grows, this range remains constant.

Results and discussion

The simulation of the proposed method implemented on an Intel i5 processor with 2.900 GHz clock frequency PC. Initially, the experiment is performed with various trie based algorithms implemented. The routing dataset named AS which can be downloaded from Netbench²⁸ and RIPE²⁹. The experiments were done with the various sizes of the forwarding table. Figure 6 shows the results of applying the binary trie, path-compressed trie, multi-bit trie with the stride=3 and proposed approach. It is clearly shows that IP lookup using the proposed approach works better than existing approaches. Figure 7 shows the results of applying the proposed encoding and the Huffman compression for the IP prefixes table. It is clearly shows that the number of bits taken from the proposed encoding scheme has been always lesser than the Huffman compression scheme. Figure 8 shows the improvement of IP lookup time for the proposed against the binary trie, path-compressed trie, multibit trie with stride=3. It shows that the memory reduction proposed encoding for the real time AS dataset. The Huffman based compression approach has the overhead of table compression, the proposed encoding gives better lookup time for the various dataset is considered without the compression overhead. Since the proposed approach uses the encoded value as an index, which point directly the corresponding next hop. Due to this, the proposed approach greatly reduces the lookup time in the forwarding table. Theorem 4 shows retrieving normalized values from the normalized table. Since the normalized values are generated for each octet value and it is stored in a separate table. Give an octet value, it uses the same value as an index and retrieves the normalized value.

Theorem 4

Retrieving normalized values from the normalized table takes the complexity of O(1)

Proof

The normalized value for each IP prefix is pre-processed and it is stored in a separate table. The normalized value of each octet is retrieved using the corresponding index, which takes the complexity of O(1). $\square$

Theorem 5 assures that the decimal to binary conversion, in the worst case takes $O(\log k)$ complexity where k is the size of the digit.

Theorem 5

The complexity of decimal to binary conversion takes $O(\log k)$, where k is the size of the digit.

Proof

Assume that the size of the digit to be converted into equivalent binary representation is k. During the conversion process for a given digit with the size k, in the worst case it takes $\left\lfloor {\log k}\right\rfloor +1$ bits. The time complexity can be written as $O(\log k)$. $\square$

Theorem 6 shows that the overall encoding complexity in the worst case is $O(n*\log k)$. In the case of Huffman based scheme, it takes pre-processing overhead of $O(n*\log n)$ which is larger than proposed encoding.

Theorem 6

The overall encoding complexity takes $O(n*\log k)$

Proof

By Theorem 1 and Theorem 5 scanning of each IP address and its binary conversion takes the complexity of $n*\log k$. Since the normalized binary values are pre-processed and stored in a separate table, to retrieve any values from the table takes a constant effort i.e) O(1). The overall complexity of the encoding operation takes only $n*\log k$. $\square$

Theorem 7 shows that there are no collision from encoded index to next hop mapping.

Theorem 7

Encoded index values maps to a unique next hop value with no collision.

Proof

Assume that there are two sets namely encoded index (I) and next hop (N). To show an element in a index set (I) is related to different element in next hop (N). Since each index in a set I has at most one element in a domain (I) is mapped to a particular element in a co-domain (N). $\forall$ a,b $\in$ I, if $f(a)=f(b)$, then $a=b$ so the mapping does not collide with any other value in the next hop. $\square$

Memory usage

For the validation purpose, AS dataset and RIPE dataset²⁹ are used. Figures 9 and 10 shows the percentage of compression for each AS dataset and RIPE data set respectively. The amount of compression obtained is computed using the Equation (1)

$$\begin{aligned} ratio =\frac{(Original~bits - Compressed~bits)}{Original~bits} \times 100 \end{aligned}$$

(1)

The experimental results shown in Figs. 9 and 10 clearly shows that the percentage of compression has increased significantly in each of the benchmark dataset. Theorem 8 show the total memory usage takes the complexity of $O(w*2^b)$ where $b<w$. When a new octet is added, it only requires a small computation time for normalization, which is O(1). This ensures that inserting new entries, even as the dataset grows, does not lead to significant performance degradation. As the number of prefixes increases, the method can scale efficiently by only updating or adding new octets when necessary. This results in minimal disruption to the system’s overall performance. The storage reduction achieved by normalizing and indexing the octets allows the system to scale without requiring exponential growth in memory usage. This makes it feasible to scale the routing table to handle large datasets.

Theorem 8

The space complexity of proposed approach takes $w*2^{b}$ bits where $b<w$.

Proof

Assume that the forwarding table uses prefix length $w=32$. The total number of bits it consumes in the memory is $(w*2^{b})$ and $b < w$. $\square$

The experimental results shown in Figs. 11,12, 13 and 14 clearly shows that the proposed approach perform better than existing approaches with respect to lookup, insertion, deletion and updation time. The lookup, insertion, deletion, and update times are measured across various datasets with entry sizes of 100, 200, 300, 400, and 500, respectively. The Figures 15 and 16 show the experimental results for the sample AS dataset (AS8452) taken. For the range of value from 0 to 255, distinct value are generated. The result shows that for each of the octet value, the proposed encoding approach generate a unique value and their summation for the given IP address generate unique index to find the next hop address.

Conclusion

An efficient method to encode IP prefixes of the forwarding table using simple normalization and perform an IP lookup over the table is proposed in this work. The value of octet is normalized and a simple arithmetic operation is performed to encode each IP address. Then the entire table is replaced with the encoded binary value. On average this scheme saves 63% of memory for the above taken IPv4 dataset. The proposed approach shows 73%, 65% and 66% lookup improvement than an existing binary trie, path-compressed trie, and multibit trie implementations respectively. Results from the observation of the above taken AS dataset, the number of bits is not exceeding the maximum width of 8 bits of each octet. IP lookup is performed over the encoded IP prefixes of the table with the complexity of $O(\log k)$. The space complexity after the compression takes $O(2^{b}*w)$ where $b < w$. Because of the reduction in the number of bits, lookup efficiency can also be improved further.

Data availibility

The dataset utilized in this research article is publicly available at www.fit.vutbr.cz/netbench and http://www.ripe.net/.

References

Sankaran, G. C. & Sivalingam, K. M. Design and analysis of fast ip address$-$lookup schemes based on cooperation among routers. 2020 International Conference on COMmunication Systems & NETworkS (COMSNETS), 330–339 (2020).
Lee, J. & Lim, H. Binary search on trie levels with a bloom filter for longest prefix match. IEEE 15th International Conference on High Performance Switching and Routing (HPSR), 38–43 (2014).
Mun, J. H. & Lim, H. New approach for efficient ip address lookup using a bloom filter in trie$-$based algorithms. IEEE Trans. Comput. 65, 1558–1565 (2015).
Article MathSciNet MATH Google Scholar
Liu, S. et al. Obf$:$ a guaranteed ip lookup performance scheme for flexible ip using one bloom filter. 2021 IEEE 29th International Conference on Network Protocols (ICNP), 1–6 (2021).
Lim, H., Lim, K., Lee, N. & Park, K.-H. On adding bloom filters to longest prefix matching algorithms. IEEE Trans. Comput. 63, 411–423 (2014).
Article MathSciNet MATH Google Scholar
Lin, Y.-H. & Hsieh, S.-Y. Improved ip lookup technology for trie$-$based data structures. J. Comput. Syst. Sci. 133, 41–55. https://doi.org/10.1016/j.jcss.2022.10.003 (2023).
Article MathSciNet MATH Google Scholar
Wang, Y., Huang, T., Wei, G., Li, H. & Zhang, H. Scalable name identifier lookup for industrial internet. Comput. Commun. 186, 102–109. https://doi.org/10.1016/j.comcom.2022.01.017 (2022).
Article MATH Google Scholar
Ueno, Y., Nakamura, R., Kuga, Y. & Esaki, H. Fast longest prefix matching by exploiting simd instructions. IEEE Access 8, 167027–167041 (2020).
Article Google Scholar
Roy, A., Sarkar, T., Singh, P. K. & Maity, R. A hybrid approach to header size and forwarding table optimization in segment routing. IEEE Netw. Lett. (2023).
Li, Z. & Hu, Y. Pasr$:$ an efficient flow forwarding scheme based on segment routing in software-defined networking. IEEE Access 8, 10907–10914 (2020).
Article MATH Google Scholar
Chunduri, U., Clemm, A. & Li, R. Preferred path routing$-$a next$-$generation routing framework beyond segment routing. 2018 IEEE Global Communications Conference (GLOBECOM), 1–7 (2018).
Dutta, N. A bargain game theory assisted interest packet forwarding strategy for information centric network. J. Netw. Comput. Appl. 209, 103546 (2023).
Article MATH Google Scholar
Ashraf, M. W. A. et al. Dynamic naming scheme and lookup method based on trie for vehicular named data network. Wirel. Commun. Mob. Comput. 2022, 6539532 (2022).
MATH Google Scholar
Baboescu, F., Tullsen, D. M., Rosu, G. & Singh, S. A tree based router search engine architecture with single port memories. 32nd International Symposium on Computer Architecture (ISCA’05), 123–133 (2005).
Lim, H., Yim, C. & Swartzlander, E. E. Jr. Priority tries for IP address lookup. IEEE Trans. Comput. 59, 784–794 (2010).
Article MathSciNet MATH Google Scholar
Le, H. & Prasanna, V. K. Scalable tree based architectures for ipv4 and ipv6 lookup using prefix partitioning. IEEE Trans. Comput. 61, 1026–1039 (2012).
Article MathSciNet MATH Google Scholar
Kwon, M., Neupane, K. P., Marshall, J. & Rafique, M. M. CuVPP$:$ filter$-$based longest prefix matching in software data planes. 2020 IEEE International Conference on Cluster Computing (CLUSTER), 12–22 (2020).
Lim, H. & Lee, N. Survey and proposal on binary search algorithms for longest prefix match. IEEE Commun. Surv. Tutor. 14, 681–697. https://doi.org/10.1109/SURV.2011.061411.00095 (2012).
Article MATH Google Scholar
Chen, W., Liu, D., Wang, J. & Tang, X. LPR$-$Trie$:$ a fast ipv6 routing lookup algorithm with virtual nodes. China Commun. 19, 1–11 (2022).
Article MATH Google Scholar
Zhang, W., Gong, X., Tian, Y. & Tang, J. High speed route lookup for variable length ip address. 2020 IEEE 28th International Conference on Network Protocols (ICNP), 1–6 (2020).
Kala, D., Kumar, A., Arunekumar, N. & Suresh Joseph, K. Design and implementation of an efficient scalable forwarding in named data networking using huffman coding. Proc. of IOT with Smart Systems, 337–346 (2022).
Hou, B., Cai, Z., Wu, K., Yang, T. & Zhou, T. 6Scan: a high efficiency dynamic internet wide ipv6 scanner with regional encoding. IEEE/ACM Trans. Netw. 5, 1–16. https://doi.org/10.1109/TNET.2023.3233953 (2023).
Article Google Scholar
Karrakchou, O., Samaan, N. & Karmouch, A. Fctrees: A front-coded family of compressed tree-based fib structures for ndn routers. IEEE Trans. Netw. Serv. Manag. 17, 1167–1180. https://doi.org/10.1109/TNSM.2020.2969172 (2020).
Article Google Scholar
Indira, B., Valarmathi, K. & Devaraj, D. A trie based ip lookup approach for high performance router$/$switch. 2019 IEEE International Conference on Intelligent Techniques in Control, Optimization and Signal Processing (INCOS), 1–6. https://doi.org/10.1109/INCOS45849.2019.8951425 (2019).
Sonai, V., Bharathi, I., Uchimucthu, M., Sountharrajan, S. & Bavirisetti, D. P. Ctla$:$ compressed table look up algorithm for open flow switch. IEEE Open J. Comput. Soc.[SPACE]https://doi.org/10.1109/OJCS.2024.3361710 (2024).
Article Google Scholar
Huffman, D. A. A method for the construction of minimum redundancy codes. Proc. IRE 40, 1098–1101 (1952).
Article MATH Google Scholar
Bonny, T. & Henkel, J. Efficient code density through look up table compression. In Proc. of the Conference on Design, Automation and Test in Europe, 809–814 (2007).
www.fit.vutbr.cz/netbench .
http://www.ripe.net/analyse/internet-measurements/routing-information-service-ris/ris-raw-data .

Download references

Acknowledgements

The authors extend their appreciation to the Deanship of Research and Graduate Studies at King Khalid University for funding this work through the Large Research Project under grant number RGP2/30/45.

Author information

Veeramani Sonai, Indira Bharathi, Sajjad Shaukat Jamal, Zaid Bassfar and Sameer Abdullah Nooh contributed equally to this work.

Authors and Affiliations

Department of Computer Science and Engineering, School of Engineering, Shiv Nadar University Chennai, Chennai, Tamilnadu, India
Veeramani Sonai
School of Computer Science and Engineering, Vellore Institute of Technology, Chennai, Tamilnadu, India
Indira Bharathi
Department of Mathematics, College of Science, King Khalid University, 61413, Abha, Saudi Arabia
Sajjad Shaukat Jamal
Faculty of Computing and Information Technology, University of Tabuk, Tabuk, Saudi Arabia
Zaid Bassfar
Department of Information Systems, Faculty of Computing and Information Technology, King Abdulaziz University, 21589, Jeddah, Saudi Arabia
Sameer Abdullah Nooh

Authors

Veeramani Sonai
View author publications
You can also search for this author inPubMed Google Scholar
Indira Bharathi
View author publications
You can also search for this author inPubMed Google Scholar
Sajjad Shaukat Jamal
View author publications
You can also search for this author inPubMed Google Scholar
Zaid Bassfar
View author publications
You can also search for this author inPubMed Google Scholar
Sameer Abdullah Nooh
View author publications
You can also search for this author inPubMed Google Scholar

Contributions

Veeramani Sonai identified problems and idea formulation. Indira Bharathi implemented the idea. Sajjad Shaukat Jamal testing and validation of results. Zaid Bassfar involded in drafting and compiling. Sameer Abdullah Nooh involed in review and verification

Corresponding author

Correspondence to Indira Bharathi.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Sonai, V., Bharathi, I., Jamal, S.S. et al. Efficient IP address retrieval using a novel octet based encoding technique for high speed lookup to improve network performance. Sci Rep 15, 2254 (2025). https://doi.org/10.1038/s41598-024-84221-6

Download citation

Received: 17 July 2024
Accepted: 20 December 2024
Published: 17 January 2025
DOI: https://doi.org/10.1038/s41598-024-84221-6

Subjects

Abstract

Similar content being viewed by others

IPv6 addressing strategy with improved secure duplicate address detection to overcome denial of service and reconnaissance attacks

Oblivious network intrusion detection systems

An adaptive detection model for IPv6 extension header threats based on deterministic decision automaton

Introduction

Related works

Hash based approaches

Tree based approaches

Statistical compression based IP lookup

Proposed encoding based IP lookup

Problem definition

Proposed novel encoding of destination IP address

Implementation details

Theorem 1

Proof

Theorem 2

Proof

IP lookup algorithm

Theorem 3

Proof

Update complexity

Results and discussion

Theorem 4

Proof

Theorem 5

Proof

Theorem 6

Proof

Theorem 7

Proof

Memory usage

Theorem 8

Proof

Conclusion

Data availibility

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Competing interests

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Quick links