The purpose of this article is to provide a basic overview of the journey of data packets exchanged over the Internet, from their creation to switches, routers, NAT, and the methods of data transmission on the Internet. This topic will be particularly interesting for those new to the fields of networking and security, as well as individuals with limited basic knowledge about the data processing processes on the Internet.
Introduction
In several articles, we have seen the importance of two areas of computer security for new users: programming and networking. While these are two distinct parts, both should be regarded as equally important. Without the programming of network protocols, there would be no networks. The question arises: is it essential for a programmer to fully grasp the concepts of networking and low-level network theory? In many cases, it is not necessary. However, a curiosity on the part of the readers would be beneficial and could lead them to programming at various points to experiment with different protocols and network theories.
For newcomers to this field, the first impression of a computer is something hard to forget. When someone discovers the Internet, the wealth of information creates a sense of awe and excitement about how its internal techniques work. Anyone seems to be drawn into a completely new world when using a computer to connect with systems on the other side of the globe. They become curious about how computers and networks perform these tasks. How does information travel from one computer to another, passing through all the various devices to reach its destination?
The Journeys
When an Internet application is called, a series of events occurs. In this article, we will simply introduce how a packet is created and the devices that send it along various paths to reach its destination. Understanding what happens between point A and point Z can be quite beneficial for approaching this field.
Now we should describe what happens from the moment an application is called to when the packets created by the application reach their destination. Let’s assume you are using Firefox to check the news on your favorite website. A series of events has been set in motion that is entirely transparent to you. After the initial TCP/IP handshake, your web browser will send a request to the web server that your homepage is querying for its homepage. The HTTP GET request information now needs to be sent to the web server. What happens to Firefox when you activate the application is that it makes a request to the system. This process will copy the data that Firefox wants to send from the memory spaces of the applications to the internal buffer in the central space.
Depending on which transport protocol the application uses, the socket layer will call either UDP or TCP. It is important to remember that many applications do not use TCP as a transport protocol. DNS uses both UDP and TCP, while other applications like TFTP only use UDP. The socket layer calls the appropriate transport protocol, at which point the data will be copied into the socket buffer.
Data Fragmentation
When copying data from the GET request to a socket buffer, TCP will fragment this data if necessary. Although a GET request corresponds to a single packet and will fit within the Ethernet MTU without issue, what happens if the browser’s request exceeds the MTU? In that case, TCP will fragment the data to ensure the size fits the Ethernet MTU limit of 1500 bytes. A key point to remember here is that this fragmentation will occur at the TCP layer if the application requests to use TCP as their data transport protocol.
Data Transmission in the Network Environment
Data is created along with its own transport layer functions; let’s consider the IP layer. Here, the IP header is constructed, and all important IP addresses are assigned. Next, the data will travel across the data link layer, where both logical link control and media access control perform their tasks. Finally, the data is ready to be transmitted via the physical layers integrated into the system through NIC cards. For most home users, a SoHo router combines both switching and routing capabilities. For corporate users, the switch is a separate piece of hardware from its router. In a corporate environment, computers may connect to switches via cables. If the switch does not have a hard-coded CAM table, it needs to pay attention to the MAC addresses of the computers (unique to each Ethernet card). When the data packet arrives from its transport process carrying the requested website data in the GET request, the switch uses the client’s reverse path to understand where to send those packets back.
How does the client understand its default gateway? Whether it’s a corporate network or a home network, the system will always perform a DHCP packet once to boot up and acquire essential information from the DHCP server. Since not all systems use DHCP, there are no pre-defined IP addresses or gateways. The information includes the DNS server name being used, its IP address, and the default gateway’s IP address. If DHCP is disabled, the system administrator must enter all this information manually. This is extremely inefficient, which explains why DHCP is enabled in most networks.
With the default gateway nearby, the computer understands the destination to access the Internet and retrieve website data as requested by Firefox. After the packets pass through the switch, they form a path to easily traverse the firewall to the router. The packets should be allowed through a firewall, where it will perform several essential tasks. A fully functional firewall will log the source IP and port, along with the destination IP and port. The firewall will keep this information in its state table memory, thus regulating how access to the internal network occurs. If a packet is not logged, it will not gain access to the network. In a future article, we will introduce you to the role of firewalls in protecting your computer.
Routers and NAT
Now, when the packets have passed through the firewall, they are currently heading to the router. The private IP address that the packets possess (assuming it is a basic address 192.168/16) will be transformed into a routable public IP address assigned by your ISP. This address is also assigned to your router. The packets now begin their journey across the Internet and through countless routers during their trip. Each time, the packets head towards another router. So, what happens to the packets themselves?
Let’s start by looking at the router. It will route the packets based on the information in its routing table. When the next router receives these packets, it will calculate according to its routing table to determine the shortest path for transmitting this packet. One of the few elements it will modify is the TTL, or “time to live.” Now, the IP header is changed, and the routers need to compute a new total value for the packets. This continues until the packets reach their desired destination.
The physical layer will send an IRQ to the CPU indicating that data has been processed. Then, the data will ascend to the data link layer, where the web server will recognize the MAC and continue up to the IP layer, followed by the transport layer (where the data is buffered). At this layer, the application information carried by the data is processed here. The end result is the requested information for the GET request has been sent back. Similarly, for a new packet, the same series of events occurs.
Conclusion
Overall, this article has aimed to provide you with a basic understanding of networking and the general concepts of routing, switching, and NAT. We hope you can continue further research to gain deeper insights into this subject, and we wish you success.
Pham Van Linh
Email: [email protected]