The Datalink Layer
The datalink layer (the link layer, Layer 2 or L2) also has one purpose: send and receive IP datagrams. L2 doesn’t maintain a connection with the target host; it’s intentionally “connectionless”, and it doesn’t guarantee delivery or integrity of packets. It just ships them between source and destination. Give it a message, give it a MAC address, and it sends it; that’s all. At first, this message-agnostic approach may seem a little weird. L2 is not without error-checking and recovery code, but it functions efficiently precisely because it isn’t concerned with the data, or even the message containing the data.
That might surprise you, especially since the word “datagram” is sometimes used a little too freely with respect to L2. A datagram is just a basic network transfer unit – the indivisible unit for a given layer – any given layer. If we’re talking about L2, it’s an IEEE 802.xx frame. At the network layer (L3, we’ll come to that in a minute), it’s a data packet. For the transport layer (L4), it would be called a segment. By now, you’re probably wondering what the indivisible units in the physical layer are called.
Chips; they’re called chips. Beats me. But I do know that they are spread-spectrum pulses in the CDMA, noise-utilising transmission system that operates at that layer. My advice? Unless you’re a EE with some communications training, you might not need to go there. Since datagram isn’t carefully used by everyone (think of User Datagram Protocol), we’ll agree to call these indivisible layer units PDUs (protocol data units). This avoids conflation with other uses and reminds you that it’s the atomic unit at the current network layer. Just remember that, at the link layer (L2), it’s a frame.
MAC Frames
A MAC frame, or just “frame”, encapsulates the packets from the network layer so that they can be transmitted on the physical layer. A frame can be small or big, anywhere from 5 bytes up to the kilobyte range. The upper size limit is called the maximum transmission unit (MTU) or maximum frame size. This size is set by the particular network technology in use.
This last observation brings up a good point: In order to talk sensibly about frames, we’d need to say what kind of frame. We’re almost always talking about packet-switched networks, so there are potentially four frame types to consider: Ethernet, fibre channel, V.42, and PPP (point-to-point protocol). Happily, Internet networks almost exclusively use Ethernet, as defined in the IEEE 802 standards, so we’ll stick to that particular frame type for this discussion. Where other frame types may come into play, we’ll discuss those as special cases.
Ethernet
Before explaining an Ethernet Frame, we need to give a little background information about how Ethernet works; otherwise a lot of the frame components either won’t make sense, or you’ll wonder how it works at all. Remember earlier, when we talked about voice radio, and the need to say “over”? Well, Ethernet at the link layer is all about controlling the conversation, so that computers don’t “talk over each other”. Ethernet implements an algorithm called CSMA/CD, which stands for “carrier sense multiple access with collision detection.” This algorithm controls which computers can access the shared medium (an Ethernet cable) without any special synchronisation requirements.
“Carrier sense” means that every NIC does what humans (should) do when we’re talking: it waits for quiet. In this case, it’s waiting for the network to be quiet, that is, when no signal is being sent on the network. “Collision detection” means that, should two NICs both start to send on a shared network at the same time (because the network was quiet), they each receive a jam signal. This signal tells them to wait a specific, randomly-generated amount of time before attempting again. Every time subsequent messages collide, the NIC waits twice the amount of time it previously waited. When it waits some maximum number of times, the NIC will declare a failure and report that the message didn’t go. This ensures that only one frame is traversing the network at any given time.
Media Access Control (MAC)
Systems like CSMA/CD are a subset of the Media Access Control (MAC) protocol kit. MAC is one-half of the link layer, with Logical Link Control (LLC) being the other half – though these are sometimes called sub-layers. LLC mostly just defines the frame format for the 802.xx protocols, like WiFi, so we can safely ignore it for the moment. If you’ve worked with networks at all, you’ve heard of MAC addresses. Those are basically unique serial numbers assigned to network interface devices (like NICs) at the time of manufacture.
Theoretically, they are unique in the world, not counting virtual NICs in virtual machine environments. MAC address collisions do happen when using VMs, and there are ways to fix it, assuming that your VMs are confined to a subnet. The MAC sub-layer is connected to the physical layer by a media-independent interface (MII), independent of the actual link protocol (e.g, cellular broadband, Wi-Fi radio, Bluetooth, Cat5e, T1, …). You can learn more about the MII if you’re so inclined, but we won’t address it again in the context of this tutorial.
Essentially, the MAC sub-layer grabs higher-level frames and makes them digestible by the physical layer, by encoding them into an MII format. It adds some synchronisation info and pads the message (if needed). The MAC sub-layer also adds a frame check sequence that makes it easier to identify errors. In conventional Ethernet networks, all this looks something like the following: Let’s decode those blocks of bits: - The Preamble is 7 bytes of clock sync, basically just zeroes and ones like this: …0101010101… This gives the receiving station a chance to catch up and sync their clock, so the following data isn’t out of sync (and thus misinterpreted).
To delve just a little deeper, the Preamble helps the receiving NIC figure out the exact amount of time between encoded bits (it’s called clock recovery). NTP is nice, but Ethernet is an asynchronous LAN protocol, meaning it doesn’t expect perfectly synchronised clocks between any two NICs. The Preamble is similar to the way an orchestra conductor might “count the players in” so they all get the same rhythm to start. Before clock recovery, there was MPE. Clock recovery is much more reliable than trying to get computers all over the world synced up to the same clock frequency and the same downbeat (starting point).
Ethernet actually started out that way with something called Manchester Encoding or Manchester Phase Encoding (MPE). This was important because electrical frequency varies not only across the world, but also from moment to moment when the power is slightly “dirty”. MPE involved bouncing a bit between two fractional voltages using a 20MHz oscillator to generate a reference square wave. It works, but it’s not very efficient, so MPE was scrapped in favour of using the Preamble, the way that projectionists use alignment marks on reels of movie film.
Here’s a look at the pieces:
The Start Frame Delimiter (SFD) is the last chance for clock sync. It is exactly 10101011, where the “11” tells the receiving station that the real header content starts next. The receiving NIC has to recover the clock by the time it hits the SFD, or abandon the frame to the bit bucket.
The Destination Address (DAddr) is six bytes long, and gives the physical address – the MAC address – of the next hop. Be aware that the next hop might be the destination, but it’s also possible that the next hop might be a NAP, MAE, NSP, or intermediate ISP. It’s basically the next address in the direction of the destination that the sender knows about. Unlike the Source Address, the Destination Address can be in a broadcast format (similar to a subnet like 192.18.0.0, but using MAC addresses).
The Source Address (SAddr) is also a six-byte MAC address, this time the MAC address of the sender, which does not change as long as the message is traversing only layer-2 (Ethernet) switches and routers.
The PDU Length (PDULen) gives the byte length of the entire frame, assuming that it’s 1500 or less. If it’s longer than that, it indicates a message type, like IPv4, ARP, or a “q-tagged” frame, which carries a VLAN ID.
The DSAP, SSAP, and Control elements are each one byte in length, and help define devices and protocols. For the most part, we won’t be worried about these with typical networks. Just know that as more and more 802 point-standards come out (e.g., 802.11, WiFi), these elements get longer and more complex.
The Data or “Payload” is the actual packet being sent, which in the case of TCP/IP, is just a TCP header attached to a fixed-length chunk of the application’s data. It’s passed on from the layer above. It cannot be less than 46 bytes, and in conventional Ethernet, it cannot be larger than 1500 bytes. If the actual data is too small, it’s padded out to 46 bytes.
The CRC or “Frame Checksum” (FCS) is a standard checksum, used to verify that the message hasn’t been corrupted along the way.
The Preamble and SFD are often considered to be part of the IEEE “packet”, so some people start counting the “frame” at the Destination Address. That distinction shouldn’t affect anything meaningful that we do with networks, but it’s nice to keep in mind, in case you run into someone who groups packets differently than you do.
Trunking VLANs
There is a crucial modification to the basic frame format called a P/Q or VLAN Tag. This allows something called VLAN trunking, which means sending all the VLAN data over the same wire and port, but giving the NICs a field (the P/Q tag) to control access. On paper, it looks something like this:
Sixteen bits of tags or a protocol ID. - Three bits representing a priority.
One bit is used as a Canonical Format Indicator (CFI), which is 0 if the following VLAN ID is in Ethernet format, or 1 if it’s in Token Ring format.
Twelve bits of VLAN ID. This matters when we’re building complex networks with lots of VLANS that probably cross over switches. After all, VLANs were initially controlled with ports and switches, although they more commonly use tags now. When more than one VLAN spans multiple switches, frames need to carry VLAN information that can be used by switches to sort or “trunk” the traffic.
The Origin of “Trunking”
The word “trunking” is derived from the telephone network term trunk lines, which are lines connecting switchboards. In the original telephone company model, each telephone had a subscriber line, which was a wire that went straight from the local Central Office (CO) to that subscriber’s telephone. Each CO had one switchboard, though it might have many seats.
Connections between Central Offices were handled by trunk lines, because they ran between phone company facilities in enclosues called cable trunks. You’d have a thick cable with lots of pairs running from CO to CO, basically enough wires to handle something like 35% of the possible calls. If you ever got the message, “All circuits are busy now; please try your call again later”, you’ve heard what happens when the system is “trunking above capacity” or “TAC’d”, as it was called.
At the CO, the wires would “branch” and run all over the place: First to junction points (those five-foot-tall metal boxes you see from time to time on the road), then to interface points (the square cans beside the road every half mile or so, also called “pedestals”) and from there to subscriber homes. When you draw out this network, it looks like a tree, where the bundles of cables between COs look like the trunks of trees.
Multiplexing LAN Channels, Actually
With VLAN trunking, by the way, we’re not just multiplexing packets, we’re actually multiplexing LAN channels, so to speak. In the parlance of networks, especially VLANs, the term “trunking” is used to indicate the sharing of network routes. This sharing is made possible by the Ethernet VLAN tags, which make the VLAN-bound messages less dependent on switches and routers to get the traffic to the right place. Otherwise, you’d have to designate complicated port configurations for switches, which is particularly easy to misconfigure.
Note that the MAC sub-layer is responsible for managing CDMA/CD, that is, detecting a clear line, initiating re-transmission if there’s a jam signal, etc. On the way in, the MAC sub-layer verifies the CRC/FCS and removes frame elements before passing the data up to the higher layers. Basically, anything that some other MAC layer did to encapsulate the message for sending, the receiving MAC layer un-does on the way in.
VLANs, Subnets, and Fabrics
When working with networks, you will frequently be concerned with VLANs, subnets, and fabrics, which are all network groupings: - Subnets define (group) a range of IP addresses. - VLANs group subnets. - Fabrics group VLANs. Let’s give each of these terms their due.
Subnets
A subnet is a range or collection of IP addresses. A subnet just means “sub-network,” and that’s exactly what it is: a subset of IP addresses that can be treated like a single block for some operations. Subnets are defined in CIDR (Classless Inter-Domain Routing) notation. If you want to use the addresses from 192.168.13.0 to 192.168.13.255 in a subnet, you can specify that with 192.168.13.0/24. The “24” refers to the number of bits in the subnet address, with the remainder out of 32 bits free to address hosts. Since 8 bits can represent 256 things, that means /24 gives you the last octet, or 255 host IP addresses.
Whatever happened to subnet classes? Subnets used to be defined in terms of subnet classes, like A, B, and C. That got to be a limitation, because those three classes define a fixed number of bits of the IP address that represent the split between subnet addresses and host addresses. In other words, the class defined how many hosts could be in the network, and three classes wasn’t really adequate to address all the possible permutations that network architects needed. The change to CIDR notation made subnets more granular, allowing many more subnets from the same network.
VLANs
A VLAN used to be a series of IP addresses that could access a given port on a specific switch, generally the switch that gated some protected resource. With the advent of VLAN trunking (see above), VLANs are marked with the 802.1Q (P/Q) bits in the MAC frame. In theory, any set of addresses can be associated with any VLAN.
Let me strongly encourage a correspondence of subnets to VLANs. Every IP should be in exactly one subnet, and every subnet should be part of exactly one VLAN. You don’t have to do that: you could, for example, have two different subnets that overlap, like 192.168.43.0/24 and 192.168.43.0/26. The “.26” subnet would use fewer bits for the host addresses, so only some of the addresses would overlap.
A decent network designer generally avoids this kind of address overlap. Likewise, putting one subnet in two different VLANs might be possible, but it isn’t practical or easy to debug when conflicts happen. You should endeavour to enforce a clean “fan-out” across the network, with no possibility of conflicting IP addresses.
Fabrics
A fabric just collects VLANs together. If you stick to the clean fan-out, that also means that a fabric collects subnets. A fabric provides a higher level grouping. Consider an example: Suppose you have one VLAN for HR, and one VLAN for payroll, so that nobody else can see HR’s private files, and likewise you’ve got payroll data limited to just those people who should see it.
Some executives are entitled to see anything and everything about the corporation. An “executive” fabric would group all VLANs together, so that people admitted to that fabric can access the VLANs without having to be explicitly added to each one. That’s very handy in really large organisations, saving a lot of time and effort.
Visualising the link layer
Let’s start with a message coming on Layer 1 from SanDiego to Bangor. When the message comes in, the link layer does the following things:
It synchronises the NIC, so that bits will indeed be recognised as bits and the message can be properly decoded.
It handles the source and destination addresses, using ARP as necessary.
It interprets the length/type bytes and uses them, which means it must judge the length of a frame, and of the data in a frame, or, alternatively, decide whether a frame is IPv4, ARP, or VLAN (“q-tagged”).
It processes VLAN tags, which means, at the very least, dealing with the message priority, deciding whether the VLAN frame is Ethernet or Token Ring, and capturing and using the VLAN ID. The layer handles messages by priority, knows how and when to send Ethernet or Token Ring frames, and knows how to route traffic to a specific VLAN.
It computes the checksum to make sure the message is valid.
Next, we’ll take a look at the network layer, where most of the message transactions take place, and where most of our debugging will be done.
Copyright (C) 2024 by Bill Wear; All Rights Reserved