billwear.github.io

Intro to Networking

From switchboards to spine‑leaf, from ARP to BGP — with field‑ready troubleshooting at every layer.

Skimmable sections. Copy‑pasteable commands. No fluff.

Welcome. I write about networks for a living; this is the friendly version I wish more people had stumbled into early on. If you have fuzzy areas or blind spots, you’re in the right place. We’ll keep jargon minimal, diagrams mental, and the troubleshooting practical.

How We Got Here

How we got here (long arc)

1870s–1930s: Telephony emerges: local exchanges, manual switchboards, then electromechanical step‑by‑step and crossbar switches. The idea: circuits — a dedicated path during a call.

1960s–1970s: Packet switching is proposed (Baran, Davies). ARPANET links UCLA, SRI, Utah, UCSB. Circuits are reliable, but packets are flexible and resilient.

1980s: TCP/IP standardizes across ARPANET (1983). Ethernet becomes the dominant LAN. The OSI model appears (a teaching model), but the pragmatic TCP/IP stack wins deployment.

1990s: NSFNET decommissions; commercial Internet takes over. “NAPs” (Network Access Points) like FIX‑West/East, MAE‑East/West, and CIX facilitate interconnection. Today we call these IXPs (Internet Exchange Points). The web arrives; BGP4 becomes the Internet’s inter‑domain routing glue.

2000s–present: Content delivery networks (CDNs), massive data centers, merchant‑silicon Ethernet switches, and Clos spine‑leaf fabrics replace tall, bespoke hierarchies. Cloud scales by repetition, not snowflakes.

Correcting an old myth

There’s no rule that a provider “must connect to three NAPs.” That phrasing came from 1990s interconnect policy and procurement checklists. In practice, networks interconnect at many IXPs and private interconnects based on cost, performance, and geography — not a magic number.

Modern Network Architecture (Why Clos Took Over)

Imagine two hosts: SanDiego and Bangor. Early on you could rent a private line, but one cut and you’re dark. The Internet’s genius was to forward packets hop‑by‑hop until they find a working path. Inside data centers, the most economical way to scale that forwarding is a Clos (spine‑leaf) fabric:

Merchant silicon + NOS

Modern switches often use Broadcom/Tofino “merchant” ASICs under different brands. A Network Operating System (NOS) provides BGP/OSPF/ISIS, telemetry, and automation hooks. Interop has improved; lock‑in is less absolute than it was.

Internet Infrastructure: ASes, BGP, IXPs (and myths)

At Internet scale we speak of autonomous systems (ASes) — networks under one admin policy — stitched together with BGP. Interconnection happens at:

Peering vs. transit

Peering is usually settlement‑free swaps of traffic between networks of roughly equal value; transit is paid. CDNs and hyperscalers peer broadly to reduce latency and cost.

Layers Without the Hand‑waving (OSI vs. TCP/IP)

The OSI 7‑layer model is a teaching aid; the deployed Internet stack is simpler, but OSI gives us a shared vocabulary:

LayerWhat to knowTroubleshooting tools
1 PhysicalBits on a wire/fiber; power; optics; RF.ethtool, link LEDs, SFP diagnostics, cable testers; Wi‑Fi analyzers.
2 Data LinkEthernet, MAC addresses, VLANs, STP.ip link, brctl/bridge, tcpdump -e, switch MAC tables.
3 NetworkIP addressing, routing, ARP/ND.ip addr, ip route, arp/ip neigh, ping, traceroute/mtr.
4 TransportTCP/UDP, ports, congestion control, MTU.ss/netstat, iperf3, tracepath, TCP dumps, ping DF tests.
5–7Sessions, TLS, HTTP/DNS/SSH, apps.curl, dig/nslookup, openssl s_client, ssh -v, browser devtools.

A Practical Toolbelt (CLI & GUI)

Your everyday toolbelt

  • Link & address: ip link, ip addr, nmcli (NetworkManager), ethtool for speed/duplex.
  • Reachability: ping (ICMP), traceroute/mtr (path), tracepath (PMTU).
  • Name resolution: dig (dig +trace, dig @resolver), resolvectl.
  • Ports & sockets: ss -ltnp, lsof -i, host firewalls (ufw, nft).
  • Traffic capture: tcpdump (CLI), Wireshark (GUI). Start narrow: host/port/proto.
  • Throughput: iperf3 (end‑to‑end), check duplex and MTU first.
  • HTTP/S: curl -v, curl --resolve (DNS override), openssl s_client (TLS handshake).
  • Service pokes: nc (netcat), telnet (legacy), nmap (scan carefully).
  • ARP/ND: ip neigh, arping, switch CAM tables.
  • BGP look‑glasses: public LGs, route‑views; whois for ASNs and prefixes.

MTU pain, quick test

# find max payload before fragmentation (Linux)
ping -M do -s 1472 8.8.8.8   # 1472 + 28 = 1500
# if this fails but -s 1452 works, something en route is at MTU 1472 (PPPoE?)

Step‑by‑step Troubleshooting

First principles troubleshooting (outside‑in)

  1. Power & link: Is the NIC up? LEDs? ip link shows state UP; ethtool shows speed/duplex.
  2. Addressing: Do you have an IP, mask, gateway? ip addr, ip route.
  3. Local reach: ping your gateway; if ARP fails, check switch port/VLAN, ip neigh, tcpdump -e arp.
  4. DNS: dig example.com, then dig @resolver example.com, then dig +trace.
  5. Path: traceroute/mtr to the target and to a known good (e.g., 1.1.1.1). Compare.
  6. Ports: Is the service listening? ss -ltnp on the server; ACL/firewall rules; cloud SGs.
  7. Throughput: iperf3 client↔server. If slow, check duplex, MTU, CPU offload, and queue drops.
  8. Capture, then hypothesize: tcpdump with a narrow filter. Confirm, don’t assume.

Wi‑Fi specifics

Two Sticky Topics: ARP and DNS

ARP in plain English

ARP maps IP→MAC on IPv4 LANs. When host A wants to send to host B on the same subnet, it broadcasts “Who has 10.0.0.42?” and B replies “10.0.0.42 is at 00:11:22:33:44:55.” Routers do ARP for gateways. Common failure: wrong VLAN or stale cache.

# watch ARP while you try to reach the gateway
sudo tcpdump -n -e arp or icmp
ip neigh show
sudo arping -I eth0 10.0.0.1

DNS in plain English

Names resolve to IPs through recursive resolvers. Separate “can I reach the resolver?” from “can the resolver answer for this name?”

# does your resolver respond?
dig @192.0.2.53 example.com
# follow the chain yourself
dig +trace example.com
# HTTPS reachability with DNS override (SNI/Host)
curl -sv --resolve example.com:443:203.0.113.10 https://example.com/

Keep it simple

The Internet grew not because every path was optimal, but because any one path being broken didn’t matter. Prefer simple, repeated designs; let redundancy and good telemetry carry the weight.

Last updated 2025‑09‑22.