Building an OpenBSD Home Router, Part 4: SSH, Hardening, and Monitoring

Last July, my firewall rebooted itself at 2pm on a Tuesday. No warning, no panic log, just a clean reboot. I was in a call, so I didn’t even notice until my VPN dropped and I found myself staring at a spinning reconnect icon.

Turned out the CPU had hit 92 degrees. In a fanless box. In a house in Larnaca. In July. The ACPI firmware did exactly what it should do and yanked the power. But 92 degrees means the silicon had been cooking for a while before the hardware killed it, and I hadn’t set up a single layer of monitoring to catch it on the way up.

That was the afternoon I sat down and wrote proper hardening for this thing, because a firewall that reboots itself when it gets warm isn’t a firewall. It’s a liability.

This is Part 4 of my OpenBSD home router series. Parts 1, 2, and 3 covered hardware, installation, and PF configuration. Today we’re doing the stuff that turns a working router into something I actually trust: SSH hardening with post-quantum key exchange, kernel hardening through sysctl, three-layer thermal monitoring for a fanless appliance in a Mediterranean climate, and file integrity checking with mtree. Plus sshguard, because brute-forcers don’t take holidays.

tl;dr, Post-quantum cryptography isn’t a future thing. OpenSSH 9.9 shipped ML-KEM key exchange in late 2024. If you’re running OpenBSD 7.8, you have it right now. This post covers how to use it, along with everything else I do to harden a headless firewall appliance.

SSH Hardening: Post-Quantum Key Exchange Is Here

Let’s start with the fun bit. Post-quantum key exchange isn’t theoretical any more. It’s not a conference slide deck. It’s not “coming soon.” OpenSSH 9.9 [1] shipped mlkem768x25519-sha256 as a default key exchange algorithm, based on NIST’s ML-KEM standard (FIPS 203) [2]. If you installed OpenBSD 7.8, you already have it. You’ve probably already been using it without realising.

I find this genuinely exciting. The cryptography community spent years arguing about lattice-based schemes, and now one just… landed in my SSH daemon. On a router I built from a single-board computer in my living room. The future arrives quietly sometimes.

Here’s my full sshd_config. I’ll walk through every decision.

Host Keys: Ed25519 Only

HostKey /etc/ssh/ssh_host_ed25519_key

During installation I deleted the ECDSA and RSA host keys. Ed25519 gives you 128-bit security with tiny keys and fast signatures. There’s no reason to keep RSA keys around on a fresh system unless you need to support ancient clients, and I don’t. If something can’t speak Ed25519 in 2026, it doesn’t get to talk to my firewall.

Key Exchange: The Post-Quantum Stack

KexAlgorithms mlkem768x25519-sha256,sntrup761x25519-sha512,curve25519-sha256

Three algorithms, in priority order. Each one is there for a reason.

mlkem768x25519-sha256 is the headliner. It’s a hybrid key exchange that combines ML-KEM-768 (a lattice-based key encapsulation mechanism standardised by NIST in FIPS 203 [2]) with X25519 elliptic-curve Diffie-Hellman. The “hybrid” part is crucial: even if ML-KEM turns out to have a weakness we haven’t found yet, the X25519 component means your session is still protected by conventional elliptic-curve security. Belt and braces.

Why does this matter for a home router? Because of “harvest now, decrypt later.” If someone, let’s be honest, probably a state actor, is recording encrypted traffic today, a future quantum computer could break pure-ECDH sessions retroactively. ML-KEM is designed to resist that. Is someone targeting my home network traffic? Almost certainly not. But the cost of enabling this is literally zero, it’s the default in OpenSSH 9.9, and the protection is real.

sntrup761x25519-sha512 is the fallback post-quantum option. It combines Streamlined NTRU Prime with X25519. OpenSSH made this the default back in version 9.0 [3], before ML-KEM was standardised. It’s well-tested, it’s conservative, and it provides the same harvest-now-decrypt-later protection via a different mathematical hardness assumption (lattice-based, but a different lattice construction than ML-KEM). Having both means we’re not betting everything on one post-quantum algorithm.

curve25519-sha256 is the classical fallback. No post-quantum protection, but it’s the best conventional key exchange available. If a client can’t speak either post-quantum algorithm, this is where we land. Still better than anything Diffie-Hellman-group based.

Notice what’s NOT in the list: no diffie-hellman-group14-sha256, no NIST P-curves, no SHA-1 based exchanges. If your threat model includes a firewall, your key exchange list shouldn’t include algorithms from the early 2000s.

Ciphers: AEAD Only

Ciphers chacha20-poly1305@openssh.com,aes256-gcm@openssh.com,aes128-gcm@openssh.com

All three are AEAD ciphers, meaning they provide authenticated encryption in a single pass. No separate MAC step, no composition bugs, no encrypt-and-hope-the-MAC-catches-it.

ChaCha20-Poly1305 is first because it’s constant-time on every architecture, including the AMD GX-412TC in this box, which does have AES-NI but where I still prefer ChaCha’s timing properties. AES-256-GCM and AES-128-GCM are there for compatibility.

What I’ve removed: every CTR-mode cipher. CTR mode requires a separate MAC, which means you’re relying on the encrypt-then-MAC composition to be correct. AEAD ciphers eliminate that entire class of bugs. If you’re building a new config from scratch, there’s no reason to include aes256-ctr or aes128-ctr.

MACs: Encrypt-then-MAC Only

MACs hmac-sha2-512-etm@openssh.com,hmac-sha2-256-etm@openssh.com,umac-128-etm@openssh.com

“But you just said AEAD ciphers don’t need MACs?” Right. These are here as a safety net in case cipher negotiation somehow falls through to a non-AEAD cipher (it shouldn’t with my config, but defence in depth). The -etm suffix means encrypt-then-MAC, which is the only correct composition order. Encrypt-and-MAC and MAC-then-encrypt have both led to real attacks in TLS and SSH. The etm variants authenticate the ciphertext, not the plaintext, which is what you want.

No hmac-sha2-256 (without -etm). No hmac-sha1. No umac-64. If it doesn’t do encrypt-then-MAC, it’s out.

Authentication: Keys Only

AuthenticationMethods publickey
PasswordAuthentication no
KbdInteractiveAuthentication no

Public key authentication only. No passwords, no keyboard-interactive, no exceptions. This is a headless firewall appliance. The only people SSH-ing into it are me and, if things have gone properly wrong, a colleague I trust with the keys. Password auth on a firewall is asking for trouble.

Access Control: Group-Based

AllowGroups sshusers

Rather than listing individual usernames in AllowUsers, I use a group. Add a user to the sshusers group and they can connect. Remove them and they can’t. It scales better than maintaining a username list in the SSH config, especially if you ever need to grant temporary access during an incident.

Listening Interfaces: LAN Only

ListenAddress 10.20.10.1
ListenAddress 10.20.20.1

The SSH daemon listens on the LAN interfaces only. Not on the WAN. Not on localhost. This means SSH is physically unreachable from the internet, full stop. PF would block it anyway (there’s no pass in on egress rule for port 22), but not listening at all is better than listening and filtering. The attack surface doesn’t just get smaller. It disappears entirely.

10.20.10.1 is the trusted LAN. 10.20.20.1 is the IoT VLAN. Both are behind the firewall, both require physical or VPN presence to reach.

Rate Limiting and Timeouts

LoginGraceTime 30s
MaxAuthTries 3
MaxStartups 3:50:10
ClientAliveInterval 15
ClientAliveCountMax 3

LoginGraceTime 30s gives unauthenticated connections 30 seconds to complete the handshake. Default is 120 seconds, which is absurdly generous for a machine where all clients use key auth.

MaxAuthTries 3 disconnects after three failed authentication attempts. Combined with key-only auth, this should never trigger for legitimate users. If it does, something is very wrong.

MaxStartups 3:50:10 is the interesting one. It means: allow 3 unauthenticated connections before rate-limiting kicks in. Between 3 and 10 pending connections, drop new ones with a 50% probability that increases linearly. At 10 pending connections, drop everything. This stops brute-force tools that open dozens of parallel connections.

ClientAliveInterval 15 with ClientAliveCountMax 3 means idle sessions get killed after 45 seconds of silence. I’m not running a shell hotel. If you’re connected, be doing something.

Disable Everything Else

AllowTcpForwarding no
AllowAgentForwarding no
AllowStreamLocalForwarding no
PermitTunnel no
X11Forwarding no

This is a firewall. It’s not a jump box. It’s not a bastion host. It’s not a tunnel endpoint. TCP forwarding, agent forwarding, Unix-domain socket forwarding, tunnelling, X11 forwarding, all off. Every forwarding feature is an opportunity for a compromised client to pivot through the firewall. Since I never need any of them on this box, they stay disabled.

Kernel Hardening: sysctl.conf

OpenBSD’s defaults are already more secure than most operating systems’ hardened configurations. That’s not flattery, it’s just true. So my /etc/sysctl.conf is short, which is the point. I’m only setting things that change behaviour from the default, not cargo-culting a list of sysctls I found on a blog somewhere.

# Routing - required for NAT gateway
net.inet.ip.forwarding=1

# No ICMP redirects - single gateway, no reason to accept route changes
net.inet.ip.redirect=0

# Disable unused tunneling protocols
net.inet.esp.enable=0
net.inet.ah.enable=0
net.inet.ipip.allow=0
net.inet.gre.allow=0
net.inet.etherip.allow=0

# Hardened malloc - detect use-after-free, heap overflow
vm.malloc_conf=S

# W^X enforcement - SIGABRT on violation instead of logging and continuing
kern.wxabort=1

# Headless appliance - reboot on panic, no debugger console
ddb.panic=0
ddb.console=0

# Encrypt swap - protects keys and cached data at rest
vm.swapencrypt.enable=1

Let me walk through the reasoning.

net.inet.ip.forwarding=1 is the one setting that’s here because we NEED it, not because it hardens anything. Without this, the kernel won’t route packets between interfaces, and our NAT gateway doesn’t work. It’s off by default in OpenBSD because most machines aren’t routers. This one is.

net.inet.ip.redirect=0 disables ICMP redirect processing. ICMP redirects tell a host “hey, there’s a better route to that destination, go via this other gateway instead.” On a network with a single gateway (which this is), there’s never a legitimate reason for a redirect. But an attacker on the LAN could use crafted ICMP redirects to reroute traffic through a malicious host. Disable it.

The tunnelling protocols (ESP, AH, IPIP, GRE, EtherIP) are all disabled because this firewall doesn’t terminate VPNs or tunnels. Every enabled protocol is attack surface. If I’m not using it, it’s off. Should I ever need WireGuard or IPsec, I’ll enable the specific protocols I need.

vm.malloc_conf=S enables OpenBSD’s hardened malloc with junking. Freed memory gets overwritten, which turns use-after-free vulnerabilities into immediate crashes instead of exploitable conditions. The S flag enables all the security features: guard pages, random junk filling, delayed frees. There’s a slight performance cost. On a router with 2GB of RAM, I’ve never noticed it.

kern.wxabort=1 is one of my favourites. OpenBSD enforces W^X (write XOR execute), meaning memory pages can’t be both writable and executable simultaneously. With wxabort=1, any W^X violation kills the offending process with SIGABRT instead of just logging a warning. Fail-closed, not fail-open. If something on this firewall is trying to execute writable memory, I want it dead immediately, not continuing to run while I maybe notice a log entry.

ddb.panic=0 and ddb.console=0 are specific to headless appliances. By default, a kernel panic on OpenBSD drops you into the DDB debugger, which is useful on a workstation where you’re sitting at the console and useless on a fanless box under a desk. With ddb.panic=0, the kernel reboots immediately on panic. With ddb.console=0, there’s no way to break into the debugger via the serial console. Since this box runs headless with serial console access, disabling the debugger removes one more attack vector.

vm.swapencrypt.enable=1 encrypts swap with a random key generated at boot. If the box loses power (or, say, reboots because it overheated), anything that was swapped out, potentially including cryptographic keys, session data, or cached credentials, can’t be recovered from the swap partition. The key exists only in RAM and vanishes with a reboot.

Thermal Monitoring: Three Layers Deep

Right. Here’s the bit that I learned the hard way.

The PC Engines APU3D2 uses an AMD GX-412TC, a quad-core embedded processor [5] with a Tj max (maximum junction temperature) of 90 degrees Celsius. The board is passively cooled via a 3mm aluminium heat spreader that conducts heat to the enclosure. No fan. No heatsink fins. Just a slab of aluminium and the hope that convection does its job.

In a Larnaca summer, ambient air temperature inside a house without aircon can hit 35-38 degrees. My router sits on a shelf with reasonable airflow, but even so, idle temperatures run 45-55 degrees and a sustained load can push it to 70+. That’s fine. The silicon can take it. But “fine” and “monitored” are different things, and I want to know about thermal problems before the hardware starts making autonomous decisions.

So I built three layers of monitoring, each covering a different failure mode.

Layer 1: sensorsd (Warning at 75 degrees)

The km(4) driver [6] exposes the AMD die temperature as hw.sensors.km0.temp0. OpenBSD’s sensorsd daemon [7] monitors hardware sensors and runs commands when thresholds are breached.

# /etc/sensorsd.conf
hw.sensors.km0.temp0:high=75C:command=/usr/bin/logger -t sensorsd \
  "CPU temp %2 exceeded threshold %4"

This logs a warning via syslog when the die temperature crosses 75 degrees. The key detail is that sensorsd fires on state changes, not continuously. It won’t spam your logs with “still hot, still hot, still hot” every polling cycle. It fires once when the sensor transitions to the warning state, and again when it transitions back to normal. I’ve also configured it to require two consecutive breaches before triggering, which acts as a debounce against momentary spikes from short CPU bursts.

75 degrees is my “pay attention” threshold. The CPU can handle it, but on a fanless box, 75 degrees means something is driving sustained load and ambient temps are probably high. It’s a signal to investigate, not to panic.

Layer 2: Cron Job (Shutdown at 85 degrees)

Here’s the edge case sensorsd can’t catch: a slow, steady temperature climb that crosses the warning threshold once (triggering the alert) and then just… keeps going. Sensorsd fires on state transitions. If the temperature goes from 74 to 76 and stays at 76, you get one alert. If it then creeps to 85 over the next hour, sensorsd won’t fire again because there’s no new state transition.

So I have a cron job that runs every 5 minutes and checks the absolute temperature:

# /etc/cron.d/thermal-shutdown
*/5 * * * * root /usr/local/bin/thermal-check.sh
#!/bin/sh
# /usr/local/bin/thermal-check.sh
TEMP=$(sysctl -n hw.sensors.km0.temp0 | sed 's/ degC//')
LIMIT=85

if [ $(echo "$TEMP > $LIMIT" | bc) -eq 1 ]; then
    logger -t thermal "CRITICAL: CPU temp ${TEMP}C exceeds ${LIMIT}C, shutting down"
    /sbin/shutdown -hp now "Thermal emergency: CPU at ${TEMP}C"
fi

85 degrees is 5 below Tj max. At that point I’d rather have a cleanly shut-down router than a running router that’s about to hit the thermal wall and reboot hard. A clean shutdown means filesystems are synced, logs are flushed, and when I power it back on after addressing the cooling situation, everything comes up cleanly.

Why not 90? Because thermal inertia is real. If the die is at 85 and climbing, it’s going to overshoot before shutdown completes. A 5-degree margin gives the shutdown process time to finish while the silicon is still within spec.

Layer 3: ACPI Firmware (Emergency at 115 degrees)

The last resort. If both software layers fail, the ACPI firmware on the APU3D2 will trigger a hard power-off at 115 degrees. This is the factory setting and I haven’t changed it.

At 115 degrees, we’re 25 degrees above Tj max. The silicon is already in damage territory. This is the “everything else failed” layer, and if it triggers, something has gone badly wrong, a stuck process, a daemon that ignored the shutdown command, a kernel bug that prevented clean shutdown.

The three layers together look like this:

LayerTriggerActionFailure Mode Covered
sensorsd75 degreesLog warningMomentary spikes, short bursts
Cron85 degreesClean shutdownSlow sustained climb
ACPI115 degreesHard power-offSoftware failure, stuck shutdown

In normal operation, the APU3D2 runs 40-55 degrees idle and 65-75 degrees under load. I’ve never had Layer 2 trigger since I set it up. Layer 1 fires occasionally during the hottest weeks of summer, which is fine. That’s what it’s there for.

File Integrity Monitoring with mtree

OpenBSD has mtree(8) [8] built into the base system. It generates cryptographic baselines of file hierarchies, recording the SHA-256 hash, permissions, ownership, and timestamps of every file in a directory tree. The daily security(8) check [9] automatically compares the system against these baselines and mails the admin if anything has changed.

This is how you detect compromised binaries. If someone or something modifies /usr/sbin/sshd, the next daily security check will scream about the hash mismatch. It’s not fancy. It doesn’t have a web dashboard. It doesn’t need one.

Why mtree over AIDE, or Tripwire, or OSSEC? Because it’s already there. It’s part of OpenBSD’s base system. No packages to install. No dependencies to maintain. No third-party code running with root privileges on my firewall. The best security tool is the one that ships with your OS and gets maintained by the same team that maintains the kernel.

Generating Baselines

I wrote a small Python script that generates SHA-256 baselines for all the directories that contain binaries:

#!/usr/bin/env python3
"""Generate mtree baselines for critical system directories."""

import subprocess
import sys

DIRS = [
    "/bin",
    "/sbin",
    "/usr/bin",
    "/usr/sbin",
    "/usr/libexec",
    "/usr/lib",
    "/usr/local/bin",
    "/usr/local/sbin",
]

MTREE_DIR = "/etc/mtree"

for d in DIRS:
    name = d.strip("/").replace("/", "_")
    outfile = f"{MTREE_DIR}/{name}.secure"
    cmd = ["mtree", "-cx", "-K", "sha256digest,type", "-p", d]
    try:
        with open(outfile, "w") as f:
            subprocess.run(cmd, stdout=f, check=True)
        print(f"Baseline written: {outfile}")
    except subprocess.CalledProcessError as e:
        print(f"Failed to baseline {d}: {e}", file=sys.stderr)
        sys.exit(1)

Running this produces .secure files in /etc/mtree/ for each directory:

/etc/mtree/bin.secure
/etc/mtree/sbin.secure
/etc/mtree/usr_bin.secure
/etc/mtree/usr_sbin.secure
/etc/mtree/usr_libexec.secure
/etc/mtree/usr_lib.secure
/etc/mtree/usr_local_bin.secure
/etc/mtree/usr_local_sbin.secure

The security(8) script [9] automatically picks up any .secure file in /etc/mtree/ during the daily check. You don’t need to configure anything extra. It just works. I do love OpenBSD sometimes.

Keeping Baselines Current

Baselines need to be regenerated after system updates, otherwise every patched binary will show up as “modified” in the daily report, and you’ll learn to ignore the alerts, which is worse than having no alerts at all.

I run this weekly via cron, after applying patches:

#!/bin/sh
# /usr/local/bin/weekly-baseline-update.sh
/usr/sbin/syspatch -c && /usr/sbin/syspatch
/usr/sbin/pkg_add -u
/usr/local/bin/mtree-baseline.py
logger -t mtree "Baselines regenerated after system update"

The logic is: apply any pending patches, update packages, then regenerate all baselines. This way the baselines always reflect the expected state of a fully-patched system. If something changes between weekly runs and it wasn’t a patch, the daily check will catch it.

The security(8) man page [9] does include a caveat worth mentioning: “These checks do not provide complete protection against Trojan horse binaries, as the miscreant can modify the tree specification to match the replaced binary.” True. If an attacker has root access AND knows where the baselines live, they can update the baseline to match their modified binary. But at that point, they already own the system. The mtree check isn’t there to stop a sophisticated attacker with persistent root access. It’s there to catch the first sign that something changed when it shouldn’t have.

SSHGuard: Automated Brute-Force Blocking

Even though SSH only listens on LAN interfaces, I still run sshguard [10]. Call it paranoia if you like. I call it defence in depth.

SSHGuard monitors authentication logs and automatically adds offending IP addresses to a PF table when it detects brute-force patterns. It’s about 3,000 lines of C, it parses logs with a compiled parser (not regex), and it’s been in the OpenBSD ports tree for years.

The integration with PF is clean. You add a table and a block rule:

# In /etc/pf.conf
table <sshguard> persist

block in quick on egress proto tcp from <sshguard> to any port 22

Then configure sshguard to use PF as its backend:

# /etc/sshguard.conf
BACKEND="/usr/local/libexec/sshg-fw-pf"

When sshguard detects repeated authentication failures from an IP, it adds that IP to the <sshguard> table. PF blocks all further connections from that IP. After a configurable timeout (default 120 seconds, increasing with repeated offences), the IP is removed and can try again.

On a LAN-only SSH setup, sshguard is mostly insurance against a compromised device on the local network attempting lateral movement. If my IoT VLAN somehow spawns something that starts hammering port 22, sshguard will block it before it gets anywhere. Unlikely? Yes. Covered? Also yes.

Putting It All Together

Here’s what the hardened system looks like in practice. SSH accepts connections only from the LAN, authenticates only with public keys, negotiates post-quantum key exchange by default, uses AEAD ciphers exclusively, and kills idle sessions after 45 seconds. The kernel has IP forwarding enabled (because it’s a router), attack surface reduced (no unused protocols, no debugger), and memory hardened (secure malloc, encrypted swap, W^X enforcement). Three layers of thermal monitoring cover everything from “that’s getting warm” to “the software is dead, pull the plug.” File integrity baselines catch unexpected binary modifications in the daily security email. And sshguard watches the auth logs for anything that looks like a brute-force attempt.

None of this is exotic. That’s kind of the point. Every tool I’ve used either ships with OpenBSD or lives in the ports tree. There’s no agent software phoning home, no cloud dashboard, no subscription. Just a well-configured operating system doing what it was designed to do.

The APU3D2 draws about 6-12 watts. It sits on a shelf, it routes my traffic, it doesn’t overheat (any more), and it’s running cryptography that most enterprise firewalls won’t have for another two years. For a box that cost about 100 euros, I’m genuinely pleased with it.

Next up in Part 5: DNS, DHCP, and NTP, because a router that can’t resolve names is just an expensive space heater.

References

  1. OpenSSH 9.9 Release Notes, ML-KEM key exchange
  2. NIST FIPS 203, Module-Lattice-Based Key-Encapsulation Mechanism Standard
  3. OpenSSH 9.0 Release Notes, sntrup761x25519 default
  4. OpenBSD sshd_config(5) man page
  5. PC Engines APU3D2 board specifications
  6. OpenBSD km(4) driver, AMD temperature sensor
  7. OpenBSD sensorsd(8) man page
  8. OpenBSD mtree(8) man page
  9. OpenBSD security(8) man page
  10. SSHGuard, brute-force attack blocker
  11. OpenBSD sysctl(2) man page
  12. OpenBSD sensorsd.conf(5) man page