Free Datadog for the fleet in one Nix module
Netdata + SigNoz on Unraid, host telemetry from every NixOS box. A Sunday morning coffee project; Nix and comin made it absurdly easy.
There’s a pattern in this blog: something that should eat a weekend turns out to eat an afternoon, because Nix and comin do most of the work. This one’s another.
I’d been meaning to put SigNoz on the fleet for months. Fleetwide host telemetry felt like the kind of project that needed a full weekend. Sunday morning, with coffee and Claude, I gave it a shot.
Two pieces. Netdata for the realtime side: “carbon’s CPU is pegged right now, what process?” Its per-process accounting is excellent, and its 18-model ML detector flags anomalies automatically. SigNoz for the historical side: “search journald across every host for that error from Tuesday.” ClickHouse-backed log search, structured fields, OTLP ingest. Both self-hostable. Both with sane Docker compositions.
The stack
Both run as Compose Manager projects on my always-on Unraid box, racer5. The compose files come straight from upstream with two tweaks: ports bind to my Tailscale IP only (no LAN exposure), and bind-mounts point at /mnt/user/appdata/... because Unraid’s /boot is a vfat partition that refuses real Unix permissions. ClickHouse runs as UID 101 inside its container, tried to read its config off the vfat mount, got Access to file denied, and the fix was moving the configs onto the actual storage array.
The compose configs are checked in alongside everything else in /boot/config/plugins/compose.manager/projects/. Once they’re up, racer5 has a Netdata Parent on :19999 and a SigNoz on :3301, both reachable only over Tailscale.
The Nix module
The host side is one file, modules/nixos/observability.nix, that does the whole thing:
config = lib.mkIf cfg.enable {
sops.secrets.netdata_stream_api_key = {};
sops.templates."netdata-stream.conf" = {
owner = "netdata";
content = ''
[stream]
enabled = yes
destination = ${cfg.netdataParent}
api key = ${config.sops.placeholder.netdata_stream_api_key}
'';
};
services.netdata = {
enable = true;
configDir."stream.conf" = config.sops.templates."netdata-stream.conf".path;
};
environment.etc."otel-collector/config.yaml".text = ''
receivers:
hostmetrics: {...}
journald: {directory: /var/log/journal, all: true}
exporters:
otlphttp/signoz:
endpoint: ${cfg.signozEndpoint}
...
'';
systemd.services.otel-collector = {
serviceConfig = {
ExecStart = "${pkgs.opentelemetry-collector-contrib}/bin/otelcol-contrib --config=/etc/otel-collector/config.yaml";
# ...sandboxed per my hardening rule
};
};
};
One import in modules/nixos/common.nix, and every NixOS laptop and workstation in the flake suddenly streams metrics to the Parent and ships journald to SigNoz. No per-host setup, no installer to run, no agent to babysit. The two appliance VMs (hermes and herqules, my agent gateways) are lean one-off nixosSystem entries that skip common.nix, so I added the same import directly to their module lists in flake.nix. Three lines total to bring them in.
The fan-out
I push the commit. Every host running comin, my GitOps deploy daemon polling main every 60 seconds, pulls it, rebuilds itself, and starts streaming. From git push to gram, hermes, and herqules all appearing in the Netdata Parent’s host picker took about four minutes, most of which was waiting for comin to poll. Nix and comin made this absurdly easy.
Making it laptop-friendly
Out of the box, Netdata samples everything at one second and burns about 14% of one core. On the desktop that’s fine. On a laptop already short on battery it’s a problem. Two changes brought it down to under 5%:
- Bump
update everyfrom 1 to 5 seconds. Single biggest lever. Applies to the daemon and most plugins. The per-host page still feels live because every metric is fresh within 5 seconds, and the “which process is eating CPU right now” answer is unchanged. - Drop plugins that earn nothing on a workstation.
debugfsreads ZSWAP/BTRFS/intel-rapl out of/sys/kernel/debug; I run ZRAM, have no BTRFS on the laptops, andhostmetricsalready covers rapl.go.dships postgres/redis/nginx/k8s collectors, none of which any laptop runs.systemd-journalis the Netdata UI’s journal tail, fully redundant once SigNoz has the full journald firehose.otel-signal-vieweris for using Netdata as an OTel sink, which I don’t.freeipmiis for server BMCs, and on a laptop it doesn’t just sit idle, it spamsinternal errorinto the system journal.
I kept apps.plugin, the per-process accounting, because it’s the killer feature, the reason I’d reach for Netdata over an alternative in the first place. The disables and the polling bump are all in the same services.netdata.config block, so it’s one commit. Comin propagates it everywhere.
Why Nix did most of the work
The bit that always lands twice in my own posts about Nix is that the same module is what wires up a workstation, a Proxmox VM, and an LG Gram. There’s no “configuration management” beyond writing the file and importing it. The whole fleet converges by polling and rebuilding.
The other piece is that the observability module is itself a small file alongside a much larger system config. The hardening section sits next to the Netdata block. The sops secret is declared in three lines. The dedicated user, the systemd unit, the sandboxing, all in the same file. When I want to know what observability does to a host, there’s exactly one place to look.
The compose files were written by SigNoz and Netdata. The streaming protocol was written by Netdata. The hostmetrics receiver was written by the OpenTelemetry project. The fan-out was written by comin. All I really did was the small glue module. The Nix part is what makes the glue stick to every machine I own at the same time, over a single cup of coffee.
Keep Reading
Agentic memory: a shared brain for my coding agents
A local, no-cost memory that Claude Code, Codex, and my assistant all share, kept current with hooks and wired up in Nix.
A scheduler skill: delegating to my assistant from the terminal
Letting my coding CLIs hand reminders and tasks to my Telegram assistant, wired up declaratively with Nix.
Hands-off NixOS across my laptops with Attic + comin
How I keep three NixOS laptops on the same config without thinking about it, with a self-hosted Attic binary cache so they pull...