Blog Projects About

InfluxDB 3 Enterprise: Observability in the Homelab

First things first: a massive thank you to InfluxData for listening to the community and offering a proper at-home hobbyist license for InfluxDB 3 Enterprise. Yes, you read that right. Enterprise-grade time-series database. In my basement. For free. Because enough homelab enthusiasts asked for it and InfluxData actually listened. This blog post exists because of their generosity, and I am eternally grateful.

The Dream: One Database to Rule Them All

Before this revamp, my observability was… scattered. Metrics here, logs there, a Grafana instance pointing at something that may or may not still exist. I wanted a single source of truth. One place where I could dump metrics, logs, traces, and Kubernetes events, then query them with actual SQL instead of whatever ancient dialect Prometheus speaks.

Enter InfluxDB 3 and the influxdb-observability ecosystem. The pitch is simple: collect everything, store it as time-series data, query it with SQL. Metrics? Time series. Logs? Time series. Traces? Also time series. It’s time series all the way down.

The Pipeline

Data flows into InfluxDB from three main paths, each more over-engineered than the last.

Path 1: Telegraf on Every Host

Good old Telegraf is deployed on every machine: the Proxmox hypervisor, every VM, every LXC. It collects system metrics (CPU, memory, disk, network, processes), receives syslog over TCP, and on the Proxmox host it even pulls VM and container statistics directly from the Proxmox API.

All of it gets gzipped and shipped to InfluxDB 3 via the native v3 write API. One token, one endpoint, one database. Beautiful.

Path 2: OpenTelemetry Collector on Kubernetes

This is where it gets spicy. The K3s cluster runs an OpenTelemetry Collector as a DaemonSet on every node. It collects:

The Collector doesn’t write directly to InfluxDB. Instead, it exports OTLP to Telegraf running on the same node via port 14317. Why the extra hop? Because Telegraf’s inputs.opentelemetry plugin with the prometheus-v2 schema handles the translation beautifully, and I already had Telegraf everywhere. Why add another output plugin when you can add another hop?

Path 3: Direct Application Writes

Some applications write directly. The GitHub runner reports build metrics. Custom scripts push data. If it moves, I instrument it. If it doesn’t move, I instrument it anyway.

The Journey: Tried Them All

I’ve been around the observability block. Over the years I’ve spun up ClickHouse, InfluxDB OSS, Grafana Loki, Grafana Tempo, Prometheus, VictoriaMetrics, Elasticsearch, and Jaeger — sometimes all at once in a desperate attempt to find “the one.” They all have strengths. ClickHouse is a beast for analytics. Loki does logs cheaply. Prometheus is the metric standard. Elasticsearch… exists.

But I always came back to InfluxData. The performance of InfluxDB for time-series workloads is genuinely hard to beat. The write throughput, the compression, the query speed — it just feels right. We even run it at work. So when InfluxDB 3 dropped with Apache Arrow under the hood and proper SQL support, the decision practically made itself.

Why InfluxDB 3?

I could have used the open-source InfluxDB 3 Core. It’s free, it’s fast, it handles time-series data like a champ. But InfluxDB 3 Enterprise has features that matter for a homelab that pretends to be production:

The at-home hobbyist license from InfluxData makes this possible. Without it, I’d be on Core, which is still excellent, but Enterprise lets me pretend I’m running a data center. Which, graphically speaking, I basically am.

Grafana: The Pretty Pictures

All this data is useless without visualization. Grafana connects to InfluxDB 3 using the native SQL datasource and pulls from the homelab database. Dashboards live in git, synced to Grafana automatically. If I break a dashboard, I revert the JSON. If I want a new panel, I write SQL, commit it, and it appears.

The dashboards show everything: Proxmox host health, VM resource usage, Kubernetes pod metrics, container log volume, syslog error rates, network throughput, disk I/O. It’s the mission control center I always wanted and absolutely don’t need for three VMs in a mini-PC. But the graphs are so pretty.

What I Actually Collect

Data TypeSourceHow
System metricsAll hostsTelegraf cpu, mem, disk, net, processes
Proxmox statsProxmox hostTelegraf proxmox input
SyslogAll hostsTelegraf syslog receiver
K8s pod metricsK3s nodesOTel Collector kubeletstats -> Telegraf OTLP
K8s eventsK3s clusterOTel Collector k8sobjects -> Telegraf OTLP
Container logsK3s nodesOTel Collector filelog -> Telegraf OTLP
TracesAppsOTLP -> OTel Collector -> Telegraf OTLP

Everything lands in InfluxDB 3 with proper tags and timestamps. I can correlate a CPU spike on a worker node with the exact pod that caused it, the logs it emitted, and the Kubernetes event that spawned it. All in one query.

Is It Overkill?

Absolutely. For a homelab with three Kubernetes nodes and a handful of LXCs, this is hilariously over-engineered. I could have run docker stats and called it a day. But where’s the fun in that?

InfluxDB 3 Enterprise, OpenTelemetry, Telegraf, and Grafana turn a modest mini-PC into something that feels like real infrastructure. And thanks to InfluxData’s at-home hobbyist license, it didn’t cost me a dime beyond the electricity bill.

If you’re running a homelab and haven’t tried InfluxDB 3 yet, go get a license. It’s worth it. Your graphs will thank you. Your wallet will thank you. And at 2 AM when something breaks, you’ll have the data to figure out exactly what went wrong.

🔗 influxdb-observability on GitHub
🔗 InfluxDB 3 Enterprise Licensing