By Vicente Arteaga Gomez
MisLinux · Last updated: May 5, 2026
This is part of my Kubernetes-on-Hetzner-and-operations series on MisLinux. It is based on real operational monitoring design work, not a generic observability checklist.
One of the easiest ways to build a misleading dashboard is to choose only one monitoring scope.
If I watch only the global service, I can miss a saturated origin.
If I watch only the individual host, I can page on a healthy rebalance.
If I watch only the provider slice, I can miss a cross-provider traffic cliff.
That is why I now want three views at the same time:
| View | Main question |
|---|---|
| Host | Is one concrete machine or origin sick, stale, or overloaded? |
| Hosting | Is one provider/location carrying the wrong share or drifting operationally? |
| Global | Is the public service really up, healthy, and serving normal demand overall? |
What each level catches
Host view
The host view is where I want:
- latency
- CPU or saturation signals
- config freshness
- health endpoint state
- reload or restart anomalies
This is how I catch "one box is bad" before it becomes "the system feels random."
Hosting view
The hosting view groups hosts by provider or operational domain.
That is where I can see:
- traffic imbalance
- one provider falling behind on config freshness
- one provider serving the wrong share of requests
- one provider drifting in latency
This matters even if there is only one host in that hosting today, because the grouping is an operational concept, not just a host count.
Global view
The global view answers the question that users care about:
> Is the public service healthy overall?
That is the view where I want:
- total public traffic
- total error rate
- canary health across all active public origins
- global per-UUID or per-surface demand cliffs
Without this layer, it is too easy to confuse a redistribution event with a real outage.
The architecture I prefer
The point of that diagram is that each level aggregates differently. They are not interchangeable charts with different legends.
A practical example
Imagine public traffic is split across two providers:
- provider A carries less traffic than usual
- provider B carries more traffic than usual
- the total public request rate is stable
If I alert only on provider A, I page on a healthy balance shift.
If I alert only on global traffic, I miss the provider imbalance.
If I alert only on per-host CPU, I may not notice that one provider is now becoming the hidden bottleneck.
The right answer is to keep all three views and interpret them together.
Metrics I now want at each level
| Scope | Metrics I care about first |
|---|---|
| Host | health, latency, CPU, config age, recent reload status |
| Hosting | total requests, mean latency, error rate, freshness spread, host count |
| Global | total public requests, total errors, demand cliffs, public canary result |
That table looks obvious after the fact, but it took me longer than it should have to stop flattening these scopes into one graph.
Failure case: the wrong dashboard story
The wrong dashboard story sounds like this:
> "Traffic is down on one origin, so the public service is down."
That sentence mixes scopes. It takes a host observation and turns it into a global conclusion.
Once I noticed that pattern, a lot of monitoring noise suddenly made more sense.
Command trail I want during validation
# Host-level health
curl -s https://origin-a.example.invalid/health
# Hosting-level request split
python3 inspect-request-split.py --group-by hosting
# Global service trend
python3 inspect-request-split.py --group-by global
The exact implementation can vary. The important part is that the queries do not collapse distinct operational questions into one series.
What I'd do differently now
I used to think multiple scopes were a nice refinement for later. I now think they are the minimum design for any public service that can move traffic across origins or providers. If I were rebuilding the monitoring from scratch, I would model host, hosting, and global identity from day one.