Pages

Wednesday, March 18, 2026

Storage, Ingress, and TLS for Production Services on Hetzner

04-storage-ingress-and-tls-for-production-services

By Vicente Arteaga Gomez

MisLinux · Last updated: May 5, 2026

This article is part 4 of my MisLinux series about running Kubernetes on Hetzner. The patterns here come from my own production-minded experience and tradeoffs, and I am not affiliated with or sponsored by Hetzner.

After the cluster boots and the network is stable, the next challenge is making services usable. That means deciding how workloads are exposed, how certificates are handled, and how stateful applications keep their data safe.

Public entry path for a small Hetzner Kubernetes service

Ingress should reduce complexity, not add it

I like ingress because it creates one place to reason about public traffic. Instead of every service making its own exposure decisions, ingress gives the cluster a front door.

On a small Hetzner-based cluster, that front door needs to be boring and reliable. I prefer a setup that is easy to audit:

  • one clear ingress controller choice
  • explicit hostnames
  • TLS by default for public services
  • minimal special cases

When ingress becomes a collection of one-off exceptions, future debugging gets painful fast.

The command trail I want to stay readable

When I expose a service, these are the first checks I want to be able to run without thinking too hard:

kubectl get ingress -A
kubectl describe ingress my-service -n my-namespace
kubectl get certificate -A
kubectl get challenge -A
kubectl get pvc -A

If those four commands already require tribal knowledge to interpret, the public-service path is too clever.

TLS automation is worth doing early

If a service is public, certificate automation should arrive early, not later. Manual certificate handling tends to be forgotten until the worst possible time.

The ideal state is simple:

  • public DNS points to the ingress entry point
  • certificates are issued automatically
  • renewals happen without emergency maintenance
  • failed renewals are visible before they become outages

For small teams, the real benefit is not elegance. It is reducing the number of recurring tasks that can wake somebody up unexpectedly.

A failure case I keep in mind here

One reason I standardize the public entry path early is that routing and certificate failures love ambiguity.

The messy version looks like this:

  • one hostname goes through ingress
  • another uses a direct node path
  • certificate ownership is split across different workflows
  • a stateful app stores data on a volume nobody has actually restored under pressure

When that kind of design fails, the outage investigation turns into "which path did this service use again?" That is exactly the question I do not want to answer in the middle of an incident.

Persistent data deserves extra caution

Stateless services are forgiving. Stateful services are not. When a workload depends on stored data, I want answers to these questions before calling it production-ready:

  • where does the data live?
  • how is it backed up?
  • how is it restored?
  • what happens if the node disappears?
  • what happens during upgrades or rescheduling?

A volume attached to a pod is not the same thing as a backup strategy. That distinction matters a lot.

My bias: standardize the public entry path early

One thing that helps a small cluster stay understandable is standardizing how public traffic enters the platform. I try to avoid a mixed world where one application uses a direct node port, another uses a bespoke reverse proxy, and a third uses ingress with different TLS behavior. That kind of drift turns every outage into detective work.

Even on a small Hetzner footprint, I would rather make these rules explicit early:

  • public HTTP and HTTPS go through the ingress layer
  • certificate ownership is obvious
  • the DNS-to-ingress path is documented
  • exceptions are rare and justified

Keep storage expectations honest

One of the easiest mistakes is to treat Kubernetes storage as if the platform automatically solves durability. It does not. Kubernetes can help schedule and mount storage, but it does not remove the need to understand failure domains, snapshot strategy, and restore procedures.

On Hetzner, I try to keep the storage story clear and modest. If a workload is stateful, its backup and recovery path should be documented in plain language. If that cannot be described clearly, the design is probably not ready.

Questions I ask before exposing a new service

Before I make a production service public, I want specific answers to a few operational questions:

  • if the backend restarts, what will users actually see?
  • if TLS renewal fails, how will I notice before visitors do?
  • if the service depends on storage, what is the restore path and who owns it?
  • if the application starts returning 502 or 504 errors, which logs and dashboards will explain it first?

These questions force me to think about the real service path, not just the YAML objects.

Public exposure checklist

Before I expose a production service, I like to verify:

  • the service has a clear hostname
  • TLS is configured and tested
  • readiness and liveness probes reflect reality
  • ingress rules are minimal and readable
  • logs make routing failures visible
  • backend pods behave correctly when traffic spikes or reloads happen
  • the service can survive a pod restart without manual intervention

This is less glamorous than discussing architecture diagrams, but it is the difference between a demo and a reliable public service.

The operational mindset

Ingress, TLS, and storage are where infrastructure starts becoming user-facing. Once real users depend on the service, small configuration mistakes become customer-visible problems.

That is why I prefer the simplest architecture that still gives me:

  • secure public access
  • predictable certificate management
  • explicit volume ownership
  • documented failure recovery

The value is not only uptime. It is confidence. When you know exactly how traffic reaches the service and exactly how data is protected, production becomes far easier to operate.

What I'd do differently now

If I were setting up the same cluster again, I would write the public-service contract earlier and more bluntly:

  1. every public hostname enters through ingress unless there is a written exception
  2. every TLS path has one obvious owner
  3. every stateful workload gets a restore note before it gets called "production"

That sounds strict, but it removes a surprising amount of future debugging noise.

Series note

This is part 4 of the series, and it is where the cluster stops being only an internal system and starts behaving like a public platform. The next article moves into day-2 operations, which is where these design choices either pay off or create recurring pain.

In the next article, I will cover day-2 operations: backups, monitoring, upgrades, and the routines that keep a Hetzner cluster healthy over time.