Skip to content

Remote Staging and Production

Every tutorial so far has run on a local k0s cluster. This one moves to a remote server and sets up the two remaining environments — staging and production — on a single cluster. Namespace isolation, network policies, and resource quotas keep them apart. Flux CD reconciles both from the same Git repository.

By the end you will have a complete promotion pipeline: develop locally, verify in local staging, push to remote staging, soak for a week, then promote to production.

Staging and production share one cluster but live in separate namespaces. Each namespace gets its own network policies and resource quotas, so a runaway staging workload cannot starve production and staging traffic cannot reach production pods.

Why one cluster instead of two? Simpler operations, lower cost, and easier shared infrastructure. Traefik, cert-manager, and monitoring run once and serve both namespaces. Split into separate clusters when compliance demands physical isolation (PCI DSS, HIPAA) or a client contract mandates dedicated infrastructure.

Create a namespace for each environment:

apiVersion: v1
kind: Namespace
metadata:
name: staging
labels:
environment: staging
---
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
environment: production

The environment label makes it easy to target each namespace in network policies, monitoring dashboards, and RBAC rules.

Start with a default deny on ingress in both namespaces, then open a hole for Traefik:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny-ingress
namespace: production
spec:
podSelector: {}
policyTypes:
- Ingress
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-traefik-ingress
namespace: production
spec:
podSelector: {}
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: traefik
policyTypes:
- Ingress

Apply the same pair of policies to the staging namespace. With these in place, a pod in staging cannot reach a service in production — the default deny blocks it, and no rule opens cross-namespace traffic.

Quotas prevent staging from consuming resources production needs. Set conservative limits for staging and larger ones for production:

apiVersion: v1
kind: ResourceQuota
metadata:
name: staging-quota
namespace: staging
spec:
hard:
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 2Gi
pods: "20"
---
apiVersion: v1
kind: ResourceQuota
metadata:
name: production-quota
namespace: production
spec:
hard:
requests.cpu: "2"
requests.memory: 2Gi
limits.cpu: "4"
limits.memory: 4Gi
pods: "50"

If a staging deployment tries to exceed its quota, the scheduler rejects the pod. Production keeps running.

Flux reconciles both environments from the same repository. The directory structure separates base manifests from per-environment overrides:

apps/
base/
my-app/
deployment.yaml
service.yaml
ingressroute.yaml
kustomization.yaml
overlays/
staging/
kustomization.yaml
production/
kustomization.yaml

Base manifests define the application once. Overlays patch what differs between environments: namespace, replica count, hostname, and image tag.

The staging overlay targets the staging namespace, runs a single replica, and routes traffic through a staging hostname:

apps/overlays/staging/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: staging
resources:
- ../../base/my-app
patches:
- target:
kind: Deployment
name: my-app
patch: |
- op: replace
path: /spec/replicas
value: 1
- target:
kind: IngressRoute
name: my-app
patch: |
- op: replace
path: /spec/routes/0/match
value: "Host(`staging.example.com`)"

The production overlay follows the same pattern with three replicas and the production hostname:

apps/overlays/production/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
namespace: production
resources:
- ../../base/my-app
patches:
- target:
kind: Deployment
name: my-app
patch: |
- op: replace
path: /spec/replicas
value: 3
- target:
kind: IngressRoute
name: my-app
patch: |
- op: replace
path: /spec/routes/0/match
value: "Host(`app.example.com`)"

Flux needs a Kustomization resource for each environment. These live in clusters/remote/:

clusters/remote/apps-staging.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps-staging
namespace: flux-system
spec:
interval: 5m0s
path: ./apps/overlays/staging
prune: true
sourceRef:
kind: GitRepository
name: flux-system
dependsOn:
- name: infrastructure
clusters/remote/apps-production.yaml
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
name: apps-production
namespace: flux-system
spec:
interval: 10m0s
path: ./apps/overlays/production
prune: true
sourceRef:
kind: GitRepository
name: flux-system
dependsOn:
- name: infrastructure
healthChecks:
- apiVersion: apps/v1
kind: Deployment
name: my-app
namespace: production
timeout: 5m

Production reconciles less frequently (every 10 minutes instead of 5) and includes a health check. Flux waits up to 5 minutes for the production deployment to become healthy before marking the reconciliation as successful.

This is the complete path from a developer’s laptop to production. Each step builds on what the previous tutorials set up.

Work on a feature branch against the local k0s cluster. All ports bind to localhost. Iteration is fast — no waiting for CI or remote deployments.

Terminal window
kubectl port-forward svc/my-app 3000:80 -n dev

Merge to main. Apply the staging overlay locally to catch manifest errors before they reach the remote cluster:

Terminal window
kubectl apply -k apps/overlays/staging
curl https://staging.local/healthz

This step catches namespace conflicts, missing patches, and kustomization errors without burning time on a remote deploy.

Push to main. Flux picks up the change and applies the staging overlay on the remote cluster:

Terminal window
git push origin main
# Flux reconciles within 5 minutes
curl https://staging.example.com/healthz

Let staging run for about a week. Monitor logs, error rates, and resource usage. This is where intermittent bugs, memory leaks, and connection pool exhaustion surface.

Update the production overlay — typically an image tag bump — and push:

Terminal window
# Update image tag in production overlay
git add apps/overlays/production/
git commit -m "promote v1.2.0 to production"
git push origin main
# Flux reconciles within 10 minutes
curl https://app.example.com/healthz

Flux applies the change, runs the health check, and reports success or failure. No manual kubectl apply on the remote cluster.

The remote Traefik overlay configures Let’s Encrypt with the HTTP-01 challenge. IngressRoutes reference the cert resolver:

spec:
tls:
certResolver: letsencrypt

Certificates are requested automatically when a new hostname appears in an IngressRoute. No manual cert management, no renewal cron jobs.

After both environments are running, check that everything is in order:

Terminal window
# Pods in both namespaces
kubectl get pods -n staging
kubectl get pods -n production
# Flux reconciliation status
flux get kustomizations
# Certificates
kubectl get certificates -A

Verify that network isolation works by trying to reach production from staging:

Terminal window
kubectl exec -n staging deploy/my-app -- curl my-app.production.svc:80

This should time out. The default-deny network policy blocks cross-namespace traffic, which is exactly what you want.

Git is the source of truth for both environments. Roll back by reverting the commit:

Terminal window
git revert HEAD
git push origin main
# Flux applies the previous state

If you need an immediate rollback and cannot wait for Flux to reconcile, suspend Flux and roll back manually:

Terminal window
flux suspend kustomization apps-production
kubectl rollout undo deployment/my-app -n production

Resume Flux once the situation stabilizes. Flux will reconcile the deployment back to whatever Git says, so make sure the revert commit has landed before resuming.

Namespace isolation handles most cases. Consider separate clusters when:

  • Compliance requires physical isolation (PCI DSS, HIPAA)
  • A client contract mandates dedicated infrastructure
  • Staging workloads are heavy enough to affect production performance
  • Different geographic regions are needed

For most teams, one cluster with network policies and resource quotas provides sufficient separation at a fraction of the cost.

This tutorial completes the promotion pipeline from Development Workflow. You started with a local k0s cluster and a handful of manifests. You now have:

  • A remote cluster provisioned with OpenTofu
  • Two isolated environments on that cluster (staging and production)
  • Flux CD reconciling both from Git
  • Network policies blocking cross-namespace traffic
  • Resource quotas preventing resource starvation
  • TLS certificates via Let’s Encrypt
  • SOPS-encrypted secrets decrypted automatically by Flux
  • Security linting validating manifests before they reach the cluster
  • Production hardening with RBAC and security contexts

Push a commit. Flux deploys it. That is the workflow.