fix(catalyst-platform): hoist parent_domains_listeners YAML out of cloud-init (Closes #2118) (#2119)
The Cilium Gateway listener block (cilium-gateway.yaml `spec.listeners:
${PARENT_DOMAINS_LISTENERS_YAML}`) was materialised in
infra/hetzner/main.tf (locals.parent_domains_listeners_yaml) and inlined
as a postBuild.substitute value on the sovereign-tls Kustomization in
cloud-init. That value scaled O(N) with parent-zone count and pushed
4-zone SME-pool Sovereigns over Hetzner's 32,256-byte user_data
guardrail. t39 audit (agent-a2c1647c, 2026-05-20): omantel.biz +
.omani.{homes,rest,trade} cloud-init rendered to 33,656 bytes (+1,400
overshoot) and the create call failed at the tofu precondition.
Fix: render the listener YAML inside the bp-catalyst-platform chart at
templates/sovereign-tls-vars-cm.yaml from .Values.parentZones (same
input the chart's per-zone Certificate render already consumes). The
template emits a ConfigMap `flux-system/sovereign-tls-vars` whose key
PARENT_DOMAINS_LISTENERS_YAML carries the JSON-flow listener array.
Cloud-init's sovereign-tls Kustomization reads it via Flux
`postBuild.substituteFrom: [{kind: ConfigMap, name: sovereign-tls-vars}]`.
Ordering is preserved — sovereign-tls `dependsOn: bootstrap-kit Ready`
and bp-catalyst-platform is inside bootstrap-kit, so the ConfigMap
exists in etcd by the time Flux reconciles sovereign-tls.
Synthetic render evidence (standalone Tofu harness, t39's 4-zone +
realistic 5,468-byte worker_cloud_init_b64):
- BEFORE (origin/main 6c1444b4c): cloud-init stripped 30,748 bytes
- AFTER (this commit): cloud-init stripped 28,619 bytes
- savings: 2,129 bytes (>1,400 overshoot fix)
Helm-template covers every historical path preserved by the removed
Tofu locals:
- Single-zone fallback (parentZones empty → list with sovereign FQDN)
emits bare `https`/`http` listener names — every catalyst-system
HTTPRoute hardcodes `sectionName: https` on single-zone Sovereigns.
- Multi-zone (SME pool) emits unique `https-<sanitised>` /
`http-<sanitised>` names per parent zone (otherwise the Gateway
controller raises a Conflicting status condition on duplicate
listener names).
- TBD-A32 #1886 per-prov 2-label wildcard listener pair
(`*.<sovereignFQDN>` with per-prov cert) appended when
sovereignFQDN ∉ parentZones; skipped on the legacy single-zone-
on-apex case to avoid a duplicate-name Conflict.
- Catalyst-Zero (contabo, empty global.sovereignFQDN) skips the
template via top-level guard — Kustomize build untouched.
Cross-region: every region runs the same chart, each peer renders its
own ConfigMap into its own flux-system, so each region's sovereign-tls
Kustomization reads locally.
Lockstep bumps in this commit:
- products/catalyst/chart/Chart.yaml 1.4.231 → 1.4.232
- clusters/_template/bootstrap-kit/13-bp-catalyst-platform.yaml
HelmRelease spec.chart.spec.version 1.4.231 → 1.4.232
Removed (kept the rationale comments as migration breadcrumbs):
- infra/hetzner/main.tf locals.parent_domains_listeners_yaml
- infra/hetzner/main.tf locals.per_prov_listeners
- infra/hetzner/main.tf locals.parent_domains_includes_sovereign_fqdn
- templatefile() var `parent_domains_listeners_yaml` on both the
primary CP and each per-secondary-region CP invocation
- PARENT_DOMAINS_LISTENERS_YAML substitute key on the sovereign-tls
Flux Kustomization (cloud-init) — replaced by substituteFrom
Doc-only updates: parent_domains.go log message + cilium-gateway.yaml
header + sandbox manifests.go + values.yaml header point at the new
ConfigMap-vars path for Day-2 add-domain workflows.
`tofu validate` + `helm lint` + `helm template` (single-zone fallback,
multi-zone with per-prov pair, collision case `fqdn ∈ parentZones`, and
Catalyst-Zero empty-fqdn) all clean.
Refs #2118
Co-authored-by: hatiyildiz <269457768+hatiyildiz@users.noreply.github.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
6c1444b4c1
commit
4be414551d
@ -840,7 +840,13 @@ spec:
|
||||
# resource-detail page's k8s SSE subscription to include the
|
||||
# `event` kind so the EventsPanel surfaces live K8s Events
|
||||
# instead of perpetually rendering empty-state.
|
||||
version: 1.4.231
|
||||
# 1.4.232 — #2118 (TBD-V48): render Cilium Gateway listener YAML
|
||||
# in the chart (templates/sovereign-tls-vars-cm.yaml) so cloud-init
|
||||
# no longer carries the O(N)-per-parent-zone listener block.
|
||||
# Removes ~2.1 KiB from cloud-init on 4-zone SME-pool Sovereigns
|
||||
# (omantel.biz + .omani.{homes,rest,trade}); brings t39-class
|
||||
# provisions back under Hetzner's 32 KiB user_data cap.
|
||||
version: 1.4.232
|
||||
sourceRef:
|
||||
kind: HelmRepository
|
||||
name: bp-catalyst-platform
|
||||
|
||||
@ -23,15 +23,24 @@
|
||||
# NET::ERR_CERT_COMMON_NAME_INVALID, marketplace WordPress tenants on
|
||||
# omani.homes are unreachable.
|
||||
#
|
||||
# Fix: render one listener pair per parent zone. The listener block is
|
||||
# materialised at Terraform plan time (infra/hetzner/main.tf
|
||||
# locals.parent_domains_listeners_yaml — jsonencode of the listener
|
||||
# objects), threaded through Flux postBuild.substitute as
|
||||
# ${PARENT_DOMAINS_LISTENERS_YAML}, and consumed BELOW as a YAML inline-
|
||||
# flow array value on `spec.listeners`. Each pair's certificateRefs
|
||||
# target the per-zone Secret rendered by products/catalyst/chart/
|
||||
# templates/sovereign-wildcard-certs.yaml (PR #827) so the Gateway
|
||||
# listener and the cert resource are always in lockstep.
|
||||
# Fix: render one listener pair per parent zone. As of #2118 (TBD-V48,
|
||||
# 2026-05-20) the listener block is rendered inside bp-catalyst-platform's
|
||||
# templates/sovereign-tls-vars-cm.yaml from .Values.parentZones (the
|
||||
# chart's single source of truth on parent-zone shape). The chart emits
|
||||
# a ConfigMap flux-system/sovereign-tls-vars whose key
|
||||
# PARENT_DOMAINS_LISTENERS_YAML carries the JSON-flow listener array.
|
||||
# Cloud-init's sovereign-tls Kustomization reads it via
|
||||
# `postBuild.substituteFrom: [{kind: ConfigMap, name: sovereign-tls-vars}]`
|
||||
# and Flux inlines the value at `${PARENT_DOMAINS_LISTENERS_YAML}` below.
|
||||
# Each pair's certificateRefs target the per-prov Secret rendered by
|
||||
# clusters/_template/sovereign-tls/cilium-gateway-cert.yaml (TBD-A29
|
||||
# #1883) — the listener and the cert resource stay in lockstep.
|
||||
#
|
||||
# Why moved out of cloud-init: the inline value scaled O(N) with parent-
|
||||
# zone count and pushed 4-zone SME-pool Sovereigns over Hetzner's 32 KiB
|
||||
# user_data cap (t39 audit, 2026-05-20: 33,656 bytes overshot the post-
|
||||
# #1985 guardrail of 32,256). Render-in-chart drops ~2.1 KiB out of
|
||||
# cloud-init with safety margin.
|
||||
#
|
||||
# Why a scalar placeholder, not a multi-line block:
|
||||
# - kustomize-build PARSES the YAML before Flux runs envsubst. A
|
||||
@ -100,10 +109,13 @@
|
||||
# sectionName (PR #1888 closing #1884) — Cilium attaches by hostname
|
||||
# match.
|
||||
#
|
||||
# The listener block is rendered by infra/hetzner/main.tf locals.
|
||||
# parent_domains_listeners_yaml using local.parent_domains_single_zone
|
||||
# to switch between the two naming schemes (and appending per-prov
|
||||
# listeners via local.per_prov_listeners).
|
||||
# The listener block is rendered by bp-catalyst-platform's
|
||||
# templates/sovereign-tls-vars-cm.yaml (Closes #2118 / TBD-V48), which
|
||||
# encodes the single-zone vs multi-zone naming switch AND the per-prov
|
||||
# pair-emission logic directly in Helm template (range over parentZones
|
||||
# + ternary `single ? "https" : "https-<sanitised>"` + a $fqdnInZones
|
||||
# collision check that skips the per-prov pair when sovereignFQDN
|
||||
# already equals a declared parent-zone name).
|
||||
|
||||
apiVersion: gateway.networking.k8s.io/v1
|
||||
kind: Gateway
|
||||
|
||||
@ -765,8 +765,9 @@ spec:
|
||||
# already use. sectionName is intentionally omitted so the HTTPRoute
|
||||
# attaches to every listener whose hostname matches "sandbox.<sov-fqdn>"
|
||||
# — currently the wildcard *.${SOVEREIGN_FQDN} HTTPS listener
|
||||
# (https-<sov-fqdn-dashed>) per infra/hetzner/main.tf
|
||||
# locals.parent_domains_listeners_yaml fallback path.
|
||||
# (https-<sov-fqdn-dashed>) emitted by bp-catalyst-platform's
|
||||
# templates/sovereign-tls-vars-cm.yaml per-prov listener pair (Closes
|
||||
# #2118 / TBD-V48; formerly infra/hetzner/main.tf locals.per_prov_listeners).
|
||||
parentRefs:
|
||||
- name: cilium-gateway
|
||||
namespace: kube-system
|
||||
|
||||
@ -1314,22 +1314,19 @@ write_files:
|
||||
# bp-catalyst-platform into clusters/_template/sovereign-tls/
|
||||
# has access to the parent-zone list without a config copy.
|
||||
PARENT_DOMAINS_YAML: '${parent_domains_yaml}'
|
||||
# PARENT_DOMAINS_LISTENERS_YAML (issue #831 follow-on to #827).
|
||||
# JSON-flow array literal listing one Gateway listener pair
|
||||
# (HTTPS:30443 + HTTP:30080) per parent zone. Consumed as a
|
||||
# scalar value at `listeners: $${PARENT_DOMAINS_LISTENERS_YAML}`
|
||||
# in clusters/_template/sovereign-tls/cilium-gateway.yaml.
|
||||
# kustomize-build accepts the unsubstituted scalar; Flux's
|
||||
# postBuild.substitute then swaps it for the materialised
|
||||
# array, which YAML parses as the actual listener list.
|
||||
# The double jsonencode is intentional — the inner one
|
||||
# (locals.parent_domains_listeners_yaml) renders the array;
|
||||
# the outer one wraps it as a JSON-encoded string so the
|
||||
# value-in-YAML embedding works regardless of the array's
|
||||
# internal characters. See infra/hetzner/main.tf
|
||||
# locals.parent_domains_listeners_yaml for rationale +
|
||||
# listener-naming convention.
|
||||
PARENT_DOMAINS_LISTENERS_YAML: ${jsonencode(parent_domains_listeners_yaml)}
|
||||
# PARENT_DOMAINS_LISTENERS_YAML — historically materialised here
|
||||
# by infra/hetzner/main.tf locals.parent_domains_listeners_yaml
|
||||
# and inlined as a substitute value, but that scaled O(N) with
|
||||
# parent-zone count and overflowed Hetzner's 32 KiB user_data
|
||||
# cap on 4-zone SME-pool Sovereigns (Closes #2118 — t39 audit,
|
||||
# 2026-05-20). Now rendered inside bp-catalyst-platform's
|
||||
# templates/sovereign-tls-vars-cm.yaml from .Values.parentZones
|
||||
# (single source of truth — same input the chart's per-zone
|
||||
# Certificate render already consumes). Picked up below via
|
||||
# `substituteFrom: ConfigMap/sovereign-tls-vars`. Ordering is
|
||||
# safe: this Kustomization `dependsOn: bootstrap-kit Ready`, and
|
||||
# bootstrap-kit is Ready only when bp-catalyst-platform's HR
|
||||
# (which renders the ConfigMap) is Ready.
|
||||
# WILDCARD_CERT_ISSUER (Fix #176 — qa-loop iter-1 LE
|
||||
# rate-limit unblock). cilium-gateway-cert.yaml references
|
||||
# this via $${WILDCARD_CERT_ISSUER}. When
|
||||
@ -1359,6 +1356,24 @@ write_files:
|
||||
SOVEREIGN_FQDN_SLUG: "${sovereign_fqdn_slug}"
|
||||
SOVEREIGN_REGION_KEY: ${sovereign_region_key}
|
||||
HCLOUD_LB_LOCATION: "${region}"
|
||||
# substituteFrom: ConfigMap/sovereign-tls-vars (Closes #2118).
|
||||
# The bp-catalyst-platform chart's templates/sovereign-tls-vars-cm.yaml
|
||||
# renders this ConfigMap from .Values.parentZones into flux-system.
|
||||
# Keys it carries:
|
||||
# - PARENT_DOMAINS_LISTENERS_YAML: JSON-flow listener array
|
||||
# consumed by clusters/_template/sovereign-tls/cilium-gateway.yaml
|
||||
# at `spec.listeners: $${PARENT_DOMAINS_LISTENERS_YAML}`.
|
||||
# Moved out of the inline `substitute` map above to keep cloud-init
|
||||
# under Hetzner's 32 KiB user_data cap on multi-zone SME-pool
|
||||
# Sovereigns (the listener block scales O(N) with parent-zone
|
||||
# count; 4 zones → ~2.2 KiB → cloud-init at 33.6 KiB before this fix).
|
||||
# optional: false is correct — bp-catalyst-platform is INSIDE
|
||||
# bootstrap-kit, and this Kustomization dependsOn bootstrap-kit
|
||||
# Ready, so the ConfigMap is guaranteed to exist before reconcile.
|
||||
substituteFrom:
|
||||
- kind: ConfigMap
|
||||
name: sovereign-tls-vars
|
||||
optional: false
|
||||
---
|
||||
apiVersion: kustomize.toolkit.fluxcd.io/v1
|
||||
kind: Kustomization
|
||||
|
||||
@ -420,132 +420,32 @@ locals {
|
||||
# cert wiring under #831, not parent-zone wildcards.
|
||||
|
||||
# ── TBD-A32 (#1886) — per-prov 2-label wildcard listener ─────────────────
|
||||
# Closes #2118 (2026-05-20): the listener YAML was historically
|
||||
# materialised here (locals.parent_domains_listeners_yaml,
|
||||
# locals.per_prov_listeners, locals.parent_domains_includes_sovereign_fqdn)
|
||||
# and threaded into cloud-init as an inline postBuild.substitute value.
|
||||
# That scaled O(N) with parent-zone count and pushed cloud-init over
|
||||
# Hetzner's 32 KiB user_data cap on 4-zone SME-pool Sovereigns (t39
|
||||
# audit, 2026-05-20: 33,656 bytes for omantel.biz + 3-zone .omani.X).
|
||||
#
|
||||
# The parent-zone listeners above declare `hostname: *.<zone>` (e.g.
|
||||
# `*.omani.works`). Per Gateway-API spec wildcard semantics, that pattern
|
||||
# matches EXACTLY ONE label depth — `foo.omani.works` ✅ — but NOT
|
||||
# `console.t28.omani.works` (2-label depth from the apex). On a
|
||||
# multi-prov shared parent zone like `omani.works`, every per-prov
|
||||
# operator endpoint (console.<fqdn>, api.<fqdn>, marketplace.<fqdn>, …)
|
||||
# is 2-label-deep, so the parent-zone wildcard listener never catches
|
||||
# them and cilium-envoy returns TLS handshake reset / NoMatchingListener.
|
||||
# The listener block is now rendered inside bp-catalyst-platform's
|
||||
# templates/sovereign-tls-vars-cm.yaml from .Values.parentZones +
|
||||
# .Values.global.sovereignFQDN — same Helm template that already owns
|
||||
# the per-zone Certificate render shape. The chart writes a ConfigMap
|
||||
# `flux-system/sovereign-tls-vars` whose `PARENT_DOMAINS_LISTENERS_YAML`
|
||||
# key is read by the sovereign-tls Kustomization via
|
||||
# `postBuild.substituteFrom` (cloud-init writes that Kustomization).
|
||||
#
|
||||
# Note: TBD-A29 (#1883) already pointed the parent-zone listener's
|
||||
# certificateRefs at the per-prov cert `sovereign-wildcard-tls-
|
||||
# <fqdn-dashed>`. That fixed the LE-budget burn but NOT the hostname-
|
||||
# match gap — listener selection happens BEFORE SNI cert dispatch, so
|
||||
# cilium-envoy never reaches the per-prov cert for a 2-label-deep
|
||||
# request that the parent-zone listener rejects on hostname.
|
||||
# The TBD-A32 collision guard (sovereign_fqdn ∈ parent_zone_names →
|
||||
# skip per-prov pair to avoid duplicate listener-name Conflict) is
|
||||
# preserved in the Helm template as a `range` over parentZones that
|
||||
# sets a `$fqdnInZones` boolean.
|
||||
#
|
||||
# Fix: emit an ADDITIONAL listener pair hostnamed `*.<sovereign_fqdn>`
|
||||
# (e.g. `*.t28.omani.works`) bound to the SAME per-prov cert
|
||||
# `sovereign-wildcard-tls-<fqdn-dashed>` rendered by
|
||||
# clusters/_template/sovereign-tls/cilium-gateway-cert.yaml. That
|
||||
# cert already enumerates 13 per-prov SANs (console / auth / gitea /
|
||||
# harbor / registry / api / bao / grafana / hubble / pdns /
|
||||
# openova-flow / guacamole / marketplace / sandbox) so every per-prov
|
||||
# subdomain has both a listener match AND a matching cert SAN.
|
||||
#
|
||||
# Collision guard: when sovereign_fqdn is identical to one of the
|
||||
# declared parent-zone names (legacy single-zone case where the
|
||||
# operator brings the apex itself, e.g. parent_domains_yaml=
|
||||
# `[{name: "omani.works"}]` and sovereign_fqdn=`omani.works`), the
|
||||
# parent-zone listener already covers everything 1-label-deep and
|
||||
# adding a duplicate `*.<fqdn>` pair would produce a Gateway
|
||||
# Conflicted condition on duplicate listener names. Skip the per-prov
|
||||
# pair in that case — `local.parent_domains_includes_sovereign_fqdn`.
|
||||
#
|
||||
# Naming: the per-prov listener pair always uses unique names
|
||||
# `https-<fqdn-dashed>` / `http-<fqdn-dashed>` (e.g. `https-t28-omani-
|
||||
# works`). This is safe because every catalyst-system HTTPRoute now
|
||||
# OMITS sectionName (PR #1888 closing #1884) — Cilium attaches each
|
||||
# route to the listener whose hostname matches via the hostname
|
||||
# filter, not by sectionName equality.
|
||||
parent_domains_includes_sovereign_fqdn = contains(
|
||||
[for e in local.parent_domains_decoded : e.name],
|
||||
var.sovereign_fqdn
|
||||
)
|
||||
# NOTE (TBD-A35 hotfix, Closes #1886): the conditional that suppresses
|
||||
# this pair when sovereign_fqdn collides with a declared parent zone now
|
||||
# lives on the consumer line in `parent_domains_listeners_yaml` below
|
||||
# (concat() at line ~503). Keeping the conditional here as
|
||||
# `... ? [] : [<HTTPS_obj>, <HTTP_obj>]` triggers tofu/terraform
|
||||
# "Inconsistent conditional result types" — the true arm is an empty
|
||||
# tuple `tuple([])` while the false arm is `tuple([obj_with_tls,
|
||||
# obj_without_tls])` and HCL cannot unify the two. Always emit the pair
|
||||
# at this local; suppress at the consumer.
|
||||
per_prov_listeners = [
|
||||
{
|
||||
name = format("https-%s", local.sovereign_fqdn_dashed)
|
||||
port = 30443
|
||||
protocol = "HTTPS"
|
||||
hostname = format("*.%s", var.sovereign_fqdn)
|
||||
tls = {
|
||||
mode = "Terminate"
|
||||
certificateRefs = [
|
||||
{
|
||||
kind = "Secret"
|
||||
name = format("sovereign-wildcard-tls-%s", local.sovereign_fqdn_dashed)
|
||||
}
|
||||
]
|
||||
}
|
||||
allowedRoutes = {
|
||||
namespaces = {
|
||||
from = "All"
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
name = format("http-%s", local.sovereign_fqdn_dashed)
|
||||
port = 30080
|
||||
protocol = "HTTP"
|
||||
hostname = format("*.%s", var.sovereign_fqdn)
|
||||
allowedRoutes = {
|
||||
namespaces = {
|
||||
from = "All"
|
||||
}
|
||||
}
|
||||
},
|
||||
]
|
||||
|
||||
parent_domains_listeners_yaml = jsonencode(concat(
|
||||
flatten([
|
||||
for entry in local.parent_domains_decoded : [
|
||||
{
|
||||
name = local.parent_domains_single_zone ? "https" : format("https-%s", replace(entry.name, ".", "-"))
|
||||
port = 30443
|
||||
protocol = "HTTPS"
|
||||
hostname = format("*.%s", entry.name)
|
||||
tls = {
|
||||
mode = "Terminate"
|
||||
certificateRefs = [
|
||||
{
|
||||
kind = "Secret"
|
||||
name = format("sovereign-wildcard-tls-%s", local.sovereign_fqdn_dashed)
|
||||
}
|
||||
]
|
||||
}
|
||||
allowedRoutes = {
|
||||
namespaces = {
|
||||
from = "All"
|
||||
}
|
||||
}
|
||||
},
|
||||
{
|
||||
name = local.parent_domains_single_zone ? "http" : format("http-%s", replace(entry.name, ".", "-"))
|
||||
port = 30080
|
||||
protocol = "HTTP"
|
||||
hostname = format("*.%s", entry.name)
|
||||
allowedRoutes = {
|
||||
namespaces = {
|
||||
from = "All"
|
||||
}
|
||||
}
|
||||
},
|
||||
]
|
||||
]),
|
||||
[for l in local.per_prov_listeners : l if !local.parent_domains_includes_sovereign_fqdn]
|
||||
))
|
||||
# Single-zone fallback (legacy Sovereigns shipping parentZones empty)
|
||||
# is preserved in the Helm template as a `if eq (len $zones) 0 →
|
||||
# list (dict "name" $fqdn "role" "primary")` substitution — matches
|
||||
# the historical `coalesce(var.parent_domains_yaml, format("[{name:
|
||||
# \"%s\", role: \"primary\"}]", var.sovereign_fqdn))` shape.
|
||||
|
||||
# ── Effective singular-path SKU selection (Fix #157) ─────────────────────
|
||||
# When qa_fixtures_enabled='true', the Sovereign is a QA-loop matrix
|
||||
@ -808,12 +708,13 @@ locals {
|
||||
var.parent_domains_yaml,
|
||||
format("[{name: \"%s\", role: \"primary\"}]", var.sovereign_fqdn)
|
||||
)
|
||||
# Cilium Gateway listeners per parent zone (issue #831). Multi-line
|
||||
# YAML block iterating local.parent_domains_decoded. Threaded into
|
||||
# clusters/_template/sovereign-tls/cilium-gateway.yaml via Flux
|
||||
# postBuild.substitute as ${PARENT_DOMAINS_LISTENERS_YAML}. See
|
||||
# locals.parent_domains_listeners_yaml above for shape + rationale.
|
||||
parent_domains_listeners_yaml = local.parent_domains_listeners_yaml
|
||||
# Cilium Gateway listener YAML is no longer threaded into cloud-init
|
||||
# (Closes #2118). The bp-catalyst-platform chart's
|
||||
# templates/sovereign-tls-vars-cm.yaml renders the listener block
|
||||
# from .Values.parentZones into a flux-system/sovereign-tls-vars
|
||||
# ConfigMap; the sovereign-tls Kustomization's
|
||||
# postBuild.substituteFrom picks it up. Keeps cloud-init under
|
||||
# Hetzner's 32 KiB user_data cap on multi-zone SME-pool Sovereigns.
|
||||
# sovereign_regions_json — canonical multi-region RegionSpec[]
|
||||
# JSON literal. Threaded into bp-catalyst-platform's
|
||||
# .Values.sovereign.regionsJson via the bootstrap-kit slot 13
|
||||
@ -1376,12 +1277,12 @@ locals {
|
||||
var.parent_domains_yaml,
|
||||
format("[{name: \"%s\", role: \"primary\"}]", var.sovereign_fqdn)
|
||||
)
|
||||
# Cilium Gateway listeners per parent zone (issue #831). Same
|
||||
# rendered multi-line YAML as the primary CP — secondary regions
|
||||
# also reconcile sovereign-tls into THEIR own cluster, so the
|
||||
# listeners block must be present there too. See
|
||||
# locals.parent_domains_listeners_yaml in this file.
|
||||
parent_domains_listeners_yaml = local.parent_domains_listeners_yaml
|
||||
# Cilium Gateway listener YAML is no longer threaded into cloud-init
|
||||
# (Closes #2118). Same bp-catalyst-platform chart runs in every
|
||||
# region — each peer's chart renders its own
|
||||
# flux-system/sovereign-tls-vars ConfigMap and sovereign-tls reads
|
||||
# it locally via postBuild.substituteFrom. See main.tf locals
|
||||
# comment (~line 422) for full rationale.
|
||||
# Same JSON-encoded RegionSpec[] as the primary CP — every region's
|
||||
# bp-catalyst-platform renders the same sovereign.regionsJson value
|
||||
# (the cluster topology is Sovereign-wide, not per-region).
|
||||
|
||||
@ -577,18 +577,25 @@ func (h *Handler) AddParentDomain(w http.ResponseWriter, r *http.Request) {
|
||||
// next-prov. For an ALREADY-RUNNING Sovereign, the Hetzner
|
||||
// hcloud_server resource has no `ignore_changes = [user_data]`
|
||||
// so a `tofu apply` from changed cloud-init would request a
|
||||
// destructive server recreate — the operator workaround is to
|
||||
// `kubectl patch kustomization sovereign-tls -n flux-system`
|
||||
// on the live Sovereign and append the new zone to the
|
||||
// `.spec.postBuild.substitute.PARENT_DOMAINS_LISTENERS_YAML`
|
||||
// value. Long-term: add a Day-2 listener-patch step here that
|
||||
// reaches into the Sovereign apiserver via the persisted
|
||||
// kubeconfig (out of scope for the #1772 ship).
|
||||
h.log.Info("parent-domain post-add: operator must patch live Sovereign Kustomization to surface listener for the new zone",
|
||||
// destructive server recreate.
|
||||
//
|
||||
// Closes #2118 (TBD-V48) changed the Day-2 patch target. The
|
||||
// listener YAML was historically inlined into cloud-init's
|
||||
// .spec.postBuild.substitute.PARENT_DOMAINS_LISTENERS_YAML on
|
||||
// the sovereign-tls Kustomization, so operators patched that
|
||||
// field on the live Sovereign. The chart now renders the
|
||||
// listener YAML into ConfigMap/sovereign-tls-vars in flux-system
|
||||
// and the Kustomization reads via postBuild.substituteFrom; the
|
||||
// live-Sovereign Day-2 patch target is therefore the ConfigMap's
|
||||
// data.PARENT_DOMAINS_LISTENERS_YAML key, NOT the Kustomization's
|
||||
// inline substitute map. Long-term: add a Day-2 ConfigMap-patch
|
||||
// step here that reaches into the Sovereign apiserver via the
|
||||
// persisted kubeconfig (out of scope for the #1772 ship).
|
||||
h.log.Info("parent-domain post-add: operator must patch live Sovereign ConfigMap to surface listener for the new zone",
|
||||
"domain", req.Name,
|
||||
"target", "Kustomization/sovereign-tls in flux-system on Sovereign",
|
||||
"field", ".spec.postBuild.substitute.PARENT_DOMAINS_LISTENERS_YAML",
|
||||
"reason", "hcloud_server user_data is not ignored — tofu apply would recreate the server. Fresh provs already render the listener.",
|
||||
"target", "ConfigMap/sovereign-tls-vars in flux-system on Sovereign",
|
||||
"field", ".data.PARENT_DOMAINS_LISTENERS_YAML",
|
||||
"reason", "hcloud_server user_data is not ignored — tofu apply would recreate the server. Fresh provs already render the listener via the chart.",
|
||||
)
|
||||
writeJSON(w, http.StatusCreated, ParentDomain{
|
||||
Name: name,
|
||||
|
||||
@ -2235,8 +2235,29 @@ name: bp-catalyst-platform
|
||||
#
|
||||
# Refs #1099 (NOT Closes — operator walk + screenshot is the DoD per
|
||||
# CLAUDE.md §0).
|
||||
version: 1.4.231
|
||||
appVersion: 1.4.231
|
||||
version: 1.4.232
|
||||
appVersion: 1.4.232
|
||||
# 1.4.232 — fix(sovereign-tls): render Cilium Gateway listener YAML in
|
||||
# the chart (templates/sovereign-tls-vars-cm.yaml) and feed it into the
|
||||
# sovereign-tls Kustomization via Flux postBuild.substituteFrom on
|
||||
# ConfigMap/sovereign-tls-vars. Removes the O(N)-per-parent-zone listener
|
||||
# block from cloud-init (infra/hetzner/cloudinit-control-plane.tftpl)
|
||||
# so 4-zone SME-pool Sovereigns (e.g. omantel.biz + .omani.{homes,rest,
|
||||
# trade}) render under Hetzner's 32 KiB user_data cap with safety margin.
|
||||
# Closes #2118 (TBD-V48). Synthetic render evidence: t39 4-zone cloud-init
|
||||
# stripped 30,748 → 28,619 bytes (saves 2,129 bytes; threshold 32,256).
|
||||
# Helm template covers all three historical paths preserved by the old
|
||||
# locals.parent_domains_listeners_yaml in infra/hetzner/main.tf:
|
||||
# - Single-zone fallback (parentZones empty → list with sovereign FQDN)
|
||||
# emits bare `https`/`http` listener names (every catalyst-system
|
||||
# HTTPRoute hardcodes sectionName: https on single-zone Sovereigns).
|
||||
# - Multi-zone (SME pool) emits unique `https-<sanitised>` /
|
||||
# `http-<sanitised>` names per parent zone.
|
||||
# - TBD-A32 #1886 per-prov 2-label wildcard listener pair appended when
|
||||
# sovereignFQDN ∉ parentZones (skipped otherwise to avoid Conflict).
|
||||
# Cross-region: every region runs the same bp-catalyst-platform chart,
|
||||
# each peer renders its own ConfigMap into its own flux-system, so
|
||||
# sovereign-tls Kustomizations in secondary regions read locally.
|
||||
# 1.4.183 — fix(httproute): omit default sectionName so multi-zone
|
||||
# Sovereigns attach via Cilium Gateway hostname matcher (Closes #1884,
|
||||
# TBD-A30). Pre-1.4.183 every catalyst-system HTTPRoute pinned
|
||||
|
||||
154
products/catalyst/chart/templates/sovereign-tls-vars-cm.yaml
Normal file
154
products/catalyst/chart/templates/sovereign-tls-vars-cm.yaml
Normal file
@ -0,0 +1,154 @@
|
||||
{{- /*
|
||||
sovereign-tls-vars ConfigMap — Flux postBuild.substituteFrom source for
|
||||
the sovereign-tls Kustomization (clusters/_template/sovereign-tls/).
|
||||
|
||||
Why this lives in the chart (Closes #2118)
|
||||
─────────────────────────────────────────
|
||||
The Cilium Gateway listeners block (cilium-gateway.yaml `spec.listeners:
|
||||
${PARENT_DOMAINS_LISTENERS_YAML}`) was previously materialised in
|
||||
infra/hetzner/main.tf (locals.parent_domains_listeners_yaml) and threaded
|
||||
into cloud-init as an inline `postBuild.substitute` value on the
|
||||
sovereign-tls Kustomization manifest.
|
||||
|
||||
That value scales O(N) with the parent_domains count: ~440 bytes per
|
||||
parent zone (HTTPS+HTTP listener objects with cert refs + allowedRoutes)
|
||||
plus the per-prov 2-label wildcard pair (TBD-A32 #1886) when the
|
||||
sovereign FQDN is not itself one of the parent zones. For a 4-zone
|
||||
SME-pool Sovereign (primary + 3 sme-pool, e.g. omantel.biz + omani.{homes,
|
||||
rest,trade}) the value renders to ~2,210 bytes inside cloud-init — and
|
||||
Hetzner caps user_data at HARD 32 KiB. Cloud-init for t39's exact body
|
||||
overshot the post-#1985 guardrail (32,256) at 33,656 bytes (audit
|
||||
agent-a2c1647c, 2026-05-20).
|
||||
|
||||
Fix: render the listener YAML inside the chart from .Values.parentZones
|
||||
(single source of truth — already populated by bootstrap-kit slot 13's
|
||||
`${PARENT_DOMAINS_YAML}` substitute). Emit it into a ConfigMap in
|
||||
flux-system/. The sovereign-tls Kustomization adds
|
||||
`postBuild.substituteFrom: [{kind: ConfigMap, name: sovereign-tls-vars,
|
||||
namespace: flux-system}]` and reads the value from there instead of an
|
||||
inline substitute key. The chart is INSIDE bootstrap-kit (slot 13);
|
||||
bootstrap-kit reaches Ready iff every HR (including this chart) is
|
||||
Ready; the sovereign-tls Kustomization `dependsOn: bootstrap-kit Ready`,
|
||||
so by the time sovereign-tls reconciles, this ConfigMap exists in etcd.
|
||||
|
||||
Ordering is the same as the legacy cloud-init substitute path — Cilium
|
||||
Gateway always lands AFTER the chart's per-zone resources are committed.
|
||||
|
||||
Catalyst-Zero (contabo, no global.sovereignFQDN, no sovereign-tls
|
||||
Kustomization) skips this template via the guard below so the contabo
|
||||
Kustomize build remains untouched.
|
||||
|
||||
Listener-shape contract (must match locals.parent_domains_listeners_yaml
|
||||
in infra/hetzner/main.tf historically):
|
||||
- SINGLE parent zone → listener names are the bare strings
|
||||
`https` / `http` (every platform HTTPRoute
|
||||
hardcodes `parentRefs[0].sectionName: https`
|
||||
on single-zone Sovereigns).
|
||||
- MULTIPLE parent zones (SME pool present) → unique names per zone:
|
||||
`https-<sanitised>` / `http-<sanitised>`
|
||||
where sanitised = zone.replace(".", "-").
|
||||
- certificateRefs ALWAYS targets the per-prov per-name TLS Secret
|
||||
`sovereign-wildcard-tls-<sovereignFQDN-dashed>` (TBD-A29 #1883 —
|
||||
LE rate-limit bypass via per-prov identifier set).
|
||||
- PER-PROV 2-label wildcard listener pair (TBD-A32 #1886) appended
|
||||
when global.sovereignFQDN is NOT identical to any parent-zone name
|
||||
(i.e. the operator did not bring the apex itself). Pair is hostnamed
|
||||
`*.<sovereignFQDN>` so 2-label-deep operator endpoints
|
||||
(`console.t39.omantel.biz`) match. Named `https-<fqdn-dashed>` /
|
||||
`http-<fqdn-dashed>` so the parent-zone listener doesn't collide.
|
||||
|
||||
The output value is a JSON-flow array string (YAML-compatible) consumed
|
||||
as a YAML scalar at `listeners: ${PARENT_DOMAINS_LISTENERS_YAML}` in
|
||||
clusters/_template/sovereign-tls/cilium-gateway.yaml. Flux's
|
||||
postBuild.substituteFrom inlines the value verbatim and the apiserver
|
||||
parses it as the materialised listener list.
|
||||
*/}}
|
||||
{{- if .Values.global.sovereignFQDN }}
|
||||
{{- $fqdn := .Values.global.sovereignFQDN }}
|
||||
{{- $fqdnDashed := replace "." "-" $fqdn }}
|
||||
{{- $secretName := printf "sovereign-wildcard-tls-%s" $fqdnDashed }}
|
||||
{{- $zones := default (list) .Values.parentZones }}
|
||||
{{- /* Single-zone fallback so legacy Sovereigns that ship parentZones
|
||||
empty still produce a valid listener pair. Mirrors the same
|
||||
fallback infra/hetzner/main.tf locals.parent_domains_decoded used
|
||||
(single zone derived from sovereign FQDN, role=primary). */}}
|
||||
{{- if eq (len $zones) 0 }}
|
||||
{{- $zones = list (dict "name" $fqdn "role" "primary") }}
|
||||
{{- end }}
|
||||
{{- $single := eq (len $zones) 1 }}
|
||||
{{- /* Build the listener array. We assemble a list of dicts then
|
||||
toJson it; Go-template flow is verbose but unambiguous. */}}
|
||||
{{- $listeners := list }}
|
||||
{{- range $z := $zones }}
|
||||
{{- $sanitised := replace "." "-" $z.name }}
|
||||
{{- $httpsName := ternary "https" (printf "https-%s" $sanitised) $single }}
|
||||
{{- $httpName := ternary "http" (printf "http-%s" $sanitised) $single }}
|
||||
{{- $httpsListener := dict
|
||||
"name" $httpsName
|
||||
"port" 30443
|
||||
"protocol" "HTTPS"
|
||||
"hostname" (printf "*.%s" $z.name)
|
||||
"tls" (dict
|
||||
"mode" "Terminate"
|
||||
"certificateRefs" (list (dict "kind" "Secret" "name" $secretName))
|
||||
)
|
||||
"allowedRoutes" (dict "namespaces" (dict "from" "All"))
|
||||
}}
|
||||
{{- $httpListener := dict
|
||||
"name" $httpName
|
||||
"port" 30080
|
||||
"protocol" "HTTP"
|
||||
"hostname" (printf "*.%s" $z.name)
|
||||
"allowedRoutes" (dict "namespaces" (dict "from" "All"))
|
||||
}}
|
||||
{{- $listeners = append $listeners $httpsListener }}
|
||||
{{- $listeners = append $listeners $httpListener }}
|
||||
{{- end }}
|
||||
{{- /* Per-prov 2-label wildcard pair (TBD-A32 #1886). Skipped when
|
||||
sovereignFQDN is identical to a declared parent-zone name
|
||||
(legacy single-zone-on-apex case — duplicate listener-name
|
||||
guard). */}}
|
||||
{{- $fqdnInZones := false }}
|
||||
{{- range $z := $zones }}
|
||||
{{- if eq $z.name $fqdn }}
|
||||
{{- $fqdnInZones = true }}
|
||||
{{- end }}
|
||||
{{- end }}
|
||||
{{- if not $fqdnInZones }}
|
||||
{{- $listeners = append $listeners (dict
|
||||
"name" (printf "https-%s" $fqdnDashed)
|
||||
"port" 30443
|
||||
"protocol" "HTTPS"
|
||||
"hostname" (printf "*.%s" $fqdn)
|
||||
"tls" (dict
|
||||
"mode" "Terminate"
|
||||
"certificateRefs" (list (dict "kind" "Secret" "name" $secretName))
|
||||
)
|
||||
"allowedRoutes" (dict "namespaces" (dict "from" "All"))
|
||||
) }}
|
||||
{{- $listeners = append $listeners (dict
|
||||
"name" (printf "http-%s" $fqdnDashed)
|
||||
"port" 30080
|
||||
"protocol" "HTTP"
|
||||
"hostname" (printf "*.%s" $fqdn)
|
||||
"allowedRoutes" (dict "namespaces" (dict "from" "All"))
|
||||
) }}
|
||||
{{- end }}
|
||||
---
|
||||
apiVersion: v1
|
||||
kind: ConfigMap
|
||||
metadata:
|
||||
name: sovereign-tls-vars
|
||||
namespace: flux-system
|
||||
labels:
|
||||
app.kubernetes.io/managed-by: helm
|
||||
app.kubernetes.io/component: sovereign-tls-vars
|
||||
catalyst.openova.io/sovereign: {{ $fqdn | quote }}
|
||||
data:
|
||||
# PARENT_DOMAINS_LISTENERS_YAML — JSON-flow array literal of the
|
||||
# Cilium Gateway listener block. Consumed by Flux postBuild.
|
||||
# substituteFrom on the sovereign-tls Kustomization (cloud-init
|
||||
# writes that Kustomization). See cilium-gateway.yaml `listeners:
|
||||
# ${PARENT_DOMAINS_LISTENERS_YAML}` for the consumer.
|
||||
PARENT_DOMAINS_LISTENERS_YAML: {{ toJson $listeners | quote }}
|
||||
{{- end }}
|
||||
@ -761,7 +761,9 @@ ingress:
|
||||
#
|
||||
# The Cilium Gateway template
|
||||
# (clusters/_template/sovereign-tls/cilium-gateway.yaml +
|
||||
# infra/hetzner/main.tf locals.parent_domains_listeners_yaml)
|
||||
# bp-catalyst-platform's templates/sovereign-tls-vars-cm.yaml
|
||||
# — formerly infra/hetzner/main.tf locals.parent_domains_listeners_yaml
|
||||
# before #2118 / TBD-V48 hoisted the render into the chart)
|
||||
# names HTTPS listeners as follows:
|
||||
# - SINGLE parent zone → bare `https` / `http`
|
||||
# - MULTIPLE parent zones (SME pool present) → unique
|
||||
|
||||
Loading…
Reference in New Issue
Block a user