Founder caught on t144: /settings/marketplace toggle showed disabled
even though the prov body had marketplaceEnabled=true.
Root cause: store.RedactedRequest struct (the on-disk projection)
lacked a MarketplaceEnabled field. Every Save/Load cycle stripped
the bit:
- Mothership Save(rec) → MarketplaceEnabled dropped
- Mothership exportDeploymentToChild → chroot receives record without bit
- Chroot HandleGetMarketplace → reads dep.Request.MarketplaceEnabled
→ zero value (false) → UI toggle defaults to disabled
PR J #1590's GET endpoint was correctly wired but the data was already
gone before it ran.
Fix: add MarketplaceEnabled field to RedactedRequest + carry it
through Redact() + ToProvisionerRequest(). Backward-compat via
`omitempty` — records persisted before this PR deserialize with
false, same as the prior behavior.
Bumps chart 1.4.151 -> 1.4.152 + bootstrap-kit pin so next prov
exercises the full chain.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
t143 hit LE PROD rate limit (50 certs/week on omani.works exhausted)
because TWO cert templates compete for the same parent-domain quota:
1. clusters/_template/sovereign-tls/cilium-gateway-cert.yaml — legacy
SAN cert named `sovereign-wildcard-tls`
2. products/catalyst/chart/templates/sovereign-wildcard-certs.yaml —
chart per-zone cert named `sovereign-wildcard-tls-<sanitised-zone>`
The Cilium Gateway listener hardcoded the legacy name, so when LE 429s
the legacy cert (as happened on t143), HTTPS to console.<fqdn> breaks
even though the per-zone cert is Ready.
Fix: gateway listener now references `sovereign-wildcard-tls-${SOVEREIGN_FQDN_DASHED}`.
Cloud-init substitutes SOVEREIGN_FQDN_DASHED = replace(fqdn, ".", "-")
in the sovereign-tls Kustomization postBuild.substitute. The per-zone
cert from the chart provides the Ready Secret with this exact name.
The legacy cilium-gateway-cert.yaml SAN cert still renders for
backward-compat (some consumers may still reference it), but the
gateway listener no longer depends on it for TLS termination.
Bumps no chart version — the change is at the Flux/Kustomize layer.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
t143 caught the LE PROD rate limit (429: too many certificates (50)
already issued for omani.works in last 168h0m0s, retry after
2026-05-17 10:28:32 UTC). The chart renders TWO cert names:
- sovereign-wildcard-tls (canonical, hit 429)
- sovereign-wildcard-tls-<fqdn> (per-FQDN, was already issued before
rate limit, Ready=True)
waitForWildcardCert only checked the canonical name. With the limit
hit, handover waited the full 10-min budget before firing degraded.
Fix: when the canonical cert is unavailable, list namespace certs
matching `sovereign-wildcard-tls-*` prefix and return Ready=True if
ANY sibling is Ready. The operator's console.<fqdn> TLS handshake
will succeed against either secret since both wildcard *.<fqdn>.
Bumps chart 1.4.150 -> 1.4.151 + bootstrap-kit pin so the fix lands
on next fresh prov.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Founder follow-up to t142 cycle:
1. "the dashboard is still not showing the clusters properly" — the D16
fan-out CODE works (3 clusters in k8sCache, dashboard handler fans
out) but the OPERATOR-FACING default Layer-1 was 'family' not
'cluster'. Operator opens /dashboard, sees family-grouped bubbles,
thinks the multi-cluster fix is broken. Fix: when SovereignFQDN is
present (Sovereign Console mode), default to ['cluster', 'application']
so the 3-cluster grouping is the first thing the operator sees.
2. "I have no idea where the admin components for billing, order, revenue
etc related BSS are" — exists at marketplace.<sov>/back-office/ but
the Sovereign Console sidebar had no link. Fix: add "Marketplace Admin"
nav link (external, opens in new tab) — uses resolvedFQDN to construct
the URL. data-testid=sov-console-nav-marketplace-admin for matrix.
Also bumps chart 1.4.149 → 1.4.150 + bootstrap-kit pin so the changes
land on next fresh prov.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Founder t140 bug #2: "in the catalog and jobs it shows as installed,
in the application page it shows as provisioning, there is a sync issue".
Root cause: AppDetail reads Application CR via GET /sovereigns/{id}/
applications/{name}. For bootstrap-kit installs (cilium, cert-manager,
gateway-api, alloy, etc.) NO Application CR exists — they ship as
HelmReleases directly with no wizard step to create the CR. The handler
returned 404 → UI showed "App not found" or perpetual "Provisioning",
while /apps (which reads HelmRelease) shows "installed".
Fix: HandleApplicationGet, on Application CR not-found, falls back to a
HelmRelease lookup in h.k8sCache (uses resolveChrootClusterID so it works
post-D16 multi-cluster fan-out). Synthesises an applicationDetailResponse
from HR fields:
- Name/Namespace from HR
- Blueprint from spec.chart.spec.chart
- Version from spec.chart.spec.version (or status.lastAttemptedRevision)
- Phase: Ready (HR Ready=True) / Failed (False) / Provisioning (Unknown)
- Conditions: pass-through HR conditions
Also bumps chart to 1.4.149 + bootstrap-kit pin so this fix + the
queued PRs #1590 (marketplace GET) + #1591 (publish toggle UI) all
land on the next fresh prov.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Founder caught on t140 bug #4: "I am supposed to mark which applications
are going to be available in the catalog … I am not able to see such
option from the application page".
Fix: PublishToggleChip rendered in the AppDetail hero meta row.
- Reads current state on mount from GET /api/catalog/apps/{slug}
- Click flips via PUT /api/catalog/admin/apps/{slug}/published
- Optimistic update; reverts + tooltip on backend error
- data-testid="app-detail-publish-toggle" for matrix coverage
Backend already shipped — SetAppPublished handler at the catalog
service /catalog/admin/apps/{slug}/published. Gateway routes
admin/* with auth-gating so only Sovereign Console operator can
flip. No backend change needed.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Founder caught on t140 bug #5: /settings/marketplace shows "disabled"
while the marketplace is actually serving (prov body had
marketplaceEnabled=true). Root cause: MarketplaceSettings UI hardcoded
useState(false) on mount because no GET endpoint existed to read the
current value.
Fix:
- Backend: new GET /api/v1/sovereigns/{id}/marketplace returning
{deploymentId, sovereignFQDN, enabled, brand}. Reads from the
in-memory deployment record (Request.MarketplaceEnabled set at
prov time + mutated by HandleSetMarketplace's commit path).
- UI: MarketplaceSettings useEffect fetches on mount, sets the
toggle to the actual value, hydrates the brand fields. Best-effort
fetch — falls back to defaults on failure.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Founder t140 bug #6: /parent-domains shows only primary, not the
sme-pool domains. Chroot's deployment record has parentDomains[]
populated but ListParentDomains uses h.activeDeployment() which
filters to AdoptedAt!=nil. The mothership ships the record before
the chroot's own handover-finalisation, so AdoptedAt is nil →
activeDeployment returns nil → only synth primary row renders.
Fix: HandleDeploymentImport stamps AdoptedAt at import time. The
FQDN-match guard above verifies "this record IS my Sovereign's
record" so the chroot is by definition the operator/owner — no
separate adoption-wizard needed on chroot side.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go)
PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:
internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
internal/handler/deployment_handover_export.go:199:6: other declaration of itoa
Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(catalyst-api): D16/D17 — 3 bugs caught on t138
Founder caught on t136 (now wiped) that /dashboard cluster grouping
still showed 1 region and /cloud nodes showed 1 node despite earlier
D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced
on t138 fresh prov.
1. exportSecondaryKubeconfigsToChild was guarded behind the early
return of exportDeploymentToChild's failed POST. The child's
ingress + cert + gateway are still racing to reach reachable
state in the seconds after handover fires, so the first POST
gets EOF and the goroutine never fires. Fix: kick off the
D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild
in its own goroutine, BEFORE the deployment-record POST.
2. Both exports now retry with exponential backoff (5s → 60s) for
up to 5 min total. Most handovers will succeed on attempt 2-4.
Was: no retry, single shot, silent failure.
3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the
auth group (rg) into the top-level router (r), alongside
/api/v1/internal/deployments/import. The previous registration
required an operator session that doesn't exist at handover —
mothership POSTs were 401'd silently. Validation is now via
safeIDPattern regex on depID + regionKey (same security model
as the deployments/import companion endpoint).
4. HandleSovereignCloud now fans out across h.k8sCache.Clusters()
instead of using only the in-cluster client. Adds Cluster
field (omitempty) to sovereignNode/LB/SC/PVC so the UI can
group/filter by region. Without this, /cloud?view=list&kind=nodes
shows 1 node even when 3 secondary kubeconfigs are registered.
Together these fix:
- D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1)
- /cloud?view=list&kind=nodes (3+ nodes, not 1)
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(catalog): D27 — fresh-seed apps default Published+Deployable
Founder caught on t136: marketplace.t136/apps shows blank application
grid. Root cause: catalog seed.go calls migrateAppPublished +
migrateAppDeployable ONLY on the "already populated" path. On a fresh
Sovereign install (empty catalog) seedAllData inserts 27 rows with
zero-value bools — Published=false, Deployable=false. The marketplace
storefront filters with `?published=true`, gets [], renders blank.
Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished
+ seedSystemApps. Both migrations are idempotent (skip rows already
true), so re-runs are safe.
Verified the bug live on t138 (eaaee1ea24184c2a):
http://catalog.sme:8082/catalog/apps returns 27 apps
http://catalog.sme:8082/catalog/apps?published=true returns 0
With this fix the latter returns 27.
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign
Founder caught on t136: console.t136.../app/bp-alloy renders the
catalog grid (AppsPage) instead of AppDetail. Three earlier PRs
(#1572 + chart bumps) flipped the appRoute beforeLoad logic but
the actual route-matching collision was not fixed.
Root cause: appRoute.addChildren registers appDeploymentRoute at
`/$deploymentId` (effective `/app/$deploymentId`, mother-only)
BEFORE consoleLayoutRoute registers consoleAppDetailRoute at
`/app/$componentId`. TanStack Router resolves equally-specific
dynamic routes by declaration order — so on the Sovereign Console
URL `/app/bp-alloy` matches appDeploymentRoute first and renders
AppsPage with deploymentId="bp-alloy".
Fix: at routeTree build time, filter appRoute children to exclude
every mother-only `/$deploymentId/*` route when running on
Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this
is a one-time check, no runtime overhead. With those routes
absent, consoleAppDetailRoute is the only matcher for
`/app/<componentId>` on Sovereign Console — AppDetail renders.
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(bootstrap-kit): pin bp-catalyst-platform 1.4.147→1.4.148
Founder-flagged bug fixes from session t136/t138/t139 verify cycle
shipped 3 PRs that bumped catalyst chart Chart.yaml to 1.4.148
(d985f27c) with new images:
- catalystApi/Ui: 2ab8a0e (PR #1583 D16 fan-out + retry + auth-bypass,
PR #1585 D17 router collision)
- smeTag: 964dc15 (PR #1584 D27 catalog fresh-seed Published)
But bootstrap-kit/13-bp-catalyst-platform.yaml stayed pinned to
1.4.147 — every fresh provision installs the OLDER chart with the
OLDER images, so the founder-flagged bugs persist.
Caught on t139 (b4a7ee052d844da0) post-handover verify: chart
installed = bp-catalyst-platform@1.4.147, catalog returns 0
published apps, /app/bp-alloy renders catalog grid.
Bumping the pin makes fresh provs install 1.4.148 (which has all 3
PRs baked).
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
feedback_overlap_provs_dont_serialize_wait.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(catalyst-api): D16 PR H — resolveChrootClusterID multi-cluster + dashboard alias
Founder caught on t140 (29b7e14918178f7e) after D16 fan-out chain shipped:
- /dashboard is empty (no treemap rendered)
- "none of the k8s resources are streaming"
Root cause: after the D16 secondary-kubeconfig export (PR #1579/#1581)
landed, chroot's k8sCache went from 1 cluster (primary self-register)
to 3 clusters (primary + 2 secondaries). Two cascading bugs:
1. resolveChrootClusterID had a `len(clusters) != 1` guard — it only
aliased when chroot had exactly one cluster. After D16 it returned
the URL deployment_id unchanged → has-cluster check failed →
every chroot handler (networking, k8s_search, k8s_resource_metrics,
k8s_exec, dashboard) saw "not found" → returned empty.
2. dashboard.go::GetDashboardTreemap was the one chroot handler that
didn't call resolveChrootClusterID before the has-cluster check —
so even with #1 fixed, the dashboard would still 404.
Fix:
- resolveChrootClusterID: when N>1, prefer the cluster whose id is
prefixed "sovereign-" (the FactoryFromEnv self-registered primary
per buildChrootClusterRef). Falls back to clusters[0] if no match.
- GetDashboardTreemap: call resolveChrootClusterID before has-cluster
check, matching the pattern in every other chroot handler.
Refs: feedback_test_theater_3rd_violation_2026_05_17.md (don't ship
D16 fan-out without verifying every handler that depends on
single-cluster k8sCache assumption).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go)
PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:
internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
internal/handler/deployment_handover_export.go:199:6: other declaration of itoa
Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(catalyst-api): D16/D17 — 3 bugs caught on t138
Founder caught on t136 (now wiped) that /dashboard cluster grouping
still showed 1 region and /cloud nodes showed 1 node despite earlier
D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced
on t138 fresh prov.
1. exportSecondaryKubeconfigsToChild was guarded behind the early
return of exportDeploymentToChild's failed POST. The child's
ingress + cert + gateway are still racing to reach reachable
state in the seconds after handover fires, so the first POST
gets EOF and the goroutine never fires. Fix: kick off the
D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild
in its own goroutine, BEFORE the deployment-record POST.
2. Both exports now retry with exponential backoff (5s → 60s) for
up to 5 min total. Most handovers will succeed on attempt 2-4.
Was: no retry, single shot, silent failure.
3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the
auth group (rg) into the top-level router (r), alongside
/api/v1/internal/deployments/import. The previous registration
required an operator session that doesn't exist at handover —
mothership POSTs were 401'd silently. Validation is now via
safeIDPattern regex on depID + regionKey (same security model
as the deployments/import companion endpoint).
4. HandleSovereignCloud now fans out across h.k8sCache.Clusters()
instead of using only the in-cluster client. Adds Cluster
field (omitempty) to sovereignNode/LB/SC/PVC so the UI can
group/filter by region. Without this, /cloud?view=list&kind=nodes
shows 1 node even when 3 secondary kubeconfigs are registered.
Together these fix:
- D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1)
- /cloud?view=list&kind=nodes (3+ nodes, not 1)
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(catalog): D27 — fresh-seed apps default Published+Deployable
Founder caught on t136: marketplace.t136/apps shows blank application
grid. Root cause: catalog seed.go calls migrateAppPublished +
migrateAppDeployable ONLY on the "already populated" path. On a fresh
Sovereign install (empty catalog) seedAllData inserts 27 rows with
zero-value bools — Published=false, Deployable=false. The marketplace
storefront filters with `?published=true`, gets [], renders blank.
Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished
+ seedSystemApps. Both migrations are idempotent (skip rows already
true), so re-runs are safe.
Verified the bug live on t138 (eaaee1ea24184c2a):
http://catalog.sme:8082/catalog/apps returns 27 apps
http://catalog.sme:8082/catalog/apps?published=true returns 0
With this fix the latter returns 27.
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign
Founder caught on t136: console.t136.../app/bp-alloy renders the
catalog grid (AppsPage) instead of AppDetail. Three earlier PRs
(#1572 + chart bumps) flipped the appRoute beforeLoad logic but
the actual route-matching collision was not fixed.
Root cause: appRoute.addChildren registers appDeploymentRoute at
`/$deploymentId` (effective `/app/$deploymentId`, mother-only)
BEFORE consoleLayoutRoute registers consoleAppDetailRoute at
`/app/$componentId`. TanStack Router resolves equally-specific
dynamic routes by declaration order — so on the Sovereign Console
URL `/app/bp-alloy` matches appDeploymentRoute first and renders
AppsPage with deploymentId="bp-alloy".
Fix: at routeTree build time, filter appRoute children to exclude
every mother-only `/$deploymentId/*` route when running on
Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this
is a one-time check, no runtime overhead. With those routes
absent, consoleAppDetailRoute is the only matcher for
`/app/<componentId>` on Sovereign Console — AppDetail renders.
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(bootstrap-kit): pin bp-catalyst-platform 1.4.147→1.4.148
Founder-flagged bug fixes from session t136/t138/t139 verify cycle
shipped 3 PRs that bumped catalyst chart Chart.yaml to 1.4.148
(d985f27c) with new images:
- catalystApi/Ui: 2ab8a0e (PR #1583 D16 fan-out + retry + auth-bypass,
PR #1585 D17 router collision)
- smeTag: 964dc15 (PR #1584 D27 catalog fresh-seed Published)
But bootstrap-kit/13-bp-catalyst-platform.yaml stayed pinned to
1.4.147 — every fresh provision installs the OLDER chart with the
OLDER images, so the founder-flagged bugs persist.
Caught on t139 (b4a7ee052d844da0) post-handover verify: chart
installed = bp-catalyst-platform@1.4.147, catalog returns 0
published apps, /app/bp-alloy renders catalog grid.
Bumping the pin makes fresh provs install 1.4.148 (which has all 3
PRs baked).
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
feedback_overlap_provs_dont_serialize_wait.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go)
PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:
internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
internal/handler/deployment_handover_export.go:199:6: other declaration of itoa
Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(catalyst-api): D16/D17 — 3 bugs caught on t138
Founder caught on t136 (now wiped) that /dashboard cluster grouping
still showed 1 region and /cloud nodes showed 1 node despite earlier
D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced
on t138 fresh prov.
1. exportSecondaryKubeconfigsToChild was guarded behind the early
return of exportDeploymentToChild's failed POST. The child's
ingress + cert + gateway are still racing to reach reachable
state in the seconds after handover fires, so the first POST
gets EOF and the goroutine never fires. Fix: kick off the
D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild
in its own goroutine, BEFORE the deployment-record POST.
2. Both exports now retry with exponential backoff (5s → 60s) for
up to 5 min total. Most handovers will succeed on attempt 2-4.
Was: no retry, single shot, silent failure.
3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the
auth group (rg) into the top-level router (r), alongside
/api/v1/internal/deployments/import. The previous registration
required an operator session that doesn't exist at handover —
mothership POSTs were 401'd silently. Validation is now via
safeIDPattern regex on depID + regionKey (same security model
as the deployments/import companion endpoint).
4. HandleSovereignCloud now fans out across h.k8sCache.Clusters()
instead of using only the in-cluster client. Adds Cluster
field (omitempty) to sovereignNode/LB/SC/PVC so the UI can
group/filter by region. Without this, /cloud?view=list&kind=nodes
shows 1 node even when 3 secondary kubeconfigs are registered.
Together these fix:
- D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1)
- /cloud?view=list&kind=nodes (3+ nodes, not 1)
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(catalog): D27 — fresh-seed apps default Published+Deployable
Founder caught on t136: marketplace.t136/apps shows blank application
grid. Root cause: catalog seed.go calls migrateAppPublished +
migrateAppDeployable ONLY on the "already populated" path. On a fresh
Sovereign install (empty catalog) seedAllData inserts 27 rows with
zero-value bools — Published=false, Deployable=false. The marketplace
storefront filters with `?published=true`, gets [], renders blank.
Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished
+ seedSystemApps. Both migrations are idempotent (skip rows already
true), so re-runs are safe.
Verified the bug live on t138 (eaaee1ea24184c2a):
http://catalog.sme:8082/catalog/apps returns 27 apps
http://catalog.sme:8082/catalog/apps?published=true returns 0
With this fix the latter returns 27.
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign
Founder caught on t136: console.t136.../app/bp-alloy renders the
catalog grid (AppsPage) instead of AppDetail. Three earlier PRs
(#1572 + chart bumps) flipped the appRoute beforeLoad logic but
the actual route-matching collision was not fixed.
Root cause: appRoute.addChildren registers appDeploymentRoute at
`/$deploymentId` (effective `/app/$deploymentId`, mother-only)
BEFORE consoleLayoutRoute registers consoleAppDetailRoute at
`/app/$componentId`. TanStack Router resolves equally-specific
dynamic routes by declaration order — so on the Sovereign Console
URL `/app/bp-alloy` matches appDeploymentRoute first and renders
AppsPage with deploymentId="bp-alloy".
Fix: at routeTree build time, filter appRoute children to exclude
every mother-only `/$deploymentId/*` route when running on
Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this
is a one-time check, no runtime overhead. With those routes
absent, consoleAppDetailRoute is the only matcher for
`/app/<componentId>` on Sovereign Console — AppDetail renders.
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go)
PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:
internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
internal/handler/deployment_handover_export.go:199:6: other declaration of itoa
Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(catalyst-api): D16/D17 — 3 bugs caught on t138
Founder caught on t136 (now wiped) that /dashboard cluster grouping
still showed 1 region and /cloud nodes showed 1 node despite earlier
D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced
on t138 fresh prov.
1. exportSecondaryKubeconfigsToChild was guarded behind the early
return of exportDeploymentToChild's failed POST. The child's
ingress + cert + gateway are still racing to reach reachable
state in the seconds after handover fires, so the first POST
gets EOF and the goroutine never fires. Fix: kick off the
D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild
in its own goroutine, BEFORE the deployment-record POST.
2. Both exports now retry with exponential backoff (5s → 60s) for
up to 5 min total. Most handovers will succeed on attempt 2-4.
Was: no retry, single shot, silent failure.
3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the
auth group (rg) into the top-level router (r), alongside
/api/v1/internal/deployments/import. The previous registration
required an operator session that doesn't exist at handover —
mothership POSTs were 401'd silently. Validation is now via
safeIDPattern regex on depID + regionKey (same security model
as the deployments/import companion endpoint).
4. HandleSovereignCloud now fans out across h.k8sCache.Clusters()
instead of using only the in-cluster client. Adds Cluster
field (omitempty) to sovereignNode/LB/SC/PVC so the UI can
group/filter by region. Without this, /cloud?view=list&kind=nodes
shows 1 node even when 3 secondary kubeconfigs are registered.
Together these fix:
- D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1)
- /cloud?view=list&kind=nodes (3+ nodes, not 1)
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(catalog): D27 — fresh-seed apps default Published+Deployable
Founder caught on t136: marketplace.t136/apps shows blank application
grid. Root cause: catalog seed.go calls migrateAppPublished +
migrateAppDeployable ONLY on the "already populated" path. On a fresh
Sovereign install (empty catalog) seedAllData inserts 27 rows with
zero-value bools — Published=false, Deployable=false. The marketplace
storefront filters with `?published=true`, gets [], renders blank.
Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished
+ seedSystemApps. Both migrations are idempotent (skip rows already
true), so re-runs are safe.
Verified the bug live on t138 (eaaee1ea24184c2a):
http://catalog.sme:8082/catalog/apps returns 27 apps
http://catalog.sme:8082/catalog/apps?published=true returns 0
With this fix the latter returns 27.
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go)
PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:
internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
internal/handler/deployment_handover_export.go:199:6: other declaration of itoa
Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(catalyst-api): D16/D17 — 3 bugs caught on t138
Founder caught on t136 (now wiped) that /dashboard cluster grouping
still showed 1 region and /cloud nodes showed 1 node despite earlier
D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced
on t138 fresh prov.
1. exportSecondaryKubeconfigsToChild was guarded behind the early
return of exportDeploymentToChild's failed POST. The child's
ingress + cert + gateway are still racing to reach reachable
state in the seconds after handover fires, so the first POST
gets EOF and the goroutine never fires. Fix: kick off the
D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild
in its own goroutine, BEFORE the deployment-record POST.
2. Both exports now retry with exponential backoff (5s → 60s) for
up to 5 min total. Most handovers will succeed on attempt 2-4.
Was: no retry, single shot, silent failure.
3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the
auth group (rg) into the top-level router (r), alongside
/api/v1/internal/deployments/import. The previous registration
required an operator session that doesn't exist at handover —
mothership POSTs were 401'd silently. Validation is now via
safeIDPattern regex on depID + regionKey (same security model
as the deployments/import companion endpoint).
4. HandleSovereignCloud now fans out across h.k8sCache.Clusters()
instead of using only the in-cluster client. Adds Cluster
field (omitempty) to sovereignNode/LB/SC/PVC so the UI can
group/filter by region. Without this, /cloud?view=list&kind=nodes
shows 1 node even when 3 secondary kubeconfigs are registered.
Together these fix:
- D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1)
- /cloud?view=list&kind=nodes (3+ nodes, not 1)
Refs: feedback_test_theater_3rd_violation_2026_05_17.md
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #1581 introduced an `itoa` helper that collided with the existing
`itoa` in handler/infrastructure.go:1952. Go vet failed:
internal/handler/infrastructure.go:1952:6: itoa redeclared in this block
internal/handler/deployment_handover_export.go:199:6: other declaration of itoa
Rename my helper to `regionSlotIndex` — more descriptive of its actual
use (deriving the per-region slot suffix for the kubeconfig filename).
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses catalyst-system namespace
PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).
Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.
Verified the CRD shape on t134 2026-05-17:
$ kubectl api-resources --api-group=access.openova.io
useraccesses access.openova.io/v1alpha1 true UserAccess
^^^^
NAMESPACED
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses tierRoleRef not wildcard app
PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.
The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.
Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A)
D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to
have all 3 regions' kubeconfigs registered so dashboard handler's
per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each.
Today the chroot only auto-registers its own in-cluster apiserver via
FactoryFromEnv's chroot self-registration branch. Secondary
kubeconfigs live on the mothership PVC + aren't replicated.
This handler bridges the gap:
- Accepts JSON {deploymentId, regionKey, kubeconfigYaml}
- Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in
depth — filename composed from these)
- Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml
(canonical FactoryFromEnv path so restart re-registers)
- Calls k8sCache.AddCluster — idempotent per Factory contract
PR B (next): mothership-side handover hook iterates secondary regions
and POSTs each kubeconfig to the chroot.
PR C (next): dashboard.go fan-out across all registered cluster IDs
when group_by includes cluster/region.
Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are written 0o600.
Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(dashboard): multi-cluster fan-out when group_by=cluster|region (D16 PR C)
When group_by includes "cluster" or "region", enumerate ALL registered
k8sCache clusters (primary + secondaries synced via PR #1579's POST
/api/v1/sovereign/secondary-kubeconfig endpoint) and concatenate
podRows from each before aggregation.
Layer-1=Cluster on /dashboard now renders 3 bubbles on a 3-region
Sovereign (was 1 bubble before).
For group_by that ONLY contains {namespace,family,application,vcluster,
sovereign} the primary clusterID's pods are sufficient and faster — no
fan-out cost.
PR B (mothership-side handover hook to POST each secondary kubeconfig)
will complete the chain. Until then, secondaries don't appear in
k8sCache.Clusters() so this fan-out is a no-op on existing provs — but
the code is in place for when PR B lands.
Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(handover): export secondary kubeconfigs to chroot at handover (D16 PR B)
Closes the D16 multi-cluster fan-out chain:
- PR #1579 (PR A): chroot endpoint accepts kubeconfigs
- PR #1580 (PR C): dashboard handler fans out across registered clusters
- This PR (PR B): mothership-side hook iterates secondary regions at
handover, reads each region's kubeconfig from the mothership PVC,
and POSTs to the chroot's endpoint
After handover-fire, exportSecondaryKubeconfigsToChild fires as a
goroutine (alongside exportDeploymentToChild). Best-effort per region:
a failure on region N doesn't abort N+1.
The chroot's k8sCache.Factory.AddCluster runs on every POST so
dashboard /api/v1/dashboard/treemap?group_by=cluster|region now
enumerates pods from all N regions and Layer-1=Cluster renders N
bubbles on an N-region Sovereign.
regionKeysForExport derives the filename convention `<region>-<slot>`
from dep.Request.Regions[1:] (primary is auto-registered by the
chroot's FactoryFromEnv self-registration so we skip index 0).
Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are read with stdlib os.ReadFile.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses catalyst-system namespace
PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).
Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.
Verified the CRD shape on t134 2026-05-17:
$ kubectl api-resources --api-group=access.openova.io
useraccesses access.openova.io/v1alpha1 true UserAccess
^^^^
NAMESPACED
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses tierRoleRef not wildcard app
PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.
The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.
Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A)
D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to
have all 3 regions' kubeconfigs registered so dashboard handler's
per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each.
Today the chroot only auto-registers its own in-cluster apiserver via
FactoryFromEnv's chroot self-registration branch. Secondary
kubeconfigs live on the mothership PVC + aren't replicated.
This handler bridges the gap:
- Accepts JSON {deploymentId, regionKey, kubeconfigYaml}
- Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in
depth — filename composed from these)
- Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml
(canonical FactoryFromEnv path so restart re-registers)
- Calls k8sCache.AddCluster — idempotent per Factory contract
PR B (next): mothership-side handover hook iterates secondary regions
and POSTs each kubeconfig to the chroot.
PR C (next): dashboard.go fan-out across all registered cluster IDs
when group_by includes cluster/region.
Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are written 0o600.
Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(dashboard): multi-cluster fan-out when group_by=cluster|region (D16 PR C)
When group_by includes "cluster" or "region", enumerate ALL registered
k8sCache clusters (primary + secondaries synced via PR #1579's POST
/api/v1/sovereign/secondary-kubeconfig endpoint) and concatenate
podRows from each before aggregation.
Layer-1=Cluster on /dashboard now renders 3 bubbles on a 3-region
Sovereign (was 1 bubble before).
For group_by that ONLY contains {namespace,family,application,vcluster,
sovereign} the primary clusterID's pods are sufficient and faster — no
fan-out cost.
PR B (mothership-side handover hook to POST each secondary kubeconfig)
will complete the chain. Until then, secondaries don't appear in
k8sCache.Clusters() so this fan-out is a no-op on existing provs — but
the code is in place for when PR B lands.
Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses catalyst-system namespace
PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).
Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.
Verified the CRD shape on t134 2026-05-17:
$ kubectl api-resources --api-group=access.openova.io
useraccesses access.openova.io/v1alpha1 true UserAccess
^^^^
NAMESPACED
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses tierRoleRef not wildcard app
PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.
The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.
Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A)
D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to
have all 3 regions' kubeconfigs registered so dashboard handler's
per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each.
Today the chroot only auto-registers its own in-cluster apiserver via
FactoryFromEnv's chroot self-registration branch. Secondary
kubeconfigs live on the mothership PVC + aren't replicated.
This handler bridges the gap:
- Accepts JSON {deploymentId, regionKey, kubeconfigYaml}
- Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in
depth — filename composed from these)
- Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml
(canonical FactoryFromEnv path so restart re-registers)
- Calls k8sCache.AddCluster — idempotent per Factory contract
PR B (next): mothership-side handover hook iterates secondary regions
and POSTs each kubeconfig to the chroot.
PR C (next): dashboard.go fan-out across all registered cluster IDs
when group_by includes cluster/region.
Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a
logged struct + are written 0o600.
Memo: feedback_d16_dashboard_multi_cluster_fan_out.md.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses catalyst-system namespace
PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).
Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.
Verified the CRD shape on t134 2026-05-17:
$ kubectl api-resources --api-group=access.openova.io
useraccesses access.openova.io/v1alpha1 true UserAccess
^^^^
NAMESPACED
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses tierRoleRef not wildcard app
PR #1564 + #1577 created the CR shape with applications=[{app:"*",...}]
but the useraccess XRD schema rejects `app: "*"` (pattern
^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged
"spec.applications[0].app: Invalid value: \"*\"" on every handover.
The XRD has a `tierRoleRef` field (pattern
^openova:tier-(viewer|developer|operator|admin|owner)$) that's the
canonical owner-tier semantic — when set, useraccess-controller binds
the named ClusterRole on the target via RoleBinding/ClusterRoleBinding.
`openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's
tier-clusterroles.yaml.
Drop the applications[] block + use tierRoleRef = openova:tier-owner.
Verified live on t135 2026-05-17 — error log showed exact pattern
mismatch before this fix.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): D21 owner seed uses catalyst-system namespace
PR #1564 created the owner UserAccess CR with .Namespace("") — the
apiserver returned "could not find the requested resource" because
useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per
the XRD's claimNames block at platform/crossplane-claims/chart/
templates/xrds/useraccess.yaml).
Pin to catalyst-system (where catalyst-api + every Catalyst-authored
CR lives) and stamp the namespace on the object too. The existing
ListUserAccess handler uses Namespace("") so the entry surfaces on
/users without per-namespace filtering.
Verified the CRD shape on t134 2026-05-17:
$ kubectl api-resources --api-group=access.openova.io
useraccesses access.openova.io/v1alpha1 true UserAccess
^^^^
NAMESPACED
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22)
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30)
When an sme-pool domain's current NS records already match the expected
[ns1.<primary>, ns2.<primary>] pair (because the operator already
delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip
step is a no-op. Skipping avoids:
1. Burning a Dynadot API credit on a flip that would be idempotent.
2. The D30 blocker — current Dynadot creds return pdm-status-401
even when the desired NS state already exists. Caught on t132
2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body
parentDomains attempt.
Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with
a 5s timeout. False on lookup error or partial match → fall through to
the original PDM pipeline so a misconfigured/partial domain still goes
through the registrar API.
This unblocks sme-pool entries for omani.homes (already pointing at
ns1/2/3.openova.io). omani.rest / omani.trades still go through the
full flip path because their NS records don't yet match expected.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-}
slot-file placeholders WITHOUT the $$ escape. tofu's templatefile()
parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu
expression — failing with "Extra characters after interpolation
expression; Template interpolation doesn't expect a colon".
Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s.
The escape pattern is documented at main.tf:1029 (the same warning
that caught t127 last week). $$ prefix tells tofu's templatefile to
emit literal \${...} to cloud-init for Flux envsubst.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22)
Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).
Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).
Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).
Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(slot-13): add D22 sovereign-side identity placeholders
Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} +
${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires
them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569)
→ catalyst-api env → chrootEnsureDeployment populates the deployment
record → Settings page renders real values instead of `—`.
This PR alone is a no-op (placeholders default to empty, same as today).
The cloud-init substitute lines + provisioner.go tfvars need to land in
a companion PR to actually populate the values on next-prov.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22)
Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block
now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit
Kustomization's postBuild.substitute env, which the slot-13 placeholders
(#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}.
Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile
substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap
keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment
populates the deployment record (#1567 + #1568 fallback).
SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes
the dependency cycle (hcloud_server.cp doesn't exist at cloudinit
render time). Separate PR will source it via metadata-service or
post-create ConfigMap patch.
Next-prov (t133+) Sovereign Console Settings page now renders real
ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b)
PR #1552 stripped the `/app` prefix on Sovereign mode to make
`/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match.
But consoleAppDetailRoute is registered at `/app/$componentId` under
consoleLayoutRoute — no chroot route matches `/<componentId>` directly,
so stripping leaves an empty render path. Playwright walkthrough on
t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render
body_len=9 (empty).
Invert the logic: only redirect mothership-only sub-paths (/dashboard
Fleet view, /install wizard, /sre, /sec, /blueprints) which have no
Sovereign Console equivalent. For everything else (component names like
`/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match
pick consoleAppDetailRoute / consoleAppsRoute.
Caught live on t132 via Playwright walker3.js — agent a4825c5a.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): re-mint handover JWT on every GetDeployment (D0)
D0 Playwright walkthrough on t132 2026-05-17 caught: handoverURL
persisted at handover-fire time carries a JWT that expires per
DefaultTTL (5min). Operators who click /jobs hours later get the stale
token → Sovereign-side /auth/handover rejects with raw JSON
{"error":"invalid token"} — no UI fallback, no /auth/handover-error,
auto-redirect to /dashboard never fires.
Re-mint the JWT on every GetDeployment when deployment is ready +
handover-fired so the URL returned to the wizard is always
freshly-signed.
Best-effort: on mint failure, leave the existing URL in place so a
transient signer error doesn't break polling. Helper is idempotent +
locked.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(cloud): hide non-active 0/0 chips (D15)
Playwright walkthrough on t132 2026-05-17 caught D15 PARTIAL: 15 chips
are correct but Bucket+Volume show 0/0. Founder rule (DoD D15):
"No kind chip shows 0/0 for a resource that actually exists in the
cluster". Bucket+Volume genuinely don't exist on this Sovereign so
showing 0/0 is noise.
Hide chips with count exactly 0 unless they're the active selection
(operator who navigated to an empty kind keeps context).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22)
Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).
Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).
Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).
Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(slot-13): add D22 sovereign-side identity placeholders
Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} +
${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires
them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569)
→ catalyst-api env → chrootEnsureDeployment populates the deployment
record → Settings page renders real values instead of `—`.
This PR alone is a no-op (placeholders default to empty, same as today).
The cloud-init substitute lines + provisioner.go tfvars need to land in
a companion PR to actually populate the values on next-prov.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22)
Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block
now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit
Kustomization's postBuild.substitute env, which the slot-13 placeholders
(#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}.
Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile
substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap
keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment
populates the deployment record (#1567 + #1568 fallback).
SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes
the dependency cycle (hcloud_server.cp doesn't exist at cloudinit
render time). Separate PR will source it via metadata-service or
post-create ConfigMap patch.
Next-prov (t133+) Sovereign Console Settings page now renders real
ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b)
PR #1552 stripped the `/app` prefix on Sovereign mode to make
`/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match.
But consoleAppDetailRoute is registered at `/app/$componentId` under
consoleLayoutRoute — no chroot route matches `/<componentId>` directly,
so stripping leaves an empty render path. Playwright walkthrough on
t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render
body_len=9 (empty).
Invert the logic: only redirect mothership-only sub-paths (/dashboard
Fleet view, /install wizard, /sre, /sec, /blueprints) which have no
Sovereign Console equivalent. For everything else (component names like
`/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match
pick consoleAppDetailRoute / consoleAppsRoute.
Caught live on t132 via Playwright walker3.js — agent a4825c5a.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(handover): re-mint handover JWT on every GetDeployment (D0)
D0 Playwright walkthrough on t132 2026-05-17 caught: handoverURL
persisted at handover-fire time carries a JWT that expires per
DefaultTTL (5min). Operators who click /jobs hours later get the stale
token → Sovereign-side /auth/handover rejects with raw JSON
{"error":"invalid token"} — no UI fallback, no /auth/handover-error,
auto-redirect to /dashboard never fires.
Re-mint the JWT on every GetDeployment when deployment is ready +
handover-fired so the URL returned to the wizard is always
freshly-signed.
Best-effort: on mint failure, leave the existing URL in place so a
transient signer error doesn't break polling. Helper is idempotent +
locked.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22)
Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).
Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).
Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).
Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(slot-13): add D22 sovereign-side identity placeholders
Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} +
${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires
them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569)
→ catalyst-api env → chrootEnsureDeployment populates the deployment
record → Settings page renders real values instead of `—`.
This PR alone is a no-op (placeholders default to empty, same as today).
The cloud-init substitute lines + provisioner.go tfvars need to land in
a companion PR to actually populate the values on next-prov.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22)
Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block
now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit
Kustomization's postBuild.substitute env, which the slot-13 placeholders
(#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}.
Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile
substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap
keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment
populates the deployment record (#1567 + #1568 fallback).
SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes
the dependency cycle (hcloud_server.cp doesn't exist at cloudinit
render time). Separate PR will source it via metadata-service or
post-create ConfigMap patch.
Next-prov (t133+) Sovereign Console Settings page now renders real
ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b)
PR #1552 stripped the `/app` prefix on Sovereign mode to make
`/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match.
But consoleAppDetailRoute is registered at `/app/$componentId` under
consoleLayoutRoute — no chroot route matches `/<componentId>` directly,
so stripping leaves an empty render path. Playwright walkthrough on
t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render
body_len=9 (empty).
Invert the logic: only redirect mothership-only sub-paths (/dashboard
Fleet view, /install wizard, /sre, /sec, /blueprints) which have no
Sovereign Console equivalent. For everything else (component names like
`/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match
pick consoleAppDetailRoute / consoleAppsRoute.
Caught live on t132 via Playwright walker3.js — agent a4825c5a.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22)
Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).
Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).
Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).
Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(slot-13): add D22 sovereign-side identity placeholders
Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} +
${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires
them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569)
→ catalyst-api env → chrootEnsureDeployment populates the deployment
record → Settings page renders real values instead of `—`.
This PR alone is a no-op (placeholders default to empty, same as today).
The cloud-init substitute lines + provisioner.go tfvars need to land in
a companion PR to actually populate the values on next-prov.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22)
Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block
now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit
Kustomization's postBuild.substitute env, which the slot-13 placeholders
(#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}.
Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile
substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap
keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment
populates the deployment record (#1567 + #1568 fallback).
SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes
the dependency cycle (hcloud_server.cp doesn't exist at cloudinit
render time). Separate PR will source it via metadata-service or
post-create ConfigMap patch.
Next-prov (t133+) Sovereign Console Settings page now renders real
ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22)
Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).
Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).
Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).
Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* chore(slot-13): add D22 sovereign-side identity placeholders
Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} +
${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires
them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569)
→ catalyst-api env → chrootEnsureDeployment populates the deployment
record → Settings page renders real values instead of `—`.
This PR alone is a no-op (placeholders default to empty, same as today).
The cloud-init substitute lines + provisioner.go tfvars need to land in
a companion PR to actually populate the values on next-prov.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment
reads to populate the deployment record so Sovereign Console Settings
page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL,
orgName (instead of `—` placeholders).
Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName,
controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with
empty defaults. Per-Sovereign overlays wire actual values from cloud-
init substitute placeholders (mirrors regionsJson pattern).
Catalyst-api Pod now reads them via valueFrom configMapKeyRef +
optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap
so env stays empty there — correct, mothership is signer not validator).
Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP
post-#1568. This PR fills the remaining 3 D22 fields when operator wires
the values.
Co-authored-by: hatiyildiz <hatice.yildiz@openova.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>