openova

Author	SHA1	Message	Date
hatiyildiz	4b97a50310	fix(store): PR P — preserve MarketplaceEnabled through Redact + ToProvisionerRequest Founder caught on t144: /settings/marketplace toggle showed disabled even though the prov body had marketplaceEnabled=true. Root cause: store.RedactedRequest struct (the on-disk projection) lacked a MarketplaceEnabled field. Every Save/Load cycle stripped the bit: - Mothership Save(rec) → MarketplaceEnabled dropped - Mothership exportDeploymentToChild → chroot receives record without bit - Chroot HandleGetMarketplace → reads dep.Request.MarketplaceEnabled → zero value (false) → UI toggle defaults to disabled PR J #1590's GET endpoint was correctly wired but the data was already gone before it ran. Fix: add MarketplaceEnabled field to RedactedRequest + carry it through Redact() + ToProvisionerRequest(). Backward-compat via `omitempty` — records persisted before this PR deserialize with false, same as the prior behavior. Bumps chart 1.4.151 -> 1.4.152 + bootstrap-kit pin so next prov exercises the full chain. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 12:51:43 +02:00
github-actions[bot]	efd5d60130	deploy: update catalyst images to `0242be5`	2026-05-17 09:21:12 +00:00
e3mrah	0242be5c49	fix(infra): PR O — cilium-gateway TLS references per-zone wildcard cert (#1595 ) t143 hit LE PROD rate limit (50 certs/week on omani.works exhausted) because TWO cert templates compete for the same parent-domain quota: 1. clusters/_template/sovereign-tls/cilium-gateway-cert.yaml — legacy SAN cert named `sovereign-wildcard-tls` 2. products/catalyst/chart/templates/sovereign-wildcard-certs.yaml — chart per-zone cert named `sovereign-wildcard-tls-<sanitised-zone>` The Cilium Gateway listener hardcoded the legacy name, so when LE 429s the legacy cert (as happened on t143), HTTPS to console.<fqdn> breaks even though the per-zone cert is Ready. Fix: gateway listener now references `sovereign-wildcard-tls-${SOVEREIGN_FQDN_DASHED}`. Cloud-init substitutes SOVEREIGN_FQDN_DASHED = replace(fqdn, ".", "-") in the sovereign-tls Kustomization postBuild.substitute. The per-zone cert from the chart provides the Ready Secret with this exact name. The legacy cilium-gateway-cert.yaml SAN cert still renders for backward-compat (some consumers may still reference it), but the gateway listener no longer depends on it for TLS termination. Bumps no chart version — the change is at the Flux/Kustomize layer. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 13:19:10 +04:00
github-actions[bot]	be0874f5e2	deploy: update catalyst images to `b27bdee`	2026-05-17 09:04:11 +00:00
e3mrah	b27bdeee05	fix(handover): PR N — fallback to per-FQDN cert when wildcard 429s (#1594 ) t143 caught the LE PROD rate limit (429: too many certificates (50) already issued for omani.works in last 168h0m0s, retry after 2026-05-17 10:28:32 UTC). The chart renders TWO cert names: - sovereign-wildcard-tls (canonical, hit 429) - sovereign-wildcard-tls-<fqdn> (per-FQDN, was already issued before rate limit, Ready=True) waitForWildcardCert only checked the canonical name. With the limit hit, handover waited the full 10-min budget before firing degraded. Fix: when the canonical cert is unavailable, list namespace certs matching `sovereign-wildcard-tls-` prefix and return Ready=True if ANY sibling is Ready. The operator's console.<fqdn> TLS handshake will succeed against either secret since both wildcard .<fqdn>. Bumps chart 1.4.150 -> 1.4.151 + bootstrap-kit pin so the fix lands on next fresh prov. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 13:02:17 +04:00
github-actions[bot]	13c9684cc1	deploy: update catalyst images to `32c46b8`	2026-05-17 08:39:46 +00:00
e3mrah	32c46b80e1	feat(ui): PR M — dashboard default Layer-1=cluster + Marketplace Admin link + chart 1.4.150 (#1593 ) Founder follow-up to t142 cycle: 1. "the dashboard is still not showing the clusters properly" — the D16 fan-out CODE works (3 clusters in k8sCache, dashboard handler fans out) but the OPERATOR-FACING default Layer-1 was 'family' not 'cluster'. Operator opens /dashboard, sees family-grouped bubbles, thinks the multi-cluster fix is broken. Fix: when SovereignFQDN is present (Sovereign Console mode), default to ['cluster', 'application'] so the 3-cluster grouping is the first thing the operator sees. 2. "I have no idea where the admin components for billing, order, revenue etc related BSS are" — exists at marketplace.<sov>/back-office/ but the Sovereign Console sidebar had no link. Fix: add "Marketplace Admin" nav link (external, opens in new tab) — uses resolvedFQDN to construct the URL. data-testid=sov-console-nav-marketplace-admin for matrix. Also bumps chart 1.4.149 → 1.4.150 + bootstrap-kit pin so the changes land on next fresh prov. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 12:37:53 +04:00
github-actions[bot]	68fe94b331	deploy: update catalyst images to `86f5331`	2026-05-17 08:02:06 +00:00
e3mrah	86f5331962	fix(catalyst-api): PR L — AppDetail HelmRelease fallback + chart 1.4.149 (#1592 ) Founder t140 bug #2: "in the catalog and jobs it shows as installed, in the application page it shows as provisioning, there is a sync issue". Root cause: AppDetail reads Application CR via GET /sovereigns/{id}/ applications/{name}. For bootstrap-kit installs (cilium, cert-manager, gateway-api, alloy, etc.) NO Application CR exists — they ship as HelmReleases directly with no wizard step to create the CR. The handler returned 404 → UI showed "App not found" or perpetual "Provisioning", while /apps (which reads HelmRelease) shows "installed". Fix: HandleApplicationGet, on Application CR not-found, falls back to a HelmRelease lookup in h.k8sCache (uses resolveChrootClusterID so it works post-D16 multi-cluster fan-out). Synthesises an applicationDetailResponse from HR fields: - Name/Namespace from HR - Blueprint from spec.chart.spec.chart - Version from spec.chart.spec.version (or status.lastAttemptedRevision) - Phase: Ready (HR Ready=True) / Failed (False) / Provisioning (Unknown) - Conditions: pass-through HR conditions Also bumps chart to 1.4.149 + bootstrap-kit pin so this fix + the queued PRs #1590 (marketplace GET) + #1591 (publish toggle UI) all land on the next fresh prov. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 11:59:59 +04:00
github-actions[bot]	b0c0f91604	deploy: update catalyst images to `df150fd`	2026-05-17 07:57:50 +00:00
e3mrah	df150fdbd8	feat(ui): PR K — per-app catalog publish/unpublish toggle on AppDetail header (#1591 ) Founder caught on t140 bug #4: "I am supposed to mark which applications are going to be available in the catalog … I am not able to see such option from the application page". Fix: PublishToggleChip rendered in the AppDetail hero meta row. - Reads current state on mount from GET /api/catalog/apps/{slug} - Click flips via PUT /api/catalog/admin/apps/{slug}/published - Optimistic update; reverts + tooltip on backend error - data-testid="app-detail-publish-toggle" for matrix coverage Backend already shipped — SetAppPublished handler at the catalog service /catalog/admin/apps/{slug}/published. Gateway routes admin/* with auth-gating so only Sovereign Console operator can flip. No backend change needed. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 11:55:45 +04:00
github-actions[bot]	e1f619aa77	deploy: update catalyst images to `114705c`	2026-05-17 07:51:10 +00:00
e3mrah	114705c63c	fix(marketplace): PR J — GET endpoint + UI reflects actual enabled state (#1590 ) Founder caught on t140 bug #5: /settings/marketplace shows "disabled" while the marketplace is actually serving (prov body had marketplaceEnabled=true). Root cause: MarketplaceSettings UI hardcoded useState(false) on mount because no GET endpoint existed to read the current value. Fix: - Backend: new GET /api/v1/sovereigns/{id}/marketplace returning {deploymentId, sovereignFQDN, enabled, brand}. Reads from the in-memory deployment record (Request.MarketplaceEnabled set at prov time + mutated by HandleSetMarketplace's commit path). - UI: MarketplaceSettings useEffect fetches on mount, sets the toggle to the actual value, hydrates the brand fields. Best-effort fetch — falls back to defaults on failure. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 11:49:03 +04:00
github-actions[bot]	a63f3c13ab	deploy: update catalyst images to `f1ebf14`	2026-05-17 07:06:33 +00:00
e3mrah	f1ebf14cf8	fix(catalyst-api): D30 PR I — mark imported deployment as Adopted on chroot (#1589 ) Founder t140 bug #6: /parent-domains shows only primary, not the sme-pool domains. Chroot's deployment record has parentDomains[] populated but ListParentDomains uses h.activeDeployment() which filters to AdoptedAt!=nil. The mothership ships the record before the chroot's own handover-finalisation, so AdoptedAt is nil → activeDeployment returns nil → only synth primary row renders. Fix: HandleDeploymentImport stamps AdoptedAt at import time. The FQDN-match guard above verifies "this record IS my Sovereign's record" so the chroot is by definition the operator/owner — no separate adoption-wizard needed on chroot side. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 11:04:38 +04:00
github-actions[bot]	473a2ba4b9	deploy: update catalyst images to `52be4d4`	2026-05-17 07:02:25 +00:00
e3mrah	52be4d4d3a	fix(catalyst-api): D16 PR H — resolveChrootClusterID multi-cluster + dashboard alias (#1587 ) * fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalyst-api): D16/D17 — 3 bugs caught on t138 Founder caught on t136 (now wiped) that /dashboard cluster grouping still showed 1 region and /cloud nodes showed 1 node despite earlier D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced on t138 fresh prov. 1. exportSecondaryKubeconfigsToChild was guarded behind the early return of exportDeploymentToChild's failed POST. The child's ingress + cert + gateway are still racing to reach reachable state in the seconds after handover fires, so the first POST gets EOF and the goroutine never fires. Fix: kick off the D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild in its own goroutine, BEFORE the deployment-record POST. 2. Both exports now retry with exponential backoff (5s → 60s) for up to 5 min total. Most handovers will succeed on attempt 2-4. Was: no retry, single shot, silent failure. 3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the auth group (rg) into the top-level router (r), alongside /api/v1/internal/deployments/import. The previous registration required an operator session that doesn't exist at handover — mothership POSTs were 401'd silently. Validation is now via safeIDPattern regex on depID + regionKey (same security model as the deployments/import companion endpoint). 4. HandleSovereignCloud now fans out across h.k8sCache.Clusters() instead of using only the in-cluster client. Adds Cluster field (omitempty) to sovereignNode/LB/SC/PVC so the UI can group/filter by region. Without this, /cloud?view=list&kind=nodes shows 1 node even when 3 secondary kubeconfigs are registered. Together these fix: - D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1) - /cloud?view=list&kind=nodes (3+ nodes, not 1) Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalog): D27 — fresh-seed apps default Published+Deployable Founder caught on t136: marketplace.t136/apps shows blank application grid. Root cause: catalog seed.go calls migrateAppPublished + migrateAppDeployable ONLY on the "already populated" path. On a fresh Sovereign install (empty catalog) seedAllData inserts 27 rows with zero-value bools — Published=false, Deployable=false. The marketplace storefront filters with `?published=true`, gets [], renders blank. Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished + seedSystemApps. Both migrations are idempotent (skip rows already true), so re-runs are safe. Verified the bug live on t138 (eaaee1ea24184c2a): http://catalog.sme:8082/catalog/apps returns 27 apps http://catalog.sme:8082/catalog/apps?published=true returns 0 With this fix the latter returns 27. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign Founder caught on t136: console.t136.../app/bp-alloy renders the catalog grid (AppsPage) instead of AppDetail. Three earlier PRs (#1572 + chart bumps) flipped the appRoute beforeLoad logic but the actual route-matching collision was not fixed. Root cause: appRoute.addChildren registers appDeploymentRoute at `/$deploymentId` (effective `/app/$deploymentId`, mother-only) BEFORE consoleLayoutRoute registers consoleAppDetailRoute at `/app/$componentId`. TanStack Router resolves equally-specific dynamic routes by declaration order — so on the Sovereign Console URL `/app/bp-alloy` matches appDeploymentRoute first and renders AppsPage with deploymentId="bp-alloy". Fix: at routeTree build time, filter appRoute children to exclude every mother-only `/$deploymentId/` route when running on Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this is a one-time check, no runtime overhead. With those routes absent, consoleAppDetailRoute is the only matcher for `/app/<componentId>` on Sovereign Console — AppDetail renders. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> chore(bootstrap-kit): pin bp-catalyst-platform 1.4.147→1.4.148 Founder-flagged bug fixes from session t136/t138/t139 verify cycle shipped 3 PRs that bumped catalyst chart Chart.yaml to 1.4.148 (`d985f27c`) with new images: - catalystApi/Ui: `2ab8a0e` (PR #1583 D16 fan-out + retry + auth-bypass, PR #1585 D17 router collision) - smeTag: `964dc15` (PR #1584 D27 catalog fresh-seed Published) But bootstrap-kit/13-bp-catalyst-platform.yaml stayed pinned to 1.4.147 — every fresh provision installs the OLDER chart with the OLDER images, so the founder-flagged bugs persist. Caught on t139 (b4a7ee052d844da0) post-handover verify: chart installed = bp-catalyst-platform@1.4.147, catalog returns 0 published apps, /app/bp-alloy renders catalog grid. Bumping the pin makes fresh provs install 1.4.148 (which has all 3 PRs baked). Refs: feedback_test_theater_3rd_violation_2026_05_17.md feedback_overlap_provs_dont_serialize_wait.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalyst-api): D16 PR H — resolveChrootClusterID multi-cluster + dashboard alias Founder caught on t140 (29b7e14918178f7e) after D16 fan-out chain shipped: - /dashboard is empty (no treemap rendered) - "none of the k8s resources are streaming" Root cause: after the D16 secondary-kubeconfig export (PR #1579/#1581) landed, chroot's k8sCache went from 1 cluster (primary self-register) to 3 clusters (primary + 2 secondaries). Two cascading bugs: 1. resolveChrootClusterID had a `len(clusters) != 1` guard — it only aliased when chroot had exactly one cluster. After D16 it returned the URL deployment_id unchanged → has-cluster check failed → every chroot handler (networking, k8s_search, k8s_resource_metrics, k8s_exec, dashboard) saw "not found" → returned empty. 2. dashboard.go::GetDashboardTreemap was the one chroot handler that didn't call resolveChrootClusterID before the has-cluster check — so even with #1 fixed, the dashboard would still 404. Fix: - resolveChrootClusterID: when N>1, prefer the cluster whose id is prefixed "sovereign-" (the FactoryFromEnv self-registered primary per buildChrootClusterRef). Falls back to clusters[0] if no match. - GetDashboardTreemap: call resolveChrootClusterID before has-cluster check, matching the pattern in every other chroot handler. Refs: feedback_test_theater_3rd_violation_2026_05_17.md (don't ship D16 fan-out without verifying every handler that depends on single-cluster k8sCache assumption). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 10:59:43 +04:00
e3mrah	8c1ccfae07	chore(bootstrap-kit): pin bp-catalyst-platform 1.4.147→1.4.148 (#1586 ) * fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalyst-api): D16/D17 — 3 bugs caught on t138 Founder caught on t136 (now wiped) that /dashboard cluster grouping still showed 1 region and /cloud nodes showed 1 node despite earlier D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced on t138 fresh prov. 1. exportSecondaryKubeconfigsToChild was guarded behind the early return of exportDeploymentToChild's failed POST. The child's ingress + cert + gateway are still racing to reach reachable state in the seconds after handover fires, so the first POST gets EOF and the goroutine never fires. Fix: kick off the D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild in its own goroutine, BEFORE the deployment-record POST. 2. Both exports now retry with exponential backoff (5s → 60s) for up to 5 min total. Most handovers will succeed on attempt 2-4. Was: no retry, single shot, silent failure. 3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the auth group (rg) into the top-level router (r), alongside /api/v1/internal/deployments/import. The previous registration required an operator session that doesn't exist at handover — mothership POSTs were 401'd silently. Validation is now via safeIDPattern regex on depID + regionKey (same security model as the deployments/import companion endpoint). 4. HandleSovereignCloud now fans out across h.k8sCache.Clusters() instead of using only the in-cluster client. Adds Cluster field (omitempty) to sovereignNode/LB/SC/PVC so the UI can group/filter by region. Without this, /cloud?view=list&kind=nodes shows 1 node even when 3 secondary kubeconfigs are registered. Together these fix: - D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1) - /cloud?view=list&kind=nodes (3+ nodes, not 1) Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalog): D27 — fresh-seed apps default Published+Deployable Founder caught on t136: marketplace.t136/apps shows blank application grid. Root cause: catalog seed.go calls migrateAppPublished + migrateAppDeployable ONLY on the "already populated" path. On a fresh Sovereign install (empty catalog) seedAllData inserts 27 rows with zero-value bools — Published=false, Deployable=false. The marketplace storefront filters with `?published=true`, gets [], renders blank. Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished + seedSystemApps. Both migrations are idempotent (skip rows already true), so re-runs are safe. Verified the bug live on t138 (eaaee1ea24184c2a): http://catalog.sme:8082/catalog/apps returns 27 apps http://catalog.sme:8082/catalog/apps?published=true returns 0 With this fix the latter returns 27. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign Founder caught on t136: console.t136.../app/bp-alloy renders the catalog grid (AppsPage) instead of AppDetail. Three earlier PRs (#1572 + chart bumps) flipped the appRoute beforeLoad logic but the actual route-matching collision was not fixed. Root cause: appRoute.addChildren registers appDeploymentRoute at `/$deploymentId` (effective `/app/$deploymentId`, mother-only) BEFORE consoleLayoutRoute registers consoleAppDetailRoute at `/app/$componentId`. TanStack Router resolves equally-specific dynamic routes by declaration order — so on the Sovereign Console URL `/app/bp-alloy` matches appDeploymentRoute first and renders AppsPage with deploymentId="bp-alloy". Fix: at routeTree build time, filter appRoute children to exclude every mother-only `/$deploymentId/` route when running on Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this is a one-time check, no runtime overhead. With those routes absent, consoleAppDetailRoute is the only matcher for `/app/<componentId>` on Sovereign Console — AppDetail renders. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> chore(bootstrap-kit): pin bp-catalyst-platform 1.4.147→1.4.148 Founder-flagged bug fixes from session t136/t138/t139 verify cycle shipped 3 PRs that bumped catalyst chart Chart.yaml to 1.4.148 (`d985f27c`) with new images: - catalystApi/Ui: `2ab8a0e` (PR #1583 D16 fan-out + retry + auth-bypass, PR #1585 D17 router collision) - smeTag: `964dc15` (PR #1584 D27 catalog fresh-seed Published) But bootstrap-kit/13-bp-catalyst-platform.yaml stayed pinned to 1.4.147 — every fresh provision installs the OLDER chart with the OLDER images, so the founder-flagged bugs persist. Caught on t139 (b4a7ee052d844da0) post-handover verify: chart installed = bp-catalyst-platform@1.4.147, catalog returns 0 published apps, /app/bp-alloy renders catalog grid. Bumping the pin makes fresh provs install 1.4.148 (which has all 3 PRs baked). Refs: feedback_test_theater_3rd_violation_2026_05_17.md feedback_overlap_provs_dont_serialize_wait.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 10:15:16 +04:00
github-actions[bot]	b61e9afabf	deploy: update catalyst images to `2ab8a0e`	2026-05-17 05:37:01 +00:00
e3mrah	2ab8a0e653	fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign (#1585 ) * fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalyst-api): D16/D17 — 3 bugs caught on t138 Founder caught on t136 (now wiped) that /dashboard cluster grouping still showed 1 region and /cloud nodes showed 1 node despite earlier D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced on t138 fresh prov. 1. exportSecondaryKubeconfigsToChild was guarded behind the early return of exportDeploymentToChild's failed POST. The child's ingress + cert + gateway are still racing to reach reachable state in the seconds after handover fires, so the first POST gets EOF and the goroutine never fires. Fix: kick off the D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild in its own goroutine, BEFORE the deployment-record POST. 2. Both exports now retry with exponential backoff (5s → 60s) for up to 5 min total. Most handovers will succeed on attempt 2-4. Was: no retry, single shot, silent failure. 3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the auth group (rg) into the top-level router (r), alongside /api/v1/internal/deployments/import. The previous registration required an operator session that doesn't exist at handover — mothership POSTs were 401'd silently. Validation is now via safeIDPattern regex on depID + regionKey (same security model as the deployments/import companion endpoint). 4. HandleSovereignCloud now fans out across h.k8sCache.Clusters() instead of using only the in-cluster client. Adds Cluster field (omitempty) to sovereignNode/LB/SC/PVC so the UI can group/filter by region. Without this, /cloud?view=list&kind=nodes shows 1 node even when 3 secondary kubeconfigs are registered. Together these fix: - D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1) - /cloud?view=list&kind=nodes (3+ nodes, not 1) Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalog): D27 — fresh-seed apps default Published+Deployable Founder caught on t136: marketplace.t136/apps shows blank application grid. Root cause: catalog seed.go calls migrateAppPublished + migrateAppDeployable ONLY on the "already populated" path. On a fresh Sovereign install (empty catalog) seedAllData inserts 27 rows with zero-value bools — Published=false, Deployable=false. The marketplace storefront filters with `?published=true`, gets [], renders blank. Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished + seedSystemApps. Both migrations are idempotent (skip rows already true), so re-runs are safe. Verified the bug live on t138 (eaaee1ea24184c2a): http://catalog.sme:8082/catalog/apps returns 27 apps http://catalog.sme:8082/catalog/apps?published=true returns 0 With this fix the latter returns 27. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(ui): D17 — exclude mother-only /app/$deploymentId routes on Sovereign Founder caught on t136: console.t136.../app/bp-alloy renders the catalog grid (AppsPage) instead of AppDetail. Three earlier PRs (#1572 + chart bumps) flipped the appRoute beforeLoad logic but the actual route-matching collision was not fixed. Root cause: appRoute.addChildren registers appDeploymentRoute at `/$deploymentId` (effective `/app/$deploymentId`, mother-only) BEFORE consoleLayoutRoute registers consoleAppDetailRoute at `/app/$componentId`. TanStack Router resolves equally-specific dynamic routes by declaration order — so on the Sovereign Console URL `/app/bp-alloy` matches appDeploymentRoute first and renders AppsPage with deploymentId="bp-alloy". Fix: at routeTree build time, filter appRoute children to exclude every mother-only `/$deploymentId/*` route when running on Sovereign mode. DETECTED_MODE.mode is fixed per-page-load so this is a one-time check, no runtime overhead. With those routes absent, consoleAppDetailRoute is the only matcher for `/app/<componentId>` on Sovereign Console — AppDetail renders. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 09:34:01 +04:00
github-actions[bot]	d985f27c8b	deploy: update sme service images to `964dc15` + bump chart to 1.4.148	2026-05-17 05:29:35 +00:00
e3mrah	964dc15570	fix(catalog): D27 — fresh-seed apps default Published+Deployable (#1584 ) * fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalyst-api): D16/D17 — 3 bugs caught on t138 Founder caught on t136 (now wiped) that /dashboard cluster grouping still showed 1 region and /cloud nodes showed 1 node despite earlier D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced on t138 fresh prov. 1. exportSecondaryKubeconfigsToChild was guarded behind the early return of exportDeploymentToChild's failed POST. The child's ingress + cert + gateway are still racing to reach reachable state in the seconds after handover fires, so the first POST gets EOF and the goroutine never fires. Fix: kick off the D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild in its own goroutine, BEFORE the deployment-record POST. 2. Both exports now retry with exponential backoff (5s → 60s) for up to 5 min total. Most handovers will succeed on attempt 2-4. Was: no retry, single shot, silent failure. 3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the auth group (rg) into the top-level router (r), alongside /api/v1/internal/deployments/import. The previous registration required an operator session that doesn't exist at handover — mothership POSTs were 401'd silently. Validation is now via safeIDPattern regex on depID + regionKey (same security model as the deployments/import companion endpoint). 4. HandleSovereignCloud now fans out across h.k8sCache.Clusters() instead of using only the in-cluster client. Adds Cluster field (omitempty) to sovereignNode/LB/SC/PVC so the UI can group/filter by region. Without this, /cloud?view=list&kind=nodes shows 1 node even when 3 secondary kubeconfigs are registered. Together these fix: - D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1) - /cloud?view=list&kind=nodes (3+ nodes, not 1) Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalog): D27 — fresh-seed apps default Published+Deployable Founder caught on t136: marketplace.t136/apps shows blank application grid. Root cause: catalog seed.go calls migrateAppPublished + migrateAppDeployable ONLY on the "already populated" path. On a fresh Sovereign install (empty catalog) seedAllData inserts 27 rows with zero-value bools — Published=false, Deployable=false. The marketplace storefront filters with `?published=true`, gets [], renders blank. Fix: after seedAllData also call migrateAppDeployable + migrateAppPublished + seedSystemApps. Both migrations are idempotent (skip rows already true), so re-runs are safe. Verified the bug live on t138 (eaaee1ea24184c2a): http://catalog.sme:8082/catalog/apps returns 27 apps http://catalog.sme:8082/catalog/apps?published=true returns 0 With this fix the latter returns 27. Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 09:28:35 +04:00
github-actions[bot]	f7ea19000e	deploy: update catalyst images to `9fc2850`	2026-05-17 05:28:28 +00:00
e3mrah	9fc2850504	fix(catalyst-api): D16/D17 — 3 bugs caught on t138 fresh prov (#1583 ) * fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(catalyst-api): D16/D17 — 3 bugs caught on t138 Founder caught on t136 (now wiped) that /dashboard cluster grouping still showed 1 region and /cloud nodes showed 1 node despite earlier D16 PRs shipping. Root cause: 3 bugs in the D16 chain that surfaced on t138 fresh prov. 1. exportSecondaryKubeconfigsToChild was guarded behind the early return of exportDeploymentToChild's failed POST. The child's ingress + cert + gateway are still racing to reach reachable state in the seconds after handover fires, so the first POST gets EOF and the goroutine never fires. Fix: kick off the D16 fan-out IMMEDIATELY at the top of exportDeploymentToChild in its own goroutine, BEFORE the deployment-record POST. 2. Both exports now retry with exponential backoff (5s → 60s) for up to 5 min total. Most handovers will succeed on attempt 2-4. Was: no retry, single shot, silent failure. 3. /api/v1/sovereign/secondary-kubeconfig route moved OUT of the auth group (rg) into the top-level router (r), alongside /api/v1/internal/deployments/import. The previous registration required an operator session that doesn't exist at handover — mothership POSTs were 401'd silently. Validation is now via safeIDPattern regex on depID + regionKey (same security model as the deployments/import companion endpoint). 4. HandleSovereignCloud now fans out across h.k8sCache.Clusters() instead of using only the in-cluster client. Adds Cluster field (omitempty) to sovereignNode/LB/SC/PVC so the UI can group/filter by region. Without this, /cloud?view=list&kind=nodes shows 1 node even when 3 secondary kubeconfigs are registered. Together these fix: - D16 /dashboard Layer-1=Cluster grouping (3 bubbles, not 1) - /cloud?view=list&kind=nodes (3+ nodes, not 1) Refs: feedback_test_theater_3rd_violation_2026_05_17.md Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 09:26:16 +04:00
github-actions[bot]	ccbe51e3e4	deploy: update catalyst images to `9237c1e`	2026-05-17 04:48:41 +00:00
e3mrah	9237c1e6ee	fix(handover): rename itoa→regionSlotIndex (collision with infrastructure.go) (#1582 ) PR #1581 introduced an `itoa` helper that collided with the existing `itoa` in handler/infrastructure.go:1952. Go vet failed: internal/handler/infrastructure.go:1952:6: itoa redeclared in this block internal/handler/deployment_handover_export.go:199:6: other declaration of itoa Rename my helper to `regionSlotIndex` — more descriptive of its actual use (deriving the per-region slot suffix for the kubeconfig filename). Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 08:45:49 +04:00
e3mrah	ce4ef6ba98	feat(handover): export secondary kubeconfigs to chroot at handover (D16 PR B) (#1581 ) * fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22) PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-} slot-file placeholders WITHOUT the $$ escape. tofu's templatefile() parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu expression — failing with "Extra characters after interpolation expression; Template interpolation doesn't expect a colon". Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s. The escape pattern is documented at main.tf:1029 (the same warning that caught t127 last week). $$ prefix tells tofu's templatefile to emit literal \${...} to cloud-init for Flux envsubst. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30) When an sme-pool domain's current NS records already match the expected [ns1.<primary>, ns2.<primary>] pair (because the operator already delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip step is a no-op. Skipping avoids: 1. Burning a Dynadot API credit on a flip that would be idempotent. 2. The D30 blocker — current Dynadot creds return pdm-status-401 even when the desired NS state already exists. Caught on t132 2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body parentDomains attempt. Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with a 5s timeout. False on lookup error or partial match → fall through to the original PDM pipeline so a misconfigured/partial domain still goes through the registrar API. This unblocks sme-pool entries for omani.homes (already pointing at ns1/2/3.openova.io). omani.rest / omani.trades still go through the full flip path because their NS records don't yet match expected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): D21 owner seed uses catalyst-system namespace PR #1564 created the owner UserAccess CR with .Namespace("") — the apiserver returned "could not find the requested resource" because useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per the XRD's claimNames block at platform/crossplane-claims/chart/ templates/xrds/useraccess.yaml). Pin to catalyst-system (where catalyst-api + every Catalyst-authored CR lives) and stamp the namespace on the object too. The existing ListUserAccess handler uses Namespace("") so the entry surfaces on /users without per-namespace filtering. Verified the CRD shape on t134 2026-05-17: $ kubectl api-resources --api-group=access.openova.io useraccesses access.openova.io/v1alpha1 true UserAccess ^^^^ NAMESPACED Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): D21 owner seed uses tierRoleRef not wildcard app PR #1564 + #1577 created the CR shape with applications=[{app:"",...}] but the useraccess XRD schema rejects `app: ""` (pattern ^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged "spec.applications[0].app: Invalid value: \"\"" on every handover. The XRD has a `tierRoleRef` field (pattern ^openova:tier-(viewer\|developer\|operator\|admin\|owner)$) that's the canonical owner-tier semantic — when set, useraccess-controller binds the named ClusterRole on the target via RoleBinding/ClusterRoleBinding. `openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's tier-clusterroles.yaml. Drop the applications[] block + use tierRoleRef = openova:tier-owner. Verified live on t135 2026-05-17 — error log showed exact pattern mismatch before this fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A) D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to have all 3 regions' kubeconfigs registered so dashboard handler's per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each. Today the chroot only auto-registers its own in-cluster apiserver via FactoryFromEnv's chroot self-registration branch. Secondary kubeconfigs live on the mothership PVC + aren't replicated. This handler bridges the gap: - Accepts JSON {deploymentId, regionKey, kubeconfigYaml} - Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in depth — filename composed from these) - Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml (canonical FactoryFromEnv path so restart re-registers) - Calls k8sCache.AddCluster — idempotent per Factory contract PR B (next): mothership-side handover hook iterates secondary regions and POSTs each kubeconfig to the chroot. PR C (next): dashboard.go fan-out across all registered cluster IDs when group_by includes cluster/region. Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a logged struct + are written 0o600. Memo: feedback_d16_dashboard_multi_cluster_fan_out.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(dashboard): multi-cluster fan-out when group_by=cluster\|region (D16 PR C) When group_by includes "cluster" or "region", enumerate ALL registered k8sCache clusters (primary + secondaries synced via PR #1579's POST /api/v1/sovereign/secondary-kubeconfig endpoint) and concatenate podRows from each before aggregation. Layer-1=Cluster on /dashboard now renders 3 bubbles on a 3-region Sovereign (was 1 bubble before). For group_by that ONLY contains {namespace,family,application,vcluster, sovereign} the primary clusterID's pods are sufficient and faster — no fan-out cost. PR B (mothership-side handover hook to POST each secondary kubeconfig) will complete the chain. Until then, secondaries don't appear in k8sCache.Clusters() so this fan-out is a no-op on existing provs — but the code is in place for when PR B lands. Memo: feedback_d16_dashboard_multi_cluster_fan_out.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(handover): export secondary kubeconfigs to chroot at handover (D16 PR B) Closes the D16 multi-cluster fan-out chain: - PR #1579 (PR A): chroot endpoint accepts kubeconfigs - PR #1580 (PR C): dashboard handler fans out across registered clusters - This PR (PR B): mothership-side hook iterates secondary regions at handover, reads each region's kubeconfig from the mothership PVC, and POSTs to the chroot's endpoint After handover-fire, exportSecondaryKubeconfigsToChild fires as a goroutine (alongside exportDeploymentToChild). Best-effort per region: a failure on region N doesn't abort N+1. The chroot's k8sCache.Factory.AddCluster runs on every POST so dashboard /api/v1/dashboard/treemap?group_by=cluster\|region now enumerates pods from all N regions and Layer-1=Cluster renders N bubbles on an N-region Sovereign. regionKeysForExport derives the filename convention `<region>-<slot>` from dep.Request.Regions[1:] (primary is auto-registered by the chroot's FactoryFromEnv self-registration so we skip index 0). Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a logged struct + are read with stdlib os.ReadFile. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 08:22:01 +04:00
github-actions[bot]	b07e5206a1	deploy: update catalyst images to `d92f734`	2026-05-17 04:09:34 +00:00
e3mrah	d92f734374	feat(dashboard): multi-cluster fan-out when group_by=cluster\|region (D16 PR C) (#1580 ) * fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22) PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-} slot-file placeholders WITHOUT the $$ escape. tofu's templatefile() parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu expression — failing with "Extra characters after interpolation expression; Template interpolation doesn't expect a colon". Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s. The escape pattern is documented at main.tf:1029 (the same warning that caught t127 last week). $$ prefix tells tofu's templatefile to emit literal \${...} to cloud-init for Flux envsubst. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30) When an sme-pool domain's current NS records already match the expected [ns1.<primary>, ns2.<primary>] pair (because the operator already delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip step is a no-op. Skipping avoids: 1. Burning a Dynadot API credit on a flip that would be idempotent. 2. The D30 blocker — current Dynadot creds return pdm-status-401 even when the desired NS state already exists. Caught on t132 2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body parentDomains attempt. Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with a 5s timeout. False on lookup error or partial match → fall through to the original PDM pipeline so a misconfigured/partial domain still goes through the registrar API. This unblocks sme-pool entries for omani.homes (already pointing at ns1/2/3.openova.io). omani.rest / omani.trades still go through the full flip path because their NS records don't yet match expected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): D21 owner seed uses catalyst-system namespace PR #1564 created the owner UserAccess CR with .Namespace("") — the apiserver returned "could not find the requested resource" because useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per the XRD's claimNames block at platform/crossplane-claims/chart/ templates/xrds/useraccess.yaml). Pin to catalyst-system (where catalyst-api + every Catalyst-authored CR lives) and stamp the namespace on the object too. The existing ListUserAccess handler uses Namespace("") so the entry surfaces on /users without per-namespace filtering. Verified the CRD shape on t134 2026-05-17: $ kubectl api-resources --api-group=access.openova.io useraccesses access.openova.io/v1alpha1 true UserAccess ^^^^ NAMESPACED Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): D21 owner seed uses tierRoleRef not wildcard app PR #1564 + #1577 created the CR shape with applications=[{app:"",...}] but the useraccess XRD schema rejects `app: ""` (pattern ^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged "spec.applications[0].app: Invalid value: \"\"" on every handover. The XRD has a `tierRoleRef` field (pattern ^openova:tier-(viewer\|developer\|operator\|admin\|owner)$) that's the canonical owner-tier semantic — when set, useraccess-controller binds the named ClusterRole on the target via RoleBinding/ClusterRoleBinding. `openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's tier-clusterroles.yaml. Drop the applications[] block + use tierRoleRef = openova:tier-owner. Verified live on t135 2026-05-17 — error log showed exact pattern mismatch before this fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A) D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to have all 3 regions' kubeconfigs registered so dashboard handler's per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each. Today the chroot only auto-registers its own in-cluster apiserver via FactoryFromEnv's chroot self-registration branch. Secondary kubeconfigs live on the mothership PVC + aren't replicated. This handler bridges the gap: - Accepts JSON {deploymentId, regionKey, kubeconfigYaml} - Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in depth — filename composed from these) - Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml (canonical FactoryFromEnv path so restart re-registers) - Calls k8sCache.AddCluster — idempotent per Factory contract PR B (next): mothership-side handover hook iterates secondary regions and POSTs each kubeconfig to the chroot. PR C (next): dashboard.go fan-out across all registered cluster IDs when group_by includes cluster/region. Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a logged struct + are written 0o600. Memo: feedback_d16_dashboard_multi_cluster_fan_out.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(dashboard): multi-cluster fan-out when group_by=cluster\|region (D16 PR C) When group_by includes "cluster" or "region", enumerate ALL registered k8sCache clusters (primary + secondaries synced via PR #1579's POST /api/v1/sovereign/secondary-kubeconfig endpoint) and concatenate podRows from each before aggregation. Layer-1=Cluster on /dashboard now renders 3 bubbles on a 3-region Sovereign (was 1 bubble before). For group_by that ONLY contains {namespace,family,application,vcluster, sovereign} the primary clusterID's pods are sufficient and faster — no fan-out cost. PR B (mothership-side handover hook to POST each secondary kubeconfig) will complete the chain. Until then, secondaries don't appear in k8sCache.Clusters() so this fan-out is a no-op on existing provs — but the code is in place for when PR B lands. Memo: feedback_d16_dashboard_multi_cluster_fan_out.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 08:07:26 +04:00
e3mrah	bcab6430cb	feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A) (#1579 ) * fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22) PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-} slot-file placeholders WITHOUT the $$ escape. tofu's templatefile() parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu expression — failing with "Extra characters after interpolation expression; Template interpolation doesn't expect a colon". Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s. The escape pattern is documented at main.tf:1029 (the same warning that caught t127 last week). $$ prefix tells tofu's templatefile to emit literal \${...} to cloud-init for Flux envsubst. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30) When an sme-pool domain's current NS records already match the expected [ns1.<primary>, ns2.<primary>] pair (because the operator already delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip step is a no-op. Skipping avoids: 1. Burning a Dynadot API credit on a flip that would be idempotent. 2. The D30 blocker — current Dynadot creds return pdm-status-401 even when the desired NS state already exists. Caught on t132 2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body parentDomains attempt. Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with a 5s timeout. False on lookup error or partial match → fall through to the original PDM pipeline so a misconfigured/partial domain still goes through the registrar API. This unblocks sme-pool entries for omani.homes (already pointing at ns1/2/3.openova.io). omani.rest / omani.trades still go through the full flip path because their NS records don't yet match expected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): D21 owner seed uses catalyst-system namespace PR #1564 created the owner UserAccess CR with .Namespace("") — the apiserver returned "could not find the requested resource" because useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per the XRD's claimNames block at platform/crossplane-claims/chart/ templates/xrds/useraccess.yaml). Pin to catalyst-system (where catalyst-api + every Catalyst-authored CR lives) and stamp the namespace on the object too. The existing ListUserAccess handler uses Namespace("") so the entry surfaces on /users without per-namespace filtering. Verified the CRD shape on t134 2026-05-17: $ kubectl api-resources --api-group=access.openova.io useraccesses access.openova.io/v1alpha1 true UserAccess ^^^^ NAMESPACED Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): D21 owner seed uses tierRoleRef not wildcard app PR #1564 + #1577 created the CR shape with applications=[{app:"",...}] but the useraccess XRD schema rejects `app: ""` (pattern ^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged "spec.applications[0].app: Invalid value: \"\"" on every handover. The XRD has a `tierRoleRef` field (pattern ^openova:tier-(viewer\|developer\|operator\|admin\|owner)$) that's the canonical owner-tier semantic — when set, useraccess-controller binds the named ClusterRole on the target via RoleBinding/ClusterRoleBinding. `openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's tier-clusterroles.yaml. Drop the applications[] block + use tierRoleRef = openova:tier-owner. Verified live on t135 2026-05-17 — error log showed exact pattern mismatch before this fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> feat(chroot): POST /api/v1/sovereign/secondary-kubeconfig (D16 PR A) D16 multi-cluster fan-out requires the chroot's k8sCache.Factory to have all 3 regions' kubeconfigs registered so dashboard handler's per-cluster h.k8sCache.List(clusterID, ...) enumerates pods from each. Today the chroot only auto-registers its own in-cluster apiserver via FactoryFromEnv's chroot self-registration branch. Secondary kubeconfigs live on the mothership PVC + aren't replicated. This handler bridges the gap: - Accepts JSON {deploymentId, regionKey, kubeconfigYaml} - Validates ids via ^[a-z0-9][a-z0-9-]{0,62}$ pattern (defense in depth — filename composed from these) - Writes kubeconfig 0o600 to /var/lib/catalyst/kubeconfigs/<depID>-<region>.yaml (canonical FactoryFromEnv path so restart re-registers) - Calls k8sCache.AddCluster — idempotent per Factory contract PR B (next): mothership-side handover hook iterates secondary regions and POSTs each kubeconfig to the chroot. PR C (next): dashboard.go fan-out across all registered cluster IDs when group_by includes cluster/region. Per docs/INVIOLABLE-PRINCIPLES.md #10 kubeconfig bytes never enter a logged struct + are written 0o600. Memo: feedback_d16_dashboard_multi_cluster_fan_out.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 08:06:08 +04:00
github-actions[bot]	6e329e27ae	deploy: update catalyst images to `4f62dd2`	2026-05-17 00:10:50 +00:00
e3mrah	4f62dd21b3	fix(handover): D21 owner seed uses tierRoleRef not wildcard app (#1578 ) * fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22) PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-} slot-file placeholders WITHOUT the $$ escape. tofu's templatefile() parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu expression — failing with "Extra characters after interpolation expression; Template interpolation doesn't expect a colon". Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s. The escape pattern is documented at main.tf:1029 (the same warning that caught t127 last week). $$ prefix tells tofu's templatefile to emit literal \${...} to cloud-init for Flux envsubst. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30) When an sme-pool domain's current NS records already match the expected [ns1.<primary>, ns2.<primary>] pair (because the operator already delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip step is a no-op. Skipping avoids: 1. Burning a Dynadot API credit on a flip that would be idempotent. 2. The D30 blocker — current Dynadot creds return pdm-status-401 even when the desired NS state already exists. Caught on t132 2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body parentDomains attempt. Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with a 5s timeout. False on lookup error or partial match → fall through to the original PDM pipeline so a misconfigured/partial domain still goes through the registrar API. This unblocks sme-pool entries for omani.homes (already pointing at ns1/2/3.openova.io). omani.rest / omani.trades still go through the full flip path because their NS records don't yet match expected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): D21 owner seed uses catalyst-system namespace PR #1564 created the owner UserAccess CR with .Namespace("") — the apiserver returned "could not find the requested resource" because useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per the XRD's claimNames block at platform/crossplane-claims/chart/ templates/xrds/useraccess.yaml). Pin to catalyst-system (where catalyst-api + every Catalyst-authored CR lives) and stamp the namespace on the object too. The existing ListUserAccess handler uses Namespace("") so the entry surfaces on /users without per-namespace filtering. Verified the CRD shape on t134 2026-05-17: $ kubectl api-resources --api-group=access.openova.io useraccesses access.openova.io/v1alpha1 true UserAccess ^^^^ NAMESPACED Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): D21 owner seed uses tierRoleRef not wildcard app PR #1564 + #1577 created the CR shape with applications=[{app:"",...}] but the useraccess XRD schema rejects `app: ""` (pattern ^[a-z0-9][a-z0-9-]{0,62}$). The seed handler logged "spec.applications[0].app: Invalid value: \"*\"" on every handover. The XRD has a `tierRoleRef` field (pattern ^openova:tier-(viewer\|developer\|operator\|admin\|owner)$) that's the canonical owner-tier semantic — when set, useraccess-controller binds the named ClusterRole on the target via RoleBinding/ClusterRoleBinding. `openova:tier-owner` is shipped by EPIC-3 (#1098) slice T1's tier-clusterroles.yaml. Drop the applications[] block + use tierRoleRef = openova:tier-owner. Verified live on t135 2026-05-17 — error log showed exact pattern mismatch before this fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 04:08:45 +04:00
github-actions[bot]	6466f97f6c	deploy: update catalyst images to `ea30ded`	2026-05-16 23:28:04 +00:00
e3mrah	ea30ded120	fix(handover): D21 owner seed uses catalyst-system namespace (#1577 ) * fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22) PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-} slot-file placeholders WITHOUT the $$ escape. tofu's templatefile() parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu expression — failing with "Extra characters after interpolation expression; Template interpolation doesn't expect a colon". Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s. The escape pattern is documented at main.tf:1029 (the same warning that caught t127 last week). $$ prefix tells tofu's templatefile to emit literal \${...} to cloud-init for Flux envsubst. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30) When an sme-pool domain's current NS records already match the expected [ns1.<primary>, ns2.<primary>] pair (because the operator already delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip step is a no-op. Skipping avoids: 1. Burning a Dynadot API credit on a flip that would be idempotent. 2. The D30 blocker — current Dynadot creds return pdm-status-401 even when the desired NS state already exists. Caught on t132 2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body parentDomains attempt. Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with a 5s timeout. False on lookup error or partial match → fall through to the original PDM pipeline so a misconfigured/partial domain still goes through the registrar API. This unblocks sme-pool entries for omani.homes (already pointing at ns1/2/3.openova.io). omani.rest / omani.trades still go through the full flip path because their NS records don't yet match expected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): D21 owner seed uses catalyst-system namespace PR #1564 created the owner UserAccess CR with .Namespace("") — the apiserver returned "could not find the requested resource" because useraccesses.access.openova.io is NAMESPACED (Crossplane Claim per the XRD's claimNames block at platform/crossplane-claims/chart/ templates/xrds/useraccess.yaml). Pin to catalyst-system (where catalyst-api + every Catalyst-authored CR lives) and stamp the namespace on the object too. The existing ListUserAccess handler uses Namespace("") so the entry surfaces on /users without per-namespace filtering. Verified the CRD shape on t134 2026-05-17: $ kubectl api-resources --api-group=access.openova.io useraccesses access.openova.io/v1alpha1 true UserAccess ^^^^ NAMESPACED Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 03:26:06 +04:00
github-actions[bot]	18b5fa1466	deploy: update catalyst images to `33ed484`	2026-05-16 23:24:34 +00:00
e3mrah	33ed484e04	fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30) (#1576 ) * fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22) PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-} slot-file placeholders WITHOUT the $$ escape. tofu's templatefile() parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu expression — failing with "Extra characters after interpolation expression; Template interpolation doesn't expect a colon". Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s. The escape pattern is documented at main.tf:1029 (the same warning that caught t127 last week). $$ prefix tells tofu's templatefile to emit literal \${...} to cloud-init for Flux envsubst. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(parent-domains): short-circuit pdmFlipNS when NS already matches (D30) When an sme-pool domain's current NS records already match the expected [ns1.<primary>, ns2.<primary>] pair (because the operator already delegated the domain to OpenOva's PowerDNS), the PDM registrar-flip step is a no-op. Skipping avoids: 1. Burning a Dynadot API credit on a flip that would be idempotent. 2. The D30 blocker — current Dynadot creds return pdm-status-401 even when the desired NS state already exists. Caught on t132 2026-05-16 day-2 add + t134 2026-05-17 fresh-prov body parentDomains attempt. Adds nsAlreadyMatches() helper using net.DefaultResolver.LookupNS with a 5s timeout. False on lookup error or partial match → fall through to the original PDM pipeline so a misconfigured/partial domain still goes through the registrar API. This unblocks sme-pool entries for omani.homes (already pointing at ns1/2/3.openova.io). omani.rest / omani.trades still go through the full flip path because their NS records don't yet match expected. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 03:21:42 +04:00
github-actions[bot]	a65a024114	deploy: update catalyst images to `c148ec6`	2026-05-16 22:33:19 +00:00
e3mrah	c148ec6a34	fix(cloudinit): escape $$\{ORG_EMAIL:-\}/$$\{ORG_NAME:-\} in comment (D22) (#1575 ) PR #1571 added a comment mentioning the $${ORG_EMAIL:-}/$${ORG_NAME:-} slot-file placeholders WITHOUT the $$ escape. tofu's templatefile() parses comments and tried to interpolate \${ORG_EMAIL:-} as a tofu expression — failing with "Extra characters after interpolation expression; Template interpolation doesn't expect a colon". Caught live on t133 fad01d84f5655004 — tofu plan failed in 30s. The escape pattern is documented at main.tf:1029 (the same warning that caught t127 last week). $$ prefix tells tofu's templatefile to emit literal \${...} to cloud-init for Flux envsubst. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 02:31:26 +04:00
github-actions[bot]	c5f777056f	deploy: update catalyst images to `3568b72`	2026-05-16 22:20:19 +00:00
e3mrah	3568b72b5e	fix(cloud): hide non-active 0/0 chips (D15) (#1574 ) * feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(slot-13): add D22 sovereign-side identity placeholders Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} + ${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569) → catalyst-api env → chrootEnsureDeployment populates the deployment record → Settings page renders real values instead of `—`. This PR alone is a no-op (placeholders default to empty, same as today). The cloud-init substitute lines + provisioner.go tfvars need to land in a companion PR to actually populate the values on next-prov. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22) Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit Kustomization's postBuild.substitute env, which the slot-13 placeholders (#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}. Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment populates the deployment record (#1567 + #1568 fallback). SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes the dependency cycle (hcloud_server.cp doesn't exist at cloudinit render time). Separate PR will source it via metadata-service or post-create ConfigMap patch. Next-prov (t133+) Sovereign Console Settings page now renders real ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b) PR #1552 stripped the `/app` prefix on Sovereign mode to make `/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match. But consoleAppDetailRoute is registered at `/app/$componentId` under consoleLayoutRoute — no chroot route matches `/<componentId>` directly, so stripping leaves an empty render path. Playwright walkthrough on t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render body_len=9 (empty). Invert the logic: only redirect mothership-only sub-paths (/dashboard Fleet view, /install wizard, /sre, /sec, /blueprints) which have no Sovereign Console equivalent. For everything else (component names like `/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match pick consoleAppDetailRoute / consoleAppsRoute. Caught live on t132 via Playwright walker3.js — agent a4825c5a. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): re-mint handover JWT on every GetDeployment (D0) D0 Playwright walkthrough on t132 2026-05-17 caught: handoverURL persisted at handover-fire time carries a JWT that expires per DefaultTTL (5min). Operators who click /jobs hours later get the stale token → Sovereign-side /auth/handover rejects with raw JSON {"error":"invalid token"} — no UI fallback, no /auth/handover-error, auto-redirect to /dashboard never fires. Re-mint the JWT on every GetDeployment when deployment is ready + handover-fired so the URL returned to the wizard is always freshly-signed. Best-effort: on mint failure, leave the existing URL in place so a transient signer error doesn't break polling. Helper is idempotent + locked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(cloud): hide non-active 0/0 chips (D15) Playwright walkthrough on t132 2026-05-17 caught D15 PARTIAL: 15 chips are correct but Bucket+Volume show 0/0. Founder rule (DoD D15): "No kind chip shows 0/0 for a resource that actually exists in the cluster". Bucket+Volume genuinely don't exist on this Sovereign so showing 0/0 is noise. Hide chips with count exactly 0 unless they're the active selection (operator who navigated to an empty kind keeps context). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 02:18:24 +04:00
github-actions[bot]	44e612f39d	deploy: update catalyst images to `58dbb92`	2026-05-16 22:18:16 +00:00
e3mrah	58dbb92f4f	fix(handover): re-mint handover JWT on every GetDeployment (D0) (#1573 ) * feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(slot-13): add D22 sovereign-side identity placeholders Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} + ${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569) → catalyst-api env → chrootEnsureDeployment populates the deployment record → Settings page renders real values instead of `—`. This PR alone is a no-op (placeholders default to empty, same as today). The cloud-init substitute lines + provisioner.go tfvars need to land in a companion PR to actually populate the values on next-prov. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22) Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit Kustomization's postBuild.substitute env, which the slot-13 placeholders (#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}. Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment populates the deployment record (#1567 + #1568 fallback). SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes the dependency cycle (hcloud_server.cp doesn't exist at cloudinit render time). Separate PR will source it via metadata-service or post-create ConfigMap patch. Next-prov (t133+) Sovereign Console Settings page now renders real ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b) PR #1552 stripped the `/app` prefix on Sovereign mode to make `/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match. But consoleAppDetailRoute is registered at `/app/$componentId` under consoleLayoutRoute — no chroot route matches `/<componentId>` directly, so stripping leaves an empty render path. Playwright walkthrough on t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render body_len=9 (empty). Invert the logic: only redirect mothership-only sub-paths (/dashboard Fleet view, /install wizard, /sre, /sec, /blueprints) which have no Sovereign Console equivalent. For everything else (component names like `/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match pick consoleAppDetailRoute / consoleAppsRoute. Caught live on t132 via Playwright walker3.js — agent a4825c5a. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(handover): re-mint handover JWT on every GetDeployment (D0) D0 Playwright walkthrough on t132 2026-05-17 caught: handoverURL persisted at handover-fire time carries a JWT that expires per DefaultTTL (5min). Operators who click /jobs hours later get the stale token → Sovereign-side /auth/handover rejects with raw JSON {"error":"invalid token"} — no UI fallback, no /auth/handover-error, auto-redirect to /dashboard never fires. Re-mint the JWT on every GetDeployment when deployment is ready + handover-fired so the URL returned to the wizard is always freshly-signed. Best-effort: on mint failure, leave the existing URL in place so a transient signer error doesn't break polling. Helper is idempotent + locked. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 02:16:26 +04:00
github-actions[bot]	dea683f5e4	deploy: update catalyst images to `9e1e422`	2026-05-16 22:08:01 +00:00
e3mrah	9e1e4224d8	fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b) (#1572 ) * feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(slot-13): add D22 sovereign-side identity placeholders Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} + ${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569) → catalyst-api env → chrootEnsureDeployment populates the deployment record → Settings page renders real values instead of `—`. This PR alone is a no-op (placeholders default to empty, same as today). The cloud-init substitute lines + provisioner.go tfvars need to land in a companion PR to actually populate the values on next-prov. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22) Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit Kustomization's postBuild.substitute env, which the slot-13 placeholders (#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}. Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment populates the deployment record (#1567 + #1568 fallback). SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes the dependency cycle (hcloud_server.cp doesn't exist at cloudinit render time). Separate PR will source it via metadata-service or post-create ConfigMap patch. Next-prov (t133+) Sovereign Console Settings page now renders real ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix(router): chroot /app/<name> only-redirect mothership-only sub-paths (D17/D17b) PR #1552 stripped the `/app` prefix on Sovereign mode to make `/app/bp-cnpg` → `/bp-cnpg`, hoping consoleAppDetailRoute would match. But consoleAppDetailRoute is registered at `/app/$componentId` under consoleLayoutRoute — no chroot route matches `/<componentId>` directly, so stripping leaves an empty render path. Playwright walkthrough on t132 2026-05-17 confirmed: /app/bp-cnpg + /app/bp-coraza both render body_len=9 (empty). Invert the logic: only redirect mothership-only sub-paths (/dashboard Fleet view, /install wizard, /sre, /sec, /blueprints) which have no Sovereign Console equivalent. For everything else (component names like `/app/bp-cnpg`, bare `/app`), let TanStack's natural most-specific-match pick consoleAppDetailRoute / consoleAppsRoute. Caught live on t132 via Playwright walker3.js — agent a4825c5a. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 02:05:54 +04:00
github-actions[bot]	4cc880cafd	deploy: update catalyst images to `5793958`	2026-05-16 21:48:54 +00:00
e3mrah	57939585c0	feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22) (#1571 ) * feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(slot-13): add D22 sovereign-side identity placeholders Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} + ${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569) → catalyst-api env → chrootEnsureDeployment populates the deployment record → Settings page renders real values instead of `—`. This PR alone is a no-op (placeholders default to empty, same as today). The cloud-init substitute lines + provisioner.go tfvars need to land in a companion PR to actually populate the values on next-prov. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(cloudinit): wire ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL substitutes (D22) Companion to #1567+#1568+#1569+#1570 — the cloud-init substitute block now emits ORG_EMAIL/ORG_NAME/GITOPS_REPO_URL into the bootstrap-kit Kustomization's postBuild.substitute env, which the slot-13 placeholders (#1570) consume via ${ORG_EMAIL:-}/${ORG_NAME:-}/${GITOPS_REPO_URL:-}. Chain: provisioner.go writeTfvars → tofu vars → cloudinit templatefile substitute → Flux Kustomization postBuild → sovereign-fqdn ConfigMap keys (#1569) → catalyst-api env (#1569) → chrootEnsureDeployment populates the deployment record (#1567 + #1568 fallback). SOVEREIGN_CONTROL_PLANE_IP omitted intentionally — main.tf:691 notes the dependency cycle (hcloud_server.cp doesn't exist at cloudinit render time). Separate PR will source it via metadata-service or post-create ConfigMap patch. Next-prov (t133+) Sovereign Console Settings page now renders real ownerEmail/orgName/gitopsRepoURL instead of `—` placeholders. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 01:47:04 +04:00
e3mrah	700d28967f	chore(slot-13): add D22 sovereign-side identity placeholders (#1570 ) * feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(slot-13): add D22 sovereign-side identity placeholders Add ${ORG_EMAIL:-} + ${ORG_NAME:-} + ${SOVEREIGN_CONTROL_PLANE_IP:-} + ${GITOPS_REPO_URL:-} envsubst placeholders so when cloud-init wires them, the chart picks them up via sovereign-fqdn ConfigMap (PR #1569) → catalyst-api env → chrootEnsureDeployment populates the deployment record → Settings page renders real values instead of `—`. This PR alone is a no-op (placeholders default to empty, same as today). The cloud-init substitute lines + provisioner.go tfvars need to land in a companion PR to actually populate the values on next-prov. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 01:29:59 +04:00
github-actions[bot]	df193d340e	deploy: update catalyst images to `9cbcd23`	2026-05-16 21:03:01 +00:00
e3mrah	9cbcd230da	feat(chart): wire OPERATOR_EMAIL/CONTROL_PLANE_IP/GITOPS_REPO_URL/ORG_NAME (D22) (#1569 ) Companion to PR #1567 + #1568 — wire the env vars chrootEnsureDeployment reads to populate the deployment record so Sovereign Console Settings page renders real values for ownerEmail, controlPlaneIP, gitopsRepoURL, orgName (instead of `—` placeholders). Adds 4 new keys to the sovereign-fqdn ConfigMap (orgEmail, orgName, controlPlaneIP, gitopsRepoURL) sourced from .Values.sovereign.* with empty defaults. Per-Sovereign overlays wire actual values from cloud- init substitute placeholders (mirrors regionsJson pattern). Catalyst-api Pod now reads them via valueFrom configMapKeyRef + optional=true (Catalyst-Zero/contabo emits no sovereign-fqdn ConfigMap so env stays empty there — correct, mothership is signer not validator). Validated: t132 already serves region=hel1, consoleURL, loadBalancerIP post-#1568. This PR fills the remaining 3 D22 fields when operator wires the values. Co-authored-by: hatiyildiz <hatice.yildiz@openova.io> Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-17 01:01:00 +04:00
github-actions[bot]	0e0280bbe0	deploy: update catalyst images to `6618392`	2026-05-16 20:56:10 +00:00

1 2 3 4 5 ...

2204 Commits