Feat: RunOnInfra to use openshift.io/node-selector annotation on namespace#1194
Feat: RunOnInfra to use openshift.io/node-selector annotation on namespace#1194anandrkskd wants to merge 3 commits into
Conversation
…pod infra NodeSelector with openshift.io/node-selector annotation on default namespace. OpenShift admission controller applies selector to all pods in namespace automatically. Tolerations and custom NodeSelector remain at pod level. assited-by: claude-code Signed-off-by: Anand Kumar Singh <anandrkskd@gmail.com>
Signed-off-by: Anand Kumar Singh <anandrkskd@gmail.com>
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Organization UI (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (1)
🔗 Linked repositories identifiedCodeRabbit considers these linked repositories for cross-repo context during reviews:
🚧 Files skipped from review as they are similar to previous changes (1)
📝 WalkthroughSummary by CodeRabbit
WalkthroughThe PR changes how ChangesInfra Node Selector via Namespace Annotation
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
test/openshift/e2e/ginkgo/sequential/1-071_validate_node_selectors_test.go (1)
137-153: 🎯 Functional Correctness | 🟠 Major | ⚡ Quick winAssert the fallback nodeSelector explicitly after cleanup.
ShouldNot(HaveTemplateSpecNodeSelector({"kubernetes.io/os":"linux","key1":"value1"}))only proveskey1is gone. It also passes if the controller drops the defaultkubernetes.io/os=linuxselector entirely. Please assert the final selector equals the default map instead of only asserting inequality.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/openshift/e2e/ginkgo/sequential/1-071_validate_node_selectors_test.go` around lines 137 - 153, The cleanup assertions in the node selector test are too weak because ShouldNot(HaveTemplateSpecNodeSelector(...)) still passes if the default selector is removed entirely. Update the assertions in the loop over deploymentNameList and the appControllerSS check to verify the final template spec node selector matches the expected fallback/default selector exactly, using the existing HaveTemplateSpecNodeSelector helper on the Deployment and StatefulSet fixtures.Source: Path instructions
🧹 Nitpick comments (1)
controllers/gitopsservice_controller_test.go (1)
643-650: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd a preservation case for custom
spec.nodeSelector.The PR contract says
RunOnInfrashould stop injecting the infra selector without dropping user-provided node selectors, but this test only covers the empty-selector path. A regression that clearsGitopsServiceSpec.NodeSelectorwould still pass here. Please add a case withSpec.NodeSelectorpopulated and assert that the backendDeploymentandArgoCD.Spec.NodePlacement.NodeSelectorkeep that map while the namespace annotation is still set.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controllers/gitopsservice_controller_test.go` around lines 643 - 650, Add a preservation test for custom GitopsServiceSpec.NodeSelector in the RunOnInfra path, since the current gitopsservice_controller_test only verifies the empty-selector case. Update the relevant test setup around the GitopsService reconciliation and the ArgoCD assertions so a populated NodeSelector is provided, then verify the backend Deployment and ArgoCD.Spec.NodePlacement.NodeSelector both retain that map while the namespace annotation still reflects infra placement. Use the existing GitopsService, ArgoCD, and deployment checks to ensure user-provided selectors are not cleared.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@controllers/gitopsservice_controller.go`:
- Around line 100-106: The GitOpsService bootstrap in the controller’s setup
path only logs failures from r.Client.Create and then continues, which lets
SetupWithManager succeed even when creation actually failed. Update the
bootstrap flow around the Create call in the GitopsService controller to return
the non-AlreadyExists error to the caller, while still treating
errors.IsAlreadyExists as a non-fatal case. Make sure the setup method that
invokes this creation propagates the error so operator startup fails fast on
RBAC/API issues instead of leaving the singleton uncreated.
- Around line 1029-1045: The ensureInfraNodeSelectorAnnotation helper is
overwriting or deleting the entire openshift.io/node-selector annotation instead
of preserving any existing selector terms. Update this function so it only adds
the infra selector when runOnInfra is true and removes just that infra clause
when false, leaving any other node-selector constraints intact; use
ensureInfraNodeSelectorAnnotation and
common.InfraNodeSelectorAnnotation/common.InfraNodeSelectorAnnotationValue to
locate the logic.
In `@test/openshift/e2e/ginkgo/sequential/1-028-validate_run_on_infra_test.go`:
- Around line 144-153: The disable-path assertion in the OpenShift GitOps e2e
test is too narrow because it only checks the namespace annotation and the
cluster deployment. Update the verification in the same test block that uses
openshiftGitopsNS and clusterDepl so it asserts the full openshift-gitops
workload set, matching the enable-path coverage by checking all relevant
deployments/workloads no longer have the infra toleration after RunOnInfra is
disabled.
---
Outside diff comments:
In `@test/openshift/e2e/ginkgo/sequential/1-071_validate_node_selectors_test.go`:
- Around line 137-153: The cleanup assertions in the node selector test are too
weak because ShouldNot(HaveTemplateSpecNodeSelector(...)) still passes if the
default selector is removed entirely. Update the assertions in the loop over
deploymentNameList and the appControllerSS check to verify the final template
spec node selector matches the expected fallback/default selector exactly, using
the existing HaveTemplateSpecNodeSelector helper on the Deployment and
StatefulSet fixtures.
---
Nitpick comments:
In `@controllers/gitopsservice_controller_test.go`:
- Around line 643-650: Add a preservation test for custom
GitopsServiceSpec.NodeSelector in the RunOnInfra path, since the current
gitopsservice_controller_test only verifies the empty-selector case. Update the
relevant test setup around the GitopsService reconciliation and the ArgoCD
assertions so a populated NodeSelector is provided, then verify the backend
Deployment and ArgoCD.Spec.NodePlacement.NodeSelector both retain that map while
the namespace annotation still reflects infra placement. Use the existing
GitopsService, ArgoCD, and deployment checks to ensure user-provided selectors
are not cleared.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 7fa859f0-e8d7-4ffd-a261-4e3fd2dd3cfc
📒 Files selected for processing (9)
common/common.gocontrollers/consoleplugin.gocontrollers/consoleplugin_test.gocontrollers/gitopsservice_controller.gocontrollers/gitopsservice_controller_test.gotest/e2e/gitopsservice_test.gotest/openshift/e2e/ginkgo/fixture/namespace/fixture.gotest/openshift/e2e/ginkgo/sequential/1-028-validate_run_on_infra_test.gotest/openshift/e2e/ginkgo/sequential/1-071_validate_node_selectors_test.go
🔗 Linked repositories identified
CodeRabbit considers these linked repositories for cross-repo context during reviews:
argoproj-labs/argocd-operator(manual)
💤 Files with no reviewable changes (1)
- controllers/consoleplugin.go
| err := r.Client.Create(context.TODO(), gitopsServiceRef) | ||
| if err != nil { | ||
| reqLogger.Error(err, "Failed to create GitOps service instance") | ||
| if errors.IsAlreadyExists(err) { | ||
| reqLogger.Info("GitOps service instance already exists, skipping creation") | ||
| } else { | ||
| reqLogger.Error(err, "Failed to create GitOps service instance") | ||
| } |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Return non-AlreadyExists bootstrap errors.
Lines 100-106 only log Create failures and still let SetupWithManager succeed. If this call fails for RBAC/API reasons, the singleton GitopsService is never created, and later reconciles just exit on NotFound, leaving the operator partially initialized.
Suggested fix
err := r.Client.Create(context.TODO(), gitopsServiceRef)
if err != nil {
if errors.IsAlreadyExists(err) {
reqLogger.Info("GitOps service instance already exists, skipping creation")
} else {
reqLogger.Error(err, "Failed to create GitOps service instance")
+ return err
}
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| err := r.Client.Create(context.TODO(), gitopsServiceRef) | |
| if err != nil { | |
| reqLogger.Error(err, "Failed to create GitOps service instance") | |
| if errors.IsAlreadyExists(err) { | |
| reqLogger.Info("GitOps service instance already exists, skipping creation") | |
| } else { | |
| reqLogger.Error(err, "Failed to create GitOps service instance") | |
| } | |
| err := r.Client.Create(context.TODO(), gitopsServiceRef) | |
| if err != nil { | |
| if errors.IsAlreadyExists(err) { | |
| reqLogger.Info("GitOps service instance already exists, skipping creation") | |
| } else { | |
| reqLogger.Error(err, "Failed to create GitOps service instance") | |
| return err | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@controllers/gitopsservice_controller.go` around lines 100 - 106, The
GitOpsService bootstrap in the controller’s setup path only logs failures from
r.Client.Create and then continues, which lets SetupWithManager succeed even
when creation actually failed. Update the bootstrap flow around the Create call
in the GitopsService controller to return the non-AlreadyExists error to the
caller, while still treating errors.IsAlreadyExists as a non-fatal case. Make
sure the setup method that invokes this creation propagates the error so
operator startup fails fast on RBAC/API issues instead of leaving the singleton
uncreated.
There was a problem hiding this comment.
this change only adds more defined logs for users, previous behaviour should handle it.
| By("verifying that the openshift-gitops namespace no longer has the infra annotation") | ||
| Eventually(openshiftGitopsNS).Should(namespaceFixture.NotHaveAnnotation(common.InfraNodeSelectorAnnotation)) | ||
|
|
||
| clusterDepl = &appsv1.Deployment{ObjectMeta: metav1.ObjectMeta{Name: "cluster", Namespace: "openshift-gitops"}} | ||
| Eventually(clusterDepl).ShouldNot(deploymentFixture.HaveTemplateSpecNodeSelector(map[string]string{ | ||
| "node-role.kubernetes.io/infra": "", | ||
| "kubernetes.io/os": "linux", | ||
| })) | ||
| By("verifying that the resources in openshift-gitops no longer have run on infra tolerations set") | ||
|
|
||
| clusterDepl = &appsv1.Deployment{ObjectMeta: metav1.ObjectMeta{Name: "cluster", Namespace: "openshift-gitops"}} | ||
| Eventually(clusterDepl).ShouldNot(deploymentFixture.HaveTolerations([]corev1.Toleration{ | ||
| {Key: "infra", Effect: "NoSchedule", Value: "reserved"}}), | ||
| ) | ||
|
|
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
The disable-path check is now too narrow.
After RunOnInfra is turned off, this only verifies annotation removal and the cluster deployment. If the other openshift-gitops workloads keep the infra toleration, the test still passes. Please assert the full workload set here, the same way this test does on the enable path.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/openshift/e2e/ginkgo/sequential/1-028-validate_run_on_infra_test.go`
around lines 144 - 153, The disable-path assertion in the OpenShift GitOps e2e
test is too narrow because it only checks the namespace annotation and the
cluster deployment. Update the verification in the same test block that uses
openshiftGitopsNS and clusterDepl so it asserts the full openshift-gitops
workload set, matching the enable-path coverage by checking all relevant
deployments/workloads no longer have the infra toleration after RunOnInfra is
disabled.
Source: Path instructions
…o/infra=' Signed-off-by: Anand Kumar Singh <anandrkskd@gmail.com>
| labelUpdate, argocdNS := ensurePodSecurityLabels(argocdNS) | ||
| needsUpdate = needsUpdate || labelUpdate | ||
| annotationUpdate, argocdNS := ensureInfraNodeSelectorAnnotation(argocdNS, instance.Spec.RunOnInfra) | ||
| needsUpdate = needsUpdate || annotationUpdate |
There was a problem hiding this comment.
same logic as 266-270, can we make it as function ?
|
@anandrkskd: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What type of PR is this?
/kind enhancement
What does this PR do / why we need it:
This PR changes
RunOnInfrafield implementation it addsopenshift.io/node-selector="node-role.kubernetes.io/infra=“annotation on namespace and let openshift make sure the pods are scheduled on infra nodes. Previous implementation used to add node-role label to resources directly.Have you updated the necessary documentation?
Which issue(s) this PR fixes:
Fixes GITOPS-9926
Test acceptance criteria:
How to test changes / Special notes to the reviewer:
make install runspec.runOnInfra: trueopenshift.io/node-selector="node-role.kubernetes.io/infra=“