OCPNODE-4494: Testcase to test runc Upgrade case#31266
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@asahay19: This pull request references OCPNODE-4494 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
WalkthroughAdds a disruptive Ginkgo node e2e that verifies MCO blocks an OSImageStream move rhel-9 → rhel-10 when CRI‑O default runtime is runc, with helpers to create/label/update/poll/verify/recover a dedicated MachineConfigPool and ContainerRuntimeConfig plus a companion test plan. Changesrunc RHCOS 10 upgrade guard test
Sequence Diagram(s)sequenceDiagram
participant TestRunner
participant API_Server
participant MachineConfigOperator
participant Node
participant ClusterVersion
TestRunner->>API_Server: create MachineConfigPool (rhel-9) & ContainerRuntimeConfig (runc)
API_Server->>MachineConfigOperator: notify new MCP/MC
MachineConfigOperator->>Node: render and apply ignition + runc drop-in
Node-->>MachineConfigOperator: report currentConfig and OS version (rhel-9)
TestRunner->>API_Server: patch MCP.spec.OSImageStream -> rhel-10
API_Server->>MachineConfigOperator: new desired OSImageStream (rhel-10)
MachineConfigOperator->>MachineConfigOperator: detect runc + rhel-10 -> set Degraded & RenderDegraded
MachineConfigOperator->>ClusterVersion: ensure CV remains Available (no Progressing/Degraded)
TestRunner->>API_Server: patch MCP.spec.OSImageStream -> rhel-9 (recovery)
MachineConfigOperator->>Node: reconcile to rhel-9, clear degraded
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 14 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (14 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: asahay19 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/extended/node/runc_upgrade_cases.go`:
- Around line 59-63: The test fails on SingleReplica (SNO) clusters because it
requires a “pure worker”; update the preflight topology check that calls
exutil.GetControlPlaneTopology (variable controlPlaneTopology) to also detect
configv1.SingleReplicaTopologyMode and call g.Skip("Skipping on single-replica
(SNO) cluster") before attempting to select a pure worker. Make the same change
in the other preflight/topology-check sites that use
exutil.GetControlPlaneTopology or perform pure-worker selection (the other
occurrences referenced in the review) so the test is skipped early on
SingleReplica clusters.
- Around line 99-107: The AfterEach currently ignores all error returns from
cleanup calls (removeNodeLabel, waitForMCPMachineCount, deleteMachineConfig,
deleteMachineConfigPool) which can leave test state dirty; update AfterEach to
capture each error into a variable and assert failure instead of swallowing it
(e.g., err := removeNodeLabel(...); Expect(err).NotTo(HaveOccurred())) for each
call that uses nodeName, oc, mcClient, runcRHCOS10GuardPool, and runcGuardMCName
(and similarly for
waitForMCPMachineCount/deleteMachineConfig/deleteMachineConfigPool) so any
cleanup failure fails the test and surfaces the underlying error.
In `@test/extended/node/runc_upgrade_cases.md`:
- Around line 51-53: Add a language tag to the fenced code block containing the
test declaration g.It("blocks upgrade of RHCOS 9 to 10 when default runtime is
runc") — replace the opening triple backticks with a language-tagged fence
(e.g., ```go) so the block reads as a Go snippet and satisfies MD040.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 02321dc4-fc08-4828-8271-c00e13720ac4
📒 Files selected for processing (2)
test/extended/node/runc_upgrade_cases.gotest/extended/node/runc_upgrade_cases.md
|
@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891 |
|
@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/de265370-633b-11f1-9c47-870e6f6dcba6-0 |
|
@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/67ece2a0-634f-11f1-8aab-94564898c66c-0 |
|
@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/b9a6bab0-63dd-11f1-828a-621d5ba8e722-0 |
|
@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c1104ff0-63dd-11f1-9f15-5905ea882eb2-0 |
07d64fb to
2db0de0
Compare
|
@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8921abf0-6407-11f1-9686-c8cdaefe3b6e-0 |
|
|
||
| // verifyClusterVersionUnaffectedByIsolatedPoolGuard checks that a render failure on an isolated | ||
| // custom MCP does not degrade the cluster-wide machine-config operator or ClusterVersion. | ||
| // The guard is pool-scoped; worker/master pools remain healthy. |
There was a problem hiding this comment.
Did you confirm it? If so we may want to do some additional propagation.
|
@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/38c35e00-64b2-11f1-8870-ba2c7e80b8b9-0 |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/extended/node/runc_upgrade_cases.go (1)
300-317: ⚡ Quick winEliminate duplicate OSImageStream Get call.
Lines 301 and 307 both call
Get(ctx, "cluster", ...)on OSImageStreams. The first result is discarded; the second is used. Capture the first result and reuse it to avoid the redundant API call.♻️ Suggested refactor
func requireOSImageStreams(ctx context.Context, mcClient *machineconfigclient.Clientset) { - _, err := mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster", metav1.GetOptions{}) + osi, err := mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster", metav1.GetOptions{}) if apierrors.IsNotFound(err) { g.Skip("OSImageStream API is not available; enable TechPreviewNoUpgrade / OSStreams on the cluster") } o.Expect(err).NotTo(o.HaveOccurred()) - osi, err := mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster", metav1.GetOptions{}) - o.Expect(err).NotTo(o.HaveOccurred()) - streamNames := make([]string, 0, len(osi.Status.AvailableStreams))🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/extended/node/runc_upgrade_cases.go` around lines 300 - 317, The function requireOSImageStreams currently calls mcClient.MachineconfigurationV1().OSImageStreams().Get twice; capture the first Get result (osi, err := mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster", metav1.GetOptions{})), use that err for the apierrors.IsNotFound check and subsequent assertions, and remove the redundant second Get call so the osi variable is reused for building streamNames and logging.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@test/extended/node/runc_upgrade_cases.go`:
- Around line 300-317: The function requireOSImageStreams currently calls
mcClient.MachineconfigurationV1().OSImageStreams().Get twice; capture the first
Get result (osi, err :=
mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster",
metav1.GetOptions{})), use that err for the apierrors.IsNotFound check and
subsequent assertions, and remove the redundant second Get call so the osi
variable is reused for building streamNames and logging.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 24c22f72-b3ec-490f-9849-a3439e591669
📒 Files selected for processing (2)
test/extended/node/runc_upgrade_cases.gotest/extended/node/runc_upgrade_cases.md
✅ Files skipped from review due to trivial changes (1)
- test/extended/node/runc_upgrade_cases.md
|
@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/1af69240-64d3-11f1-991c-fc8c3045c5fc-0 |
|
/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2 openshift/machine-config-operator#5891 |
|
@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/289b6a20-657c-11f1-8b2c-d2cc517bfd33-0 |
|
/payload-job periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2 |
|
@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6dac7590-65b9-11f1-8109-d881c0e1b806-0 |
This PR adds an end-to-end test that validates MCO's guard blocking a RHCOS 9→10 OS stream transition when a MachineConfigPool has runc configured as the default container runtime. Also this PR contains runc_upgrade_cases.md file which contains the Test Plan with all the required meta data.
What the test does
MCO change PR: openshift/machine-config-operator#5891
Locally tested with my custom mco image against the above mco PR and it got passed:
./openshift-tests run-test "[Suite:openshift/disruptive-longrunning][sig-node][Serial][Disruptive] runc RHCOS 10 upgrade guard blocks upgrade of RHCOS 9 to 10 when default runtime is runc"Summary by CodeRabbit
Tests
Documentation