Skip to content

OCPNODE-4494: Testcase to test runc Upgrade case#31266

Draft
asahay19 wants to merge 1 commit into
openshift:mainfrom
asahay19:4494
Draft

OCPNODE-4494: Testcase to test runc Upgrade case#31266
asahay19 wants to merge 1 commit into
openshift:mainfrom
asahay19:4494

Conversation

@asahay19

@asahay19 asahay19 commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

This PR adds an end-to-end test that validates MCO's guard blocking a RHCOS 9→10 OS stream transition when a MachineConfigPool has runc configured as the default container runtime. Also this PR contains runc_upgrade_cases.md file which contains the Test Plan with all the required meta data.

What the test does

  • Creates an isolated MachineConfigPool (runc-rhcos10-guard) pinned to spec.osImageStream: rhel-9 and a runc CRI-O drop-in MachineConfig
  • Labels one pure worker into the pool and waits for a healthy baseline rollout
  • Verifies the node is on RHCOS 9 with runc as the default runtime
  • Patches the pool's osImageStream to rhel-10
  • Asserts the guard fires: MCP Degraded=True + RenderDegraded=True with a message referencing runc and rhel-10
  • Confirms the node remains on RHCOS 9 with runc (rollout was blocked)
  • Cleans up: removes node label, waits for pool to drain to zero machines, deletes MC and MCP
  • Skips on: MicroShift, Hypershift

MCO change PR: openshift/machine-config-operator#5891

Locally tested with my custom mco image against the above mco PR and it got passed:
./openshift-tests run-test "[Suite:openshift/disruptive-longrunning][sig-node][Serial][Disruptive] runc RHCOS 10 upgrade guard blocks upgrade of RHCOS 9 to 10 when default runtime is runc"

  ============================================
  Random Seed: 1781098173 - will randomize all specs

  Will run 1 of 1 specs
  ------------------------------
  [Suite:openshift/disruptive-longrunning][sig-node][Serial][Disruptive] runc RHCOS 10 upgrade guard blocks upgrade of RHCOS 9 to 10 when ContainerRuntimeConfig sets default runtime to runc
  github.com/openshift/origin/test/extended/node/runc_upgrade_cases.go:74
    STEP: Creating a kubernetes client @ 06/10/26 18:59:38.429
  I0610 18:59:38.430912   28909 discovery.go:214] Invalidating discovery information
  I0610 18:59:41.287958 28909 client.go:293] configPath is now "/var/folders/95/wktd6vvd57g5hgy7prd1sgmh0000gn/T/configfile3340602746"
  I0610 18:59:41.288021 28909 client.go:368] The user is now "e2e-test-runc-rhcos10-guard-l4h9r-user"
  I0610 18:59:41.288039 28909 client.go:370] Creating project "e2e-test-runc-rhcos10-guard-l4h9r"
  I0610 18:59:41.596911 28909 client.go:378] Waiting on permissions in project "e2e-test-runc-rhcos10-guard-l4h9r" ...
  I0610 18:59:42.545306 28909 client.go:407] DeploymentConfig capability is enabled, adding 'deployer' SA to the list of default SAs
  I0610 18:59:42.781600 28909 client.go:422] Waiting for ServiceAccount "default" to be provisioned...
  I0610 18:59:43.358224 28909 client.go:422] Waiting for ServiceAccount "builder" to be provisioned...
  I0610 18:59:43.927713 28909 client.go:422] Waiting for ServiceAccount "deployer" to be provisioned...
  I0610 18:59:44.500341 28909 client.go:432] Waiting for RoleBinding "system:image-pullers" to be provisioned...
  I0610 18:59:44.739219 28909 client.go:432] Waiting for RoleBinding "system:image-builders" to be provisioned...
  I0610 18:59:44.973297 28909 client.go:432] Waiting for RoleBinding "system:deployers" to be provisioned...
  I0610 18:59:46.166253 28909 client.go:469] Project "e2e-test-runc-rhcos10-guard-l4h9r" has been fully provisioned.
  I0610 18:59:46.170372 28909 framework.go:2330] [precondition-check] checking if cluster is MicroShift
  I0610 18:59:46.405904 28909 framework.go:2353] IsMicroShiftCluster: microshift-version configmap not found, not MicroShift
  I0610 18:59:47.109390 28909 runc_upgrade_cases.go:316] OSImageStream default="rhel-10" streams=[rhel-10 rhel-9]
    STEP: Labeling one worker into the custom pool @ 06/10/26 18:59:47.109
  I0610 18:59:47.819865 28909 runc_upgrade_cases.go:398] Labeled node ip-10-0-52-146.us-east-2.compute.internal with node-role.kubernetes.io/runc-rhcos10-guard
    STEP: Creating custom MachineConfigPool pinned to rhel-9 @ 06/10/26 18:59:47.819
    STEP: Creating ContainerRuntimeConfig that sets default runtime to runc for the custom pool @ 06/10/26 18:59:48.057
    STEP: Waiting for node to join the custom pool @ 06/10/26 18:59:48.296
  I0610 18:59:48.529168 28909 runc_upgrade_cases.go:429] MCP runc-rhcos10-guard waiting for machine count 1 (current 0)
  I0610 18:59:58.532087 28909 runc_upgrade_cases.go:426] MCP runc-rhcos10-guard machine count reached 1
    STEP: Waiting for pool rollout on rhel-9 with runc @ 06/10/26 18:59:58.532
  I0610 18:59:58.533212 28909 node_utils.go:522] Waiting for MCP runc-rhcos10-guard to be ready (timeout: 30m0s)...
  I0610 18:59:58.767138 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:00:08.771281 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:00:18.767127 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:00:28.770810 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:00:38.766299 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:00:48.768751 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:00:58.768087 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:01:08.769373 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:01:18.771215 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:01:28.769220 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:01:38.768343 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:01:48.767514 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:01:58.803288 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:02:08.772007 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:02:18.768793 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:02:28.770728 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:02:38.768422 28909 node_utils.go:564] MachineConfigPool runc-rhcos10-guard not ready yet: updating=true, ready=false, machines=0/1
  I0610 19:02:48.769024 28909 node_utils.go:561] MachineConfigPool runc-rhcos10-guard is ready: 1/1 machines ready
    STEP: Checking default runtime is runc on RHCOS 9 @ 06/10/26 19:02:48.769
    STEP: Upgrading RHCOS version to RHCOS 10 via osImageStream @ 06/10/26 19:02:55.544
  I0610 19:02:56.256629 28909 runc_upgrade_cases.go:512] MCP runc-rhcos10-guard waiting for runc+rhel-10 guard: degraded=false renderDegraded=false message=""
  I0610 19:03:06.256954 28909 runc_upgrade_cases.go:508] MCP runc-rhcos10-guard is degraded as expected: Failed to render configuration for pool runc-rhcos10-guard: MachineConfigPool runc-rhcos10-guard targets OS image stream "rhel-10" where runc is not available. To unblock, migrate to crun by removing any ContainerRuntimeConfig that sets defaultRuntime to runc, and removing any MachineConfig that sets default_runtime = "runc" in CRI-O configuration under /etc/crio/crio.conf.d/
    STEP: Verifying cluster upgrade is blocked via CO and CVO Upgradeable=False @ 06/10/26 19:03:06.257
  I0610 19:03:06.493276 28909 runc_upgrade_cases.go:165] waiting for ClusterOperator machine-config Upgradeable=False, current status=True
  I0610 19:03:16.969176 28909 runc_upgrade_cases.go:206] ClusterOperator machine-config reports Upgradeable=False (reason DegradedPool); ClusterVersion already Upgradeable=False on a non-upgradeable feature set
    STEP: Verifying node remains ready, not rolling out, on RHCOS 9 with runc after guard blocks rollout @ 06/10/26 19:03:16.969
  I0610 19:03:17.204793 28909 runc_upgrade_cases.go:269] Node ip-10-0-52-146.us-east-2.compute.internal is Ready and not rolling out MCO config (rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46)
    STEP: Recovering pool by setting osImageStream back to rhel-9 @ 06/10/26 19:03:23.628
  I0610 19:03:24.347021 28909 runc_upgrade_cases.go:476] MCP runc-rhcos10-guard recovery in progress: degraded=true renderDegraded=true updating=false updated=true machines=1/1
  I0610 19:03:34.346099 28909 runc_upgrade_cases.go:476] MCP runc-rhcos10-guard recovery in progress: degraded=true renderDegraded=false updating=false updated=true machines=1/1
  I0610 19:03:44.345580 28909 runc_upgrade_cases.go:474] MCP runc-rhcos10-guard recovered: 1/1 machines ready
    STEP: Verifying node remains ready, not rolling out, on RHCOS 9 with runc after recovery @ 06/10/26 19:03:44.345
  I0610 19:03:44.581161 28909 runc_upgrade_cases.go:269] Node ip-10-0-52-146.us-east-2.compute.internal is Ready and not rolling out MCO config (rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46)
  I0610 19:03:50.946796 28909 client.go:689] Deleted {user.openshift.io/v1, Resource=users  e2e-test-runc-rhcos10-guard-l4h9r-user}, err: <nil>
  I0610 19:03:51.186287 28909 client.go:689] Deleted {oauth.openshift.io/v1, Resource=oauthclients  e2e-client-e2e-test-runc-rhcos10-guard-l4h9r}, err: <nil>
  I0610 19:03:51.427324 28909 client.go:689] Deleted {oauth.openshift.io/v1, Resource=oauthaccesstokens  sha256~tWa-UUGD9Smz2iuTtKnolopZ0pQW3t2BUnfMF0Q1KHA}, err: <nil>
  I0610 19:03:52.607144 28909 runc_upgrade_cases.go:429] MCP runc-rhcos10-guard waiting for machine count 0 (current 1)
  I0610 19:04:02.609102 28909 runc_upgrade_cases.go:426] MCP runc-rhcos10-guard machine count reached 0
  I0610 19:04:02.842461 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:04:13.077751 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:04:23.074983 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:04:33.071528 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:04:43.079118 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:04:53.074058 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:05:03.077638 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:05:13.074225 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:05:23.073158 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:05:33.073196 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:05:43.074853 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:05:53.075259 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:06:03.076921 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:06:13.075494 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:06:23.074745 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:06:33.077765 28909 runc_upgrade_cases.go:236] Node ip-10-0-52-146.us-east-2.compute.internal waiting for worker rollback: current="rendered-runc-rhcos10-guard-0fc3ccf9f7cf9db328bdd5ae8b38ef46" desired="rendered-worker-89ea806348f73bebafe0707724fde161"
  I0610 19:06:43.337398 28909 node_utils.go:522] Waiting for MCP worker to be ready (timeout: 30m0s)...
  I0610 19:06:43.574108 28909 node_utils.go:561] MachineConfigPool worker is ready: 3/3 machines ready
    STEP: Destroying namespace "e2e-test-runc-rhcos10-guard-l4h9r" for this suite. @ 06/10/26 19:06:43.575
  • [425.409 seconds]
  ------------------------------

  Ran 1 of 1 Specs in 425.411 seconds
  SUCCESS! -- 1 Passed | 0 Failed | 0 Pending | 0 Skipped
[
  {
    "name": "[Suite:openshift/disruptive-longrunning][sig-node][Serial][Disruptive] runc RHCOS 10 upgrade guard blocks upgrade of RHCOS 9 to 10 when ContainerRuntimeConfig sets default runtime to runc",
    "lifecycle": "blocking",
    "duration": 425411,
    "startTime": "2026-06-10 13:29:38.401773 UTC",
    "endTime": "2026-06-10 13:36:43.813577 UTC",
    "result": "passed",

Summary by CodeRabbit

  • Tests

    • Added a disruptive serial e2e test suite validating the RHCOS 9→10 upgrade guard when the cluster’s default container runtime is set to runc. Verifies pool-scoped blocking, node OS/runtime persistence, ClusterOperator/ClusterVersion signals, and includes environment skip rules.
  • Documentation

    • Added a detailed test plan for the upgrade-guard scenario with prerequisites, skip conditions, step flow, pass/fail criteria, and cleanup guidance.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: automatic mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 8, 2026
@openshift-ci-robot

openshift-ci-robot commented Jun 8, 2026

Copy link
Copy Markdown

@asahay19: This pull request references OCPNODE-4494 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

This PR adds an end-to-end test that validates MCO's guard blocking a RHCOS 9→10 OS stream transition when a MachineConfigPool has runc configured as the default container runtime. Also this PR contains runc_upgrade_cases.md file which contains the Test Plan with all the required meta data.

What the test does

  • Creates an isolated MachineConfigPool (runc-rhcos10-guard) pinned to spec.osImageStream: rhel-9 and a runc CRI-O drop-in MachineConfig
  • Labels one pure worker into the pool and waits for a healthy baseline rollout
  • Verifies the node is on RHCOS 9 with runc as the default runtime
  • Patches the pool's osImageStream to rhel-10
  • Asserts the guard fires: MCP Degraded=True + RenderDegraded=True with a message referencing runc and rhel-10
  • Confirms the node remains on RHCOS 9 with runc (rollout was blocked)
  • Cleans up: removes node label, waits for pool to drain to zero machines, deletes MC and MCP
  • Skips on: MicroShift, Hypershift

MCO change PR: openshift/machine-config-operator#5891

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 8, 2026
@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai

coderabbitai Bot commented Jun 8, 2026

Copy link
Copy Markdown

Walkthrough

Adds a disruptive Ginkgo node e2e that verifies MCO blocks an OSImageStream move rhel-9 → rhel-10 when CRI‑O default runtime is runc, with helpers to create/label/update/poll/verify/recover a dedicated MachineConfigPool and ContainerRuntimeConfig plus a companion test plan.

Changes

runc RHCOS 10 upgrade guard test

Layer / File(s) Summary
Suite setup, gating, and OSImageStream preflight
test/extended/node/runc_upgrade_cases.go
Ginkgo disruptive/serial suite, BeforeEach gating (skip MicroShift/Hypershift/SNO/topology), AfterEach ordering, and requireOSImageStreams check for rhel-9 and rhel-10.
MCP and ContainerRuntimeConfig creation + node labeling
test/extended/node/runc_upgrade_cases.go
Create pool-specific MachineConfigPool pinned to rhel-9, create pool-targeted ContainerRuntimeConfig with default_runtime: runc, and label a randomly chosen pure worker into the pool (idempotent/AlreadyExists-tolerant).
Guard detection, MCP patching and recovery polling
test/extended/node/runc_upgrade_cases.go
Patch MCP spec.OSImageStream to rhel-10, poll for Degraded+RenderDegraded (render message must reference runc and rhel-10), assert machine-config CO shows Upgradeable=False (not Degraded) and ClusterVersion remains Available, then revert and poll for MCP recovery and node rollback.
Node verification helpers
test/extended/node/runc_upgrade_cases.go
On-node checks: verify CRI-O default runtime drop-in contains runc, extract RHEL major version from /etc/os-release, and verify node Ready with matching MCO current/desired config (no ongoing rollout).
Pool OSImageStream set and deletion helpers
test/extended/node/runc_upgrade_cases.go
Helper to set MCP spec.OSImageStream and deletion helpers for CRC and MCP that treat NotFound as success.
Test documentation
test/extended/node/runc_upgrade_cases.md
UC-1 test plan: scope/skip conditions, step flow diagram, detailed pass/fail criteria, run command, CI lane suggestions, and helper mapping.

Sequence Diagram(s)

sequenceDiagram
  participant TestRunner
  participant API_Server
  participant MachineConfigOperator
  participant Node
  participant ClusterVersion
  TestRunner->>API_Server: create MachineConfigPool (rhel-9) & ContainerRuntimeConfig (runc)
  API_Server->>MachineConfigOperator: notify new MCP/MC
  MachineConfigOperator->>Node: render and apply ignition + runc drop-in
  Node-->>MachineConfigOperator: report currentConfig and OS version (rhel-9)
  TestRunner->>API_Server: patch MCP.spec.OSImageStream -> rhel-10
  API_Server->>MachineConfigOperator: new desired OSImageStream (rhel-10)
  MachineConfigOperator->>MachineConfigOperator: detect runc + rhel-10 -> set Degraded & RenderDegraded
  MachineConfigOperator->>ClusterVersion: ensure CV remains Available (no Progressing/Degraded)
  TestRunner->>API_Server: patch MCP.spec.OSImageStream -> rhel-9 (recovery)
  MachineConfigOperator->>Node: reconcile to rhel-9, clear degraded
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested reviewers

  • BhargaviGudi
  • deads2k
🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (14 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly references the Jira ticket (OCPNODE-4494) and accurately describes the main addition: a test case validating the runc upgrade guard behavior during RHCOS 9→10 transitions.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All Ginkgo test titles in runc_upgrade_cases.go are stable and deterministic, containing only static descriptive strings with no dynamic identifiers, timestamps, UUIDs, or node names.
Test Structure And Quality ✅ Passed Test has single It block, proper BeforeEach/AfterEach, timeouts on all cluster operations, and meaningful assertion messages consistent with codebase patterns.
Microshift Test Compatibility ✅ Passed Test is protected from MicroShift: BeforeEach block (lines 51-59) contains exutil.IsMicroShiftCluster() check with g.Skip(), runs before all It() blocks in the Describe wrapper.
Single Node Openshift (Sno) Test Compatibility ✅ Passed Test is protected from SNO via explicit SingleReplicaTopologyMode check with g.Skip() in BeforeEach (line 67-69), and requires pure worker nodes unavailable on SNO.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds test code (Ginkgo e2e test), not deployment/operator code. Test includes topology-aware skip conditions for SNO, Hypershift, and MicroShift. No scheduling constraints defined.
Ote Binary Stdout Contract ✅ Passed File runc_upgrade_cases.go has no process-level stdout writes: no main/init/TestMain functions, uses framework.Logf, no klog/log packages, properly registers Ginkgo v2 suite via var _.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed Test uses cluster-internal Kubernetes APIs only. No hardcoded IPv4 addresses, no IPv4-only patterns, no external connectivity requirements. All node inspection via local file reads.
No-Weak-Crypto ✅ Passed No weak cryptography (MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB), custom crypto implementations, or unsafe secret comparisons detected in the added test files.
Container-Privileges ✅ Passed Files added contain no Kubernetes manifests or container specs with privileged: true, hostPID/hostNetwork/hostIPC, SYS_ADMIN capability, or allowPrivilegeEscalation: true.
No-Sensitive-Data-In-Logs ✅ Passed No sensitive data (passwords, tokens, API keys, PII, or credentials) exposed in logs. Test logs only cluster status, node names, version info, and configuration values.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: asahay19
Once this PR has been reviewed and has the lgtm label, please assign cpmeadors for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/extended/node/runc_upgrade_cases.go`:
- Around line 59-63: The test fails on SingleReplica (SNO) clusters because it
requires a “pure worker”; update the preflight topology check that calls
exutil.GetControlPlaneTopology (variable controlPlaneTopology) to also detect
configv1.SingleReplicaTopologyMode and call g.Skip("Skipping on single-replica
(SNO) cluster") before attempting to select a pure worker. Make the same change
in the other preflight/topology-check sites that use
exutil.GetControlPlaneTopology or perform pure-worker selection (the other
occurrences referenced in the review) so the test is skipped early on
SingleReplica clusters.
- Around line 99-107: The AfterEach currently ignores all error returns from
cleanup calls (removeNodeLabel, waitForMCPMachineCount, deleteMachineConfig,
deleteMachineConfigPool) which can leave test state dirty; update AfterEach to
capture each error into a variable and assert failure instead of swallowing it
(e.g., err := removeNodeLabel(...); Expect(err).NotTo(HaveOccurred())) for each
call that uses nodeName, oc, mcClient, runcRHCOS10GuardPool, and runcGuardMCName
(and similarly for
waitForMCPMachineCount/deleteMachineConfig/deleteMachineConfigPool) so any
cleanup failure fails the test and surfaces the underlying error.

In `@test/extended/node/runc_upgrade_cases.md`:
- Around line 51-53: Add a language tag to the fenced code block containing the
test declaration g.It("blocks upgrade of RHCOS 9 to 10 when default runtime is
runc") — replace the opening triple backticks with a language-tagged fence
(e.g., ```go) so the block reads as a Go snippet and satisfies MD040.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 02321dc4-fc08-4828-8271-c00e13720ac4

📥 Commits

Reviewing files that changed from the base of the PR and between c0f50ac and 0c371a9.

📒 Files selected for processing (2)
  • test/extended/node/runc_upgrade_cases.go
  • test/extended/node/runc_upgrade_cases.md

Comment thread test/extended/node/runc_upgrade_cases.go
Comment thread test/extended/node/runc_upgrade_cases.go
Comment thread test/extended/node/runc_upgrade_cases.md Outdated
@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-ci-5.0-e2e-gcp-ovn

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/de265370-633b-11f1-9c47-870e6f6dcba6-0

@openshift-ci

openshift-ci Bot commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/67ece2a0-634f-11f1-8aab-94564898c66c-0

Comment thread test/extended/node/runc_upgrade_cases.go
Comment thread test/extended/node/runc_upgrade_cases.go
@openshift-ci openshift-ci Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 9, 2026
@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-1of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/b9a6bab0-63dd-11f1-828a-621d5ba8e722-0

@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/c1104ff0-63dd-11f1-9f15-5905ea882eb2-0

@asahay19 asahay19 force-pushed the 4494 branch 2 times, most recently from 07d64fb to 2db0de0 Compare June 9, 2026 09:14
Comment thread test/extended/node/runc_upgrade_cases.go Outdated
Comment thread test/extended/node/runc_upgrade_cases.go Outdated
Comment thread test/extended/node/runc_upgrade_cases.go Outdated
Comment thread test/extended/node/runc_upgrade_cases.go Outdated
Comment thread test/extended/node/runc_upgrade_cases.go
@openshift-ci

openshift-ci Bot commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/8921abf0-6407-11f1-9686-c8cdaefe3b6e-0


// verifyClusterVersionUnaffectedByIsolatedPoolGuard checks that a render failure on an isolated
// custom MCP does not degrade the cluster-wide machine-config operator or ClusterVersion.
// The guard is pool-scoped; worker/master pools remain healthy.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you confirm it? If so we may want to do some additional propagation.

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/38c35e00-64b2-11f1-8870-ba2c7e80b8b9-0

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
test/extended/node/runc_upgrade_cases.go (1)

300-317: ⚡ Quick win

Eliminate duplicate OSImageStream Get call.

Lines 301 and 307 both call Get(ctx, "cluster", ...) on OSImageStreams. The first result is discarded; the second is used. Capture the first result and reuse it to avoid the redundant API call.

♻️ Suggested refactor
 func requireOSImageStreams(ctx context.Context, mcClient *machineconfigclient.Clientset) {
-	_, err := mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster", metav1.GetOptions{})
+	osi, err := mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster", metav1.GetOptions{})
 	if apierrors.IsNotFound(err) {
 		g.Skip("OSImageStream API is not available; enable TechPreviewNoUpgrade / OSStreams on the cluster")
 	}
 	o.Expect(err).NotTo(o.HaveOccurred())
 
-	osi, err := mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster", metav1.GetOptions{})
-	o.Expect(err).NotTo(o.HaveOccurred())
-
 	streamNames := make([]string, 0, len(osi.Status.AvailableStreams))
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/extended/node/runc_upgrade_cases.go` around lines 300 - 317, The
function requireOSImageStreams currently calls
mcClient.MachineconfigurationV1().OSImageStreams().Get twice; capture the first
Get result (osi, err :=
mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster",
metav1.GetOptions{})), use that err for the apierrors.IsNotFound check and
subsequent assertions, and remove the redundant second Get call so the osi
variable is reused for building streamNames and logging.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/extended/node/runc_upgrade_cases.go`:
- Around line 300-317: The function requireOSImageStreams currently calls
mcClient.MachineconfigurationV1().OSImageStreams().Get twice; capture the first
Get result (osi, err :=
mcClient.MachineconfigurationV1().OSImageStreams().Get(ctx, "cluster",
metav1.GetOptions{})), use that err for the apierrors.IsNotFound check and
subsequent assertions, and remove the redundant second Get call so the osi
variable is reused for building streamNames and logging.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 24c22f72-b3ec-490f-9849-a3439e591669

📥 Commits

Reviewing files that changed from the base of the PR and between f6fc984 and c79f168.

📒 Files selected for processing (2)
  • test/extended/node/runc_upgrade_cases.go
  • test/extended/node/runc_upgrade_cases.md
✅ Files skipped from review due to trivial changes (1)
  • test/extended/node/runc_upgrade_cases.md

@openshift-ci

openshift-ci Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

@bitoku: This PR was included in a payload test run from openshift/machine-config-operator#5891
trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/1af69240-64d3-11f1-991c-fc8c3045c5fc-0

@bitoku

bitoku commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

/payload-job-with-prs periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2 openshift/machine-config-operator#5891

@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/289b6a20-657c-11f1-8b2c-d2cc517bfd33-0

@bitoku

bitoku commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

/payload-job periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

@bitoku: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command

  • periodic-ci-openshift-release-main-nightly-5.0-e2e-aws-disruptive-longrunning-techpreview-2of2

See details on https://pr-payload-tests.ci.openshift.org/runs/ci/6dac7590-65b9-11f1-8109-d881c0e1b806-0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants