Fix critical SSH connection leak and task deduplication false-positive by kartikbhartiya · Pull Request #1353 · openwisp/openwisp-controller

kartikbhartiya · 2026-05-14T01:53:47Z

PR: Fix critical SSH connection leak and task deduplication false-positive

Summary

This PR fixes two critical bugs in the connection module that can cause SSH connection exhaustion on managed devices and silently dropped configuration updates.

Bug 1: SSH connection leak in `DeviceConnection.update_config()`

File: openwisp_controller/connection/base/models.py

Severity: 🔴 Critical — causes resource exhaustion on managed devices

Problem

When connector_instance.update_config() raises an exception (e.g. SSH command failure, network timeout), disconnect() is never called because it was placed in the else branch of the try/except:

# BEFORE (broken)
def update_config(self):
    self.connect()
    if self.is_working:
        try:
            self.connector_instance.update_config()
        except Exception as e:
            logger.exception(e)
        else:                    # ← only runs on SUCCESS
            self.disconnect()    # ← connection leaked on failure

Every failed config push leaks an open SSH connection. Over time, this exhausts the SSH connection limit (MaxSessions / MaxStartups) on managed OpenWrt devices, causing all subsequent SSH operations to fail — including manual access by administrators.

Fix

Move disconnect() from else to finally so the connection is always cleaned up:

-            else:
+            finally:
                 self.disconnect()

Bug 2: False-positive in concurrent `update_config` task deduplication

File: openwisp_controller/connection/tasks.py

Severity: 🔴 Critical — silently drops legitimate configuration updates

Problem

The _is_update_in_progress() function checks whether another Celery task is already updating a device's config. It uses a substring match to find the device ID in the task's args:

# BEFORE (broken)
str(device_id) in task["args"]   # ← substring match on string repr!

task["args"] is a string representation like "('550e8400-e29b-41d4-a716-446655440000',)". The in operator performs a substring search, meaning:

Device "4466" would match a task running for device "550e8400-e29b-41d4-a716-446655440000" because "4466" appears as a substring
This causes _is_update_in_progress() to return True incorrectly
The config update for the wrong device is silently dropped

This is especially dangerous with UUID-based IDs where partial hex collisions are plausible at scale.

Fix

Parse the args list and compare device IDs with strict equality instead of substring matching:

+    device_id_str = str(device_id)
     ...
-                and str(device_id) in task["args"]
+                and device_id_str in [str(arg) for arg in task.get("args", [])]

Files Changed

File	Change
models.py	`else` → `finally` for `disconnect()` in `update_config()`
tasks.py	Substring match → exact element match in `_is_update_in_progress()`

Testing

Both fixes are backward-compatible and require no migration. Existing tests should continue to pass since:

The disconnect() call now runs in more scenarios (superset of previous behavior)
The task deduplication check is now more precise (fewer false positives)

coderabbitai · 2026-05-14T01:54:02Z

📝 Walkthrough

Walkthrough

This PR contains two targeted fixes to the connection management system. The first change ensures that device connections are properly disconnected even when configuration updates fail, by moving the disconnect call into a finally block. The second change improves task detection logic to correctly identify concurrent update_config tasks by comparing stringified device identifiers against stringified task arguments, rather than relying on direct containment checks.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

openwisp/openwisp-controller#1292: Both PRs modify openwisp_controller/connection/tasks.py, specifically the _is_update_in_progress duplicate-detection logic around Celery update_config tasks.

Suggested reviewers

nemesifier

Caution

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

Ignore

❌ Failed checks (1 error, 1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Bug Fixes	❌ Error	Regression tests for both bugs are insufficient. Bug 1 lacks any test validating disconnect is called on exception. Bug 2's tests don't reproduce the UUID substring collision scenario.	Add test mocking connector_instance.update_config raising exception and verify disconnect is called. Add test for _is_update_in_progress with UUID substring collision (one ID substring of another).
Title check	⚠️ Warning	The pull request title lacks the required [type] prefix (e.g., [fix], [feature]) as specified in the repository's title requirements.	Update the title to follow the format: [fix] Fix critical SSH connection leak and task deduplication false-positive
Description check	❓ Inconclusive	The pull request description is comprehensive and well-structured, covering the problem, fix, and testing approach, though it does not reference a specific issue number as required by the template.	Add a 'Reference to Existing Issue' section with 'Closes #' to complete the required template checklist format.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

Generate code and open pull requests
Plan features and break down work
Investigate incidents and troubleshoot customer tickets together
Automate recurring tasks and respond to alerts with triggers
Summarize progress and report instantly

Built for teams:

Shared memory across your entire org—no repeating context
Per-thread sandboxes to safely plan and execute work
Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

This PR addresses two bugs in the connection module: an SSH connection leak when update_config() fails, and a brittleness in the concurrent-task deduplication check.

Changes:

In AbstractDeviceConnection.update_config(), move self.disconnect() from the else branch to a finally branch so the connection is released even when the connector raises.
In _is_update_in_progress(), replace str(device_id) in task["args"] with an explicit element-wise comparison against stringified args, also defensively using task.get("args", []).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File	Description
openwisp_controller/connection/base/models.py	Switch `else` to `finally` to ensure `disconnect()` always runs after `update_config()` attempts.
openwisp_controller/connection/tasks.py	Use list membership of stringified args instead of a containment check on `task["args"]` to avoid false positives.

Notes on the tasks.py change: the existing tests in tests/test_tasks.py mock args as a list (e.g. "args": [device_id]), so the previous str(device_id) in task["args"] was already doing list-membership (not substring matching) under tests. The PR description's example of substring matching on a string-repr only applies if Celery returns args as a string in some configurations. The new code is still strictly safer and remains compatible with the existing tests, but the description slightly overstates the original bug for the in-repo behavior.

The models.py change is a clear, well-scoped fix. Both changes are minimal and backward compatible.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

openwisp-companion · 2026-05-14T01:55:56Z

Dependency Conflict in `openwisp-controller`

Hello @kartikbhartiya,
(Analysis for commit 5b0138a)

There is a dependency conflict between openwisp-controller and openwisp-ipam due to incompatible versions of django-reversion.

openwisp-controller requires django-reversion~=6.0.0.
openwisp-ipam requires django-reversion~=6.1.0.

To resolve this, please update the version specifier for django-reversion in your setup.py or requirements.txt to a version that satisfies both dependencies, or remove the specific version constraints to allow pip to find a compatible version.

kilo-code-bot · 2026-05-14T01:56:04Z

Code Review Summary

Status: 2 Issues Found | Recommendation: Address before merge

Overview

Severity	Count
CRITICAL	2
WARNING	0
SUGGESTION	0

Both code fixes are correct and important, but this PR is missing required regression tests per project policy.

Issue Details (click to expand)

CRITICAL

Missing regression test for connection leak fix
- File: openwisp_controller/connection/base/models.py
- Line: 372 (the finally: change)
- Issue: Bug fixes must include a regression test that fails without the patch and passes with it. There is no test verifying that disconnect() is called when update_config() raises an exception.
Missing regression test for substring matching fix
- File: openwisp_controller/connection/tasks.py
- Line: 30 (the list comprehension change)
- Issue: No test verifies that partial device ID strings don't cause false positives in _is_update_in_progress(). A test should demonstrate that device ID "4466" does NOT match a task with args containing "550e8400-e29b-41d4-a716-446655440000".

Code Quality Assessment

Bug 1 Fix (else → finally): ✅ Correct

Moving disconnect() from else to finally ensures connections are always cleaned up, preventing SSH connection exhaustion on managed devices.

Bug 2 Fix (substring → exact match): ✅ Correct

Changing from str(device_id) in task["args"] to exact element matching prevents false positives when device IDs share substrings (e.g., UUID partial matches).

Policy Compliance: ❌ Missing regression tests

Per the project policy, bug fixes must include regression tests.
The existing tests in test_tasks.py and test_models.py do not cover these specific failure scenarios.

Files Reviewed (2 files)

openwisp_controller/connection/base/models.py - Connection leak fix, missing regression test
openwisp_controller/connection/tasks.py - Substring matching fix, missing regression test
openwisp_controller/connection/tests/test_tasks.py - Reviewed existing test coverage
openwisp_controller/connection/tests/test_models.py - Reviewed existing test coverage

Fix these issues in Kilo Cloud

_{Reviewed by kimi-k2.5-0127 · 172,175 tokens}

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

openwisp_controller/connection/tasks.py (1)
19-34: 🧹 Nitpick | 🔵 Trivial | 💤 Low value

Consider documenting the task signature assumption.

The check device_id_str in [str(arg) for arg in task.get("args", [])] works correctly for the current single-argument task signature (update_config(self, device_id)). If the task signature ever changes to include additional arguments, this check may need refinement to compare against a specific argument position (e.g., task.get("args", [])[0]) rather than using in on the entire args list.

Adding a brief inline comment noting this assumption would improve future maintainability.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@openwisp_controller/connection/tasks.py` around lines 19 - 34, The
_is_update_in_progress function assumes the Celery task args list contains the
device_id as a positional argument (currently the only/first arg); add a concise
inline comment near the line with device_id_str in [str(arg) for arg in
task.get("args", [])] stating this assumption and advising that if the task
signature changes (e.g., extra args), the check should be updated to compare the
specific positional index (task.get("args", [])[0]) or use a named-arg
inspection instead; reference _is_update_in_progress and _TASK_NAME in the
comment so future maintainers can find and update the logic.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@openwisp_controller/connection/tasks.py`:
- Around line 19-34: The _is_update_in_progress function assumes the Celery task
args list contains the device_id as a positional argument (currently the
only/first arg); add a concise inline comment near the line with device_id_str
in [str(arg) for arg in task.get("args", [])] stating this assumption and
advising that if the task signature changes (e.g., extra args), the check should
be updated to compare the specific positional index (task.get("args", [])[0]) or
use a named-arg inspection instead; reference _is_update_in_progress and
_TASK_NAME in the comment so future maintainers can find and update the logic.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 264a444f-f8c5-4bc5-a17d-772345e25ba3

📥 Commits

Reviewing files that changed from the base of the PR and between 868f1cd and 5b0138a.

📒 Files selected for processing (2)

openwisp_controller/connection/base/models.py
openwisp_controller/connection/tasks.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)

GitHub Check: Agent
GitHub Check: Python==3.10 | django~=5.1.0
GitHub Check: Python==3.13 | django~=5.1.0
GitHub Check: Python==3.12 | django~=5.2.0
GitHub Check: Python==3.10 | django~=4.2.0
GitHub Check: Python==3.12 | django~=4.2.0
GitHub Check: Python==3.11 | django~=5.2.0
GitHub Check: Python==3.13 | django~=5.2.0
GitHub Check: Python==3.12 | django~=5.1.0
GitHub Check: Python==3.10 | django~=5.2.0
GitHub Check: Python==3.11 | django~=5.1.0
GitHub Check: Python==3.11 | django~=4.2.0
GitHub Check: Kilo Code Review

🧰 Additional context used

📓 Path-based instructions (4)

**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp}

📄 CodeRabbit inference engine (Custom checks)

**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp}: Flag potential security vulnerabilities in code
Avoid unnecessary comments or docstrings for code that is already clear
Code formatting is compact and readable. Do not add excessive blank lines, especially inside function or method bodies
Flag unused or redundant code
Ensure variables, functions, classes, and files have descriptive and consistent names
New code must handle errors properly: log errors that cannot be resolved by the user with error level, log unusual conditions with warning level, log important background actions with info level, and provide user-facing messages for errors that the user can solve autonomously

Files:

openwisp_controller/connection/tasks.py
openwisp_controller/connection/base/models.py

**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp,sql}

📄 CodeRabbit inference engine (Custom checks)

Flag obvious performance regressions, such as heavy loops, repeated I/O, or unoptimized queries

Files:

openwisp_controller/connection/tasks.py
openwisp_controller/connection/base/models.py

**/*.{js,ts,tsx,jsx,py,java,go,cs,rb,php,c,cpp,h,hpp,sh,bash,sql}

📄 CodeRabbit inference engine (Custom checks)

Cryptic or non-obvious code (regex, complex bash commands, or hard-to-read code) must include a concise comment explaining why it is needed and why the complexity is acceptable

Files:

openwisp_controller/connection/tasks.py
openwisp_controller/connection/base/models.py

**/*.{py,html}

📄 CodeRabbit inference engine (Custom checks)

For Django pull requests, ensure all user-facing strings are marked as translatable using the Django i18n framework

Files:

openwisp_controller/connection/tasks.py
openwisp_controller/connection/base/models.py

🧠 Learnings (3)

📚 Learning: 2026-01-15T15:05:49.557Z

Learnt from: DragnEmperor
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/config/management/commands/clear_last_ip.py:38-42
Timestamp: 2026-01-15T15:05:49.557Z
Learning: In Django projects, when using select_related() to traverse relations (for example, select_related("organization__config_settings")), the traversed relation must not be deferred. If you also use .only() in the same query, include the relation name or FK field (e.g., "organization" or "organization_id") in the .only() list to avoid the error "Field X cannot be both deferred and traversed using select_related at the same time." Apply this guideline to Django code in openwisp_controller/config/management/commands/clear_last_ip.py and similar modules by ensuring any select_related with an accompanying only() includes the related field names to prevent deferred/traversed conflicts.

Applied to files:

openwisp_controller/connection/tasks.py
openwisp_controller/connection/base/models.py

📚 Learning: 2026-02-17T19:13:10.088Z

Learnt from: nemesifier
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/config/whois/commands.py:0-0
Timestamp: 2026-02-17T19:13:10.088Z
Learning: In reviews for the openwisp/openwisp-controller repository, do not propose changes based on Ruff warnings. The project does not use Ruff as its linter; ignore Ruff-related suggestions and follow the repository’s established linting and configuration rules. This guidance applies to all Python files under the openwisp_controller directory.

Applied to files:

openwisp_controller/connection/tasks.py
openwisp_controller/connection/base/models.py

📚 Learning: 2026-01-15T15:07:17.354Z

Learnt from: DragnEmperor
Repo: openwisp/openwisp-controller PR: 1175
File: openwisp_controller/geo/estimated_location/tests/tests.py:172-175
Timestamp: 2026-01-15T15:07:17.354Z
Learning: In this repository, flake8 enforces E501 (line too long) via setup.cfg (max-line-length = 88) while ruff ignores E501 via ruff.toml. Therefore, use '# noqa: E501' on lines that intentionally exceed 88 characters to satisfy flake8 without affecting ruff checks. This applies to Python files across the project (any .py) and is relevant for tests as well. Use sparingly and only where breaking lines is not feasible without hurting readability or functionality.

Applied to files:

openwisp_controller/connection/tasks.py
openwisp_controller/connection/base/models.py

🔇 Additional comments (2)

openwisp_controller/connection/base/models.py (1)

365-373: LGTM!

openwisp_controller/connection/tasks.py (1)

23-23: LGTM!

Also applies to: 30-30

Fix critical SSH connection leak and task deduplication false-positive

5b0138a

Copilot AI review requested due to automatic review settings May 14, 2026 01:53

Copilot started reviewing on behalf of kartikbhartiya May 14, 2026 01:54 View session

Copilot AI reviewed May 14, 2026

View reviewed changes

coderabbitai Bot reviewed May 14, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix critical SSH connection leak and task deduplication false-positive#1353

Fix critical SSH connection leak and task deduplication false-positive#1353
kartikbhartiya wants to merge 1 commit into
openwisp:masterfrom
kartikbhartiya:master

kartikbhartiya commented May 14, 2026

Uh oh!

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks failed

Uh oh!

Copilot AI left a comment

Uh oh!

openwisp-companion Bot commented May 14, 2026

Uh oh!

kilo-code-bot Bot commented May 14, 2026 •

edited

Loading

CRITICAL

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

kartikbhartiya commented May 14, 2026

PR: Fix critical SSH connection leak and task deduplication false-positive

Summary

Bug 1: SSH connection leak in DeviceConnection.update_config()

Problem

Fix

Bug 2: False-positive in concurrent update_config task deduplication

Problem

Fix

Files Changed

Testing

Uh oh!

coderabbitai Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Possibly related PRs

Suggested reviewers

Pre-merge checks failed

❌ Failed checks (1 error, 1 warning, 1 inconclusive)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

openwisp-companion Bot commented May 14, 2026

Dependency Conflict in openwisp-controller

Uh oh!

kilo-code-bot Bot commented May 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

Overview

CRITICAL

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Bug 1: SSH connection leak in `DeviceConnection.update_config()`

Bug 2: False-positive in concurrent `update_config` task deduplication

coderabbitai Bot commented May 14, 2026 •

edited

Loading

Dependency Conflict in `openwisp-controller`

kilo-code-bot Bot commented May 14, 2026 •

edited

Loading