Skip to content

refactor(tools): add offset/limit pagination to read_file tools#12471

Open
Bijit-Mondal wants to merge 4 commits into
continuedev:mainfrom
Bijit-Mondal:feat/paginated-read-file-tool
Open

refactor(tools): add offset/limit pagination to read_file tools#12471
Bijit-Mondal wants to merge 4 commits into
continuedev:mainfrom
Bijit-Mondal:feat/paginated-read-file-tool

Conversation

@Bijit-Mondal
Copy link
Copy Markdown

@Bijit-Mondal Bijit-Mondal commented May 22, 2026

Description

Adds offset and limit parameters to the read_file tool so the LLM can paginate through large files instead of hitting a hard failure. Replaces the old throw-on-large-file approach with bounded range reads that are O(1) in memory regardless of file size.

  • Core (read_file): switched from ide.readFile() (full file load) to ide.readRangeInFile() — only the requested line window is fetched from the IDE layer, never the full file.
  • CLI (read_file): replaced fs.readFileSync with fs.createReadStream + readline so lines are processed one at a time and the stream is destroyed the moment the byte cap is hit.
  • Both implementations apply a 50 KB hard byte cap with per-line truncation at 2000 chars, and return a pagination hint (Use offset=N to continue reading.) when more content is available.

Also removes throwIfFileExceedsHalfOfContext from readFileRangeImpl — the range fetch is already bounded by definition so the check wasredundant.

AI Code Review

  • Team members only: AI review runs automatically when PR is opened or marked ready for review
  • Team members can also trigger a review by commenting @continue-review

Checklist

  • I've read the contributing guide
  • The relevant docs, if any, have been updated or created
  • The relevant tests, if any, have been updated or created

Screen recording or screenshot

2nd one is what the PR does

export-1779467617979.1.1.mp4

Tests

N/A


Summary by cubic

Adds offset/limit pagination to the read_file and read_currently_open_file tools so large files are read in predictable, line-numbered windows with a next-offset hint. Core uses bounded range reads; the CLI streams lines with a ~50 KB cap to keep memory O(output).

  • New Features

    • Optional offset (1-based) and limit on both tools; defaults offset=1, limit=2000 with a MIN_LIMIT=200 clamp.
    • Core: readFile uses ide.readRangeInFile and the N+1 sentinel to detect EOF; readCurrentlyOpenFile slices the editor buffer.
    • Output: line numbers, 50 KB cap, 2000-char per-line truncation, and “Use offset=N to continue”; tool descriptions/examples updated.
  • Bug Fixes

    • Reliable EOF detection with N+1 sentinel; no false “more” on exact-boundary reads.
    • Safe pagination: clamp offset ≥ 1, enforce limit ≥ 200; stop dividing limit across parallel tool calls—only the byte cap is divided—to prevent zero-length windows and infinite loops.
    • Removed context-size throw by dropping throwIfFileExceedsHalfOfContext from range paths and related tests.

Written for commit d3fadba. Summary will update on new commits. Review in cubic

- Replace full-file loading with bounded range reads to reduce memoryusage. The read_file tool now accepts optional offset (1-based line) and limit parameters and returns a pagination hint when more content is available, so the LLM can continue reading without hitting contextlimits.

  - core: use ide.readRangeInFile() — only the requested line window is fetched, never the full file
  - cli: replace fs.readFileSync with fs.createReadStream + readline for true O(1) memory regardless of file size
  - apply 50 KB hard byte cap and 2000-char per-line truncation as secondary guards on output size
  - remove throwIfFileExceedsHalfOfContext from readFileRangeImpl — the range fetch is already bounded by definition
  - add getOptionalNumberArg utility to parseArgs
@Bijit-Mondal Bijit-Mondal requested a review from a team as a code owner May 22, 2026 12:36
@dosubot dosubot Bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label May 22, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 22, 2026

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@Bijit-Mondal
Copy link
Copy Markdown
Author

I have read the CLA Document and I hereby sign the CLA

@Bijit-Mondal
Copy link
Copy Markdown
Author

@sestinj Please Review

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 8 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="extensions/cli/src/tools/readFile.ts">

<violation number="1" location="extensions/cli/src/tools/readFile.ts:52">
P2: `readline` buffers complete lines internally before emitting the `'line'` event. For single-line files (e.g., minified JS without newlines), the entire file is loaded into memory as `rawLine` before truncation to `MAX_LINE_LENGTH` is applied. This violates the documented O(output) memory guarantee and can cause memory spikes proportional to file size.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

Comment thread extensions/cli/src/tools/readFile.ts Outdated
Comment thread extensions/cli/src/tools/readFile.ts
Clamp offset to ≥1 to avoid broken line numbering and non-advancing pagination, clamp limit to ≥1 to prevent immediate stream termination, and clamp effectiveLimit to ≥1 after parallel division to avoid 0-value limits that trigger infinite loops. Handle invalid input values like 0 or negative offset/limit.
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Reply with feedback, questions, or to request a fix.

Fix all with cubic | Re-trigger cubic

Comment thread extensions/cli/src/tools/readFile.ts Outdated
  - Clamp offset to ≥ 1 to preserve 1-based line numbering and ensure nextOffset always advances
  - Clamp limit to ≥ 1 to prevent a zero effectiveLimit that caused the stream to return linesRead=0 and loop infinitely on the same offset
  - Stop dividing limit by parallelCount: limit is a per-call value, not a shared budget; only effectiveMaxBytes (the internal context cap) is divided, avoiding both quota under-delivery and the effectiveLimit=0 collapse when limit < parallelCount
@Bijit-Mondal
Copy link
Copy Markdown
Author

Issues - #12432
and constant fallback to grep everytime the file is in big size

Replace the ambiguous `>= limit` heuristic with the N+1 sentinel pattern for reliable EOF detection in both core and CLI readFile implementations. When the returned line count exceeds the requested limit, there is unambiguously more content — no false positives on exact-boundary reads.

Also introduce MIN_LIMIT (200) to clamp caller-supplied limits, preventing excessive pagination from very small limit values while ensuring meaningful chunks are always returned.

  Changes:
  - Request limit+1 lines from the IDE/stream; trim sentinel before output
  - Replace globalLineCount-based `more` heuristic with outputLines.length > limit
  - Remove buggy `cut = false` assignment in the line-limit early-stop branch
  - Add MIN_LIMIT = 200 constant, applied in both core and CLI
@Bijit-Mondal
Copy link
Copy Markdown
Author

@sestinj added video of the changes also

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 issues found across 2 files (changes from recent commits).

Tip: Review your code locally with the cubic CLI to iterate faster.

Fix all with cubic | Re-trigger cubic

Comment thread core/tools/implementations/readFile.ts
Comment thread extensions/cli/src/tools/readFile.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

1 participant