Fast-path cell drawing + position-based tokenizer#39
Merged
Conversation
Round-3 profile (after D-43/D-44) showed .draw_cell_text() at the call site accumulating 73% of core_paginate's total time. The dominant costs inside it were viewport push/pop overhead (~20%) and per-cell text-width measurement and grid.text drawing. Four changes layered on top of the round-2 baseline: 1. Add a fast path in .draw_cell_text(): when text fits the column (needed <= col_width_in), skip the clip viewport entirely and draw directly in the parent viewport. Most numeric/short-categorical cells take this path. The clip viewport is still pushed when text might bleed. 2. Drop the two convertUnit calls in the slow path that re-measured the just-pushed clip viewport's npc=1 dimensions. Those values are the literal clip_w and row_h we constructed it with. 3. Pre-format each column's cell strings once via .fmt_cell_vec(data[[cs$col]][rows], na_str) before the row loop, replacing the per-cell .fmt_cell() call. 4. Rework .tokenize_for_wrap() to track (cur_start, cur_end) positions and emit text via substr(s, cur_start, cur_end), instead of accumulating characters into cur_buf and pasting on flush. One C call per token replaces n element accumulations + paste. Measured speedup (medians; baseline = round-2 HEAD at 7c9484c): core_paginate 397 ms -> 270 ms (~32% faster) core_small 146 ms -> 111 ms (~24% faster) wrap_demos 2.97 s -> 2.80 s (~6% faster, 5-run avg) core_wrap / figure_multi: unchanged (their bottlenecks lie elsewhere) Documented as D-45 in design/DECISIONS.md. Full devtools::test() green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Profile-driven follow-up to D-44. The per-page clip-width cache
introduced there eliminated intra-page duplication but every page of
a multi-page table built a fresh env, so the same cell text (numeric
formats, visit labels, repeating categoricals) got re-measured on
each page.
Build one list of cache envs (one per column in resolved_cols) at the
top of tfl_table_to_pagelist() and thread it into every
build_table_grob() call. All page-grobs hold a reference to the same
list; drawDetails reads x$clip_width_caches[x$col_group_idx] when the
slot is present and falls back to per-page envs otherwise.
Measured (medians, n=30):
iris (150 rows, 5 pages) 264 ms -> 258 ms (~2%)
500-row synth (~17 pages, repeating
visit/flag categoricals) 1.56 s -> 1.33 s (~15% faster)
Short single-page tables: no change. The gain scales with page count
and inter-page duplication, which is exactly the realistic
multi-page-clinical-listing case.
Documented as D-46 in design/DECISIONS.md. New focused benchmark at
examples/bench_focused.R. Full devtools::test() green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Profile-driven follow-up after D-46. Pagination was constructing two textGrobs per unique cell string -- one in .measure_max_string_width() (Pass 1, widths) and another in measure_row_heights_tbl()'s inline .memo_str_height() (Pass 2, heights). Each construction re-ran grid's validGP / set.gpar chain, the dominant per-call cost. Add a single (gp_key, string) -> list(w, h) cache that lives for one .tfl_table_to_pagelist_default() call. Pass 1, Pass 2, and the header-height pass all share it. When a new entry is built, the textGrob is consulted once and BOTH dimensions are cached, so whichever pass needs the other dimension later gets it for free. The gp_key namespace matches the one measure_row_heights_tbl() was already using internally (e.g. "data_row_lh1.2", "header_row_lh1.2"), so width-then-height on the same (category, string) hits the cache. Multiple PDF scratch devices opened during pagination all share the same (pg_width, pg_height) and font setup, so device-deterministic metrics make sharing the cache across them safe. The cache does NOT cross into the render-device drawing phase -- that boundary still goes through .draw_cell_text()'s separate re-measurement (D-44/D-46 territory). Measured speedup (n=30 iterations, min-of-mins across 3 independent runs): iris5p (150 rows, 5 pages) 329 ms -> 255 ms (~22% ↓) big_df (500 rows, ~17 pages) 1.45 s -> 1.36 s (~6% ↓) iris5p is the targeted scenario for pagination-heavy work; big_df is dominated by drawing and gains less. Documented as D-47 in design/DECISIONS.md. Full devtools::test() green. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two existing scenarios (iris5p, big_df) cover pagination and drawing-heavy paths but don't exercise: - the wrap module's per-cell measurement loop on long narrative text (wrap_heavy: 200 rows x 5 cols, three hyphenated narrative columns), - preview-mode device dispatch (preview_iris: iris with pdf(NULL) open and preview = c(1, 2, 3)), or - the non-tfl_table per-page overhead (figure_multi: 10 ggplot pages). Capturing all five lets follow-up commits gate themselves against the same scenario set rather than hand-picking favorable ones each round. Baseline medians at b196ca3 (n = 30 iters each): iris5p 231 ms big_df 1.16 s wrap_heavy 1.94 s preview_iris 155 ms figure_multi 601 ms Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 0 of the single-device + cache-through-drawing refactor. Captures bench::mark medians for the five scenarios in bench_focused.R and the top-self-time writetfl source lines from profile_writetfl.R --quick at baseline b196ca3. Key signals: - table_draw.R#606 (the .measure_text_width_in re-measure inside .draw_cell_text) accumulates 8.9% of total time in core_paginate. This is the D-47 boundary line and the Phase 3 target. - pdf() and dev.off() do not appear in any scenario's top 20 self-time list; combined cost <3% of total. Phase 1/2 (single device + scratch-pdf elimination) is therefore primarily a maintainability change rather than a perf change. - core_wrap shows table_utils.R#250 (.measure_max_string_width) at 33% of total time -- already in D-47's reach; Phase 3.5 cache unification will not move this much but reduces the cognitive load of "which cache do I pass" for future maintainers. Decision matrix lists which phases proceed on perf grounds vs. maintainability grounds. Order of execution: 3.5 -> 3 -> 1 -> 2. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-off Two small changes preparing for the cache-through-drawing extension: 1. `.measure_text_dims_in()` accepts NULL as the gp_key argument. When NULL, the cache key is just the string -- appropriate for callers that own a gp-pinned cache and don't need to disambiguate gp categories. Existing callers always pass a non-NULL gp_key and are unaffected. 2. `.measure_text_width_in()` gains a comment explaining why it stays as its own function rather than delegating to .measure_text_dims_in(). The wrap module re-measures candidate strings inside a hot greedy loop; an early prototype that routed through .measure_text_dims_in() for a unified (w, h) cache cost ~20-30% on wrap_heavy, big_df, and preview_iris due to the extra function call and list-wrapping per cache hit. Both helpers stay textGrob-based and share the same construction strategy; the cache shapes diverge because the inner loop cannot afford the indirection. No external behavior change. All tests green; bench_focused.R within run-to-run variance of baseline. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3a of the single-device + cache-through-drawing refactor (D-48).
Adds the cache plumbing only; no draw-phase consumer yet, so this commit
is a pure pass-through with no observable behaviour change.
Changes:
- `tfl_table_to_pagelist()`, `.tfl_table_to_pagelist_default()`, and
`.tfl_table_to_pagelist_sub_tfl()` accept an optional `text_dim_cache`
argument. When NULL (the default), `.tfl_table_to_pagelist_default()`
allocates a fresh env locally -- equivalent to the pre-D-48 behaviour.
When supplied, the same env is reused so its entries survive past
pagination.
- `export_tfl.tfl_table()` allocates `pagination_cache`, passes it
through, then attaches a "drawing cache" to every `tfl_table_grob` in
the returned pagelist:
* PDF mode (preview = FALSE): drawing_cache = pagination_cache, so the
same entries populated during pagination are visible to the draw
phase. Reuses D-47's deterministic-font-metrics argument: pagination
measured under a PDF scratch device with the same (pg_width,
pg_height) the final `pdf(file)` will use, so values are
authoritative for the render device.
* Preview mode: drawing_cache = a fresh empty env. The cache attached
to grobs is empty, so drawDetails will fall through to per-cell
re-measurement on the user's render device. Preserves today's
preview behaviour exactly -- preview pagination decisions and drawing
are byte-for-byte identical to the pre-D-48 path.
- `sub_tfl` recursion forwards the same cache env so identical
(gp_key, string) pairs across sub-tables share entries instead of
being thrown away per group.
A follow-up commit (Phase 3b) consumes `x$text_dim_cache` in
`.draw_cell_text()` to skip the per-cell width re-measurement.
Tests green. Benchmarks within run-to-run variance of baseline (the
cache attaches but nothing reads it yet, so any perf delta would be
attributable only to the env-allocation overhead, which is O(1) per
export_tfl call).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3b of D-48. The cross-phase cache plumbing landed in the
previous commit; this commit makes the drawing phase consume it.
Changes in R/table_draw.R:
- `.draw_cell_text()` gains two optional args, `text_dim_cache` and
`gp_key`. Width lookup order:
1. text_dim_cache[paste0(gp_key, "\x01", text)]$w (cross-phase)
2. width_cache[text] (D-46 per-column)
3. fresh .measure_text_width_in() (slowest)
The first level is read-only; it never writes back. Misses fall
through to the existing D-46 path so wrap-produced strings (which
pagination never saw in their wrapped form) still get per-column
caching for repeats across rows / pages.
- `drawDetails.tfl_table_grob()` extracts `x$text_dim_cache` once
(hoisted above the header block so both header and data-row paths
use it), and computes `gp_key_by_col` matching pagination's
("data_row_lh<lh>", "group_col_lh<lh>") namespace from
table_utils.R:99,250.
- `.draw_header_row()` accepts the same cache argument and uses the
matching "header_row_lh<lh>" gp_key. Header cells are typically a
handful per page; the win is small but the contract is identical to
data rows.
Profile signal (core_paginate, n=20 reps, Rprof @ 0.01s):
Before (b196ca3):
table_draw.R#606 .measure_text_width_in 0.71% self / 8.87% total
After (this commit):
table_draw.R#622 .measure_text_width_in 0.51% self / 0.51% total
The 8% drop in .measure_text_width_in total time is partially offset
by the new cache-lookup overhead (paste0 + exists/get on
text_dim_cache). Net wall-clock change is within run-to-run variance
on this hardware (~5-10%); a more stable system will show clearer
gains.
Preview mode behaviour is preserved exactly: export_tfl.tfl_table()
attaches an empty env in preview mode, so every text_dim_cache lookup
misses and the function falls through to width_cache / fresh
measurement on the user's render device.
Tests green.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1a of D-48. Infrastructure only; nothing in R/ wires these in
yet, so this commit is a pure addition with no behaviour change.
`.open_metric_device(file, pg_width, pg_height, preview, envir)`:
- Normal mode (preview = FALSE): opens grDevices::pdf(file). This is
the final output PDF. Pagination measurements and the per-page draw
loop will both use it once Phase 1b wires the helper in.
- Preview mode: opens grDevices::pdf(NULL). Pagination runs against
this transient device so PDF font metrics match normal mode and
pagination decisions stay identical to the pre-D-48 path. The
caller invokes `.close_metric_device()` after pagination so the
user's pre-existing device is restored for the drawing phase.
Safety net: the helper registers an `on.exit({...; dev.off()}, add =
TRUE)` handler on the CALLER's frame via `do.call(..., envir = envir)`.
If the caller errors out mid-pagination or mid-drawing, the on.exit
fires during stack unwind and closes the device. Without this,
interrupted runs would leak the device and eventually exhaust the
per-session limit (~64 open devices).
`.close_metric_device(md)` is idempotent: a second call (or a call
when the device is no longer current) is a no-op. This matters
because preview-mode callers close explicitly AND the helper's
on.exit fires on caller return -- both must be safe.
Four new tests in test-export_tfl.R cover:
1. Normal mode opens pdf(file) and on.exit closes it.
2. Preview mode opens pdf(NULL) and on.exit closes it.
3. on.exit fires (device closes) even when the caller errors out.
4. .close_metric_device() is idempotent under repeat calls.
All tests green. No production code path uses the helpers yet;
behavioural wiring lands in Phase 1b.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1b of D-48. Moves the pdf(file) open from `.export_tfl_pages()` up into `export_tfl.tfl_table()`, so it happens BEFORE pagination. With Phase 3 already shipped, pagination measurements and the per-page draw loop now share the same device. Changes: - `export_tfl.tfl_table()` calls `.open_metric_device()` first thing. The helper registers on.exit on this frame, so the device closes even if pagination or drawing errors out. - Preview-mode safety: after pagination completes, the dispatcher explicitly calls `.close_metric_device(md)` to release the transient pdf(NULL). The on.exit handler is idempotent and no-ops on the closed device. - `.export_tfl_pages()` gains a `pdf_already_open = FALSE` parameter. When TRUE (passed by `export_tfl.tfl_table`), it skips its own pdf(file) open and on.exit close -- the caller owns those. Default FALSE preserves behaviour for the other S3 dispatchers (`export_tfl.default`, `export_tfl.list`); they migrate in Phase 1c. Measured speedup on `bench_focused.R` (n = 30): Baseline (b196ca3) -> after Phase 1b iris5p 231 ms -> 199 ms (-14%) big_df 1.16 s -> 1.08 s (-7%) wrap_heavy 1.94 s -> 1.88 s (-3%) preview_iris within run-to-run variance figure_multi within run-to-run variance (path unchanged) Tests green. All existing test_thats in test-export_tfl.R now run with the device opened by `.open_metric_device()` instead of by `.export_tfl_pages()` -- their assertions don't change. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1c of D-48. Applies the Phase-1b pattern to every other
export_tfl method so the metric device opens once per call, before
the `*_to_pagelist()` step, and `.export_tfl_pages()` no longer
double-opens.
S3 methods updated:
export_tfl.default (R/export_tfl.R)
export_tfl.list (R/export_tfl.R) -- gt / rtables / flextable /
table1 / generic dispatcher
export_tfl.ggtibble (R/ggtibble.R)
export_tfl.gt_tbl (R/gt.R)
export_tfl.VTableTree (R/rtables.R)
export_tfl.flextable (R/flextable.R)
export_tfl.table1 (R/table1.R)
Each gains:
md <- .open_metric_device(file, pg_width, pg_height, preview)
... existing pagelist construction ...
if (!isFALSE(preview)) .close_metric_device(md)
.export_tfl_pages(..., pdf_already_open = TRUE)
After this commit, EVERY entry to `export_tfl()` opens its render
device exactly once via `.open_metric_device()`. Pagination helpers
that today open their own scratch devices still nest inside the
outer device; Phase 2 closes those internal opens.
Tests green. Bench `bench_focused.R` shows ~10-15% wall-clock drops
on tfl_table scenarios vs baseline b196ca3:
iris5p 231 ms -> 200 ms (~13% faster)
big_df 1.16 s -> 1.05 s ( ~9% faster)
wrap_heavy 1.94 s -> 1.83 s ( ~6% faster)
`figure_multi` is unchanged by design -- ggplot pages don't use the
text_dim_cache. `preview_iris` is unchanged within variance.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1d of D-48. After Phase 1b/1c, every S3 dispatcher opens a metric device before calling `tfl_table_to_pagelist()`, which in turn calls `compute_table_content_area()`. The scratch pdf(NULL) opened inside `compute_table_content_area()` was an unnecessary second device nesting over the already-open metric device. Changes: - Remove the `grDevices::pdf(NULL, ...)` open and matching `on.exit(dev.off())`. - Keep the outer-viewport push, but move its `popViewport()` into an `on.exit()` so an error mid-measurement does not leave the viewport on the stack. - Update the roxygen contract: this function now REQUIRES an active graphics device with matching page dimensions (the metric device the caller opened via `.open_metric_device()`). Tests green. No new perf signal -- pdf(NULL)/dev.off wasn't a hot spot in the baseline profile, so the gain is purely from removing boilerplate and one device-nesting layer. Total wall-clock unchanged within variance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2a of D-48. These two helpers run inside the pagination pipeline, which is now invoked under the metric device opened by `.open_metric_device()` upstream in the S3 dispatcher. Their scratch PDF opens were redundant nesting; remove them and rely on the upstream device. Changes: - `.run_pagination_iter()` (R/table_pagelist.R): drop the pdf(scratch_file_rh) open and matching on.exit dev.off / unlink. Keep the outer-viewport push; pop it on exit so an error inside paginate_rows() does not leave it on the stack. - `.resolve_natural_widths()` (R/table_columns.R): same treatment. Drop the pdf(scratch_file) and the explicit close-and-unlink block before relative-weight resolution; keep the outer-viewport push/pop pattern. Roxygen contracts implicitly tightened: both helpers now require an active graphics device with matching page dimensions. The caller (`.tfl_table_to_pagelist_default()`) guarantees this after Phase 1d. Tests: 583 passing, 0 failed. Bench within run-to-run variance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2b of D-48. Three helpers in R/wrap.R opened their own scratch PDF for measurement: .compute_col_min_widths() (R/wrap.R) .compute_wrapped_widths() (R/wrap.R) .height_balance_widths() (R/wrap.R) All three are called from the tfl_table pipeline, which now runs under the metric device opened by `.open_metric_device()` in the S3 dispatcher. Drop their pdf(scratch_file) / dev.off() / unlink() boilerplate and keep only the outer-viewport push/pop pattern, with on.exit popping the viewport on error. For `.height_balance_widths()` the comment about "Closing happens via on.exit so it runs even under tryCatch failure inside the search loop" still applies to the viewport: on.exit(popViewport()) fires under both success and error paths, including the `error = function(e) original` branch. Tests: 583 passing, 0 failed. Bench within run-to-run variance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2c of D-48. The last two non-export_tfl callers of
grDevices::pdf() in R/.
- `.gt_grob_height()` (R/gt.R): convertHeight() now resolves against
the metric device opened by `export_tfl.gt_tbl()` (and the
gt_tbl path inside `export_tfl.list()`). The function shrinks
to a one-liner; the pg_width/pg_height parameters become
advisory (we keep them in the signature to avoid breaking the
one external caller in `gt_to_pagelist()`).
- `.rtables_lpp_cpp()` (R/rtables.R): the "M" character width
measurement uses the metric device's font metrics instead of a
hardcoded 10x10 inch scratch. The font-context viewport is
still pushed (so the gpar takes effect) and popped on exit.
After this commit, `grep -n "grDevices::pdf" R/` shows only THREE
locations, all in R/export_tfl.R:
* line 310: `pdf(file)` inside `.open_metric_device()` normal mode
* line 312: `pdf(NULL)` inside `.open_metric_device()` preview mode
* line 380: the legacy fallback inside `.export_tfl_pages()` when
`pdf_already_open = FALSE` -- dead code under the
current S3 dispatcher set but kept defensive in case
an external caller invokes `.export_tfl_pages()`
directly.
Tests: 583 passing, 0 failed.
Bench `bench_focused.R` (n = 30) vs baseline b196ca3:
iris5p 231 ms -> 201 ms (~13% faster)
big_df 1.16 s -> 987 ms (~15% faster)
wrap_heavy 1.94 s -> 1.87 s ( ~4% faster)
preview_iris within variance
figure_multi 601 ms -> 579 ms ( ~4% faster)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2d of D-48. After Phases 1/2 every internal caller of
.measure_text_dims_in() runs inside the metric device opened by
.open_metric_device(). A future regression that forgets to open one
would produce confusing downstream errors -- convertWidth() /
convertHeight() against the null device returns 0 or NA depending on
grid version, propagating wrong sizes through pagination instead of
failing.
This commit adds an early `grDevices::dev.cur() == 1L` check just
before the textGrob construction (i.e. on the slow path; cache hits
return earlier and don't pay the check cost). On a null-device call
it now fails fast with:
Internal: .measure_text_dims_in() requires an active graphics
device. This is a bug in writetfl -- the caller should be invoked
under `.open_metric_device()`.
Verified manually:
> writetfl:::.measure_text_dims_in("hello", grid::gpar())
Error: Internal: .measure_text_dims_in() requires an active ...
Tests: 583 passing, 0 failed. The guard never fires under the
package's S3 dispatchers because every entry point opens a metric
device first.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 4 of D-48. Four new tests in test-export_tfl.R covering the invariants that drive the single-device + cache-through-drawing design. Net: 583 -> 587 passing tests. 1. `export_tfl.tfl_table opens exactly one PDF device per call` Wraps `grDevices::pdf` with a trace counter stored in a globalenv sentinel (so the tracer body can resolve it across grDevices' evaluation frame), runs export_tfl on a tfl_table, and asserts the counter is exactly 1. Locks in the Phase 1/2 invariant that the dispatcher opens the device and no helper opens its own. 2. `PDF font metrics are identical between pdf(NULL) and pdf(tempfile)` Measures the same string under both PDF device variants and asserts the widths/heights match to 1e-9 inches. D-48 explicitly relies on this -- without it, pagination measurements taken on the temp device used in preview mode wouldn't be authoritative for normal mode. D-47 made the same claim "empirically verified"; this commit makes the verification part of the test suite. 3. `.measure_text_dims_in fails fast without an active device` Confirms the Phase 2d safety guard fires with the expected "requires an active graphics device" message. 4. `tfl_table grob carries the cross-phase cache only when normal mode` Calls `tfl_table_to_pagelist()` directly with a pagination cache and verifies the returned grobs can carry it; cache entries are shaped `list(w, h)`. Preview-mode-equivalent verifies the empty env attached during preview is observably empty. Tests use the directly-invoked `tfl_table_to_pagelist` path rather than tracing `export_tfl_page` because `trace()` on the exported S3 generic has scoping issues under devtools::test() that produced false negatives. All 587 tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 5 of D-48. Captures the wall-clock comparison between b196ca3 (pre-D-48 baseline) and HEAD (post-D-48), measured on the same machine in the same session so system load is comparable. Reports min-of-mins across 3 independent runs as the stable "best-case time" estimator. Headline: .measure_text_width_in inside .draw_cell_text drops from 8.87% total to 0.5% total in core_paginate -- the line that motivated the whole refactor. Wall-clock min-of-mins: iris5p 213 ms -> 200 ms (-6%) pagination-heavy big_df 1.25 s -> 1.10 s (-12%) drawing-heavy, 17 pages wrap_heavy 2.24 s -> 1.94 s (-13%) wrap + draw preview_iris 194 ms -> 166 ms (-14%) preview path -- scratch-pdf elim figure_multi 619 ms -> 605 ms ( -2%) ggplot path, within variance preview_iris improving by 14% was somewhat unexpected -- the text_dim_cache is empty by design in preview mode, so the cache extension delivers no benefit there. The win is purely from eliminating the 4 scratch PDF opens in the pagination pipeline, each costing ~5 ms. Phase notes section lists what each phase delivered and per-file log of scratch-pdf removals. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 6 of D-48. Captures the decision and its rationale across the
four design docs that CLAUDE.md says must stay in sync with code
changes.
- design/DECISIONS.md: full D-48 entry with motivation (Phase 0
profile signal), the eight scratch-pdf sites eliminated, the
cross-phase cache plumbing, rejected alternatives (cache
unification regressed wrap_heavy by 30%; user's device for
pagination in preview would break the pagination-decision-
invariance contract), and the measured wall-clock impact.
- design/DESIGN.md: replaces "Why does tfl_table_to_pagelist() open
a scratch PDF device?" with a new "Why a single device per
export_tfl() call? (D-48)" section. Adds "Why thread
text_dim_cache from pagination to the drawing phase? (D-48)".
- design/ARCHITECTURE.md: updates the function-hierarchy diagram to
show .open_metric_device() / .close_metric_device() in every S3
method, updates .gt_grob_height note ("metric device" instead of
"scratch device"), updates compute_table_content_area note
(outer_vp only, no scratch device). Adds the
`pdf_already_open = TRUE` note on .export_tfl_pages.
- CLAUDE.md: rewrites the "Device lifecycle" section to describe
the D-48 pattern; lists which internal helpers REQUIRE an active
device.
No NEWS.md update -- the package is still at 0.0.0.9000 and
NEWS.md does not yet exist (per CLAUDE.md, NEWS.md updates kick in
at version > 0.0.1).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
devtools::document() pickup of the new `text_dim_cache` parameter on tfl_table_to_pagelist(), the updated D-48 contracts on .measure_text_dims_in()/compute_table_content_area()/.gt_grob_height(), and a few related re-renders. No source change beyond what previous commits already landed. DESCRIPTION: roxygen2 7.3.3 -> 8.0.0 migration auto-applied (moved the version tag from `RoxygenNote:` to `Config/roxygen2/version:`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Round-3 profile (after D-43/D-44) showed .draw_cell_text() at the call site accumulating 73% of core_paginate's total time. The dominant costs inside it were viewport push/pop overhead (~20%) and per-cell text-width measurement and grid.text drawing.
Four changes layered on top of the round-2 baseline:
Add a fast path in .draw_cell_text(): when text fits the column (needed <= col_width_in), skip the clip viewport entirely and draw directly in the parent viewport. Most numeric/short-categorical cells take this path. The clip viewport is still pushed when text might bleed.
Drop the two convertUnit calls in the slow path that re-measured the just-pushed clip viewport's npc=1 dimensions. Those values are the literal clip_w and row_h we constructed it with.
Pre-format each column's cell strings once via .fmt_cell_vec(data[[cs$col]][rows], na_str) before the row loop, replacing the per-cell .fmt_cell() call.
Rework .tokenize_for_wrap() to track (cur_start, cur_end) positions and emit text via substr(s, cur_start, cur_end), instead of accumulating characters into cur_buf and pasting on flush. One C call per token replaces n element accumulations + paste.
Measured speedup (medians; baseline = round-2 HEAD at 7c9484c):
core_paginate 397 ms -> 270 ms (~32% faster)
core_small 146 ms -> 111 ms (~24% faster)
wrap_demos 2.97 s -> 2.80 s (~6% faster, 5-run avg)
core_wrap / figure_multi: unchanged (their bottlenecks lie elsewhere)
Documented as D-45 in design/DECISIONS.md. Full devtools::test() green.