Skip to content

Fast-path cell drawing + position-based tokenizer#39

Merged
billdenney merged 20 commits into
mainfrom
claude/perf-round2
May 13, 2026
Merged

Fast-path cell drawing + position-based tokenizer#39
billdenney merged 20 commits into
mainfrom
claude/perf-round2

Conversation

@billdenney
Copy link
Copy Markdown
Member

Round-3 profile (after D-43/D-44) showed .draw_cell_text() at the call site accumulating 73% of core_paginate's total time. The dominant costs inside it were viewport push/pop overhead (~20%) and per-cell text-width measurement and grid.text drawing.

Four changes layered on top of the round-2 baseline:

  1. Add a fast path in .draw_cell_text(): when text fits the column (needed <= col_width_in), skip the clip viewport entirely and draw directly in the parent viewport. Most numeric/short-categorical cells take this path. The clip viewport is still pushed when text might bleed.

  2. Drop the two convertUnit calls in the slow path that re-measured the just-pushed clip viewport's npc=1 dimensions. Those values are the literal clip_w and row_h we constructed it with.

  3. Pre-format each column's cell strings once via .fmt_cell_vec(data[[cs$col]][rows], na_str) before the row loop, replacing the per-cell .fmt_cell() call.

  4. Rework .tokenize_for_wrap() to track (cur_start, cur_end) positions and emit text via substr(s, cur_start, cur_end), instead of accumulating characters into cur_buf and pasting on flush. One C call per token replaces n element accumulations + paste.

Measured speedup (medians; baseline = round-2 HEAD at 7c9484c):
core_paginate 397 ms -> 270 ms (~32% faster)
core_small 146 ms -> 111 ms (~24% faster)
wrap_demos 2.97 s -> 2.80 s (~6% faster, 5-run avg)
core_wrap / figure_multi: unchanged (their bottlenecks lie elsewhere)

Documented as D-45 in design/DECISIONS.md. Full devtools::test() green.

billdenney and others added 20 commits May 12, 2026 08:58
Round-3 profile (after D-43/D-44) showed .draw_cell_text() at the call
site accumulating 73% of core_paginate's total time.  The dominant
costs inside it were viewport push/pop overhead (~20%) and per-cell
text-width measurement and grid.text drawing.

Four changes layered on top of the round-2 baseline:

1. Add a fast path in .draw_cell_text(): when text fits the column
   (needed <= col_width_in), skip the clip viewport entirely and draw
   directly in the parent viewport.  Most numeric/short-categorical
   cells take this path.  The clip viewport is still pushed when text
   might bleed.

2. Drop the two convertUnit calls in the slow path that re-measured the
   just-pushed clip viewport's npc=1 dimensions.  Those values are the
   literal clip_w and row_h we constructed it with.

3. Pre-format each column's cell strings once via
   .fmt_cell_vec(data[[cs$col]][rows], na_str) before the row loop,
   replacing the per-cell .fmt_cell() call.

4. Rework .tokenize_for_wrap() to track (cur_start, cur_end) positions
   and emit text via substr(s, cur_start, cur_end), instead of
   accumulating characters into cur_buf and pasting on flush.  One C
   call per token replaces n element accumulations + paste.

Measured speedup (medians; baseline = round-2 HEAD at 7c9484c):
  core_paginate    397 ms -> 270 ms   (~32% faster)
  core_small       146 ms -> 111 ms   (~24% faster)
  wrap_demos       2.97 s -> 2.80 s   (~6% faster, 5-run avg)
  core_wrap / figure_multi: unchanged (their bottlenecks lie elsewhere)

Documented as D-45 in design/DECISIONS.md.  Full devtools::test() green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Profile-driven follow-up to D-44.  The per-page clip-width cache
introduced there eliminated intra-page duplication but every page of
a multi-page table built a fresh env, so the same cell text (numeric
formats, visit labels, repeating categoricals) got re-measured on
each page.

Build one list of cache envs (one per column in resolved_cols) at the
top of tfl_table_to_pagelist() and thread it into every
build_table_grob() call.  All page-grobs hold a reference to the same
list; drawDetails reads x$clip_width_caches[x$col_group_idx] when the
slot is present and falls back to per-page envs otherwise.

Measured (medians, n=30):
  iris (150 rows, 5 pages)              264 ms -> 258 ms  (~2%)
  500-row synth (~17 pages, repeating
    visit/flag categoricals)            1.56 s -> 1.33 s  (~15% faster)

Short single-page tables: no change.  The gain scales with page count
and inter-page duplication, which is exactly the realistic
multi-page-clinical-listing case.

Documented as D-46 in design/DECISIONS.md.  New focused benchmark at
examples/bench_focused.R.  Full devtools::test() green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Profile-driven follow-up after D-46.  Pagination was constructing two
textGrobs per unique cell string -- one in .measure_max_string_width()
(Pass 1, widths) and another in measure_row_heights_tbl()'s inline
.memo_str_height() (Pass 2, heights).  Each construction re-ran grid's
validGP / set.gpar chain, the dominant per-call cost.

Add a single (gp_key, string) -> list(w, h) cache that lives for one
.tfl_table_to_pagelist_default() call.  Pass 1, Pass 2, and the
header-height pass all share it.  When a new entry is built, the
textGrob is consulted once and BOTH dimensions are cached, so whichever
pass needs the other dimension later gets it for free.

The gp_key namespace matches the one measure_row_heights_tbl() was
already using internally (e.g. "data_row_lh1.2", "header_row_lh1.2"),
so width-then-height on the same (category, string) hits the cache.

Multiple PDF scratch devices opened during pagination all share the
same (pg_width, pg_height) and font setup, so device-deterministic
metrics make sharing the cache across them safe.  The cache does NOT
cross into the render-device drawing phase -- that boundary still goes
through .draw_cell_text()'s separate re-measurement (D-44/D-46
territory).

Measured speedup (n=30 iterations, min-of-mins across 3 independent
runs):
  iris5p   (150 rows, 5 pages)              329 ms -> 255 ms  (~22% ↓)
  big_df   (500 rows, ~17 pages)            1.45 s -> 1.36 s  (~6%  ↓)

iris5p is the targeted scenario for pagination-heavy work; big_df is
dominated by drawing and gains less.

Documented as D-47 in design/DECISIONS.md.  Full devtools::test() green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The two existing scenarios (iris5p, big_df) cover pagination and
drawing-heavy paths but don't exercise:

- the wrap module's per-cell measurement loop on long narrative text
  (wrap_heavy: 200 rows x 5 cols, three hyphenated narrative columns),
- preview-mode device dispatch (preview_iris: iris with pdf(NULL) open
  and preview = c(1, 2, 3)), or
- the non-tfl_table per-page overhead (figure_multi: 10 ggplot pages).

Capturing all five lets follow-up commits gate themselves against the
same scenario set rather than hand-picking favorable ones each round.

Baseline medians at b196ca3 (n = 30 iters each):
  iris5p         231 ms
  big_df         1.16 s
  wrap_heavy     1.94 s
  preview_iris   155 ms
  figure_multi   601 ms

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 0 of the single-device + cache-through-drawing refactor.
Captures bench::mark medians for the five scenarios in
bench_focused.R and the top-self-time writetfl source lines from
profile_writetfl.R --quick at baseline b196ca3.

Key signals:

- table_draw.R#606 (the .measure_text_width_in re-measure inside
  .draw_cell_text) accumulates 8.9% of total time in core_paginate.
  This is the D-47 boundary line and the Phase 3 target.
- pdf() and dev.off() do not appear in any scenario's top 20
  self-time list; combined cost <3% of total.  Phase 1/2 (single
  device + scratch-pdf elimination) is therefore primarily a
  maintainability change rather than a perf change.
- core_wrap shows table_utils.R#250 (.measure_max_string_width) at
  33% of total time -- already in D-47's reach; Phase 3.5 cache
  unification will not move this much but reduces the cognitive
  load of "which cache do I pass" for future maintainers.

Decision matrix lists which phases proceed on perf grounds vs.
maintainability grounds.  Order of execution: 3.5 -> 3 -> 1 -> 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-off

Two small changes preparing for the cache-through-drawing extension:

1. `.measure_text_dims_in()` accepts NULL as the gp_key argument.  When
   NULL, the cache key is just the string -- appropriate for callers
   that own a gp-pinned cache and don't need to disambiguate gp
   categories.  Existing callers always pass a non-NULL gp_key and are
   unaffected.

2. `.measure_text_width_in()` gains a comment explaining why it stays
   as its own function rather than delegating to .measure_text_dims_in().
   The wrap module re-measures candidate strings inside a hot greedy
   loop; an early prototype that routed through .measure_text_dims_in()
   for a unified (w, h) cache cost ~20-30% on wrap_heavy, big_df, and
   preview_iris due to the extra function call and list-wrapping per
   cache hit.  Both helpers stay textGrob-based and share the same
   construction strategy; the cache shapes diverge because the inner
   loop cannot afford the indirection.

No external behavior change.  All tests green; bench_focused.R within
run-to-run variance of baseline.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3a of the single-device + cache-through-drawing refactor (D-48).
Adds the cache plumbing only; no draw-phase consumer yet, so this commit
is a pure pass-through with no observable behaviour change.

Changes:

- `tfl_table_to_pagelist()`, `.tfl_table_to_pagelist_default()`, and
  `.tfl_table_to_pagelist_sub_tfl()` accept an optional `text_dim_cache`
  argument.  When NULL (the default), `.tfl_table_to_pagelist_default()`
  allocates a fresh env locally -- equivalent to the pre-D-48 behaviour.
  When supplied, the same env is reused so its entries survive past
  pagination.

- `export_tfl.tfl_table()` allocates `pagination_cache`, passes it
  through, then attaches a "drawing cache" to every `tfl_table_grob` in
  the returned pagelist:

  * PDF mode (preview = FALSE): drawing_cache = pagination_cache, so the
    same entries populated during pagination are visible to the draw
    phase.  Reuses D-47's deterministic-font-metrics argument: pagination
    measured under a PDF scratch device with the same (pg_width,
    pg_height) the final `pdf(file)` will use, so values are
    authoritative for the render device.
  * Preview mode: drawing_cache = a fresh empty env.  The cache attached
    to grobs is empty, so drawDetails will fall through to per-cell
    re-measurement on the user's render device.  Preserves today's
    preview behaviour exactly -- preview pagination decisions and drawing
    are byte-for-byte identical to the pre-D-48 path.

- `sub_tfl` recursion forwards the same cache env so identical
  (gp_key, string) pairs across sub-tables share entries instead of
  being thrown away per group.

A follow-up commit (Phase 3b) consumes `x$text_dim_cache` in
`.draw_cell_text()` to skip the per-cell width re-measurement.

Tests green.  Benchmarks within run-to-run variance of baseline (the
cache attaches but nothing reads it yet, so any perf delta would be
attributable only to the env-allocation overhead, which is O(1) per
export_tfl call).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 3b of D-48.  The cross-phase cache plumbing landed in the
previous commit; this commit makes the drawing phase consume it.

Changes in R/table_draw.R:

- `.draw_cell_text()` gains two optional args, `text_dim_cache` and
  `gp_key`.  Width lookup order:

    1. text_dim_cache[paste0(gp_key, "\x01", text)]$w  (cross-phase)
    2. width_cache[text]                                (D-46 per-column)
    3. fresh .measure_text_width_in()                   (slowest)

  The first level is read-only; it never writes back.  Misses fall
  through to the existing D-46 path so wrap-produced strings (which
  pagination never saw in their wrapped form) still get per-column
  caching for repeats across rows / pages.

- `drawDetails.tfl_table_grob()` extracts `x$text_dim_cache` once
  (hoisted above the header block so both header and data-row paths
  use it), and computes `gp_key_by_col` matching pagination's
  ("data_row_lh<lh>", "group_col_lh<lh>") namespace from
  table_utils.R:99,250.

- `.draw_header_row()` accepts the same cache argument and uses the
  matching "header_row_lh<lh>" gp_key.  Header cells are typically a
  handful per page; the win is small but the contract is identical to
  data rows.

Profile signal (core_paginate, n=20 reps, Rprof @ 0.01s):

  Before (b196ca3):
    table_draw.R#606  .measure_text_width_in   0.71% self / 8.87% total

  After (this commit):
    table_draw.R#622  .measure_text_width_in   0.51% self / 0.51% total

The 8% drop in .measure_text_width_in total time is partially offset
by the new cache-lookup overhead (paste0 + exists/get on
text_dim_cache).  Net wall-clock change is within run-to-run variance
on this hardware (~5-10%); a more stable system will show clearer
gains.

Preview mode behaviour is preserved exactly: export_tfl.tfl_table()
attaches an empty env in preview mode, so every text_dim_cache lookup
misses and the function falls through to width_cache / fresh
measurement on the user's render device.

Tests green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1a of D-48.  Infrastructure only; nothing in R/ wires these in
yet, so this commit is a pure addition with no behaviour change.

`.open_metric_device(file, pg_width, pg_height, preview, envir)`:

- Normal mode (preview = FALSE): opens grDevices::pdf(file).  This is
  the final output PDF.  Pagination measurements and the per-page draw
  loop will both use it once Phase 1b wires the helper in.
- Preview mode: opens grDevices::pdf(NULL).  Pagination runs against
  this transient device so PDF font metrics match normal mode and
  pagination decisions stay identical to the pre-D-48 path.  The
  caller invokes `.close_metric_device()` after pagination so the
  user's pre-existing device is restored for the drawing phase.

Safety net: the helper registers an `on.exit({...; dev.off()}, add =
TRUE)` handler on the CALLER's frame via `do.call(..., envir = envir)`.
If the caller errors out mid-pagination or mid-drawing, the on.exit
fires during stack unwind and closes the device.  Without this,
interrupted runs would leak the device and eventually exhaust the
per-session limit (~64 open devices).

`.close_metric_device(md)` is idempotent: a second call (or a call
when the device is no longer current) is a no-op.  This matters
because preview-mode callers close explicitly AND the helper's
on.exit fires on caller return -- both must be safe.

Four new tests in test-export_tfl.R cover:

1. Normal mode opens pdf(file) and on.exit closes it.
2. Preview mode opens pdf(NULL) and on.exit closes it.
3. on.exit fires (device closes) even when the caller errors out.
4. .close_metric_device() is idempotent under repeat calls.

All tests green.  No production code path uses the helpers yet;
behavioural wiring lands in Phase 1b.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1b of D-48.  Moves the pdf(file) open from `.export_tfl_pages()`
up into `export_tfl.tfl_table()`, so it happens BEFORE pagination.
With Phase 3 already shipped, pagination measurements and the
per-page draw loop now share the same device.

Changes:

- `export_tfl.tfl_table()` calls `.open_metric_device()` first thing.
  The helper registers on.exit on this frame, so the device closes
  even if pagination or drawing errors out.

- Preview-mode safety: after pagination completes, the dispatcher
  explicitly calls `.close_metric_device(md)` to release the
  transient pdf(NULL).  The on.exit handler is idempotent and no-ops
  on the closed device.

- `.export_tfl_pages()` gains a `pdf_already_open = FALSE` parameter.
  When TRUE (passed by `export_tfl.tfl_table`), it skips its own
  pdf(file) open and on.exit close -- the caller owns those.  Default
  FALSE preserves behaviour for the other S3 dispatchers
  (`export_tfl.default`, `export_tfl.list`); they migrate in
  Phase 1c.

Measured speedup on `bench_focused.R` (n = 30):

  Baseline (b196ca3) -> after Phase 1b
  iris5p         231 ms -> 199 ms  (-14%)
  big_df         1.16 s -> 1.08 s  (-7%)
  wrap_heavy     1.94 s -> 1.88 s  (-3%)
  preview_iris   within run-to-run variance
  figure_multi   within run-to-run variance  (path unchanged)

Tests green.  All existing test_thats in test-export_tfl.R now run
with the device opened by `.open_metric_device()` instead of by
`.export_tfl_pages()` -- their assertions don't change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1c of D-48.  Applies the Phase-1b pattern to every other
export_tfl method so the metric device opens once per call, before
the `*_to_pagelist()` step, and `.export_tfl_pages()` no longer
double-opens.

S3 methods updated:

  export_tfl.default  (R/export_tfl.R)
  export_tfl.list     (R/export_tfl.R) -- gt / rtables / flextable /
                                          table1 / generic dispatcher
  export_tfl.ggtibble (R/ggtibble.R)
  export_tfl.gt_tbl   (R/gt.R)
  export_tfl.VTableTree (R/rtables.R)
  export_tfl.flextable  (R/flextable.R)
  export_tfl.table1   (R/table1.R)

Each gains:

  md <- .open_metric_device(file, pg_width, pg_height, preview)
  ... existing pagelist construction ...
  if (!isFALSE(preview)) .close_metric_device(md)
  .export_tfl_pages(..., pdf_already_open = TRUE)

After this commit, EVERY entry to `export_tfl()` opens its render
device exactly once via `.open_metric_device()`.  Pagination helpers
that today open their own scratch devices still nest inside the
outer device; Phase 2 closes those internal opens.

Tests green.  Bench `bench_focused.R` shows ~10-15% wall-clock drops
on tfl_table scenarios vs baseline b196ca3:

  iris5p     231 ms -> 200 ms  (~13% faster)
  big_df     1.16 s -> 1.05 s  ( ~9% faster)
  wrap_heavy 1.94 s -> 1.83 s  ( ~6% faster)

`figure_multi` is unchanged by design -- ggplot pages don't use the
text_dim_cache.  `preview_iris` is unchanged within variance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 1d of D-48.  After Phase 1b/1c, every S3 dispatcher opens a
metric device before calling `tfl_table_to_pagelist()`, which in turn
calls `compute_table_content_area()`.  The scratch pdf(NULL) opened
inside `compute_table_content_area()` was an unnecessary second
device nesting over the already-open metric device.

Changes:

- Remove the `grDevices::pdf(NULL, ...)` open and matching
  `on.exit(dev.off())`.
- Keep the outer-viewport push, but move its `popViewport()` into an
  `on.exit()` so an error mid-measurement does not leave the
  viewport on the stack.
- Update the roxygen contract: this function now REQUIRES an active
  graphics device with matching page dimensions (the metric device
  the caller opened via `.open_metric_device()`).

Tests green.  No new perf signal -- pdf(NULL)/dev.off wasn't a hot
spot in the baseline profile, so the gain is purely from removing
boilerplate and one device-nesting layer.  Total wall-clock unchanged
within variance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2a of D-48.  These two helpers run inside the pagination
pipeline, which is now invoked under the metric device opened by
`.open_metric_device()` upstream in the S3 dispatcher.  Their scratch
PDF opens were redundant nesting; remove them and rely on the
upstream device.

Changes:

- `.run_pagination_iter()` (R/table_pagelist.R): drop the
  pdf(scratch_file_rh) open and matching on.exit dev.off / unlink.
  Keep the outer-viewport push; pop it on exit so an error inside
  paginate_rows() does not leave it on the stack.
- `.resolve_natural_widths()` (R/table_columns.R): same treatment.
  Drop the pdf(scratch_file) and the explicit close-and-unlink
  block before relative-weight resolution; keep the outer-viewport
  push/pop pattern.

Roxygen contracts implicitly tightened: both helpers now require
an active graphics device with matching page dimensions.  The
caller (`.tfl_table_to_pagelist_default()`) guarantees this after
Phase 1d.

Tests: 583 passing, 0 failed.  Bench within run-to-run variance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2b of D-48.

Three helpers in R/wrap.R opened their own scratch PDF for
measurement:

  .compute_col_min_widths()    (R/wrap.R)
  .compute_wrapped_widths()    (R/wrap.R)
  .height_balance_widths()     (R/wrap.R)

All three are called from the tfl_table pipeline, which now runs
under the metric device opened by `.open_metric_device()` in the S3
dispatcher.  Drop their pdf(scratch_file) / dev.off() / unlink()
boilerplate and keep only the outer-viewport push/pop pattern, with
on.exit popping the viewport on error.

For `.height_balance_widths()` the comment about "Closing happens via
on.exit so it runs even under tryCatch failure inside the search
loop" still applies to the viewport: on.exit(popViewport()) fires
under both success and error paths, including the `error = function(e)
original` branch.

Tests: 583 passing, 0 failed.  Bench within run-to-run variance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2c of D-48.  The last two non-export_tfl callers of
grDevices::pdf() in R/.

- `.gt_grob_height()` (R/gt.R): convertHeight() now resolves against
  the metric device opened by `export_tfl.gt_tbl()` (and the
  gt_tbl path inside `export_tfl.list()`).  The function shrinks
  to a one-liner; the pg_width/pg_height parameters become
  advisory (we keep them in the signature to avoid breaking the
  one external caller in `gt_to_pagelist()`).

- `.rtables_lpp_cpp()` (R/rtables.R): the "M" character width
  measurement uses the metric device's font metrics instead of a
  hardcoded 10x10 inch scratch.  The font-context viewport is
  still pushed (so the gpar takes effect) and popped on exit.

After this commit, `grep -n "grDevices::pdf" R/` shows only THREE
locations, all in R/export_tfl.R:

  * line 310: `pdf(file)` inside `.open_metric_device()` normal mode
  * line 312: `pdf(NULL)` inside `.open_metric_device()` preview mode
  * line 380: the legacy fallback inside `.export_tfl_pages()` when
              `pdf_already_open = FALSE` -- dead code under the
              current S3 dispatcher set but kept defensive in case
              an external caller invokes `.export_tfl_pages()`
              directly.

Tests: 583 passing, 0 failed.

Bench `bench_focused.R` (n = 30) vs baseline b196ca3:

  iris5p       231 ms -> 201 ms  (~13% faster)
  big_df       1.16 s -> 987 ms  (~15% faster)
  wrap_heavy   1.94 s -> 1.87 s  ( ~4% faster)
  preview_iris within variance
  figure_multi 601 ms -> 579 ms  ( ~4% faster)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 2d of D-48.  After Phases 1/2 every internal caller of
.measure_text_dims_in() runs inside the metric device opened by
.open_metric_device().  A future regression that forgets to open one
would produce confusing downstream errors -- convertWidth() /
convertHeight() against the null device returns 0 or NA depending on
grid version, propagating wrong sizes through pagination instead of
failing.

This commit adds an early `grDevices::dev.cur() == 1L` check just
before the textGrob construction (i.e. on the slow path; cache hits
return earlier and don't pay the check cost).  On a null-device call
it now fails fast with:

  Internal: .measure_text_dims_in() requires an active graphics
  device.  This is a bug in writetfl -- the caller should be invoked
  under `.open_metric_device()`.

Verified manually:
  > writetfl:::.measure_text_dims_in("hello", grid::gpar())
  Error: Internal: .measure_text_dims_in() requires an active ...

Tests: 583 passing, 0 failed.  The guard never fires under the
package's S3 dispatchers because every entry point opens a metric
device first.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 4 of D-48.  Four new tests in test-export_tfl.R covering the
invariants that drive the single-device + cache-through-drawing
design.  Net: 583 -> 587 passing tests.

1. `export_tfl.tfl_table opens exactly one PDF device per call`

   Wraps `grDevices::pdf` with a trace counter stored in a globalenv
   sentinel (so the tracer body can resolve it across grDevices'
   evaluation frame), runs export_tfl on a tfl_table, and asserts
   the counter is exactly 1.  Locks in the Phase 1/2 invariant
   that the dispatcher opens the device and no helper opens its
   own.

2. `PDF font metrics are identical between pdf(NULL) and pdf(tempfile)`

   Measures the same string under both PDF device variants and
   asserts the widths/heights match to 1e-9 inches.  D-48 explicitly
   relies on this -- without it, pagination measurements taken on
   the temp device used in preview mode wouldn't be authoritative
   for normal mode.  D-47 made the same claim "empirically
   verified"; this commit makes the verification part of the test
   suite.

3. `.measure_text_dims_in fails fast without an active device`

   Confirms the Phase 2d safety guard fires with the expected
   "requires an active graphics device" message.

4. `tfl_table grob carries the cross-phase cache only when normal
   mode`

   Calls `tfl_table_to_pagelist()` directly with a pagination cache
   and verifies the returned grobs can carry it; cache entries are
   shaped `list(w, h)`.  Preview-mode-equivalent verifies the empty
   env attached during preview is observably empty.

Tests use the directly-invoked `tfl_table_to_pagelist` path rather
than tracing `export_tfl_page` because `trace()` on the exported S3
generic has scoping issues under devtools::test() that produced
false negatives.

All 587 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 5 of D-48.  Captures the wall-clock comparison between
b196ca3 (pre-D-48 baseline) and HEAD (post-D-48), measured on the
same machine in the same session so system load is comparable.
Reports min-of-mins across 3 independent runs as the stable
"best-case time" estimator.

Headline: .measure_text_width_in inside .draw_cell_text drops from
8.87% total to 0.5% total in core_paginate -- the line that
motivated the whole refactor.

Wall-clock min-of-mins:

  iris5p       213 ms -> 200 ms  (-6%)   pagination-heavy
  big_df       1.25 s -> 1.10 s  (-12%)  drawing-heavy, 17 pages
  wrap_heavy   2.24 s -> 1.94 s  (-13%)  wrap + draw
  preview_iris 194 ms -> 166 ms  (-14%)  preview path -- scratch-pdf elim
  figure_multi 619 ms -> 605 ms  ( -2%)  ggplot path, within variance

preview_iris improving by 14% was somewhat unexpected -- the
text_dim_cache is empty by design in preview mode, so the cache
extension delivers no benefit there.  The win is purely from
eliminating the 4 scratch PDF opens in the pagination pipeline,
each costing ~5 ms.

Phase notes section lists what each phase delivered and per-file
log of scratch-pdf removals.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase 6 of D-48.  Captures the decision and its rationale across the
four design docs that CLAUDE.md says must stay in sync with code
changes.

- design/DECISIONS.md: full D-48 entry with motivation (Phase 0
  profile signal), the eight scratch-pdf sites eliminated, the
  cross-phase cache plumbing, rejected alternatives (cache
  unification regressed wrap_heavy by 30%; user's device for
  pagination in preview would break the pagination-decision-
  invariance contract), and the measured wall-clock impact.

- design/DESIGN.md: replaces "Why does tfl_table_to_pagelist() open
  a scratch PDF device?" with a new "Why a single device per
  export_tfl() call? (D-48)" section.  Adds "Why thread
  text_dim_cache from pagination to the drawing phase? (D-48)".

- design/ARCHITECTURE.md: updates the function-hierarchy diagram to
  show .open_metric_device() / .close_metric_device() in every S3
  method, updates .gt_grob_height note ("metric device" instead of
  "scratch device"), updates compute_table_content_area note
  (outer_vp only, no scratch device).  Adds the
  `pdf_already_open = TRUE` note on .export_tfl_pages.

- CLAUDE.md: rewrites the "Device lifecycle" section to describe
  the D-48 pattern; lists which internal helpers REQUIRE an active
  device.

No NEWS.md update -- the package is still at 0.0.0.9000 and
NEWS.md does not yet exist (per CLAUDE.md, NEWS.md updates kick in
at version > 0.0.1).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
devtools::document() pickup of the new `text_dim_cache` parameter on
tfl_table_to_pagelist(), the updated D-48 contracts on
.measure_text_dims_in()/compute_table_content_area()/.gt_grob_height(),
and a few related re-renders.  No source change beyond what previous
commits already landed.

DESCRIPTION: roxygen2 7.3.3 -> 8.0.0 migration auto-applied (moved
the version tag from `RoxygenNote:` to `Config/roxygen2/version:`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@billdenney billdenney merged commit 7bc16b5 into main May 13, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant