Filter unreachable ICU data (-7.46 MB) + drop dead CRASH_WITH_INFO strings (-550 KB)#211
Conversation
…usables/unames) Bun has zero ucnv_/utrans_/usprep_/uspoof_ consumers in the linked binary: TextCodecICU registration is removed in TextEncodingRegistry.cpp and UCONFIG_NO_LEGACY_CONVERSION=1 is already set. The corresponding ~7.4 MB of icudt75l.dat (190 .cnv, translit/, rbnf/, *.spp, *.cfu, cnvalias.icu, unames.icu) is unreachable dead weight. After building bin/icupkg, filter data/in/icudt75l.dat in place and rebuild the data target. --auto_toc_prefix is required to avoid icupkg's 100 KB string-store overflow when the output basename differs from the input. Measured: libicudata.a 30,730,236 -> 23,268,172 bytes.
…latforms that ignore them
On Linux/Windows the WTFCrashWithInfo body is { CRASH(); } and never reads
the file/function arguments. Passing nullptr instead of __FILE__ /
WTF_PRETTY_FUNCTION at the ~17k call sites drops ~550 KB of unreferenced
.rodata strings from release builds. Darwin/PlayStation keep the
register-stuffing path. DFG_CRASH and friends that print their arguments
take WTF_PRETTY_FUNCTION directly and are unaffected.
WalkthroughAdds an ICU data-filtering step to multiple build scripts and Dockerfiles that removes converters, translit, rbnf, stringprep, confusables, and related data from icudt*.dat and rebuilds ICU; also adds a conditional CRASH_WITH_INFO macro variant for a specific Bun JSC build configuration. ChangesICU Data Filtering Across Build Systems
Assertion Macro Configuration
🚥 Pre-merge checks | ✅ 3 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Review rate limit: 0/5 reviews remaining, refill in 59 minutes and 47 seconds. Comment |
The previous placement caused VA_OPT_SUPPORTED to be left undefined on the platforms taking the Bun branch (Linux/Windows), which broke JavaScriptCore/tools/Integrity.h's IA_LOG macro on MSVC. Move the override inside the block so PP_THIRD_ARG/VA_OPT_SUPPORTED are always defined.
…achable azure.archive.ubuntu.com has been timing out from GitHub-hosted runners, the same way archive.ubuntu.com did when the original sed swap was added. Try Azure first (still faster when reachable) and fall back to the canonical mirror if apt-get update fails.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@Dockerfile`:
- Around line 36-38: The RUN fallback that retries apt-get update currently
invokes "apt-get install -y" which pulls recommended packages; update the
fallback command so the apt-get install uses "--no-install-recommends" to avoid
extra recommended packages and image bloat; locate the RUN block that contains
the retry logic (the apt-get update fallback and subsequent "apt-get install -y"
invocation) and add the "--no-install-recommends" flag to that apt-get install
call.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 9f51ff27-6be4-4ed7-8d14-d6a20a7ba190
📒 Files selected for processing (2)
DockerfileSource/WTF/wtf/Assertions.h
| RUN ( apt-get update || \ | ||
| ( sed -i 's|http://azure.archive.ubuntu.com/ubuntu|http://archive.ubuntu.com/ubuntu|g' /etc/apt/sources.list && apt-get update ) \ | ||
| ) && apt-get install -y \ |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial | ⚡ Quick win
Add --no-install-recommends to this fallback install step.
This stage already lists the packages it needs explicitly. Pulling recommended packages here adds avoidable download volume and builder image bloat on the exact path you're hardening against mirror timeouts.
Proposed diff
-RUN ( apt-get update || \
- ( sed -i 's|http://azure.archive.ubuntu.com/ubuntu|http://archive.ubuntu.com/ubuntu|g' /etc/apt/sources.list && apt-get update ) \
- ) && apt-get install -y \
+RUN ( apt-get update || \
+ ( sed -i 's|http://azure.archive.ubuntu.com/ubuntu|http://archive.ubuntu.com/ubuntu|g' /etc/apt/sources.list && apt-get update ) \
+ ) && apt-get install -y --no-install-recommends \🧰 Tools
🪛 Hadolint (2.14.0)
[info] 36-36: Avoid additional packages by specifying --no-install-recommends
(DL3015)
[warning] 36-36: Pin versions in apt get install. Instead of apt-get install <package> use apt-get install <package>=<version>
(DL3008)
🪛 Trivy (0.69.3)
[error] 36-50: 'apt-get' missing '--no-install-recommends'
'--no-install-recommends' flag is missed: '( apt-get update || ( sed -i 's|http://azure.archive.ubuntu.com/ubuntu|http://archive.ubuntu.com/ubuntu|g' /etc/apt/sources.list && apt-get update ) ) && apt-get install -y wget curl git python3 python3-pip ninja-build software-properties-common apt-transport-https ca-certificates gnupg lsb-release && rm -rf /var/lib/apt/lists/*'
Rule: DS-0029
(IaC/Dockerfile)
[warning] 37-38: 'RUN cd ...' to change directory
RUN should not be used to change directory: 'mkdir -p /tmp/clang && cd /tmp/clang && git clone --depth 1 -b ${CLANG_RELEASE} https://llvm.googlesource.com/llvm-project'. Use 'WORKDIR' statement instead.
Rule: DS-0013
(IaC/Dockerfile)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@Dockerfile` around lines 36 - 38, The RUN fallback that retries apt-get
update currently invokes "apt-get install -y" which pulls recommended packages;
update the fallback command so the apt-get install uses
"--no-install-recommends" to avoid extra recommended packages and image bloat;
locate the RUN block that contains the retry logic (the apt-get update fallback
and subsequent "apt-get install -y" invocation) and add the
"--no-install-recommends" flag to that apt-get install call.
Preview Builds
|
Two binary-size reductions for Bun, both zero observable change.
ICU data filter (−7,462,064 bytes, validated)
Bun has zero ICU-converter consumers in the linked binary:
TextEncodingRegistry.cppremovedTextCodecICUregistration,UCONFIG_NO_LEGACY_CONVERSION=1is already set, andnm bun-profile | grep -E 'ucnv_|utrans_|usprep_|uspoof_'returns 0 matches. The corresponding ~7.4 MB oficudt75l.datis unreachable:*.cnv(all converters)translit/rbnf/unames.icu*.spp(stringprep — IDNA usesuts46.nrm)*.cfu(confusables)cnvalias.icuAfter building
bin/icupkg, filterdata/in/icudt75l.datin place and rebuild the data target.--auto_toc_prefixavoids icupkg's 100 KB string-store overflow when output basename differs.Validated end-to-end with a real ICU 75 build:
libicudata.a30,730,236 → 23,268,172 bytes; 0.cnv/translit//rbnf/items remain in the resulting.dat.CRASH_WITH_INFO strings (~−550 KB)
On Linux/Windows,
WTFCrashWithInfo(int, const char*, const char*)and allWTFCrashWithInfoImploverloads are{ CRASH(); }— the file/function arguments are never read. ~17,000 call sites pass__FILE__/__PRETTY_FUNCTION__, leaving ~3,100 pretty-function strings (489 KB) + ~630 file strings (38 KB) of unreferenced.rodata.Under
USE(BUN_JSC_ADDITIONS) && !ASSERT_ENABLEDand only on the platforms where the body ignores the args, defineCRASH_WITH_INFOto passnullptr, nullptr. Darwin/PlayStation keep the register-stuffing path.DFG_CRASHand friends that print their arguments takeWTF_PRETTY_FUNCTIONdirectly and are unaffected.__LINE__is kept so call sites stay distinct for ICF.Companion to oven-sh/bun#30098.