fix: support multi-segment ELF layouts in DT_STRTAB resolution#46
Open
f-pound wants to merge 1 commit into
Open
fix: support multi-segment ELF layouts in DT_STRTAB resolution#46f-pound wants to merge 1 commit into
f-pound wants to merge 1 commit into
Conversation
PROBLEM
-------
build_dynsym_data() crashes with SIGSEGV (EXC_BAD_ACCESS) on ELF
shared objects that use 3+ PT_LOAD segments where .dynstr resides
in a separate RW segment rather than the text segment.
Reproduced against PySide2/Shiboken2 .so files from Binary Ninja,
but affects any ELF linked with a layout that places .dynstr
outside the text segment.
ROOT CAUSE
----------
load_dynamic_segment_data() resolves DT_STRTAB using:
obj->dynstr = &obj->mem[entry.value - elf_text_base(obj)];
This assumes DT_STRTAB always lives in the text segment. For
3-segment layouts the VMA-to-file-offset mapping differs per
segment:
LOAD elfmaster#1: vaddr=0x000000 offset=0x000000 (text, r-x)
LOAD elfmaster#2: vaddr=0x240a88 offset=0x040a88 (data, rw-)
LOAD elfmaster#3: vaddr=0x243000 offset=0x054000 (dynstr, rw-)
DT_STRTAB=0x2433c0 yields offset 0x2433c0 (past 355KB file end)
instead of correct offset 0x543c0.
Additionally, build_dynsym_data() has no bounds check on st_name
before indexing into dynstr, unlike build_symtab_data() which does.
FIX
---
1. Use elf_address_pointer() to resolve DT_STRTAB virtual address
to file offset via PT_LOAD segment walk. This handles any number
of LOAD segments correctly. Legacy text-base math kept as
fallback for edge cases.
2. Add st_name < dynseg.dynstr.size bounds check in
build_dynsym_data() for both 32-bit and 64-bit paths, matching
the existing pattern in build_symtab_data(). Out-of-bounds
indices resolve to "invalid_name_index" instead of crashing.
3. Relax DT_SYMTAB and DT_STRTAB anomaly checks to accept
addresses in any LOAD segment (via elf_address_pointer != NULL)
rather than only the text segment.
NOTE: The existing sanity_check_st_name() stub (marked XXX TODO)
remains a no-op; the bounds check in build_dynsym_data effectively
covers the dynstr case.
TESTING
-------
Verified against the crashing binary (libshiboken2.abi3.so.5.14,
497 dynamic symbols) -- all symbols now resolve correctly.
Owner
|
I have run into this. Thanks I'll take a peek at the patch |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
build_dynsym_data()crashes with SIGSEGV on ELF shared objects that use 3+ PT_LOAD segments where.dynstrresides in a separate RW segment rather than the text segment.Reproduction
Any ELF .so where
.dynstrresides in a separate RW LOAD segment triggers the crash. Example: PySide2/Shiboken2 binaries from Binary Ninja's Python distribution.Root Cause
load_dynamic_segment_data()resolves DT_STRTAB using:This assumes DT_STRTAB always lives in the text segment. For 3-segment layouts the VMA-to-file-offset mapping differs per segment:
DT_STRTAB=0x2433c0yields offset0x2433c0(past 355KB file end) instead of correct offset0x543c0.Additionally,
build_dynsym_data()has no bounds check onst_namebefore indexing intodynstr, unlikebuild_symtab_data()which does.Fix
Use
elf_address_pointer()to resolve DT_STRTAB virtual address to file offset via PT_LOAD segment walk. Handles any number of LOAD segments. Legacy text-base math kept as fallback.Add
st_name < dynseg.dynstr.sizebounds check inbuild_dynsym_data()for both 32/64-bit paths, matching the existing pattern inbuild_symtab_data().Relax DT_SYMTAB/DT_STRTAB anomaly checks to accept addresses in any LOAD segment via
elf_address_pointer() != NULL.Testing
Verified against the crashing binary (
libshiboken2.abi3.so.5.14, 497 dynamic symbols) — all symbols now resolve correctly.Patch by AstroSec LLC — discovered during recursive filesystem scanning.