fix(cluster): consolidate stderr suppression into _suppress_output()#1564
Open
OmerFDaskin wants to merge 898 commits into
Open
fix(cluster): consolidate stderr suppression into _suppress_output()#1564OmerFDaskin wants to merge 898 commits into
OmerFDaskin wants to merge 898 commits into
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…bs#1143) python -m graphify.serve graph.json --transport http --port 8080 serves the same MCP tools over the Streamable HTTP transport (spec 2025-03-26) so a single shared process can serve the graph for a whole team. - _build_server() refactors server registration into a shared factory (stdio behavior is byte-for-byte unchanged — all 52 existing tests pass) - _ApiKeyMiddleware: raw ASGI (not BaseHTTPMiddleware) preserves SSE streaming; constant-time compare; RFC-6750 case-insensitive Bearer; blank-key normalized to no-auth - DNS-rebinding protection via TransportSecuritySettings; wildcard binds disable it and print an exposure warning when no api-key is set - session_idle_timeout reaps idle stateful sessions (default 3600s) so a long-running shared server does not leak memory on client disconnect - Dockerfile + .dockerignore for containerized team deployment - 16 new tests via in-process ASGI test client (importorskip-guarded) - stdio remains the default; no change for existing setups Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…1155) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…s#1159 Graphify-Labs#1107 Graphify-Labs#1103 (graph quality + new features) Graphify-Labs#1118 — prune stale AST nodes on full re-extraction (Graphify-Labs#1116) Stamps every AST-extracted node with _origin="ast" in extract(). On a full rebuild _rebuild_code drops any AST-marked node absent from the fresh output even when its source file survives, fixing stale symbols. Backward-compat: marker-less nodes from pre-1118 graphs survive one cycle then self-heal. Graphify-Labs#1110 — stop reading images and PDFs as garbage in headless extract Images route through per-backend vision payloads (base64/data-URI/bytes for claude/openai/bedrock); non-vision backends get _strip_pixels for graceful degradation. PDFs reuse pypdf. 5MB cap, 20-image chunk limit. Graphify-Labs#1159 — Salesforce Apex extractor (.cls, .trigger) Regex-based extractor: classes, interfaces, enums, methods, triggers, SOQL/DML edges. No new dependency. Dispatched as .cls and .trigger. Graphify-Labs#1107 — Azure OpenAI Service backend (--backend azure) Uses AzureOpenAI SDK client (from existing openai package). Auto-detects when AZURE_OPENAI_API_KEY + AZURE_OPENAI_ENDPOINT both set. Uses max_completion_tokens (not deprecated max_tokens). Graphify-Labs#1103 — live PostgreSQL introspection (--postgres DSN) graphify extract --postgres "postgresql://..." introspects tables, views, routines, and FK relations via information_schema (SERIALIZABLE READ ONLY). Credentials sanitized on error. New graphify[postgres] extra (psycopg3). Union-resolved llm.py conflict: Azure functions + bedrock images= param. Fixed test_image_vision.py mock to accept timeout= kwarg (our Graphify-Labs#1112). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…hify-Labs#1160) Graphify-Labs#1154: scope numpy>=2.0 constraint to python_version>='3.13' only. numpy 1.26.4 ships no cp313 wheel so uv sync falls back to a source build requiring a C compiler. The marker avoids forcing numpy 2.x on 3.10-3.12 users who have working 1.x environments. Graphify-Labs#1160: codex platform skill now installs to .codex/skills/graphify/ instead of .agents/skills/graphify/. The hook already wrote to .codex/ so the skill destination was inconsistent. Propagates automatically through install/uninstall (both read _PLATFORM_CONFIG dynamically). Updated all codex-specific test assertions. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…(hooks, sensitive filter, score_nodes) Graphify-Labs#1170 — replace nohup with cross-platform Python detach in git hooks. Git for Windows MSYS has no nohup so post-commit/post-checkout hooks silently failed. Now uses subprocess.Popen with DETACHED_PROCESS | CREATE_NEW_PROCESS_GROUP on Windows, start_new_session=True on POSIX. Quoting-safe (argv list). Fixes Graphify-Labs#1161. Graphify-Labs#1169 — fix _is_sensitive false positives on topic-mentioning filenames. token-economics-of-recall.md and password-policy-discussion.md were silently dropped as secrets. Generic keywords (token/secret/password) now only fire when the keyword ends the filename stem or the stem is ≤2 words. Specific patterns (.env/.pem/id_rsa etc.) remain unconditional. Graphify-Labs#1165 — fix multi-word endpoint resolution in _score_nodes. graphify path "AuthService" "UserRepo" never fired the exact-match bonus because per-token comparison never equalled the full label. Now joins normalized tokens and compares against the full label and its tokenized form. O(1) per node, affects query_graph and shortest_path uniformly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…st drift (Graphify-Labs#1174 Graphify-Labs#1173 Graphify-Labs#1172 Graphify-Labs#1163) Graphify-Labs#1174: affected.py load_graph now forces directed=True before node_link_graph, matching the identical fix in serve.py and __main__.py. Undirected graphs (directed:false in graph.json) were causing in_edges to fall back to a direction-blind scan, missing true callers and reporting false positives. Regression test added. Graphify-Labs#1173: post-commit and post-checkout hook bodies now read graphify-out/.graphify_root before calling _rebuild_code, falling back to Path('.') if absent. A scoped build (graphify src/) no longer gets silently expanded to the full repo on the next commit. Tests added. Graphify-Labs#1172: Step 9 cleanup split into rm -f for fixed files and find -maxdepth 1 -delete for the chunk glob. Under fish/zsh an unmatched glob aborts the entire rm -f line, leaving temp files on disk. Fixed in the three skillgen source fragments and regenerated. Graphify-Labs#1163: detect_incremental type guard on stored mtime — if the manifest contains a dict-valued mtime (schema drift from older versions), coerce to None rather than propagating a non-numeric into comparisons. Regression test added. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds FalkorDB as a sibling option to the existing Neo4j sink, selected via `graphify export falkordb [--push redis://localhost:6379]`. - New push_to_falkordb() in graphify/export.py mirrors push_to_neo4j; FalkorDB is OpenCypher-compatible so the MERGE/SET upsert queries are identical. - export falkordb subcommand wired in graphify/__main__.py (cypher.txt when no --push, direct push otherwise). Auth is optional; target graph defaults to "graphify". - falkordb optional extra in pyproject.toml (and in the all extra). - Tests: CLI cypher generation (CI-safe) + real-FalkorDB integration tests that skip when no instance is reachable. - README extras table + command reference and CHANGELOG updated.
…on --update (Graphify-Labs#1178) Three-part fix: dedup.py: Pass 1 exact-merge now skips nodes with an empty source_file. Previously all no-source_file nodes with the same label landed in one bucket and were merged, destroying distinct symbols (third-party deps, standalone functions) that happened to share a short name. update.md (skillgen + all 13 host variants): the --update merge now passes both deleted AND changed files to prune_sources, mirroring what watch._rebuild_code already does correctly. Old nodes for re-extracted files are pruned before fresh AST is inserted — no fuzzy reconciliation needed, no cross-file collapse possible. export.py: anti-shrink guard message now names fuzzy dedup as a possible cause (not only "missing chunk files"), and advises a full rebuild as the safe recovery path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds graphify skill installation for CodeBuddy (https://www.codebuddy.ai/). CodeBuddy uses the same agent+hook mechanism as Claude Code. - graphify codebuddy install — writes ~/.codebuddy/skills/graphify/SKILL.md and a CODEBUDDY.md always-on section - graphify codebuddy uninstall — removes both cleanly - graphify install --platform codebuddy — same as above - Registers Bash + Read|Glob PreToolUse hooks in .codebuddy/settings.json - Full install/uninstall roundtrip tests (35 tests) Co-authored-by: studyzy <studyzy@gmail.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…x README hook description
graphify codebuddy install was writing CODEBUDDY.md and settings.json
but not copying the SKILL.md. Added _copy_skill_file("codebuddy") call
to match the --platform codebuddy path. README hook description updated
from "Glob and Grep" to "Bash search and file reads" to match actual
hook matchers (Bash + Read|Glob).
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…Labs#1180) The Agent Skills spec only defines name, description, license, compatibility, metadata, and allowed-tools as valid frontmatter fields. The trigger: /graphify line was non-spec, silently ignored by spec- following hosts, and flagged by agentskills validate CI checks. - gen.py: removed trigger emission from _render_frontmatter; added _is_trigger_line() helper for roundtrip allow-list - fragments/core/aider.md: removed hardcoded trigger: /graphify - platforms.toml: removed trigger doc comment and trigger="" entries - test_skillgen.py: replaced trigger-assertion tests with a single test asserting no host has trigger: in frontmatter - Regenerated all 125 skill artifacts Routing intent is preserved: the description field already contains "treated as a graphify query first" and "graphify-out/ exists". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds `graphify-mcp` as a named console script pointing to `graphify.serve:_main`, making the MCP stdio server directly invocable as a first-class CLI command from uv tool / pipx installs. MCP client configs can now use `"command": "graphify-mcp"` instead of `python -m graphify.serve`. Co-authored-by: jr2804 <jr2804@users.noreply.github.com>
Adds support for the XML-based `.slnx` solution format (VS 2022 17.13+ replacement for `.sln`). Extracts project references as `contains` edges and build dependencies as `imports` edges. XXE-protected XML parsing with size cap. Wired into `_DISPATCH` and `CODE_EXTENSIONS`. 6 new tests passing. Co-authored-by: bakgaard <bakgaard@users.noreply.github.com>
…/ URI Makes the FalkorDB option a first-class sibling of Neo4j in the agent skill, not just the export CLI: - --falkordb / --falkordb-push shorthands documented in core.md + the shared exports.md reference, so they render into all modular platform skills and read exactly like --neo4j / --neo4j-push. (The aider/devin monoliths are diff-frozen vs v8 by skillgen's roundtrip guard, so they are left untouched.) - README command reference switched to the /graphify ./raw --falkordb-push form. - Documented URI scheme is now falkordb://localhost:6379; the scheme is only informational (host/port are parsed out), so redis:// or a bare host:port remain equivalent. Regenerated skill artifacts + expected/ snapshots.
The no-push 'graphify export falkordb' path advertised 'redis-cli -x GRAPH.QUERY graphify < cypher.txt', but FalkorDB rejects that with 'query with more than one statement is not supported' - cypher.txt is a multi-statement Neo4j script. The individual statements ARE valid OpenCypher (verified by loading them one at a time), only bulk script import is unsupported. Message + skill docs now say so and point to --push (the verified load path).
…rdb) The --push/--user/--password export flags feed both the neo4j and falkordb dispatch branches, so the neo4j_ prefix was misleading - a neo4j_password that reads FALKORDB_PASSWORD made no sense. Renamed to push_uri/push_user/ push_password, and the password env lookup now reads the backend-specific var (FALKORDB_PASSWORD for falkordb, NEO4J_PASSWORD otherwise) instead of OR-ing both.
…Graphify-Labs#1197) Adds extra_body parameter support for custom/OpenAI-compat providers so users can pass provider-specific params (e.g. thinking budget for Claude via Bedrock compat). Adds multi-batch label_communities for 16k-context models — batches multiple community descriptions into a single LLM call instead of one per community. Partial batch failures are handled gracefully. Co-authored-by: EirikWolf <EirikWolf@users.noreply.github.com>
…Labs#1195) Guards _norm, _norm_label, and _strip_diacritics against None node labels that cause TypeError in unicodedata.normalize(). Fixes Graphify-Labs#1194. Consistent with existing security.py:270 precedent. Co-authored-by: freiit <freiit@users.noreply.github.com>
…edup prefix merge - analyze.py: pass length_bound=max_cycle_length to nx.simple_cycles() so networkx prunes during enumeration instead of post-filtering; drops report generation from never-returns to ~0.1s on dense graphs (Graphify-Labs#1196) - llm.py: replace hardcoded min(40+16*n,4096) label_communities token budget with _resolve_max_tokens(min(64+24*n,8192)) — 24 tok/community covers 5-word JSON entries; 8192 cap fits 16k-context models; env var now honoured (Graphify-Labs#1200) - dedup.py: add prefix-extension guard in Pass 2 and _llm_tiebreak — skip merge when one normalised label is a strict prefix of the other (getActiveSession / getActiveSessions, parseConfig / parseConfigFile). Option (a) rejected: dropping the >=12 early-out from _short_label_blocked breaks test_typo_merged (Graphify-Labs#1201) - tests/test_dedup.py: two new regression tests verifying prefix guard fires for extension pairs and does not fire for same-length typo pairs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
All 27 tree-sitter-* deps were unversioned in pyproject.toml. Users installing via 'pip install graphifyy' (the README's primary install path) bypass uv.lock entirely and resolve whatever tree-sitter-* versions PyPI happens to serve. A breaking minor bump in any grammar package can land in user installs without notice. Add explicit lower bounds (matching uv.lock) and upper bounds one minor above (or one major above for 0.x packages with frequent breaks). Ranges chosen to allow patch updates without re-pinning while blocking incompatible major/minor jumps.
1. graphify merge-chunks dumped the entire node list to the terminal instead of the node count. __main__.py concatenated merged['nodes'] (a list of dicts) into an f-string where it clearly meant len(merged['nodes']) -- the other two values in the same line use len() correctly. 2. global_graph._load_manifest silently returned a fresh empty manifest on any JSON parse error. That is reachable through normal interrupted writes (the manifest is rewritten in full on every global_add / global_remove, with no fsync or atomic rename), and the failure mode is total data loss: every tracked repo disappears from ~/.graphify/global-manifest.json on the next read. Back the corrupt file up to <path>.corrupt.<unix_ts> and print to stderr before returning the empty default. Users can then recover manually and the failure is visible rather than silent.
bandit, pip-audit, and safety are already declared in the dev dependency group but nothing in CI invokes them, so a new HIGH-severity finding or a newly-disclosed CVE in a pinned dep can land without anyone noticing until the next manual audit. Add a security-scan job that runs bandit (-ll, HIGH-severity only) and pip-audit (--strict) on every push and PR. Marked continue-on-error so this doesn't block PRs on pre-existing findings -- a follow-up should do the cleanup pass and flip the flag. safety intentionally omitted: it requires a free-tier API key for the new commercial backend, which is a setup burden for forks. pip-audit covers the same ground using the PyPI JSON advisory feed and OSV.
- security.py: replace global socket.getaddrinfo monkey-patch with per-connection _SSRFGuardedHTTPConnection/HTTPSConnection subclasses (thread-safe, closes TOCTOU) - security.py: add GRAPHIFY_MAX_GRAPH_BYTES env var override for 512MB cap (MB/GB suffix supported); improve cap error message to cite the env var - llm.py: wrap untrusted source files in XML delimiters with sha256 fingerprint; neutralise jailbreak sentinel tokens to mitigate prompt injection - dedup.py: skip code nodes in label-based dedup passes; code symbols now deduplicated by ID only, preventing distinct same-named symbols from merging - extract.py: cross-file calls resolution now consults import evidence before bailing on ambiguous callee names; emits EXTRACTED edges when named import is unambiguous - analyze.py: extend _BUILTIN_NOISE_LABELS with stdlib types and modules - __main__.py: CLAUDE.md template uses MANDATORY language for graphify-first rule; PreToolUse hook message hardened to imperative; graphify export html auto-falls back to community-aggregation view when graph.json exceeds size cap - tests/test_pg_introspect.py: add importorskip guard for tree_sitter_sql Closes Graphify-Labs#1211, Graphify-Labs#1210, Graphify-Labs#1205, Graphify-Labs#1219, Graphify-Labs#1227; resolves discussion Graphify-Labs#1019 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Patch over 0.9.0: completes the node-ID work (fully closes Graphify-Labs#1504 via injective salt Graphify-Labs#1522), stops origin_file leaking into graph.json (Graphify-Labs#1516), extends cross-file stub disambiguation to the six dedicated extractors (Graphify-Labs#1515), Java type-param skip (Graphify-Labs#1518) + record component refs (Graphify-Labs#1519), prunes a deleted import's edge on update (Graphify-Labs#1521), and retries rate-limited (429) requests instead of dropping chunks (Graphify-Labs#1523). All non-breaking. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
These were committed before .gitignore included the .DS_Store rule, so gitignore never removed them from tracking. Untrack them (they remain on local disk, just leave git) — the existing .gitignore rule keeps them out going forward. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Graphify-Labs#1499) Resolve Ruby `obj.method()` calls by the inferred type of the receiver instead of by globally-unique method name. `p = Processor.new; p.run` now emits a `calls` edge to `Processor#run` and survives name collisions with unrelated `Worker#run` definitions, where the old name-based match either resolved by luck or dropped the edge as ambiguous. Introduces graphify/resolver_registry.py, a behavior-identical formalization of the existing tail-of-extract() language resolution passes (Swift Graphify-Labs#1356, Python Graphify-Labs#1446 become registered entries), and graphify/ruby_resolution.py, its first new consumer. Receiver type is inferred only from unambiguous local `var = ClassName.new` bindings; ambiguous or unknown receivers resolve to nothing (no false positives). Note: Ruby member calls are now excluded from name-based cross-file resolution and resolved by inferred type only. This is an intentional precision-over-recall change scoped to Ruby: a cross-file `var.method` whose receiver type cannot be proven from a local `X.new` binding no longer resolves by name-luck (it produces no edge rather than a possibly wrong one), matching the project's confidence model. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…phify-Labs#1442) _call_llm (used by the dedup LLM tiebreaker) built its Anthropic and OpenAI-compatible clients with max_retries but no timeout, so requests on this path silently ignored GRAPHIFY_API_TIMEOUT — unlike the primary extraction paths (_call_openai_compat / _call_claude) which already pass both. Add timeout=_resolve_api_timeout() to both constructors. The PR branch self-neutralized: a v8 merge resolved the conflict in favor of the max_retries-bearing line and dropped the original one-line fix, so it is re-applied here on top of current v8 with max_retries preserved. Adds regression coverage for both _call_llm branches, which were previously untested. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…le (Graphify-Labs#1502) Two cross-platform fixes salvaged from Graphify-Labs#1502: - to_graphml: nx.write_graphml raises ValueError on None attribute values, so a node/edge carrying a null field crashed the export. Coerce None -> "" for node and edge attributes before writing. - save-result: add --answer-file as an alternative to --answer so long or multiline answers can be passed via a file instead of a fragile inline shell arg (notably Windows/PowerShell quoting). Exactly one of --answer / --answer-file is required. The rest of Graphify-Labs#1502 (a version downgrade and a hand-edited generated skill-windows.md that fails skillgen --check, plus duplicated windows-scripts) is left for rework on the PR. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…#1530) Generated install/skill guidance told agents to invoke a literal `skill` tool with `skill: "graphify"`, which is host-specific and not valid in every environment. The always-on AGENTS fragment, packaged artifact, expected snapshot, and _skill_registration() output now use host-generic wording: "use the installed graphify skill or instructions". Also decodes skillgen git blob reads as UTF-8 for Windows and replaces stale English code-block examples in the translated READMEs. The always-on roundtrip guard deliberately freezes the v8 baseline, so an intentional wording change would otherwise fail it. Rather than only patching the pytest mirror (which left the blocking CLI guard --always-on-roundtrip red, as the original PR did), this adds an explicit, reviewable ALWAYS_ON_SANCTIONED_EDITS registry: the guard applies the approved old->new substitution to the baseline before the byte-for-byte compare, so this exact sentence is allowed while any other drift still fails. CLI guard and pytest test now agree and CI passes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…bot) Resolve two Dependabot alerts in transitive deps: - msgpack 1.1.2 -> 1.2.1 (HIGH, GHSA-6v7p-g79w-8964): out-of-bounds read / crash on Unpacker reuse after a caught error. Pulled only via cachecontrol -> pip-audit (dev group), so not in the published wheel's closure, but a fix is available so we take it. - pydantic-settings 2.14.1 -> 2.14.2 (MEDIUM, GHSA-4xgf-cpjx-pc3j): NestedSecretsSettingsSource follows symlinks outside secrets_dir. Pulled via mcp (the [mcp]/[all] extra); graphify does not use the affected secrets-dir source, but the fix is free. Lockfile-only; both are transitive. Full suite green (2537 passed), MCP/serve tests pass on the bumped versions. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
`safety` was declared in the dev group but never invoked — the CI security-scan job only runs bandit and pip-audit, and pip-audit already provides the same dependency-CVE scanning. Its only practical effect was pulling in nltk, which carries an unpatched HIGH path-traversal advisory (GHSA-p4gq-832x-fm9v) with no fix available. Removing safety drops nltk (and safety-schemas/typer/tenacity/tomlkit) from the lockfile entirely, closing the alert with no loss of coverage. Updated the stale CI comment that referenced safety. Full suite green (2537 passed); pip-audit and bandit unaffected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…fallbacks (Graphify-Labs#1529, Graphify-Labs#1531) Graphify-Labs#1529 (regression from the 0.9.0 full-repo-relative node-ID migration): relative JS/TS imports resolve to repo-relative paths and ride the extract() id-remap to canonical node IDs, but tsconfig path-alias and workspace-package imports resolve to ABSOLUTE paths (their bases are .resolve()'d), so the import-target ID baked in the on-disk prefix and never matched the repo-relative definition node — the edge was dropped at build (common on Next.js/SvelteKit `@/`-alias codebases). The id-remap post-pass now also registers the absolute-resolved form of each input path (file-level edges) and both the input-form and absolute-form symbol prefixes (named-import edges), so alias/workspace import targets remap to the canonical ID. Verified the built graph has no orphan nodes or dangling edges. Graphify-Labs#1531: tsconfig `paths` values are ordered fallback lists (tsc tries each target until one resolves), but only targets[0] was kept. The alias map now stores all targets in order, and a single _resolve_tsconfig_alias helper (replacing six duplicated inline loops) returns the first target whose candidate exists on disk, falling back to the first candidate when none exist (no false edge). Wildcards, baseUrl, and array `extends` are preserved. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…aphify-Labs#1527) The AST cache is version-swept but the semantic/LLM cache had no pruning, so it grew unbounded: it is content-hash-keyed, so every content change or file deletion leaves a permanent orphan entry (reporter saw 152 entries for 124 live docs). This matters for the committed-cache workflow where the semantic cache is published for warm CI rebuilds. Adds prune_semantic_cache(root, live_hashes) and wires it into the end of the extract path, sweeping cache/semantic/*.json entries whose hash is not in the live set. The live set is computed from the FULL detected document set (not the incremental changed-subset, which would delete valid entries), using the same file_hash recipe save_semantic_cache uses. Best-effort (unlink guarded), only touches cache/semantic/ (.tmp and cache/ast/** untouched), and keeps the semantic cache unversioned so releases never re-bill LLM extraction. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…orts map (Graphify-Labs#1308) Workspace imports with subpath exports (e.g. `import { x } from "@scope/pkg/browser"`) now resolve through the package's `exports` map instead of falling back to a bare path. Supports string values, condition objects, nested conditions, and single-`*` wildcard patterns (`"./*": "./src/*.js"`), falling back to the existing bare-path/index resolution when there is no exports map or no match. Adapted from Graphify-Labs#1541, taking only the exports-map resolver and not that PR's competing import-node-ID normalization (current v8 already resolves the node-ID mismatch via the Graphify-Labs#1529 id-remap post-pass, and the PR's _file_stem approach regressed the relative-input alias case). Two hardening changes over the original: - `default` is consulted LAST in the condition priority (it is Node's catch-all), so a matching `import`/`module`/`svelte` condition wins. - Export targets that escape the package directory are rejected (`_contained_in_package`), so a malicious `exports` value like `"./x": "../../../etc/..."` cannot resolve to a file outside the package. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…c/init refs (Graphify-Labs#1475) Three residual ObjC extractor bugs from the Graphify-Labs#1475 thread, each reproduced against the real tree-sitter-objc grammar: 1. NS_ASSUME_NONNULL_BEGIN before @interface made the parser fail to emit a class_interface node at all (the whole interface was swallowed into ERROR nodes), so headers using the macro produced no class node. Blank the two argument-less annotation macros to equal-length spaces before parsing (offset-preserving; macro-free files are byte-identical). The reporter's "@Class breaks it" hypothesis was wrong — only the macro does. 2. Quoted `#import "X.h"` edges dangled once a `.h`/`.m` pair existed: the target used the bare stem, which the post-pass canonicalizes and then _disambiguate_colliding_node_ids salts apart by path, so the import target no longer matched. Resolve the include to a real file (mirroring _import_c), and repoint imports/imports_from edges to the header variant in _disambiguate_colliding_node_ids — taking precedence over the same-source-file salt so a `.m` importing its own `.h` resolves to the header instead of self-looping. Also repairs the equivalent latent C-include dangling bug. 3. `[[Foo alloc] init]` produced no edge — walk_calls only reconstructed selectors and skipped the receiver. Emit a `references` edge from the allocating method to the class, resolved via the unique-class stub guard (ensure_named_node + _rewire_unique_stub_nodes) so unknown/ambiguous names produce no false edge. The calls-to-init edge is deliberately deferred (init selectors are ambiguous across classes). Reported by JabberYQ with a precise repro and test repo. Adds regression tests incl. a self-loop guard on the import edges. Still open on Graphify-Labs#1475: dot-syntax property accesses (Bug 5) and @selector target-action (Bug 6b). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ERRED (Graphify-Labs#1533) A type-qualified Swift call (`Type.staticMethod()`, `Singleton.shared.method()`) names the receiver type explicitly in source, so the resolved edge is an exact reference — now emitted as EXTRACTED (1.0), matching the Python qualified-class-method pass (_resolve_python_member_calls). Instance calls whose receiver type comes from local inference (`obj.method()`) stay INFERRED (0.8). Resolution and the single-definition god-node guard are unchanged. This addresses the actionable part of Graphify-Labs#1533's "static calls" report: the edge was always produced (graphify models calls as method->method), it was just under-confident. Updated the confidence test to assert the instance/type-qualified split. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Covers Graphify-Labs#1499 (Ruby type-aware resolution), Graphify-Labs#1308/Graphify-Labs#1541 (workspace exports map), Graphify-Labs#1529 (alias/workspace import-edge regression), Graphify-Labs#1531 (tsconfig paths fallbacks), Graphify-Labs#1527 (semantic cache pruning), Graphify-Labs#1475 (three ObjC fixes), Graphify-Labs#1533 (Swift static-call confidence), Graphify-Labs#1442 (secondary LLM timeout), Graphify-Labs#1502 (GraphML null coercion + save-result --answer-file), Graphify-Labs#1530 (host-generic skill wording), and the Dependabot dep bumps. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Ruby type-aware member-call resolution and workspace exports-map resolution, the Graphify-Labs#1529 alias/workspace import-edge regression fix, tsconfig paths fallbacks, semantic-cache pruning, three ObjC extractor fixes, Swift static-call confidence, the secondary LLM timeout, GraphML null coercion, host-generic install wording, and Dependabot dep bumps. See CHANGELOG. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Extends the tsconfig path-alias resolver (Graphify-Labs#1531) with single-`*` wildcard capture and substitution: a pattern like `@app/*` or `@*/interfaces` captures the matched segment and substitutes it into each target in declared order, honoring baseUrl and tsc's longest-prefix / exact-wins specificity rules, and preserving Graphify-Labs#1531's first-existing-target-wins fallback (no false edge when nothing resolves). Builds on the _resolve_tsconfig_alias helper rather than reintroducing inline loops; multi-star patterns remain out of scope. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…raphify-Labs#1552) `export * as ns from './mod'` now creates a real symbol node for the namespace binding `ns`, registers it as a named export (so a downstream `import { ns }` resolves to it), and emits a file-level `re_exports` edge to the target module. The binding is treated as a single opaque symbol — `ns.member` accesses are deliberately NOT expanded into per-member name-matching, avoiding the over-linking that would fan false edges. Includes re-export cycle and deep-chain recursion guards. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… types (Graphify-Labs#1316) A member call through a constructor-injected dependency (`constructor(private db: Database)` ... `this.db.query()`) now produces a calls edge to the field type's method. The field->type map is captured from constructor parameter-properties, and resolution reuses the existing single-definition god-node guard (like the Swift/Python/Ruby member-call resolvers): the edge is emitted only when the field's type name resolves to exactly one class definition that owns the method, so an ambiguous or unknown/untyped field produces no edge — no global name-match fan-out. Edges are EXTRACTED (the type is explicit from the annotation). TS/JS-only and additive; scope is constructor parameter-property injection. Adds the decisive regression tests the implementation needed: two classes defining the same method name where the injected field is typed to one of them (must resolve to that one only), and an ambiguous type-name case (must emit no edge). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…raphify-Labs#1475, Graphify-Labs#1543) `self.product.name` dot-syntax now emits an `accesses` edge and `@selector(method)` emits a `calls` edge, both resolved only to an unambiguous in-scope definition (a sibling method of the same class for dot-syntax; exactly one method by exact selector name for @selector) so no false-edge fan-out occurs when multiple classes share a name. Hardened over the original PR: resolution now matches the method node id EXACTLY (a method id is _make_id(container, name)) rather than by `endswith` suffix. The substring match would mis-resolve `self.name` to a sibling `-surname` (false positive) and, when a substring-colliding sibling existed, suppress the correct edge (false negative); exact matching fixes both. Adds substring-collision regression tests (`-name`/`-surname`, `-doThing`/`-reallyDoThing`). Completes the Graphify-Labs#1475 ObjC follow-ups (Bug 5 dot-syntax accesses, Bug 6b @selector target-action). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…aph sidecar (Graphify-Labs#1441) Projects the verdicts `graphify reflect` already distills (preferred / tentative / contested, exponential time-decayed) into a derived experiential layer the read surfaces consume, so accumulated agent experience actually shows up where you look — without polluting the structural graph. Design (grounded in agent-memory + provenance literature; a redesign of the Graphify-Labs#1542 approach): - SIDECAR, not graph.json stamping. `reflect` writes `.graphify_learning.json` next to graph.json (an additional output, so the git hooks produce it automatically). graph.json stays purely structural; nothing leaks into GraphML; no graph.json churn. Mirrors the named-graph / event-sourcing separation of durable truth from a derived layer. - Reuses the existing reflect aggregate (its `_decay` is the recency-weighted exponential model; `_finalize_sources` the classification) — no new scoring. - PROVENANCE: each verdict carries the source questions/dates that produced it (cap 5, most-recent first). - STALENESS: each verdict stores the node's file fingerprint; on read, a changed source file flags the verdict stale ("code changed since — re-verify") rather than presenting a confident lesson on rewritten code. - CONTESTED surfaced distinctly (useful N / dead-end M), not averaged away. - DEAD-ENDS stay QUERY-SCOPED — never a node-level status; they appear only in the report as question -> nodes. - Read surfaces (explain / query+MCP / GRAPH_REPORT / graph.html) merge the overlay at read time, sanitized; un-annotated graphs are byte-identical. Deferred (logged): letting verdicts influence query/seed traversal — the recommender feedback-loop / Matthew-effect risk means that needs propensity correction + exploration, not naive biasing. Builds on the idea in Graphify-Labs#1441/Graphify-Labs#1542 (thanks @TPAteeq). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… paths The overlay fingerprint resolved a node's source_file against graph_path.parent (the graphify-out/ dir), but source_file is stored relative to the PROJECT root — so graphify-out/auth.py never existed and _is_stale flagged EVERY verdict "code changed since — re-verify" the moment it was written. (The original staleness test used an absolute source_file, which masked it.) Fix: resolve the file by trying the likely roots in order (.graphify_root marker, graphify-out's parent, graph.json's own dir, cwd) and use the first that exists — the same search at write and read — and fingerprint file CONTENT only (sha256 of bytes, no path mixed in) so the hash is root-independent and a committed sidecar stays valid across checkouts. Drops the brittle directory-name-based root guess. Adds a regression test with a relative source_file under the graphify-out layout (stale=False right after reflect, True after an edit). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
README: document that `reflect --graph` writes the .graphify_learning.json overlay and that explain/query surface a Lesson hint (with the code-changed staleness flag). CHANGELOG: add an Unreleased section for the post-0.9.2 work — the work-memory overlay (Graphify-Labs#1441/Graphify-Labs#1542), this.field.method() injected-field resolution (Graphify-Labs#1316), TS wildcard path aliases (Graphify-Labs#1544), JS namespace re-exports (Graphify-Labs#1552), and the ObjC dot-syntax/@selector edges (Graphify-Labs#1475/Graphify-Labs#1543). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…raphify-Labs#1558) Refines the staleness file resolution (00e00a0) by folding in the two genuine merits of @TPAteeq's parallel fix (Graphify-Labs#1558), which independently and correctly diagnosed the same root-mismatch bug: - Layout-ordered candidates: try the layout-appropriate root FIRST (the graphify-out parent for the standard layout, graph.json's own dir for a flat layout) before the other. The prior order tried the grandparent first unconditionally, which in a flat layout (graph.json at the project root) could fingerprint a same-named file one directory up. Existence checking is kept on top, so a defeated name heuristic or a stale .graphify_root marker still falls through to the real file. - Adds @TPAteeq's .graphify_root-marker-driven regression test, plus a flat-layout test that pins the ordering (editing the real file flips stale; editing the same-named decoy one dir up does not). Co-Authored-By: tpateeq <mohammedateequddin399@gmail.com> Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Graphify-Labs#1561) A hyperedge's member list is canonically keyed `nodes`, but producers (LLM/subagent drift, externally-supplied graph.json) sometimes emit `members` or `node_ids` — graphify only read `nodes`, so those hyperedges silently lost their members, and semantic_cleanup's prune dropped them entirely. Normalize the member key to `nodes` at one ingest chokepoint in build_from_json (and in semantic_cleanup, which runs pre-build), deduping and warning, so every downstream consumer sees the canonical key. Mirrors the existing from/to edge-endpoint aliasing. Reported by @askalot-io. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…ph (Graphify-Labs#1553) The cross-file call resolver bailed (Graphify-Labs#543/Graphify-Labs#1219 god-node guard) whenever a bare callee name had 2+ definitions without unique import evidence — so a single same-named test mock (or any same-named symbol) dropped the real `calls` edge, erasing the call graph wherever a mock existed (the reporter saw a 76-stub Pester suite wipe everything). Replace the blunt bail with a smarter guard: when a name is ambiguous and import evidence doesn't resolve it, apply tie-breakers — non-test preference (a shared, segment-aware _is_test_path classifier) then path proximity — and emit an INFERRED edge ONLY if exactly one candidate survives, else keep bailing. A real def + a test mock resolves to the real def; two genuine non-test defs still bail (god-node guard intact, no fan-out). Wired into both the extract.py pass and the symbol_resolution.py copy via the shared classifier. Reported by @Schweinehund. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… routing (Graphify-Labs#1556, Graphify-Labs#1547) A class declared in a header (Foo.h/@interface) and defined in its impl (Foo.cpp/Foo.m/@implementation) fragmented into two nodes: _file_stem drops the extension so Foo.h and Foo.cpp share a node id, which _disambiguate_colliding_node_ids then split apart by path — and the two "defs" tripped every resolver's single-definition god-node guard, cascading into missing .h<->.m/.cpp linkage and cross-file/cross-language edges. - Routing: a `.h` using `#import` now routes to extract_objc (Graphify-Labs#1556 bridging headers — extract_c drops `#import` as a preproc_call), and a `.h` with C++-only signals (class/namespace/template/::/access-specifiers) routes to extract_cpp (Graphify-Labs#1547 — the C grammar has no class_specifier, so a C++ header previously yielded a junk node and lost every method). ObjC sniff keeps priority; a plain C header still routes to extract_c. - Merge: a new _merge_decl_def_classes post-pass collapses the header/impl id-collision onto the header (declaration) variant, modeled on _merge_swift_extensions, gated so it fires ONLY for a clean sibling header/impl pair (same dir, same base stem, exactly one header) — two same-named classes in different directories have different stems and never collide, so they are never merged (god-node guard verified). C++ method definitions retain their `Foo::` qualifier so a `Foo::bar` def keys onto the header declaration (one method node, not two); free functions keep their bare-name ids. Result: one canonical class node per .h/.m or .h/.cpp pair with methods unified, which unblocks the existing member-call resolvers (verified Swift->ObjC calls and Swift `extension` folding now resolve). Strict improvement over v8 (which produced junk/fragmented nodes here, verified). Still open as follow-ups: cross-file C++ #include edge resolution and a C++/ObjC cross-file member-call resolver (a pre-existing gap, not a regression). Reported by @JabberYQ (Graphify-Labs#1556) and @c0dezer019 (Graphify-Labs#1547). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…aphify-Labs#1547, Graphify-Labs#1556) Connects paired classes across files: Main.cpp's `Foo f; f.bar()` now resolves to Foo::bar, and ObjC `Foo *f = [[Foo alloc] init]; [f doThing]` to Foo's doThing — the "connect with other classes" goal of Graphify-Labs#1547/Graphify-Labs#1556. Design grounded in prior-art research (ctags qualified-name matching, Doxygen's name-keyed false-edge failure modes, PAIGE's receiver-type approach, Clang USR): resolve by RECEIVER TYPE, never bare name, and skip when the type can't be inferred rather than guess (a false call edge / god-node is worse than a missing one). Mirrors the existing Swift/Python/Ruby/TS member-call resolvers. - C++ extractor now captures the member-call receiver (field_expression / qualified_identifier / pointer access) and builds a per-file type table from local declarations (`Foo f;`, `Foo* f;`, `Foo *f = ...;`); emits raw_calls. - ObjC extractor emits raw_calls for message sends with the receiver + selector and a type table from `Foo *f = ...;` locals (existing in-file selector / alloc-init / dot-syntax / @selector matching preserved). - New _resolve_cpp_member_calls / _resolve_objc_member_calls, registered for their suffixes. Receiver tiers: `Foo::bar()` / capitalized ObjC receiver and this/self/super (enclosing class) -> EXTRACTED; local-var-typed -> INFERRED. Single-definition god-node guard (skip unless exactly one type def matches); the just-shipped decl/def class merge makes a paired class one def so the guard resolves it. Verified: a.run() -> A::run only (not a same-named B::run); an uninferable receiver with run() in two classes emits zero edges (no fan-out); ObjC [f doThing] -> Foo only. - build.py: the cross-language INFERRED-call prune treated .h/.cpp/.m as different families and dropped header/impl interop calls; unified the C family (.c .h .cc .cpp .hpp .cxx .hh .hxx .cu .cuh .metal .m .mm) so a .cpp/.m call to a .h-declared method survives. Still open (tracked on Graphify-Labs#1547/Graphify-Labs#1556): the file-level `#include` edge can stay uncanonicalized when the project root isn't symlink-resolved (the extract() id-remap `continue`s on a /var-vs-/private/var mismatch) — the class connection above is robust to it; include-reachability candidate narrowing and ObjC dynamic-dispatch/id-typed receivers also deferred (expected low ObjC recall, per the research). Reported by @c0dezer019 (Graphify-Labs#1547) and @JabberYQ (Graphify-Labs#1556). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
_suppress_output() documented that it suppressed both stdout and stderr but only redirected stdout. Stderr was handled by a manual sys.stderr swap in the caller, which is less safe (no guarantee of restoration on exception before the try/finally). Use contextlib.ExitStack + redirect_stderr so both streams are handled by the context manager and the caller is simplified. Removes the now-unused sys import. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
_suppress_output()both stdout and stderr'ı bastırdığını belgeliyorduama yalnızca stdout'u yönlendiriyordu. Stderr, çağıranda manuel bir
sys.stderrswap'ıyla yönetiliyordu — exception durumundafinallyöncesi geri yükleme garantisi yoktu.
contextlib.ExitStack+redirect_stderrkullanılarak her iki streamcontext manager tarafından temiz şekilde yönetiliyor, çağıran
basitleştiriliyor ve artık kullanılmayan
sysimport'u kaldırılıyor.