Skip to content

Apps refactor#390

Merged
krokicki merged 40 commits into
mainfrom
app-refactor
Jun 20, 2026
Merged

Apps refactor#390
krokicki merged 40 commits into
mainfrom
app-refactor

Conversation

@krokicki

@krokicki krokicki commented Jun 18, 2026

Copy link
Copy Markdown
Member

Refactor: split apps/core.py into cohesive modules

Why

fileglancer/apps/core.py had grown to 1701 lines, mixing nine unrelated
concerns behind a flat list of module-level functions. It was hard to navigate,
hard to test in isolation, and a magnet for further growth.

This PR splits it into four single-purpose modules and deletes core.py
(clean break). It is a pure reorganization — no behavior change.

New structure

graph TD
    subgraph consumers["External consumers"]
        server["server.py"]
        worker["user_worker.py"]
        tests["tests/"]
    end

    init["apps/__init__.py<br/><i>re-exports public API</i>"]

    subgraph pkg["fileglancer/apps/"]
        manifest["manifest.py<br/><i>worker dispatch + git repo<br/>cache + manifest discovery/loading</i>"]
        command["command.py<br/><i>requirement checks + path/param<br/>validation + build_command</i>"]
        jobfiles["jobfiles.py<br/><i>work-dir path construction<br/>+ job file access</i>"]
        jobs["jobs.py<br/><i>poll loop + job<br/>submission/cancel</i>"]
        adapters["adapters.py / pixi.py / nextflow.py<br/><i>manifest adapters (unchanged)</i>"]
    end

    server --> init
    worker --> init
    tests --> init

    init --> manifest
    init --> command
    init --> jobfiles
    init --> jobs

    jobs --> manifest
    jobs --> command
    jobs --> jobfiles

    manifest --> adapters

    classDef leaf fill:#e8f4ea,stroke:#3a7d44;
    classDef orchestrator fill:#e7eefc,stroke:#2f5fb0;
    classDef facade fill:#fdf3e3,stroke:#c08a2d;
    class manifest,command,jobfiles leaf;
    class jobs orchestrator;
    class init facade;
Loading

The module graph is a DAG: jobs is the only module with sibling
dependencies; manifest, command, and jobfiles are independent leaves.
No circular-import risk.

Module responsibilities

Module Responsibility ~lines
manifest.py worker dispatch (_dispatch/set_worker_exec), git repo cache, manifest discovery & DB-cache loading 430
command.py requirement-check generation, path/parameter validation, build_command 390
jobs.py background poll loop, job submission & cancellation 700
jobfiles.py work-dir path construction, job file access 200

Intra-package edges

flowchart LR
    jobs -->|"_dispatch, _ensure_repo_cache,<br/>get_or_load_manifest"| manifest
    jobs -->|"build_command, merge_requirements,<br/>build_requirements_check"| command
    jobs -->|"_build_work_dir"| jobfiles
    manifest -->|"try_adapt, MANIFEST_ADAPTERS"| adapters
Loading

External dependencies (outside the package): all four touch database;
manifest/jobs/jobfiles touch settings; command touches filestore;
jobs touches cluster_api.

Backward compatibility

  • apps/__init__.py re-exports the same public names, so package-level access
    (fileglancer.apps.X) is unchanged — server.py needed no edits.
  • user_worker.py inline imports were repointed to the new modules.
  • Test imports and patch(...) targets were repointed to the module that now
    owns each name (e.g. apps.core._dispatchapps.jobs._dispatch,
    apps.core.fetch_app_manifestapps.manifest.fetch_app_manifest).

Verification

  • Full backend suite: 382 passed.
  • import fileglancer.apps / server / user_worker is clean (no circular imports).
  • grep -rn "apps\.core" returns nothing — the clean break is complete.

@StephanPreibisch @JaneliaSciComp/fileglancer

krokicki and others added 30 commits June 5, 2026 11:14
Users can publish their apps to a catalog browsed by everyone, and add
listings to their own collection. Adding from the catalog creates an
independent UserApp; the listing stores only metadata (no manifest
snapshot) so there is no staleness or drift to manage. The Apps page's
new "Browse Catalog" and "Add from URL" buttons replace the old single
"Add App" entry point.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Trash icon on the app card and Delete button in the info dialog both
open a confirmation dialog before calling the remove mutation, matching
the Stop Service confirmation pattern in JobDetail.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replace the standalone /catalog route and the separate Catalog/Jobs
navbar entries with a tab bar on /apps that switches between My Apps,
App Catalog, and Jobs. The Apps navbar entry now carries the active-job
badge that used to live on the Jobs link.

Tabs are a small NavLink-based component (not Material Tailwind's Tabs)
since the latter is built around in-memory Tabs.Panel content rather
than URL routing; NavLink gets us automatic active state from the URL
and proper back/forward behavior.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The server-side verify_requirements() check inspected the backend's PATH,
but jobs run on the compute node as the user with a different environment.
This produced false negatives (tool present on the cluster but missing on
the backend blocked submission) and false positives (passed on the backend,
failed at runtime).

Add build_requirements_check(), which generates a bash snippet from the same
requirement parser and tool registry. It runs inside the job after PATH/conda/
env setup, checks tool existence and version constraints (sort -V), aggregates
all failures to stderr, and exits non-zero. Failures surface as a FAILED job
with the message in stderr.log, shown by the existing JobDetail UI.

submit_job no longer hard-fails on the server; the check is embedded in the
job script before pre_run/command.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The mutation hooks expose a single shared isPending flag, so passing it to
every ListingCard caused all add/unshare buttons to animate when any one was
in flight. Scope each card's pending state to the listing being acted on by
matching the mutation's in-flight variables (listing_id).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Remove the inline GitHub URL link from catalog cards and add an info button
that opens a ListingInfoDialog, mirroring the "my apps" AppInfoDialog. The
dialog shows the URL, branch, and description in the same table layout, plus
a "Shared by" row, and surfaces Add/Unshare actions.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a checkbox to the catalog that filters out listings already in the user's
apps, reusing the existing myAppKeys set. Updates the empty state to explain
when all shared apps are already installed.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Codex <codex@openai.com>
The loading state swapped the label for a larger spinner plus loading text,
which grew the button in both dimensions. Now the label stays in normal flow
(hidden in place while loading) and a smaller spinner is overlaid, so the
button keeps a constant size and its label stays vertically centered.

loadingText is now used as the spinner's accessible label. Spinner gains an
optional sizeClasses prop (default unchanged) and skips empty text.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Codex <codex@openai.com>
Share/unshare is now only available in the app info dialog. The card keeps
its info, launch, and remove actions; the "Shared" badge still indicates
shared apps.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Codex <codex@openai.com>
The trash can icon meant "remove from my apps" on the Apps page but
"unshare" in the catalog. Use users-slash (FaUsersSlash) for unshare and
users (FaUsers) for share, in the catalog card/info dialog and the app info
dialog, to distinguish sharing actions from removal.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Codex <codex@openai.com>
Co-authored-by: Codex <codex@openai.com>
Co-authored-by: Codex <codex@openai.com>
- Share/unshare no longer close the app info or catalog info dialogs.
- Fold the share form into AppInfoDialog as an inline view instead of a
  separate stacked dialog, so the dialog stays open throughout sharing
  (stacked Material Tailwind dialogs dismissed the one underneath). Removes
  the now-unused ShareAppDialog.
- Rename the app's "Delete" action to "Remove".
- Add tooltips to every button in both info dialogs and keep the two dialogs
  in sync (layout, button styling, wording).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Codex <codex@openai.com>
Apps are versioned on their own terms (package.json, pixi.toml, etc.), so a
manifest-level version is meaningless. Drop it from the AppManifest model and
TS type, the pixi adapter, the info dialog and launch form, and the docs.
Legacy manifests that still include `version:` are accepted but the field is
ignored (Pydantic extra='ignore').

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Codex <codex@openai.com>
Co-authored-by: Codex <codex@openai.com>
Co-authored-by: Codex <codex@openai.com>
Co-authored-by: Codex <codex@openai.com>
Requirement checking moved from submit-time on the server to job runtime
in the execution environment, but the Job Execution and extra_paths
sections still described the old server-side behavior.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
submit_job stopped calling verify_requirements when requirement checking
moved to job runtime (build_requirements_check); the function was left
exported and exercised only by tests. Remove it, its sole helper
(_augmented_path), the now-unused shutil/subprocess/packaging imports,
and the orphaned test class. The conda-binary and version-comparison
cases it covered are exercised by the build_requirements_check tests.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
_find_manifests_in_repo raised on the first adapter conversion failure,
so a single adapter's error could mask a later adapter that would have
handled the repo. Collect all adapter errors, return as soon as any
adapter succeeds (logging the rest), and only raise an aggregated error
when no adapter produced a manifest. Add tests covering the
one-fails-one-succeeds, all-fail-aggregated, and none-handle paths.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
model.py and apps/core.py each defined near-identical requirement-spec
patterns that had to be kept in sync by hand. Make model._REQUIREMENT_PATTERN
the single source of truth (groups: tool, operator, version) and import it
into core. This also lets build_requirements_check read the operator and
version straight from the match, dropping the separate _REQ_OP_PATTERN
split, and tidies merge_requirements to match each spec once.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
published_at comes back from the API as a naive UTC datetime (no tz
marker), but the catalog cards/dialog parsed it with a raw
new Date(...).toLocaleDateString(), which interprets it as local time
and can show the wrong day near midnight. Use the shared formatDateString
helper (already used for job/ticket/link dates), which normalizes naive
strings to UTC before formatting.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The FgButton loading rework stopped rendering loadingText as visible text
and instead exposes it as the spinner overlay's aria-label (role="status"),
keeping the button size constant. The Loading and LoadingWithHref stories
still asserted the text with getByText, so their play functions threw and
Chromatic reported 2 component errors. Query by role/accessible name to
match the current a11y contract.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
krokicki and others added 10 commits June 16, 2026 22:23
TestBuildRequirementsCheck executes the generated requirement-check
snippet through `bash -c` and depends on POSIX tools (grep -oE, sort -V,
arrays) plus chmod-based executable shims. That environment isn't
available/consistent on Windows, so all 12 cases failed on the
windows-latest CI runner. The snippet only ever runs on Linux compute
nodes, so skip the class on win32 (matching test_worker/test_filestore).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add the ability to download all job parameters (parameters, environment,
and cluster settings) as a JSON file and to upload such a file to populate
the launch form. A file may target any subset of the three tabs.

- AppLaunchForm: "Export params" and "Upload params file" buttons next to
  Submit; partial files applied field-by-field
- JobDetail: "Download params" button on /apps/jobs/:id to download the
  parameters used for a completed job
- Add AppLaunchParamsFile type + parseAppLaunchParamsFile validator and a
  shared downloadTextFile util (reused by JobDetail log downloads)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The generated runtime requirements-check snippet always included all four
bash helper functions (__fg_check_tool, __fg_extract_version, __fg_ver_le,
__fg_check_version) even when the checks never called them. Split the
preamble into per-helper blocks and emit only those the generated checks
reference: version-only helpers are dropped for name-only requirements, and
none are emitted when every requirement is invalid.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Opening a job fired the script, stdout, and stderr file-content queries on
mount, making the first load slow even though the default Parameters tab
needs none of them. Gate each file query on its tab being active via a new
enabled flag on useJobFileQuery, so file content is fetched only when its
tab is viewed (and cached thereafter).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Opening a job was slow because GET /api/jobs/{id} dispatched to the per-user
worker to compute file paths, which globbed the work dir for the script and
stat'd the logs over NFS on every read. The script filename is already
returned by cluster-api at submit time and was simply discarded.

- Add a script_path column to jobs (migration f3a9c41d76e2) and persist the
  value cluster-api returns from submit
- get_job_file_paths now derives paths from work_dir + stored script_path with
  no filesystem access, inferring existence from job state (script once
  submitted, logs once started, service_url only while running)
- read_job_file reads the stored script_path, falling back to a glob for
  legacy jobs
- get_job resolves file info in-process, removing the worker roundtrip; drop
  the now-unused get_job_file_paths worker action

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The job detail endpoint resolved browse links by realpath'ing every file
share mount, three times per request (once per file). On a cold server those
realpaths trigger NFS automounts, making the first job load take ~6s,
synchronously on the async handler.

All of a job's files live under one work dir, so its browse-link base is
resolvable once at submit time in the user-context worker (where mounts are
warm). Persist it and build per-file browse links from the DB by string
concatenation, with no filesystem access on read.

- Add work_dir_fsp_name / work_dir_subpath columns (migration b1c4e7f29a83)
- Worker submit resolves and returns the work dir's browse-link base;
  submit_job persists it
- get_job_file_paths builds browse links from the stored base (drop the
  per-read realpath and the FSP fetch); remove _resolve_browse_path

Legacy jobs (no stored base) show file paths without browse links; their
content tabs still work via the glob fallback.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Keep stored job paths as raw strings when building metadata so POSIX cluster paths are not rewritten with local OS separators on Windows.

Co-authored-by: Codex <codex@openai.com>
core.py had grown to 1701 lines mixing nine concerns behind a flat list
of module-level functions. Split it into four single-purpose modules and
delete core.py (clean break):

- manifest.py  — worker dispatch + git repo cache + manifest discovery/loading
- command.py   — requirement checks + path/param validation + build_command
- jobs.py      — poll loop + job submission/cancel
- jobfiles.py  — work-dir path construction + job file access

Dependencies form a DAG (jobs imports the other three; the rest are
leaves), so there's no circular-import risk. __init__.py re-exports the
same public API, so server.py is unchanged. Repointed user_worker.py
imports, a model.py comment, and test imports/patch targets to the new
modules. Pure reorganization — no behavior change.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Commit 589ba27 ("Fix job file paths on Windows") modified the
job-file-path logic in the old core.py. That logic now lives in
jobfiles.py after the split, so re-apply it there: keep stored job
paths as raw strings (via _stored_work_dir_path / _join_stored_path /
_stored_path_basename) so POSIX cluster paths are not rewritten with
local OS separators on Windows.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@krokicki krokicki changed the title App refactor Apps refactor Jun 18, 2026
Base automatically changed from app-catalog to main June 20, 2026 13:37
@krokicki krokicki merged commit 63ab01c into main Jun 20, 2026
5 checks passed
@krokicki krokicki deleted the app-refactor branch June 20, 2026 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant