Skip to content

feat(metrics): config-driven bounded request labels (adds service to oneshot_pipeline.duration)#545

Open
staging-devin-ai-integration[bot] wants to merge 7 commits into
mainfrom
devin/1780157746-oneshot-service-label
Open

feat(metrics): config-driven bounded request labels (adds service to oneshot_pipeline.duration)#545
staging-devin-ai-integration[bot] wants to merge 7 commits into
mainfrom
devin/1780157746-oneshot-service-label

Conversation

@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor

@staging-devin-ai-integration staging-devin-ai-integration Bot commented May 30, 2026

Summary

  • Adds a general, operator-configurable facility for attaching bounded labels to request metrics, sourced from trusted request headers — instead of hardcoding service names in the Rust binary. A new dimension/value is a skit.toml edit, not a recompile.
  • Config lives at [server.metrics] as a list of request_labels, each { name, header, allowed[], fallback }. The header value is trimmed + lowercased and matched against allowed; anything outside it (or a missing header) collapses to fallback (default "other"), so clients can pick a bucket but never create new series.
  • A reusable resolver (apps/skit/src/metrics_labels.rs) is applied at two consumers to prove it's general: the shared http.server.* middleware (resolves once, stashes in request extensions, skipped when no labels are configured) and oneshot_pipeline.duration (reuses the stashed labels at every record site).
  • Opt-in by default: request_labels is empty out of the box, so no metric gains a label and http.server.* keeps its schema on upgrade. The service / X-StreamKit-Service / {tts,stt} mapping ships as a commented example in samples/skit.toml; the gateway sets the header per endpoint (sibling effort). Declaring any request_labels sets the full list (figment doesn't merge sequences).
  • Hardened config load: invalid/empty names, empty allowlist entries, and names that collide with a built-in key after Prometheus sanitization (http_methodhttp.method) or each other are rejected; fallback is normalized like matched values; built-in reserved keys have one source of truth shared by the emit sites and validation; normalization (and a best-effort validate warning) also run in create_app_state so embedded/MCP/test paths can't silently drift.

Review & Validation

  • Cardinality bounded by config: value emitted only if non-empty and normalized-in-allowed, else fallback (empty allowed ⇒ always fallback).
  • Reserved/duplicate/invalid names and empty allowlist entries fail config load (MetricsConfig::validate).
  • Default is empty/opt-in (metrics_default_is_empty_opt_in); samples/skit.toml parses with the example commented out (sample_config_test).
  • With the sample service block uncommented, oneshot_pipeline.duration and http.server.* gain service ∈ {tts,stt,other}.
  • just lint-skit + cargo test -p streamkit-server (metrics_labels + config + sample_config) pass.

Notes

  • Scope: Rust under apps/skit/ plus the commented example in samples/skit.toml (added with the requester's OK). Dashboard and the gateway header-setting are owned by sibling sessions.
  • Generated config-reference regen (docs/) is owned by the docs sibling and tracked in docs: regenerate config reference for [server.metrics].request_labels #554.
  • role_extractor keeps its own header normalization (auth domain); left uncoupled from the metrics helper deliberately.

Link to Devin session: https://staging.itsdev.in/sessions/b5750eb705c84d388f3af4f5b6d6940b
Requested by: @streamer45


Devin Review

Status Commit
🕐 Outdated 3e84014 (HEAD is 0f08945)

Run Devin Review

Open in Devin Review (Staging)

Source a bounded `service` label ({tts,stt,other}) from the trusted X-StreamKit-Service request header so TTS, STT, and other oneshot requests are distinguishable without leaking arbitrary user-submitted pipeline names into metric cardinality.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

Copy link
Copy Markdown
Contributor Author

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment thread apps/skit/src/server/oneshot.rs Outdated
}
self.recorded = true;
let labels = [KeyValue::new("status", status)];
let labels = [KeyValue::new("status", status), KeyValue::new("service", self.service)];
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Existing aggregate PromQL should continue to work with the added label

The change adds a second label to oneshot_pipeline.duration observations at all metric recording sites in this handler. I did not flag this as a bug because the existing dashboard queries aggregate away all non-grouped labels: the error-rate query uses sum(rate(oneshot_pipeline_duration_count{status="error"}[5m])) and the latency query uses sum(rate(oneshot_pipeline_duration_bucket{status="ok"}[5m])) by (le) in samples/grafana-dashboard.json:164 and samples/grafana-dashboard.json:279, so adding service will increase series cardinality but should not change those aggregate panels' values.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment thread apps/skit/src/server/oneshot.rs Outdated
Comment on lines +52 to +57
fn classify_service(header: Option<&str>) -> &'static str {
match header.map(|h| h.trim().to_ascii_lowercase()).as_deref() {
Some("tts") => "tts",
Some("stt") => "stt",
_ => "other",
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Service header is intentionally low-cardinality and advisory

The new header classifier lowercases and trims the value, allows only tts and stt, and maps missing, invalid UTF-8, empty, or unknown values to other. That means clients can choose the service bucket for observability, but cannot create unbounded metric cardinality. I did not treat the unauthenticated/advisory nature of X-StreamKit-Service as a security bug because the label is only used for metrics and is not fed into authorization or pipeline behavior.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

@codecov
Copy link
Copy Markdown

codecov Bot commented May 30, 2026

Codecov Report

❌ Patch coverage is 95.63492% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.04%. Comparing base (6abe3ba) to head (0f08945).

Files with missing lines Patch % Lines
apps/skit/src/server/mod.rs 77.27% 5 Missing ⚠️
apps/skit/src/server/oneshot.rs 83.33% 5 Missing ⚠️
apps/skit/src/config.rs 99.28% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #545      +/-   ##
==========================================
+ Coverage   79.96%   80.04%   +0.07%     
==========================================
  Files         234      235       +1     
  Lines       68061    68300     +239     
  Branches     1846     1933      +87     
==========================================
+ Hits        54428    54670     +242     
+ Misses      13627    13624       -3     
  Partials        6        6              
Flag Coverage Δ
backend 79.82% <95.63%> (+0.08%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
core 85.37% <ø> (ø)
engine 83.03% <ø> (ø)
api 89.96% <ø> (ø)
nodes 76.89% <ø> (+0.04%) ⬆️
server 80.47% <95.63%> (+0.14%) ⬆️
plugin-native 83.70% <ø> (ø)
plugin-wasm 92.20% <ø> (ø)
ui-services 84.69% <ø> (ø)
ui-components 60.49% <ø> (ø)
Files with missing lines Coverage Δ
apps/skit/src/metrics_labels.rs 100.00% <100.00%> (ø)
apps/skit/src/config.rs 97.88% <99.28%> (+0.34%) ⬆️
apps/skit/src/server/mod.rs 80.46% <77.27%> (-0.15%) ⬇️
apps/skit/src/server/oneshot.rs 88.59% <83.33%> (-0.04%) ⬇️

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Replace the hardcoded {tts,stt,other} allowlist with a general, operator-configurable [server.metrics.request_labels] facility: each label is sourced from a trusted request header, bounded by a configured allowlist, with a fallback (default "other"). A reusable resolver applies it to both oneshot_pipeline.duration and the shared http.server.* request metrics, so new dimensions need only a config edit, not a recompile. Defaults preserve the {tts,stt,other} service label.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
@staging-devin-ai-integration staging-devin-ai-integration Bot changed the title feat(oneshot): add bounded service label to oneshot_pipeline.duration feat(metrics): config-driven bounded request labels (adds service to oneshot_pipeline.duration) May 30, 2026
streamer45 and others added 3 commits May 31, 2026 10:53
…esolution

Address review on request-metric labels:
- Reject configured label names that collide with built-in metric keys (status, http.method, http.route, http.status_code) or duplicate each other, at config load (duplicate Prometheus label keys break scrape).
- Pre-normalize allowlist entries once at config load so the per-request hot path only normalizes the incoming header.
- Resolve labels once in metrics_middleware and stash them in request extensions; the oneshot handler reuses them (falls back to resolving when the layer is absent, e.g. unit tests).
- Record oneshot_pipeline.duration on the routing-task-aborted error arm for consistent error-path coverage.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
- reject empty/invalid label names and empty allowlist entries
- compare label names by Prometheus-sanitized key so http_method collides
  with the built-in http.method
- normalize fallback like matched values; share one normalize() helper
- centralize built-in reserved keys (single source for emit sites + validation)
- normalize metrics config in create_app_state so non-load AppState paths
  (tests, MCP/embedded) match the allowlist
- skip resolve + extension insert when request_labels is empty
- correct default_request_labels doc: declaring labels replaces the list

Signed-off-by: streamkit-devin <devin@streamkit.dev>
load() hard-rejects, but embedded/MCP/test callers build Config directly;
validate there too and log a warning so a bad label can't slip in silently.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
Comment thread apps/skit/src/config.rs Outdated
Comment on lines +238 to +248
fn default_request_labels() -> Vec<RequestLabelConfig> {
// Ships the `service` dimension oneshot dashboards build against. Declaring
// any `request_labels` in config REPLACES this list wholesale (figment does
// not merge sequences), so re-list `service` to keep it alongside new ones.
vec![RequestLabelConfig {
name: "service".to_string(),
header: "X-StreamKit-Service".to_string(),
allowed: vec!["tts".to_string(), "stt".to_string()],
fallback: default_label_fallback(),
}]
}
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about defaults given these are used in a demo setup. They could serve as good commented out example in our skit.toml file though

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed — done in 0f08945. The binary no longer ships an opinionated default: request_labels is now empty by default (opt-in), so no metric gains a label and http.server.* keeps its existing schema on upgrade. The service / X-StreamKit-Service / {tts,stt} mapping now lives as a commented-out example in samples/skit.toml (under a [server.metrics] block) for operators to uncomment. Note this means the oneshot service dashboard contract now depends on shipping that config rather than being on by default.

Per review: don't bake a demo-oriented service label into the binary. Default
to no request labels and document the service/X-StreamKit-Service/{tts,stt}
mapping as a commented example in samples/skit.toml. Avoids a silent
http.server.* schema change on upgrade for operators who never opt in.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants