Skip to content

docs: local observability stack sample, guide, and skill#552

Open
staging-devin-ai-integration[bot] wants to merge 7 commits into
devin/1780157741-speech-observabilityfrom
devin/1780214715-observability-local-stack
Open

docs: local observability stack sample, guide, and skill#552
staging-devin-ai-integration[bot] wants to merge 7 commits into
devin/1780157741-speech-observabilityfrom
devin/1780214715-observability-local-stack

Conversation

@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor

@staging-devin-ai-integration staging-devin-ai-integration Bot commented May 31, 2026

Summary

  • Add samples/observability/ — a docker compose stack (skit + Prometheus + Grafana, optional speech gateway) that auto-provisions both dashboards and ships a traffic generator, so anyone can reproduce the speech-monitoring dashboards locally with two commands. Captures wiring that was previously tribal knowledge: skit pushes via OTLP (--web.enable-otlp-receiver) while the gateway is scraped.
  • Document the plugin / oneshot / gateway metric families in the observability guide (they power feat(observability): plugin/ML dashboard row + standalone speech-gateway dashboard #547's new dashboard rows but weren't described anywhere) plus a "run the stack locally" section.
  • Add an observability-stack agent skill capturing the validation procedure (verify metric names/labels with PromQL, expected "No data" panels) and the demo-image gotchas.
  • Add a Dockerfile to the speech-gateway example (used by the compose gateway profile; independently useful).

Stacking: this is the capstone of the speech-observability effort. It is based on #547 (dashboards) and merges in #546 (gateway /metrics) so the committed gateway profile actually exposes the gateway_* series the dashboard queries. Intended merge order: #546#547 → this; retarget to main once those land (the #546/#547 diff then drops out).

Verified locally on v0.5.0-demo:

  • Default stack + ./generate-traffic.sh → real Ogg/Opus TTS and a real Whisper transcript; populates HTTP, engine, oneshot, and the Plugins / ML inference row.
  • --profile gateway + ./generate-traffic.sh --gateway/metrics scraped, gateway_requests_total etc. populate the Speech Gateway row.

Findings consolidated here (each is a Known gotcha in the sample README + skill), tracked in #553:

  • latest-demo is stale (predates plugin.call.*) → pin a versioned -demo tag.
  • -demo images ship bare .so plugins but the loader wants native/<id>/ bundles → skit/entrypoint.sh reassembles them.
  • Pipeline model_paths must match the model files the image ships. The demo ships only ggml-tiny-q5_1.bin, so the sample's own pipelines use it; the gateway's built-in STT pipeline targets base.en and therefore 500s through the gateway profile until that model is present (documented).

Review & Validation

Notes

Performance dashboard populated from the local sample

Link to Devin session: https://staging.itsdev.in/sessions/2abca85782ba45fe90994b8307526aa3
Requested by: @streamer45


Devin Review

Status Commit
🟢 Reviewed 9ea4f9a
Open in Devin Review (Staging)

Instrument the /tts and /stt handlers with request count, latency, in-flight, upstream-call latency, and rejection-reason metrics, and serve them at GET /metrics (ungated by the concurrency semaphore).

Signed-off-by: streamkit-devin <devin@streamkit.dev>
Lets the skit oneshot_pipeline metric split by tts/stt service.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
Handlers return without writing on context.Canceled; default the metric to 499 (client closed) unless a status was written, so cancellations are not counted as successes.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
- Add samples/observability/ compose stack (skit + Prometheus + Grafana,
  optional speech gateway) that auto-provisions both dashboards and ships a
  traffic generator, so the metrics/dashboards can be validated locally.
- Document the plugin/oneshot/gateway metric families and a 'run the stack
  locally' section in the observability guide.
- Add an observability-stack agent skill capturing the validation procedure
  and the demo-image gotchas.
- Add a Dockerfile to the speech-gateway example.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

Copy link
Copy Markdown
Contributor Author

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 5 potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment on lines +21 to +25
- kind: containers::webm::muxer
params:
channels: 1
sample_rate: 48000
streaming_mode: live
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 TTS sample emits WebM but the STT step sends it as Ogg

generate-traffic.sh writes the TTS response to a.ogg and immediately posts it to the STT pipeline as Content-Type: audio/ogg (samples/observability/generate-traffic.sh:32-41), while the STT sample begins with containers::ogg::demuxer (samples/observability/pipelines/stt-whisper.yml:5-7). The TTS pipeline here actually muxes Opus into WebM, so the STT request cannot be demuxed as Ogg and is ignored by the script via || true, leaving the intended Whisper/STT traffic absent from the observability run.

Suggested change
- kind: containers::webm::muxer
params:
channels: 1
sample_rate: 48000
streaming_mode: live
- kind: containers::ogg::muxer
params:
channels: 1
codec: opus
Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 732b425. The TTS pipeline now muxes Ogg/Opus (containers::ogg::muxer, codec: opus) with http_output content_type: audio/ogg, matching the gateway's own TTS pipeline. Re-tested the round trip: TTS returns Ogg data, Opus audio, mono 48000 Hz and STT now returns a real transcript instead of being silently dropped, so the Whisper inference traffic actually shows up on the dashboard.

Comment on lines +10 to +12
- job_name: speech-gateway
static_configs:
- targets: ['gateway:8080']
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Prometheus scrapes gateway metrics that the built gateway does not expose

The observability stack configures Prometheus to scrape the optional gateway at gateway:8080, which defaults to /metrics, but the gateway image built by this PR compiles the existing cmd/gateway server (examples/speech-gateway/Dockerfile:7-8) and that server only registers /stt and /tts handlers (examples/speech-gateway/cmd/gateway/main.go:128-130). When the gateway profile is enabled, Prometheus receives 404s for /metrics, so the Speech Gateway dashboard rows relying on gateway_requests_total, gateway_request_duration_seconds, and related series remain empty despite the README/skill promising gateway monitoring.

Prompt for agents
The observability compose stack assumes examples/speech-gateway exposes a Prometheus /metrics endpoint, but the current gateway binary only serves /stt and /tts. Update the gateway implementation or the compose setup so the image used by samples/observability actually provides the gateway_* metrics queried by examples/speech-gateway/grafana-dashboard.json. Relevant places: samples/observability/prometheus.yml scrapes gateway:8080, samples/observability/docker-compose.yml builds ../../examples/speech-gateway, and examples/speech-gateway/cmd/gateway/main.go currently registers only /stt and /tts. A fix should make the profile produce gateway_requests_total, gateway_request_duration_seconds, gateway_rejected_total, etc., or stop advertising/scraping those metrics until the instrumented gateway exists.
Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by merging #546 (devin/1780157823-speech-gateway-metrics) into this branch in 75346be, so the gateway image built by the gateway profile now serves GET /metrics via promhttp. Verified end-to-end on the committed stack: gateway_requests_total (e.g. {endpoint="tts",code="200"}) and the other gateway_* series are scraped and populate the Speech Gateway row.

Note: the gateway's built-in STT pipeline targets ggml-base.en-q5_1.bin, which the -demo image doesn't ship (it ships ggml-tiny-q5_1.bin), so STT through the gateway returns 500 until that model is present — that's one of the model-name-mismatch findings tracked in #553 and called out in the sample README. The default direct-to-skit traffic path uses the matching model and does full TTS+STT.

Comment on lines +1 to +9
---
name: observability-stack
description: >-
Spin up StreamKit's local observability stack (skit + Prometheus + Grafana,
optional speech gateway) and validate the Grafana dashboards end-to-end. Use
when testing metrics/dashboards, debugging empty dashboard panels, or
reproducing the speech-gateway monitoring setup locally.
license: MPL-2.0
---
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: License-header requirements are satisfied via REUSE annotations for frontmatter/config files

The new SKILL.md starts with YAML frontmatter instead of inline SPDX comments, but this is intentional and covered by REUSE.toml's .agents/skills/**/SKILL.md annotation. Likewise, the new .yml files are covered by the existing **/*.yml configuration-file annotation. I did not flag the missing inline SPDX headers on those files because adding them would either duplicate the configured REUSE coverage or break skill frontmatter parsing.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment on lines +9 to +12
scrape_configs:
- job_name: speech-gateway
static_configs:
- targets: ['gateway:8080']
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Gateway scrape target is expected to be down without the optional profile

The base docker compose up -d path starts Prometheus even though the gateway service is behind the gateway profile. That means the speech-gateway scrape target can appear down until users run docker compose --profile gateway up -d --build. Because the README explicitly describes the gateway as optional and says its dashboard row stays empty until the profile is enabled, the mere presence of an unavailable target in the base stack is expected rather than a separate bug.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment on lines +55 to +59
- |
set -e
for f in /in/*.json; do
sed 's/$${DS_PROMETHEUS}/prometheus/g' "$$f" > "/out/$$(basename "$$f")"
done
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Compose interpolation escaping in dashboard-prep is intentional

The dashboard-prep command uses $${DS_PROMETHEUS} and $$f/$$(...) so Docker Compose leaves literal shell variables and command substitution for the container's /bin/sh. After Compose interpolation, the script rewrites dashboard datasource placeholders from ${DS_PROMETHEUS} to the provisioned Grafana datasource uid prometheus; this escaping matches the documented gotcha in the added README/skill and is not an accidental double-dollar typo.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Brings the gateway /metrics instrumentation into the branch so the sample's
optional gateway profile exposes the gateway_* series the Speech Gateway
dashboard row queries.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
The TTS pipeline muxed WebM while generate-traffic.sh posts the result to the
STT pipeline as audio/ogg (demuxed by containers::ogg::demuxer), so STT traffic
was silently dropped. Mux Ogg/Opus to match, as the gateway's own TTS pipeline does.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
…r round

Signed-off-by: streamkit-devin <devin@streamkit.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants