diff --git a/.agents/skills/observability-stack/SKILL.md b/.agents/skills/observability-stack/SKILL.md new file mode 100644 index 000000000..f02a69b0c --- /dev/null +++ b/.agents/skills/observability-stack/SKILL.md @@ -0,0 +1,87 @@ +--- +name: observability-stack +description: >- + Spin up StreamKit's local observability stack (skit + Prometheus + Grafana, + optional speech gateway) and validate the Grafana dashboards end-to-end. Use + when testing metrics/dashboards, debugging empty dashboard panels, or + reproducing the speech-gateway monitoring setup locally. +license: MPL-2.0 +--- + +# Observability stack (local) + +`samples/observability/` is a `docker compose` stack that runs skit + Prometheus ++ Grafana (and an optional speech gateway), auto-provisioning both bundled +dashboards. Use it to validate metrics and dashboards without any cloud setup. + +## Run it + +```bash +cd samples/observability +docker compose up -d +./generate-traffic.sh # direct-to-skit TTS+STT +# optional gateway row: +docker compose --profile gateway up -d --build +./generate-traffic.sh --gateway +``` + +Grafana: (anonymous admin). Prometheus: +. skit: . + +## How metrics flow + +- **skit → Prometheus via OTLP push.** Prometheus runs with + `--web.enable-otlp-receiver`; skit's `SK_TELEMETRY__OTLP_ENDPOINT` points at + `…/api/v1/otlp/v1/metrics`. There is **no scrape job** for skit. +- **gateway → Prometheus via scrape** of the gateway's `/metrics`. + +## Validate dashboards (don't just eyeball) + +OTLP renames dotted metrics and appends unit suffixes, so verify the metric +names/labels the panels query actually exist before trusting a panel: + +```bash +# list all metric names Prometheus knows about +curl -s localhost:9090/api/v1/label/__name__/values | jq -r '.data[]' | sort +# run a panel's exact PromQL and count series (0 == panel will be "No data") +curl -s --data-urlencode 'query=' localhost:9090/api/v1/query \ + | jq '.data.result | length' +# inspect a metric's labels +curl -s 'localhost:9090/api/v1/series?match[]=' | jq +``` + +Key name/label facts: + +- Plugin metrics: `plugin_call_duration_seconds_*` (unit suffix present), + `plugin_calls_total`; labels `plugin_kind`, `op`. +- `oneshot_pipeline_duration_*` has **no** `_seconds` suffix (no unit set); + labels `status`, and `service` only when an `X-StreamKit-Service` header is + forwarded by a service-label-aware skit. +- Gateway: `gateway_requests_total{endpoint,code}`, + `gateway_request_duration_seconds`, `gateway_rejected_total{reason}` (only + appears after a 413/415/502 actually occurs). + +## Expected "No data" (not bugs) + +- Plugin failure panels (`plugin_errors_total` etc.) — counters don't exist + until a failure happens. +- Oneshot "by Service" panels — empty unless the skit build emits the `service` + label. +- Video / MoQ / codec panels — only populate when you run those pipelines. + +## Gotchas (most-common causes of empty dashboards) + +- **`latest-demo` is stale.** Pin a versioned `-demo` tag; `latest-demo` can + predate metrics like `plugin.call.duration`, leaving the Plugins row empty. +- **Demo-image plugin layout.** `-demo` images ship bare `.so` files but the + loader wants `plugins/native//` bundles; `skit/entrypoint.sh` reassembles + them. Symptom: "no plugins found" / "node kind not found in registry". +- **Model-name mismatch.** A pipeline's `model_path` must exist in the image's + `models/`. The stack's `pipelines/` use the names the `-demo` image ships. +- **Grafana datasource input.** Committed dashboards use `${DS_PROMETHEUS}`; + the `dashboard-prep` step rewrites it to the provisioned uid. In compose + command strings, escape it as `$${DS_PROMETHEUS}` so compose doesn't + interpolate it. +- **Local auth.** skit needs `SK_AUTH__MODE=disabled` + + `SK_PERMISSIONS__ALLOW_INSECURE_NO_AUTH=true` to start unauthenticated on a + non-loopback bind. Local only. diff --git a/.claude/skills/observability-stack b/.claude/skills/observability-stack new file mode 120000 index 000000000..288055401 --- /dev/null +++ b/.claude/skills/observability-stack @@ -0,0 +1 @@ +../../.agents/skills/observability-stack \ No newline at end of file diff --git a/docs/src/content/docs/guides/observability.md b/docs/src/content/docs/guides/observability.md index 52b44674c..0c6dddf3a 100644 --- a/docs/src/content/docs/guides/observability.md +++ b/docs/src/content/docs/guides/observability.md @@ -66,6 +66,44 @@ Import [`samples/grafana-dashboard.json`](https://github.com/streamer45/streamki ![Grafana Dashboard](/screenshots/grafana_dashboard.png) +### What's measured + +Beyond HTTP and engine/node throughput, a few metric families are especially +useful for speech and ML workloads: + +- **Plugin / ML inference** — native plugins emit per-call metrics labelled by + `plugin_kind` (e.g. `whisper`, `kokoro`) and `op`: `plugin_call_duration_seconds` + (histogram), `plugin_calls_total`, and `plugin_errors_total` / + `plugin_timeouts_total` / `plugin_panics_total`. This is where inference + latency and failures show up — usually the dominant cost of a speech pipeline. +- **Oneshot pipelines** — `oneshot_pipeline_duration` (histogram) is labelled by + `status` (`ok`/`error`). Because every oneshot request hits the same + `POST /api/v1/process` endpoint, splitting TTS vs STT requires a trusted + `service` label (sent via the `X-StreamKit-Service` header); without it all + oneshot traffic collapses into one series. +- **Speech gateway** — the [speech gateway example](https://github.com/streamer45/streamkit/tree/main/examples/speech-gateway) + exposes Prometheus metrics for the front door it puts in front of skit: + per-endpoint request rate/latency (`gateway_requests_total`, + `gateway_request_duration_seconds`), in-flight gauge, upstream latency, and + rejections by reason (`gateway_rejected_total`). + +### Run the full stack locally + +To see all of the above on the dashboards without any cloud setup, use the +[`samples/observability`](https://github.com/streamer45/streamkit/tree/main/samples/observability) +compose stack — it wires skit (OTLP push) + the gateway (scrape) into Prometheus +and auto-provisions both dashboards in Grafana: + +```bash +cd samples/observability +docker compose up -d +./generate-traffic.sh +# Grafana: http://localhost:3000 +``` + +See its README for the wiring details and known gotchas (demo-image tag/plugin +layout, model-name matching, the Prometheus OTLP receiver, and local auth). + ## Traces (OTLP) Tracing export is controlled by: diff --git a/examples/speech-gateway/Dockerfile b/examples/speech-gateway/Dockerfile new file mode 100644 index 000000000..8c86de183 --- /dev/null +++ b/examples/speech-gateway/Dockerfile @@ -0,0 +1,13 @@ +# SPDX-FileCopyrightText: © 2025 StreamKit Contributors +# +# SPDX-License-Identifier: MPL-2.0 + +FROM golang:1.24-bookworm AS build +WORKDIR /src +COPY . . +RUN CGO_ENABLED=0 go build -o /gateway ./cmd/gateway + +FROM gcr.io/distroless/static-debian12 +COPY --from=build /gateway /gateway +EXPOSE 8080 +ENTRYPOINT ["/gateway"] diff --git a/examples/speech-gateway/README.md b/examples/speech-gateway/README.md index c86c83915..812148c4d 100644 --- a/examples/speech-gateway/README.md +++ b/examples/speech-gateway/README.md @@ -88,3 +88,5 @@ curl http://127.0.0.1:8080/metrics ### Grafana dashboard A ready-made dashboard lives at [`grafana-dashboard.json`](./grafana-dashboard.json). It is self-contained: import it and pick the Prometheus datasource scraping both the gateway and the StreamKit backend. Alongside the gateway metrics above, it includes a per-service split of the backend's `oneshot_pipeline_duration` (via the `service` label: `tts`/`stt`/`other`) and the StreamKit native-plugin inference metrics (`plugin_call_duration_seconds`, `plugin_calls_total`, …) that back the STT/TTS models. + +To run the gateway, Prometheus, and Grafana together locally, see [`samples/observability`](../../samples/observability). diff --git a/samples/observability/README.md b/samples/observability/README.md new file mode 100644 index 000000000..0c2e663b9 --- /dev/null +++ b/samples/observability/README.md @@ -0,0 +1,100 @@ + + +# Local observability stack + +A `docker compose` stack that runs **skit + Prometheus + Grafana** (and an +optional **speech gateway**) so you can see StreamKit's metrics on the bundled +Grafana dashboards locally — no cloud, no manual import. + +## Quick start + +```bash +cd samples/observability +docker compose up -d # skit + Prometheus + Grafana +./generate-traffic.sh # drive ~20 TTS + STT requests through skit +``` + +Then open Grafana at (anonymous admin, no login). Two +dashboards are auto-provisioned: + +- **StreamKit Performance Dashboard** — the repo's main dashboard + ([`samples/grafana-dashboard.json`](../grafana-dashboard.json)), including the + **Plugins / ML inference** row. +- **StreamKit Speech Gateway Dashboard** — the gateway/oneshot dashboard + ([`examples/speech-gateway/grafana-dashboard.json`](../../examples/speech-gateway/grafana-dashboard.json)). + +| Service | URL | +| ---------- | ----------------------- | +| Grafana | | +| Prometheus | | +| skit API | | +| gateway | (gateway profile only) | + +## How metrics get to Prometheus + +Two different paths, both visible on the dashboards: + +- **skit → Prometheus (OTLP push).** skit exports OTLP metrics to Prometheus' + native OTLP receiver, which is enabled with `--web.enable-otlp-receiver`. + Configured via `SK_TELEMETRY__OTLP_ENDPOINT` pointing at + `http://prometheus:9090/api/v1/otlp/v1/metrics`. This feeds the HTTP, engine, + oneshot, and **plugin** metrics. +- **gateway → Prometheus (scrape).** The speech gateway exposes a classic + `/metrics` endpoint that Prometheus scrapes (see `prometheus.yml`). This feeds + the **Speech Gateway** row. + +## Speech Gateway row + +The gateway is behind a compose profile because it requires the gateway +**metrics** instrumentation: + +```bash +docker compose --profile gateway up -d --build +./generate-traffic.sh --gateway # route traffic through the gateway +``` + +Notes: + +- The gateway's `/metrics` endpoint and the `gateway_*` metrics require the + metrics-instrumented gateway. The Speech Gateway dashboard row stays empty + until those metrics are present and the gateway has served traffic. +- The gateway's default STT pipeline targets a Whisper model that must exist on + the skit it talks to. The bundled `-demo` image ships `ggml-tiny-q5_1.bin`; if + the gateway points at a different model, STT through the gateway will fail + while TTS still works. The direct-to-skit traffic path (the default + `generate-traffic.sh`) avoids this by shipping its own pipelines under + `pipelines/`. + +## Known gotchas + +These are the sharp edges worth knowing when wiring this up yourself: + +- **Pin a versioned `-demo` tag.** `latest-demo` can lag behind released + versions and predate metrics like `plugin.call.duration`, which leaves the + Plugins / ML inference row empty. This stack pins `v0.5.0-demo`. +- **Demo image plugin layout.** Current `-demo` images ship native plugins as + bare `.so` files under `plugins/native/`, but the loader expects directory + bundles (`plugins/native//` with a `plugin.yml` + the `.so`). `skit serve` + otherwise logs "no plugins found" and pipelines fail with "node kind not + found". `skit/entrypoint.sh` reassembles the expected layout at startup from + the in-repo manifests (mounted at `/repo-manifests`). +- **Model names must match.** Pipelines reference model files by path; the file + must actually be present in the image/`models/` dir. The pipelines under + `pipelines/` use the model names the `-demo` image actually ships. +- **Local auth override.** skit refuses to start unauthenticated on a + non-loopback bind unless you opt in. This stack sets + `SK_AUTH__MODE=disabled` + `SK_PERMISSIONS__ALLOW_INSECURE_NO_AUTH=true`. + **Local testing only** — never do this on an exposed instance. +- **Grafana dashboard datasource.** The committed dashboards use a + `${DS_PROMETHEUS}` datasource input. The `dashboard-prep` step rewrites it to + the provisioned datasource uid so the dashboards load without a manual import. + +## Cleanup + +```bash +docker compose --profile gateway down -v +``` diff --git a/samples/observability/docker-compose.yml b/samples/observability/docker-compose.yml new file mode 100644 index 000000000..305057bc9 --- /dev/null +++ b/samples/observability/docker-compose.yml @@ -0,0 +1,96 @@ +# Local observability stack for StreamKit: skit + Prometheus + Grafana, with an +# optional speech gateway. See README.md for the walkthrough and known gotchas. +# +# Usage: +# docker compose up -d # skit + Prometheus + Grafana +# docker compose --profile gateway up -d # also build & run the speech gateway +# +# Grafana: http://localhost:3000 (anonymous admin, no login) +# Prometheus: http://localhost:9090 +# skit API: http://localhost:4545 +# gateway: http://localhost:8080 (gateway profile only) + +services: + skit: + image: ghcr.io/streamer45/streamkit:v0.5.0-demo + # Pinned to a versioned -demo tag on purpose: `latest-demo` can lag behind + # and predate metrics like plugin.call.duration, leaving dashboard rows empty. + entrypoint: ["/entrypoint.sh"] + environment: + SK_AUTH__MODE: disabled + SK_PERMISSIONS__ALLOW_INSECURE_NO_AUTH: "true" + SK_PLUGINS__DIRECTORY: /opt/streamkit/np + SK_TELEMETRY__ENABLE: "true" + SK_TELEMETRY__OTLP_ENDPOINT: http://prometheus:9090/api/v1/otlp/v1/metrics + volumes: + - ./skit/entrypoint.sh:/entrypoint.sh:ro + - ../../plugins/native:/repo-manifests:ro + ports: + - "4545:4545" + healthcheck: + test: ["CMD", "curl", "-fsS", "http://localhost:4545/healthz"] + interval: 5s + timeout: 3s + retries: 20 + + prometheus: + image: prom/prometheus:v3.1.0 + command: + - --config.file=/etc/prometheus/prometheus.yml + - --web.enable-otlp-receiver + - --storage.tsdb.path=/prometheus + volumes: + - ./prometheus.yml:/etc/prometheus/prometheus.yml:ro + ports: + - "9090:9090" + + dashboard-prep: + image: alpine:3.21 + # Copies the in-repo dashboards into Grafana's provisioning dir, resolving + # the ${DS_PROMETHEUS} template input to the provisioned datasource uid so + # the dashboards load without manual import. + command: + - sh + - -c + - | + set -e + for f in /in/*.json; do + sed 's/$${DS_PROMETHEUS}/prometheus/g' "$$f" > "/out/$$(basename "$$f")" + done + echo "prepared dashboards:"; ls -1 /out + volumes: + - ../../samples/grafana-dashboard.json:/in/streamkit.json:ro + - ../../examples/speech-gateway/grafana-dashboard.json:/in/speech-gateway.json:ro + - grafana-dashboards:/out + + grafana: + image: grafana/grafana:11.4.0 + environment: + GF_AUTH_ANONYMOUS_ENABLED: "true" + GF_AUTH_ANONYMOUS_ORG_ROLE: Admin + GF_AUTH_DISABLE_LOGIN_FORM: "true" + GF_SECURITY_ADMIN_PASSWORD: admin + volumes: + - ./grafana/provisioning:/etc/grafana/provisioning:ro + - grafana-dashboards:/var/lib/grafana/dashboards:ro + ports: + - "3000:3000" + depends_on: + - prometheus + - dashboard-prep + + gateway: + profiles: ["gateway"] + build: + context: ../../examples/speech-gateway + environment: + GATEWAY_LISTEN: ":8080" + SKIT_URL: http://skit:4545 + ports: + - "8080:8080" + depends_on: + skit: + condition: service_healthy + +volumes: + grafana-dashboards: diff --git a/samples/observability/generate-traffic.sh b/samples/observability/generate-traffic.sh new file mode 100755 index 000000000..d253d88ea --- /dev/null +++ b/samples/observability/generate-traffic.sh @@ -0,0 +1,51 @@ +#!/usr/bin/env bash +# SPDX-FileCopyrightText: © 2025 StreamKit Contributors +# +# SPDX-License-Identifier: MPL-2.0 + +# Drives TTS + STT traffic so the dashboards have data. By default it calls +# skit's oneshot /api/v1/process directly (no gateway required). Pass --gateway +# to route through the speech gateway instead (requires the `gateway` profile), +# which also populates the Speech Gateway dashboard row. +set -euo pipefail + +ROUNDS="${ROUNDS:-20}" +SKIT_URL="${SKIT_URL:-http://localhost:4545}" +GATEWAY_URL="${GATEWAY_URL:-http://localhost:8080}" +HERE="$(cd "$(dirname "$0")" && pwd)" +MODE="direct" +[ "${1:-}" = "--gateway" ] && MODE="gateway" + +tmp="$(mktemp -d)" +trap 'rm -rf "$tmp"' EXIT + +echo "mode=$MODE rounds=$ROUNDS" +for i in $(seq 1 "$ROUNDS"); do + text="StreamKit observability sample, round $i: the quick brown fox." + if [ "$MODE" = "gateway" ]; then + curl -fsS -o "$tmp/a.ogg" -d "$text" "$GATEWAY_URL/tts" + # The gateway's built-in STT pipeline targets a Whisper model the -demo + # image may not ship, so STT can return 5xx; surface it once, not per round. + code=$(curl -s -o /dev/null -w '%{http_code}' --data-binary @"$tmp/a.ogg" -H 'Content-Type: audio/ogg' "$GATEWAY_URL/stt") + case "$code" in + 2*) ;; + *) [ -n "${stt_warned:-}" ] || { printf '\nnote: gateway STT -> HTTP %s; its built-in pipeline needs a Whisper model the demo image may not ship (see README / #553). TTS still populates the gateway row.\n' "$code"; stt_warned=1; } ;; + esac + else + printf '%s' "$text" > "$tmp/in.txt" + # X-StreamKit-Service lets a service-label-aware skit (see PR #545) split + # oneshot metrics by {tts,stt}; older builds simply ignore the header. + curl -fsS -o "$tmp/a.ogg" \ + -H 'X-StreamKit-Service: tts' \ + -F "config=<$HERE/pipelines/tts-kokoro.yml" \ + -F "media=@$tmp/in.txt;type=text/plain;filename=media" \ + "$SKIT_URL/api/v1/process" + curl -fsS -o /dev/null \ + -H 'X-StreamKit-Service: stt' \ + -F "config=<$HERE/pipelines/stt-whisper.yml" \ + -F "media=@$tmp/a.ogg;type=audio/ogg;filename=media" \ + "$SKIT_URL/api/v1/process" || true + fi + printf '.' +done +echo " done" diff --git a/samples/observability/grafana/provisioning/dashboards/streamkit.yml b/samples/observability/grafana/provisioning/dashboards/streamkit.yml new file mode 100644 index 000000000..721b3cba6 --- /dev/null +++ b/samples/observability/grafana/provisioning/dashboards/streamkit.yml @@ -0,0 +1,9 @@ +apiVersion: 1 + +providers: + - name: streamkit + type: file + allowUiUpdates: true + options: + path: /var/lib/grafana/dashboards + foldersFromFilesStructure: false diff --git a/samples/observability/grafana/provisioning/datasources/prometheus.yml b/samples/observability/grafana/provisioning/datasources/prometheus.yml new file mode 100644 index 000000000..c70b848dd --- /dev/null +++ b/samples/observability/grafana/provisioning/datasources/prometheus.yml @@ -0,0 +1,11 @@ +apiVersion: 1 + +datasources: + - name: Prometheus + uid: prometheus + type: prometheus + access: proxy + url: http://prometheus:9090 + isDefault: true + jsonData: + timeInterval: 5s diff --git a/samples/observability/pipelines/stt-whisper.yml b/samples/observability/pipelines/stt-whisper.yml new file mode 100644 index 000000000..37f6d1d26 --- /dev/null +++ b/samples/observability/pipelines/stt-whisper.yml @@ -0,0 +1,27 @@ +name: stt +description: Speech-to-Text (Whisper) for the observability sample +mode: oneshot +steps: + - kind: streamkit::http_input + - kind: containers::ogg::demuxer + - kind: audio::opus::decoder + - kind: audio::resampler + params: + chunk_frames: 960 + output_frame_size: 960 + target_sample_rate: 16000 + - kind: plugin::native::whisper + params: + model_path: models/ggml-tiny-q5_1.bin + language: en + vad_model_path: models/silero_vad.onnx + vad_threshold: 0.5 + min_silence_duration_ms: 700 + max_segment_duration_secs: 30.0 + - kind: core::json_serialize + params: + pretty: false + newline_delimited: true + - kind: streamkit::http_output + params: + content_type: application/json diff --git a/samples/observability/pipelines/tts-kokoro.yml b/samples/observability/pipelines/tts-kokoro.yml new file mode 100644 index 000000000..d5f9fb188 --- /dev/null +++ b/samples/observability/pipelines/tts-kokoro.yml @@ -0,0 +1,28 @@ +name: tts +description: Text-to-Speech (Kokoro) for the observability sample +mode: oneshot +steps: + - kind: streamkit::http_input + - kind: core::text_chunker + params: + min_length: 10 + - kind: plugin::native::kokoro + params: + model_dir: "models/kokoro-multi-lang-v1_1" + speaker_id: 0 + speed: 1.0 + num_threads: 4 + - kind: audio::resampler + params: + chunk_frames: 960 + output_frame_size: 960 + target_sample_rate: 48000 + - kind: audio::opus::encoder + - kind: containers::ogg::muxer + params: + channels: 1 + codec: opus + chunk_size: 32768 + - kind: streamkit::http_output + params: + content_type: audio/ogg diff --git a/samples/observability/prometheus.yml b/samples/observability/prometheus.yml new file mode 100644 index 000000000..5296e6f4d --- /dev/null +++ b/samples/observability/prometheus.yml @@ -0,0 +1,12 @@ +global: + scrape_interval: 5s + evaluation_interval: 5s + +# skit pushes metrics via OTLP to Prometheus' native OTLP receiver +# (enabled with --web.enable-otlp-receiver in docker-compose.yml), so skit +# does not need a scrape_config here. The speech gateway exposes a classic +# Prometheus /metrics endpoint and is scraped below. +scrape_configs: + - job_name: speech-gateway + static_configs: + - targets: ['gateway:8080'] diff --git a/samples/observability/skit/entrypoint.sh b/samples/observability/skit/entrypoint.sh new file mode 100755 index 000000000..7f58ce5a6 --- /dev/null +++ b/samples/observability/skit/entrypoint.sh @@ -0,0 +1,31 @@ +#!/bin/sh +# SPDX-FileCopyrightText: © 2025 StreamKit Contributors +# +# SPDX-License-Identifier: MPL-2.0 + +# The -demo images currently ship native plugins as bare `.so` files under +# plugins/native/, but the loader expects directory bundles +# (plugins/native// with a plugin.yml + the .so). Without this, `skit serve` +# logs "no plugins found" and TTS/STT pipelines fail with "node kind not found". +# We assemble the expected layout from the repo manifests (mounted at +# /repo-manifests) plus the .so files baked into the image, then start the +# server. Tracked upstream; remove once the demo image ships bundles directly. +set -e + +SRC=/opt/streamkit/plugins/native +DST=/opt/streamkit/np/native +mkdir -p "$DST" + +for manifest in /repo-manifests/*/plugin.yml; do + [ -f "$manifest" ] || continue + id=$(basename "$(dirname "$manifest")") + so=$(awk '/^entrypoint:/{print $2}' "$manifest") + if [ -n "$so" ] && [ -f "$SRC/$so" ]; then + mkdir -p "$DST/$id" + cp "$manifest" "$DST/$id/plugin.yml" + cp "$SRC/$so" "$DST/$id/$so" + echo "assembled plugin bundle: $id ($so)" + fi +done + +exec skit serve