Skip to content

feat(speech-gateway): expose Prometheus metrics#546

Open
staging-devin-ai-integration[bot] wants to merge 3 commits into
mainfrom
devin/1780157823-speech-gateway-metrics
Open

feat(speech-gateway): expose Prometheus metrics#546
staging-devin-ai-integration[bot] wants to merge 3 commits into
mainfrom
devin/1780157823-speech-gateway-metrics

Conversation

@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor

@staging-devin-ai-integration staging-devin-ai-integration Bot commented May 30, 2026

Summary

Part of the speech-services observability effort. This PR instruments the Go speech gateway; sibling PRs add the Grafana dashboard and docs that build against the metric contract below.

  • Add github.com/prometheus/client_golang and serve GET /metrics via promhttp. The route is registered directly on the mux, so it is not gated by the GATEWAY_MAX_CONCURRENCY semaphore (which is acquired inside the handlers) and stays scrapable when request slots are saturated.
  • Wrap /tts and /stt in a small instrument(endpoint, …) helper that records count/latency/in-flight per endpoint. A statusRecorder captures the status code while still implementing http.Flusher, so proxied responses keep streaming incrementally.
  • Rejection reasons are derived from the final status code (415→bad_content_type, 413→too_large, 502→upstream_error), keeping the existing handler logic untouched.
  • Requests where the handler returns without writing (the context.Canceled client-disconnect path) are recorded as code="499" rather than 200, so aborted requests aren't counted as successes.
  • gateway_upstream_duration_seconds is observed around the client.Do call to the skit backend /api/v1/process.
  • Forward X-StreamKit-Service: {tts|stt} on the upstream /api/v1/process request so the backend's oneshot_pipeline.duration can split by a bounded service label (consumed by sibling Rust PR feat(metrics): config-driven bounded request labels (adds service to oneshot_pipeline.duration) #545 — without this it always reports other).
  • Anchor the example .gitignore to /gateway — the unanchored gateway pattern was also ignoring the cmd/gateway/ source dir.

Frozen metric contract emitted (all carry endpointtts|stt):

  • gateway_requests_total{endpoint,method,code} — counter
  • gateway_request_duration_seconds{endpoint} — histogram (buckets 0.05,0.1,0.25,0.5,1,2.5,5,10,30)
  • gateway_inflight_requests{endpoint} — gauge
  • gateway_upstream_duration_seconds{endpoint} — histogram
  • gateway_rejected_total{endpoint,reason} — counter, reasontoo_large|bad_content_type|upstream_error

Review & Validation

Notes

  • Runtime-checked: 415 (bad CT on tts/stt), 502 (upstream down on tts) emit the expected gateway_rejected_total reasons, and /metrics exposes the full contract.
  • golangci-lint was run with the configured linters except unqueryvet (a SQL-builder plugin requiring a custom golangci-lint build; unrelated to this code) — 0 issues. There is no CI lint job wired for examples/speech-gateway.

Link to Devin session: https://staging.itsdev.in/sessions/0aff3af47014452692b73a1e0acb87a8
Requested by: @streamer45


Devin Review

Status Commit
🕐 Outdated 1fde6cf (HEAD is 458f023)

Run Devin Review

Open in Devin Review (Staging)

Instrument the /tts and /stt handlers with request count, latency, in-flight, upstream-call latency, and rejection-reason metrics, and serve them at GET /metrics (ungated by the concurrency semaphore).

Signed-off-by: streamkit-devin <devin@streamkit.dev>
@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

Lets the skit oneshot_pipeline metric split by tts/stt service.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
Copy link
Copy Markdown
Contributor Author

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 3 potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment on lines +85 to +90
rec := &statusRecorder{ResponseWriter: w, code: http.StatusOK}
start := time.Now()
next(rec, r)

requestDuration.WithLabelValues(endpoint).Observe(time.Since(start).Seconds())
requestsTotal.WithLabelValues(endpoint, r.Method, strconv.Itoa(rec.code)).Inc()
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Canceled requests are counted as HTTP 200s

statusRecorder is initialized with http.StatusOK and instrument always increments gateway_requests_total using that value after the handler returns. The speech handlers intentionally return without writing any response when proxyMultipart fails with context.Canceled (examples/speech-gateway/cmd/gateway/main.go:253-255, examples/speech-gateway/cmd/gateway/main.go:324-326), so a client disconnect during the upstream call or response copy is exported as code="200" even though no successful 200 response was served. This makes the new success/error metrics under-report aborted requests as successes.

Prompt for agents
In examples/speech-gateway/cmd/gateway/metrics.go, avoid defaulting unwritten responses to 200 for metrics. The current statusRecorder starts with code=http.StatusOK, so handlers that return without writing (notably handleSTT/handleTTS on context.Canceled from proxyMultipart) are counted as successful 200 responses. Track whether any header/body was written and either use a distinct label for aborted/unwritten requests, skip the status-code counter for them, or have the handlers write an explicit status before returning when appropriate. Update tests to cover a handler that returns without writing.
Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 458f023. instrument now records code="499" (nginx's "client closed request") when the handler returns without writing a response, instead of defaulting to 200. This covers the context.Canceled paths in handleSTT/handleTTS, so aborted requests are no longer counted as successes. Added TestInstrumentUnwrittenResponseCountedAsClientClosed to lock in the no-write behavior.

Comment on lines +131 to +135
mux.HandleFunc("/stt", instrument("stt", gw.handleSTT))
mux.HandleFunc("/tts", instrument("tts", gw.handleTTS))
// /metrics is intentionally not gated by the concurrency semaphore so it
// stays scrapable while request slots are saturated.
mux.Handle("/metrics", promhttp.Handler())
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Instrumentation wraps only STT/TTS, leaving /metrics outside the semaphore as intended

The new route registration wraps /stt and /tts with instrument(...) but registers /metrics directly with promhttp.Handler(). This matches the documented behavior that metrics remain scrapable even when all gateway request slots are saturated, and because the semaphore is still acquired inside handleSTT/handleTTS, the metrics wrapper does not change the concurrency limit itself (examples/speech-gateway/cmd/gateway/main.go:248-249, examples/speech-gateway/cmd/gateway/main.go:279-280).

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Comment on lines +72 to +75
func (s *statusRecorder) Flush() {
if f, ok := s.ResponseWriter.(http.Flusher); ok {
f.Flush()
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Flusher preservation prevents the wrapper from disabling streaming responses

The proxied response path checks whether the response writer implements http.Flusher before wrapping writes in flushWriter (examples/speech-gateway/cmd/gateway/main.go:405-420). Since statusRecorder implements Flush() by delegating to the underlying writer, wrapping STT/TTS handlers for metrics should still allow streamed upstream audio/transcription responses to flush incrementally rather than buffering until handler completion.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Handlers return without writing on context.Canceled; default the metric to 499 (client closed) unless a status was written, so cancellations are not counted as successes.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
staging-devin-ai-integration Bot pushed a commit that referenced this pull request May 31, 2026
Brings the gateway /metrics instrumentation into the branch so the sample's
optional gateway profile exposes the gateway_* series the Speech Gateway
dashboard row queries.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants