Skip to content

docs: surface hosted speech gateway demo for newcomers#541

Open
staging-devin-ai-integration[bot] wants to merge 4 commits into
mainfrom
devin/1780144946-surface-speech-gateway-demo
Open

docs: surface hosted speech gateway demo for newcomers#541
staging-devin-ai-integration[bot] wants to merge 4 commits into
mainfrom
devin/1780144946-surface-speech-gateway-demo

Conversation

@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor

@staging-devin-ai-integration staging-devin-ai-integration Bot commented May 30, 2026

Summary

Surface the hosted speech gateway (tts.streamkit.dev / stt.streamkit.dev) as a zero-install glimpse of StreamKit for newcomers, with a clear best-effort/no-SLA disclaimer.

  • Landing page (docs/.../index.mdx): a "Try it from your terminal" section. TTS pipes audio to ffplay; STT is a self-contained tts | stt chain (no local file, no extra tooling) so the first touch always works.
  • README + examples/speech-gateway/README.md: same TTS one-liner; STT uses an ffmpeg mic-capture one-liner (record → Opus → curljq) with macOS + Linux variants, so there's no dependency on a sample audio file. The example points at its existing cross-platform ./stt.sh for an interactive version.
  • Generic disclaimer everywhere: best-effort, no SLA, may be offline/rate-limited, monitored for abuse, don't send anything sensitive.
  • Drive-by: fix a stale cd examples/streamkit-cli-gateway path in the example README.

Docs-only — no code paths touched.

Review & Validation

  • One-liners run as written: TTS + the tts | stt chain verified against the live endpoint here; mic variants confirmed on macOS + Linux.

Notes

Part of a broader speech-services effort; the observability pieces (gateway Prometheus metrics, dashboard rows, oneshot service label) are tracked in separate PRs.

Link to Devin session: https://staging.itsdev.in/sessions/2abca85782ba45fe90994b8307526aa3
Requested by: @streamer45


Devin Review

Status Commit
🕐 Outdated 72d14a6 (HEAD is a1ee443)

Run Devin Review

Open in Devin Review (Staging)

Add a 'Try it from your terminal' section (curl-based TTS/STT against
tts.streamkit.dev / stt.streamkit.dev) to the docs landing page and
README so newcomers can see what StreamKit can do without installing
anything. Both carry a generic best-effort/no-SLA disclaimer.

Also note the hosted instance in the speech-gateway example README and
fix a stale 'cd' path there (streamkit-cli-gateway -> speech-gateway).

Signed-off-by: streamkit-devin <devin@streamkit.dev>
@staging-devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment, CI, and merge conflict monitoring

Copy link
Copy Markdown
Contributor Author

@staging-devin-ai-integration staging-devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 2 potential issues.

Open in Devin Review (Staging)
Debug

Playground

Comment thread docs/src/content/docs/index.mdx Outdated
Speech to text (send any Ogg/Opus file, get back newline-delimited JSON):

```bash
curl --data-binary @sample.ogg https://stt.streamkit.dev
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 STT example references a file the reader has not created

The new landing-page STT command uses --data-binary @sample.ogg, but this section is advertised as “No install required” and does not include any preceding step that creates or downloads sample.ogg; the docs’ quick-start only creates that file in a separate guide (docs/src/content/docs/getting-started/quick-start.md:104-116). A reader running the copied command from this page will have curl fail locally with “Failed to open/read local data” before it ever reaches the public demo.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Valid catch for the landing page specifically — it's framed as "no install required," so @sample.ogg reads like a file the reader should already have, and curl will fail locally before reaching the demo. (The README copy is fine since it uses the in-repo @samples/audio/system/sample.ogg.)

Proposed fix for the landing page, pending the author's call:

  • Make it self-contained by chaining the two endpoints so no local file is needed:
    curl -sd 'StreamKit transcribes speech' https://tts.streamkit.dev | curl -s --data-binary @- https://stt.streamkit.dev
  • Or keep it simple with an obvious placeholder (@your-audio.ogg) and a note that any Ogg/Opus file works.

Holding off until @streamer45 confirms which they'd prefer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c13dd5d — the landing-page STT example now pipes the TTS output straight into STT, so it's fully self-contained (no local file) and doubles as a demo of both endpoints:

curl -sd 'StreamKit transcribes speech' https://tts.streamkit.dev | curl -s --data-binary @- https://stt.streamkit.dev

Comment thread README.md
Text to speech (returns Opus audio, piped straight to `ffplay`):

```bash
curl -d 'Hello from StreamKit' https://tts.streamkit.dev | ffplay -nodisp -autoexit -
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Hosted TTS example relies on gateway content-type fallback

The new hosted TTS examples intentionally omit -H "Content-Type: text/plain". This is still compatible with the existing gateway because curl -d sends application/x-www-form-urlencoded, and handleTTS explicitly treats that content type as text/plain before proxying (examples/speech-gateway/cmd/gateway/main.go:260-264). I therefore did not flag the missing header as a bug, although adding the explicit header would make the example less dependent on this gateway-specific fallback.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional — these mirror the exact CLI commands the maintainer shared, so I kept them copy-paste-identical and relied on the gateway's documented application/x-www-form-urlencodedtext/plain fallback in handleTTS. Happy to add the explicit -H "Content-Type: text/plain" if we'd rather not depend on the fallback; leaving as-is for now to match the original.

streamkit-devin and others added 3 commits May 30, 2026 15:30
The landing page is framed as 'no install required', but the STT
example referenced a local @sample.ogg the reader never creates. Pipe
the TTS output straight into STT so the command runs as-is and also
showcases both endpoints at once.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
Replace the file-based STT example (which referenced a sample.ogg the
reader doesn't have) with a self-contained ffmpeg mic-capture oneliner
that streams Opus straight to the hosted endpoint and pretty-prints the
transcript with jq.

Signed-off-by: streamkit-devin <devin@streamkit.dev>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants