Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@ Define your pipeline once, run it how you need.

- [Project status](#project-status)
- [Use cases](#use-cases)
- [Try it from your terminal](#try-it-from-your-terminal)
- [Core concepts](#core-concepts)
- [Quickstart (Docker)](#quickstart-docker)
- [What's included](#whats-included)
Expand Down Expand Up @@ -74,6 +75,31 @@ If you try it and something feels off, please open an issue (or a small PR). For
- **Batch processing** — High-throughput file conversion or offline transcription using the Oneshot HTTP API.
- **Your idea** — Add your own node or plugin and compose it into a pipeline

## Try it from your terminal

Want a glimpse of what StreamKit can do without installing anything? These commands hit a public instance of the [speech gateway example](examples/speech-gateway/) — a thin HTTP front-end over StreamKit oneshot pipelines (Kokoro TTS and Whisper STT).

Text to speech (returns Opus audio, piped straight to `ffplay`):

```bash
curl -d 'Hello from StreamKit' https://tts.streamkit.dev | ffplay -nodisp -autoexit -
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

📝 Info: Hosted TTS example relies on gateway content-type fallback

The new hosted TTS examples intentionally omit -H "Content-Type: text/plain". This is still compatible with the existing gateway because curl -d sends application/x-www-form-urlencoded, and handleTTS explicitly treats that content type as text/plain before proxying (examples/speech-gateway/cmd/gateway/main.go:260-264). I therefore did not flag the missing header as a bug, although adding the explicit header would make the example less dependent on this gateway-specific fallback.

Open in Devin Review (Staging)

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Intentional — these mirror the exact CLI commands the maintainer shared, so I kept them copy-paste-identical and relied on the gateway's documented application/x-www-form-urlencodedtext/plain fallback in handleTTS. Happy to add the explicit -H "Content-Type: text/plain" if we'd rather not depend on the fallback; leaving as-is for now to match the original.

```

Speech to text (record 5s from your mic with `ffmpeg`, get back JSON — no audio file needed):

```bash
# macOS (CoreAudio); list inputs with: ffmpeg -f avfoundation -list_devices true -i ""
ffmpeg -hide_banner -f avfoundation -i ":0" -t 5 -ac 1 -ar 48000 -c:a libopus -f ogg - | curl -s --data-binary @- -H 'Content-Type: audio/ogg' https://stt.streamkit.dev | jq

# Linux (PulseAudio/PipeWire); use "-f alsa -i default" for ALSA
ffmpeg -hide_banner -f pulse -i default -t 5 -ac 1 -ar 48000 -c:a libopus -f ogg - | curl -s --data-binary @- -H 'Content-Type: audio/ogg' https://stt.streamkit.dev | jq
```

Drop `-t 5` to record until you stop `ffmpeg` with `q`.

> [!NOTE]
> `tts.streamkit.dev` and `stt.streamkit.dev` are a free, best-effort public demo with no SLA — they may be slow, rate-limited, or offline at any time, and usage is monitored for abuse. Don't send anything sensitive. To run your own, see [`examples/speech-gateway/`](examples/speech-gateway/).

## Core concepts

- **Pipelines** are node graphs (DAGs) that process real-time streams and requests.
Expand Down
19 changes: 19 additions & 0 deletions docs/src/content/docs/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,25 @@ Then open [http://localhost:4545](http://localhost:4545) to access the web UI.
> - Follow the [Quick Start](/getting-started/quick-start/) to run your first pipeline end-to-end.
> - Explore the [web UI guide](/guides/web-ui/) and [Creating Pipelines](/guides/creating-pipelines/) when you're ready to build your own graphs.

## Try it from your terminal

No install required — these hit a public instance of the [speech gateway example](https://github.com/streamer45/streamkit/tree/main/examples/speech-gateway), a thin HTTP front-end over StreamKit oneshot pipelines (Kokoro TTS and Whisper STT).

Text to speech (returns Opus audio, piped straight to `ffplay`):

```bash
curl -d 'Hello from StreamKit' https://tts.streamkit.dev | ffplay -nodisp -autoexit -
```

Speech to text — pipe the synthesized audio straight back into STT (no local file needed) and get back newline-delimited JSON:

```bash
curl -sd 'StreamKit transcribes speech' https://tts.streamkit.dev | curl -s --data-binary @- https://stt.streamkit.dev
```

> [!NOTE]
> `tts.streamkit.dev` and `stt.streamkit.dev` are a free, best-effort public demo with no SLA — they may be slow, rate-limited, or offline at any time, and usage is monitored for abuse. Don't send anything sensitive. To run your own, see [`examples/speech-gateway`](https://github.com/streamer45/streamkit/tree/main/examples/speech-gateway).

## Execution modes

StreamKit supports two pipeline execution modes:
Expand Down
21 changes: 20 additions & 1 deletion examples/speech-gateway/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,25 @@ SPDX-License-Identifier: MPL-2.0

Thin HTTP gateway that rewrites simple STT/TTS requests into the multipart oneshot format expected by a StreamKit backend.

## Hosted instance

A free, best-effort public instance runs at `https://tts.streamkit.dev` and `https://stt.streamkit.dev`, so you can try the endpoints below without running anything:

Text to speech (returns Opus audio, piped to `ffplay`):

```sh
curl -d 'Hello from StreamKit' https://tts.streamkit.dev | ffplay -nodisp -autoexit -
```

Speech to text — record from your mic with `ffmpeg`, no audio file needed (use `STT_URL=https://stt.streamkit.dev ./stt.sh` for an interactive, cross-platform version):

```sh
# macOS; on Linux use "-f pulse -i default" (PulseAudio/PipeWire) or "-f alsa -i default"
ffmpeg -hide_banner -f avfoundation -i ":0" -t 5 -ac 1 -ar 48000 -c:a libopus -f ogg - | curl -s --data-binary @- -H 'Content-Type: audio/ogg' https://stt.streamkit.dev | jq
```

There is no SLA — it may be slow, rate-limited, or offline at any time, and usage is monitored for abuse. Don't send anything sensitive. Run your own (below) to remove those limits.

## Prereqs

- StreamKit server running locally (default assumed: `http://127.0.0.1:4545`).
Expand All @@ -16,7 +35,7 @@ Thin HTTP gateway that rewrites simple STT/TTS requests into the multipart onesh
## Run the gateway

```sh
cd examples/streamkit-cli-gateway
cd examples/speech-gateway
go run ./cmd/gateway --listen :8080 --skit-url http://127.0.0.1:4545
```

Expand Down
Loading