The dottxt package exposes two clients based on the OpenAI Python
SDK:
DotTxt: synchronous clientAsyncDotTxt: asynchronous client
Both clients expose the same surface:
models.list(): list models available for the API key providedgenerate(...): main structured-output method that accepts various output types to constrain the generation and returns parsed datachat.completions.create(...): fully OpenAI-compatible chat completions
Both generate and chat.completions.create allow you to generate constrained text
based on an input and an output type, but generate includes helper features for
better convenience.
pip install dottxtRequires Python 3.10+.
The client takes two arguments. Each is read from the constructor first, then from the environment.
api_key(str | None): falls back toDOTTXT_API_KEY. Required, the constructor raisesValueErrorif neither is set.base_url(str | None): falls back toDOTTXT_BASE_URL, then tohttps://api.dottxt.ai/v1.
Any other keyword argument is forwarded to the underlying OpenAI SDK client.
DotTxt and AsyncDotTxt take the same arguments and expose the same
surface; the async version just returns awaitables.
from dottxt import DotTxt, AsyncDotTxt
sync_client = DotTxt()
async_client = AsyncDotTxt()models.list() returns an OpenAI SDK SyncPage[Model] (or AsyncPage[Model] on
AsyncDotTxt). The models are under .data and each has an id property.
from dottxt import DotTxt
client = DotTxt()
page = client.models.list()
for model in page.data:
print(model.id)
# openai/gpt-oss-20b
# Qwen/Qwen3.5-35B-A3B-FP8Use the id values as the model= argument of generate(...) or
chat.completions.create(...). You can also set DOTTXT_MODEL to one of them
to use it as a default for the CLI.
The generate method is the main way of generating constrained text. It requires
providing a model ID, a text input and the desired response format.
Parameters:
model(str): a model ID, it must correspond to one of the models returned by themodels.listmethodinput(str | list[dict]): the prompt — a plain string or a list of OpenAI-style messages (see Input)response_format(Any): the output type to use to constrain the response (see Output Types)temperature(float | None),max_tokens(int | None),seed(int | None): optional arguments to control the generation parameters- any other keyword is forwarded to the API endpoint. As the client is based
on the OpenAI sdk, those keywords must be those used by
chat.completions.create
input accepts two types of values:
- a plain string, wrapped as a single user message
- a list of OpenAI-style message dicts, passed through to the API unchanged. See the OpenAI chat messages reference for the supported roles and fields.
from typing import Literal
from pydantic import BaseModel
from dottxt import DotTxt
class IncidentSummary(BaseModel):
severity: Literal["low", "medium", "high"]
team: str
client = DotTxt()
# Plain string
client.generate(
model="openai/gpt-oss-20b",
input="Summarize this incident: checkout errors are blocking purchases.",
response_format=IncidentSummary,
)
# Messages list (e.g. to add a system prompt or prior turns)
client.generate(
model="openai/gpt-oss-20b",
input=[
{"role": "system", "content": "You are an incident-response assistant."},
{"role": "user", "content": "Summarize: checkout errors are blocking purchases."},
],
response_format=IncidentSummary,
)response_format accepts numerous types that evaluate to a JSON Schema:
- a JSON Schema as a
strordict - a Pydantic
BaseModelsubclass - a
TypedDictordataclass - an
Enumclass,Literal[...],Union[...]orOptional[...] - typing containers:
list[T],dict[K, V],tuple[...] - any object exposing
to_json() -> str(e.g. Genson)
generate(...) returns decoded JSON (dict, list, str, ...) for all
response formats except Pydantic, for which it returns a validated model
instance.
Raw list instances are not supported — pass list[T] or a JSON Schema object.
dottxt.InvalidOutputError: raised bygenerate(...)when the completion cannot be parsed into the requested structure. It exposes:model: model identifier used for generationraw_output: raw completion text returned by the modelfinish_reason: completion finish reason when available (a"length"value is called out in the message as a likely truncation)original_error: underlyingjson.JSONDecodeErroror PydanticValidationError
dottxt.schemas.InvalidSchemaError: raised whenresponse_formatcannot be normalized to a JSON Schema object.- Transport and API errors (rate limits, auth failures, ...) propagate from the OpenAI SDK. See the OpenAI Python error classes.
String input with a Pydantic model:
from typing import Literal
from pydantic import BaseModel
from dottxt import DotTxt
class IncidentSummary(BaseModel):
severity: Literal["low", "medium", "high"]
team: str
client = DotTxt()
result = client.generate(
model="openai/gpt-oss-20b",
input="Summarize this incident: checkout errors are blocking purchases.",
response_format=IncidentSummary,
)
print(result) # IncidentSummary(severity='high', team='checkout')
print(result.model_dump()) # {'severity': 'high', 'team': 'checkout'}Messages input with a JSON Schema dict:
from dottxt import DotTxt
schema = {
"type": "object",
"properties": {
"severity": {"type": "string", "enum": ["low", "medium", "high"]},
"team": {"type": "string", "maxLength": 32},
},
"required": ["severity", "team"],
"additionalProperties": False,
}
client = DotTxt()
result = client.generate(
model="openai/gpt-oss-20b",
input=[
{"role": "system", "content": "You are an incident-response assistant."},
{"role": "user", "content": "Summarize: checkout errors are blocking purchases."},
],
response_format=schema,
)
print(result) # {'severity': 'high', 'team': 'checkout'}AsyncDotTxt.stream(...) yields PatchEvent objects as the model fills in a
schema-constrained response. It is built on the gateway's stream: "patch"
mode, which emits RFC 6902 JSON Patch operations in schema order, so
downstream work can start the moment a field arrives, without waiting for
the closing brace.
Parameters mirror generate(...):
model(str)input(str | list[dict])response_format(Any) — any schema input accepted bygenerate(...)temperature,max_tokens,seed,timeout— optionalextra(dict | None) — extra chat-completions body fields
Each PatchEvent carries:
event.op— the raw RFC 6902 operation ({"op": "add", "path": ..., "value": ...}). The dottxt API only ever emitsaddops;dottxt.apply_add(doc, path, value)folds one into a object in place, returning the (possibly new) root.event.snapshot— an independent deep copy of the JSON object built up to and including this opevent.field/event.value—fieldis the JSON Pointer with the leading/stripped ("intent","steps/0","address/city").valuecontains the current field content, including empty lists[]or dictionary{}values.
import asyncio
from typing import Literal
from pydantic import BaseModel
from dottxt import AsyncDotTxt
class SupportTicket(BaseModel):
# Field order = arrival order. Put what unblocks downstream work first.
intent: Literal["billing", "technical", "account"]
urgency: Literal["low", "medium", "high", "critical"]
reply: str
async def main():
client = AsyncDotTxt()
stream = client.stream(
model="openai/gpt-oss-20b",
response_format=SupportTicket,
input="I was charged twice this month, please refund the duplicate.",
)
async for event in stream:
match event.field:
case "intent":
asyncio.create_task(dispatch_to_queue(event.value))
case "urgency" if event.value == "critical":
asyncio.create_task(page_oncall())
case "reply":
await send(event.value)
asyncio.run(main())The routing decision fires the moment intent arrives, typically tens of
milliseconds in while reply continues to stream. If you need the full
object so far (e.g. to log progress or hand a partial object to another
service), use event.snapshot.
To drive your own state from the raw ops instead of the snapshot (e.g. to
mirror the object into a store of your own) fold each event.op in with
apply_add:
from dottxt import AsyncDotTxt, apply_add
async def main():
client = AsyncDotTxt()
doc = {} # the stream's first op is the root seed
stream = client.stream(
model="openai/gpt-oss-20b",
response_format=SupportTicket,
input="I was charged twice this month, please refund the duplicate.",
)
async for event in stream:
doc = apply_add(doc, event.op["path"], event.op["value"])
print(doc)Errors:
dottxt.PatchStreamError: raised when the gateway returns a non-200 status. Exposesstatus_codeandbody.
If you prefer the standard OpenAI SDK surface, you can call
chat.completions.create(...) directly. The client passes the call through
unchanged and returns the raw chat completion object, parsing and
validation are up to the caller.
For structured output, pass the wrapped OpenAI-style response_format
payload yourself:
from typing import Literal
from pydantic import BaseModel
from dottxt import DotTxt
class IncidentSummary(BaseModel):
severity: Literal["low", "medium", "high"]
team: str
client = DotTxt()
completion = client.chat.completions.create(
model="openai/gpt-oss-20b",
messages=[
{
"role": "user",
"content": "Summarize this incident: checkout errors are blocking purchases.",
}
],
response_format={
"type": "json_schema",
"json_schema": {
"name": "incident_summary",
"schema": IncidentSummary.model_json_schema(),
},
},
)
print(completion.choices[0].message.content)
# {"severity":"high","team":"checkout"}See the OpenAI chat completions reference.
Both clients hold an underlying HTTP connection pool. Call client.close()
(or await client.close() on AsyncDotTxt) when you're done with it.
Runnable examples live in the examples/ directory:
generate_pydantic.py: generate with a Pydantic modelgenerate_json_schema.py: generate with a JSON Schema stringlist_models.py: list available modelsopenai_chat_completions.py: use the OpenAI-compatiblechat.completions.createsurfacestream_field_printer.py: minimalstreamdemo — print each leaf field and value as it landsstream_early_routing.py: route on/intentwhile/replyis still streamingstream_hitl_approval.py: approve a proposed action mid-stream and discard the reply if the operator declinesstream_fanout.py: fan research tasks out on each/steps/Nas the planner emits them