Recreate NotebookLM's AI PPT feature and extend it into a controllable, editable, model-configurable PPT workbench that converts papers, documents, and other materials into beautiful PPT images.
The demo covers uploading doc/L9.md, entering custom requirements, generating and editing the design outline, confirming page designs, generating a 6-slide deck, editing one slide, confirming the replacement, and exporting PDF/PPTX. Model waiting time is fast-forwarded.
NotebookLM's PPT feature is closer to a one-click result generator, with limited visibility into the design process and limited per-slide control. This project turns the workflow into an understandable, editable workbench:
- Visible process: Review the deck outline and page-by-page design notes before image generation
- Per-slide control: Edit any slide independently, generate new versions, revert history, and confirm replacements
- Model control: Configure separate OpenAI-compatible models for text planning, image generation, and image editing
- Local-first config: Manage model connections through local
config.yamlor WebUI local API configuration; saved projects and exported files do not include API keys - Export-ready output: Export generated decks to PDF/PPTX for presentation or further editing
- 🎨 Per-slide image generation: Create an editable outline and page designs before converting them into PPT page images
- 🌐 PPT Workbench: Upload sources, configure model roles, preview slides, edit pages, track history, and export
- 📝 Multi-format parsing: Supports
.md/.txt/.pdf/.docx/.pptxinput and converts content to Markdown - ✏️ Full-page image editing: Edit each generated slide independently, revert history, and confirm replacements
- 🔀 Three model roles: Configure
prompt_model,image_model, andedit_modelseparately - 🖼️ Image result compatibility: Accepts URLs, Markdown image links, data URLs,
b64_json, and raw base64 - 💾 Local multi-project persistence: Save multiple PPT projects in the browser, including source content, outline, page designs, generated images, and per-slide edit history
# Clone the project
git clone <repository-url>
cd OpenNotebookLM-AIPPT
# Configure API keys
cp config.example.yaml config.yaml
# Edit config.yaml and fill in your API keysOption 1: WebUI Interface (Recommended)
# One-click start for both frontend and backend
./start.shAfter startup, visit:
- 🎨 Frontend: http://localhost:5173
- 📚 API Docs: http://localhost:8000/docs
Option 2: Start Frontend and Backend Separately
# Terminal 1: Start backend
./start-api.sh
# Terminal 2: Start frontend
cd web && npm install && npm run devOption 3: Command Line Usage
# Install dependencies
pip install -r requirements.txt
# Basic usage
python main.py -i doc/L9.md -n 5
# Generate prompts only
python main.py -i doc/L9.md -n 5 --prompt-only -o prompts.json
# Generate from prompt file
python main.py --from-prompt prompts.jsonAIPPT stores project content and image assets in the current browser profile's IndexedDB, and uses localStorage for the active project id, UI preferences, and local API configuration. Saved project data includes uploaded sources, content settings, design outlines, page designs, generated images, edited versions, and image data needed for export.
Notes:
- Clearing browser site data removes local projects.
- Projects do not automatically sync across browsers or devices.
- API keys belong to local API configuration; they are not written into saved project records and are not included in exported PDF/PPTX files.
- Upload Document: Drag and drop or click to upload a source file in the left panel
- Configure Models: Configure text, image generation, and image editing model roles
- Set Parameters & Requirements: Choose page count, resolution, aspect ratio, language, style, audience, and custom requirements
- Confirm Design: Generate an editable outline, confirm it, then review the generated page designs
- Generate PPT: Generate slide images after page-design confirmation and watch real-time progress
- Preview & Edit: Preview generated slides in the right panel and edit a single page when needed
- Export: Export to PDF or PPTX
The built-in demo source is doc/L9.md. This is a repository-relative path, so a fresh clone can use it directly in the WebUI or CLI examples.
OpenNotebookLM-AIPPT/
├── src/ # Core logic
├── api/ # FastAPI backend
├── web/ # React frontend
├── tests/ # Tests
├── doc/ # Input documents directory
│ └── L9.md # Default demo source
├── config.yaml # Configuration file
├── start.sh # One-click startup script
└── main.py # CLI entry point
All configurations are managed in config.yaml, including:
- API configuration (
prompt_model,image_model,edit_model) - PPT default settings (language, style, page count)
- Timeout and retry settings
See config.example.yaml for detailed configuration examples.
api:
models:
prompt_model:
adapter: "openai_chat"
model: "gpt-4o"
base_url: "https://api.openai.com/v1"
api_key: "sk-xxx"
image_model:
adapter: "raw_chat_multimodal"
model: "gpt-image-2"
base_url: "https://api.example.com/v1"
api_key: "sk-xxx"
edit_model:
adapter: "raw_chat_multimodal"
model: "gpt-image-2"
base_url: "https://api.example.com/v1"
api_key: "sk-xxx"output/ppt_20241201_123456/
├── source_material.txt # Original input material
├── prompts.json # Generated prompts
├── result.json # Generation result
├── presentation.pdf # Exported PDF
└── images/ # Slide images
- Upgrade generated PPT images into structured, editable PPT content
- Support region selection for partial slide editing
- Add more provider profile templates
Apache License 2.0
