A lightweight CLI tool to convert PDF and Word documents to Markdown, and merge multiple Markdown files together.
No LLMs. No cloud APIs. Zero tokens burned. Just fast, local document conversion.
- Convert PDF to Markdown - preserves headings, lists, tables, and structure
- Convert DOCX to Markdown - maintains document formatting
- Merge Markdown files - combine multiple files with source markers and separators
- Batch processing - convert entire directories at once
- Lightweight - uses Microsoft's markitdown library, no heavy dependencies
If you've never used Terminal before, don't worry — it's simpler than it looks. Terminal is just a text-based way to give your computer instructions.
- Press Command + Space to open Spotlight Search
- Type Terminal
- Press Enter or click on Terminal.app
A window will open with a blinking cursor. That's your terminal — you're ready to type commands.
- Press the Windows key on your keyboard
- Type cmd or PowerShell
- Press Enter
A window will open with a blinking cursor. That's your terminal — you're ready to type commands.
Tip: You can also right-click the Start button and select "Windows PowerShell" or "Terminal".
- Type a command and press Enter to run it
- Copy/paste works normally (Command+C/V on Mac, Ctrl+C/V on Windows)
- To stop a command that's running, press Ctrl+C
- To clear the screen, type
clear(Mac) orcls(Windows) and press Enter
That's it! You're ready to install and use mdg.
You need Python 3.10 or higher installed on your computer.
Check if you have Python:
python --versionIf you see "Python 3.10" or higher, you're good. If not:
- Mac: Download from python.org or run
brew install python - Windows: Download from python.org (check "Add Python to PATH" during installation)
pipx installs Python CLI tools in isolated environments and makes them globally available.
# Install pipx if you don't have it
python -m pip install --user pipx
python -m pipx ensurepath
# Restart your terminal, then install mdg
pipx install git+https://github.com/futureformed/markdowngetdown.gitpip install git+https://github.com/futureformed/markdowngetdown.gitgit clone https://github.com/futureformed/markdowngetdown.git
cd markdowngetdown
pip install .After installing, close and reopen your terminal, then run:
mdg --helpIf you see usage information, you're ready to go!
# Convert a PDF
mdg convert document.pdf
# Convert a Word document
mdg convert document.docx
# Specify output location
mdg convert document.pdf -o output.md
# Convert all documents in a folder
mdg convert ./documents/# Merge specific files
mdg merge file1.md file2.md file3.md -o combined.md
# Merge all files in a folder
mdg merge ./chapters/ -o book.md
# Preview merged output (prints to screen)
mdg merge file1.md file2.mdMac/Linux:
mdg convert ~/Documents/meetings/ -o ~/Documents/meetings-md/
mdg merge ~/Documents/meetings-md/ -o ~/Documents/all-meetings.mdWindows:
mdg convert C:\Users\YourName\Documents\meetings\ -o C:\Users\YourName\Documents\meetings-md\
mdg merge C:\Users\YourName\Documents\meetings-md\ -o C:\Users\YourName\Documents\all-meetings.mdmdg convert ./reports/ -o ./reports-md/
mdg merge ./reports-md/Q1.md ./reports-md/Q2.md ./reports-md/Q3.md ./reports-md/Q4.md -o annual-report.mdThis means mdg isn't in your system PATH. Try these steps:
- Close and reopen your terminal — sometimes PATH changes need a fresh terminal
- Check if pipx installed correctly:
You should see
pipx list
mdgin the list - Reinstall with pip instead:
pip install git+https://github.com/futureformed/markdowngetdown.git
Python isn't installed or isn't in your PATH:
- Mac: Run
brew install pythonor download from python.org - Windows: Download from python.org and make sure to check "Add Python to PATH" during installation, then restart your terminal
If you get permission errors, try adding --user to pip commands:
pip install --user git+https://github.com/futureformed/markdowngetdown.gitIf you see an error about execution policy, run this command first:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUserThen try installing again.
# If installed with pipx
pipx uninstall mdg
# If installed with pip
pip uninstall mdg- Python 3.10 or higher
- macOS, Linux, or Windows
mdg uses Microsoft's markitdown library for document conversion. It extracts text, preserves structure (headings, lists, tables), and outputs clean Markdown. No AI or LLM is involved - it's pure text extraction.
- Scanned/image-based PDFs may not convert well (no OCR)
- Complex formatting may require manual cleanup
- Only supports PDF and DOCX formats currently
MIT
Issues and pull requests welcome at github.com/futureformed/markdowngetdown.