Skip to content

Add MiniMax Cloud TTS as speech generation provider#73

Open
octo-patch wants to merge 2 commits intoabus-aikorea:mainfrom
octo-patch:feature/add-minimax-tts
Open

Add MiniMax Cloud TTS as speech generation provider#73
octo-patch wants to merge 2 commits intoabus-aikorea:mainfrom
octo-patch:feature/add-minimax-tts

Conversation

@octo-patch
Copy link
Copy Markdown

Summary

  • Add MiniMax Cloud TTS API integration (speech-2.8-hd and speech-2.8-turbo models) as a new tab under Speech Generation
  • Follow existing three-layer architecture: core engine (abus_tts_minimax.py), Gradio bridge (gradio_tts_minimax.py), UI tab (tab_tts_minimax.py)
  • 12 preset voices with speed control, subtitle (SRT/ASS/VTT) and plain text input support
  • 24 unit tests + 4 integration tests covering API payloads, error handling, voice constants, and config

Changes

File Description
app/abus_tts_minimax.py Core MiniMax TTS class with generate_audio, request_tts, srt_to_voice, text_to_voice, infer
app/gradio_tts_minimax.py Gradio bridge connecting UI events to TTS engine
app/tab_tts_minimax.py Gradio UI tab with voice/model dropdowns, speed slider, subtitle upload
app/abus_config.py get_minimax_api_key() and minimax_tts_available() helpers
app/abus_app_voice.py Import and register MiniMax tab under Speech Generation
tests/test_minimax_tts.py 24 unit tests
tests/test_minimax_integration.py 4 integration tests (require MINIMAX_API_KEY)

Configuration

Set MINIMAX_API_KEY environment variable (or add to .env file) to enable MiniMax Cloud TTS.

Test plan

  • All 24 unit tests pass (pytest tests/test_minimax_tts.py)
  • Integration tests pass with valid API key
  • MiniMax tab appears under Speech Generation in the UI
  • Text-to-speech synthesis works with different voices and models
  • Subtitle file (SRT) input produces timed audio output

PR Bot and others added 2 commits March 21, 2026 00:54
Integrate MiniMax Cloud TTS API (speech-2.8-hd and speech-2.8-turbo models)
as a new speech generation tab alongside existing Edge-TTS, F5-TTS, CosyVoice,
and Kokoro engines.

- Add MiniMax TTS core engine with 12 preset voices, subtitle and plain text
  support, silence trimming, and stereo conversion pipeline
- Add Gradio bridge and UI tab following existing three-layer architecture
- Add get_minimax_api_key() and minimax_tts_available() config helpers
- Add 24 unit tests and 4 integration tests

Co-Authored-By: Octopus <liyuan851277048@icloud.com>
- Remove invalid sample_rate/bitrate from audio_setting (API rejects them)
- Register MiniMax TTS tab in abus_app_gulliver.py
- Add dependency mocking for test portability
- Add i18n translations for MiniMax and Model keys
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant