🚀 AI Multimodal Workspace: Ada & Gesture Control

This repository features two cutting-edge Python projects that bridge the gap between Human-Computer Interaction (HCI) and Artificial Intelligence. It includes Ada, a real-time voice assistant powered by Gemini, and a Gesture-Based Controller for media management.

🎙️ 1. Ada: Real-Time Voice Assistant

Ada is a sophisticated AI companion built using the Google Gemini 2.5 Flash Native Audio model. Unlike traditional assistants, Ada processes raw audio streams for near-instantaneous responses.

Key Features

Low-Latency Interaction: Utilizes v1beta live connection for real-time dialogue.
VAD & Interruption Handling: Integrated Voice Activity Detection allows you to interrupt the AI naturally, just like a human conversation.
Smart Context: Configured with a witty and helpful personality ("Ada") using the 'Kore' voice profile.

🖐️ 2. Gesture-Based Spotify Controller

A computer vision application that allows you to control your Spotify playback and Windows volume using hand movements. Perfect for hands-free control while working or gaming.

Control Schema

Gesture	Action
Pinch (Index + Thumb)	Dynamic Volume Adjustment (0% - 100%)
Fist (Closed Hand)	Mute Spotify and Lock Controls
German Three (3 Fingers)	Max Volume (100%) and Lock Controls
Full Palm (5 Fingers)	Unlock Controls
Thumb Up (Like)	Lock Current Volume Level

🛠️ Technical Stack

Core: Python 3.10+
AI/ML: Google GenAI SDK (Gemini 2.5 Flash), MediaPipe
Computer Vision: OpenCV
Audio Processing: PyAudio, Pycaw (Windows Core Audio), Struct, Math
Asynchronous I/O: Asyncio

⚙️ Installation & Setup

Clone the Repository:

git clone https://github.com/JustLmr/Ada
cd ai-multimodal-workspace

Install Dependencies:
```
pip install -r requirements.txt
```

Configure Environment Variables:

GEMINI_API_KEY=your_google_gemini_api_key

Run the Application:
```
python main.py
```
(Note: Ensure your microphone and camera are not being used by other applications.)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.env		.env
Readme.md		Readme.md
hand_controller.py		hand_controller.py
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 AI Multimodal Workspace: Ada & Gesture Control

🎙️ 1. Ada: Real-Time Voice Assistant

Key Features

🖐️ 2. Gesture-Based Spotify Controller

Control Schema

🛠️ Technical Stack

⚙️ Installation & Setup

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 AI Multimodal Workspace: Ada & Gesture Control

🎙️ 1. Ada: Real-Time Voice Assistant

Key Features

🖐️ 2. Gesture-Based Spotify Controller

Control Schema

🛠️ Technical Stack

⚙️ Installation & Setup

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages