A sophisticated web scraping system that collects and organizes air fryer recipes from popular cooking websites. Built with Python, this project automates recipe collection, standardizes formats, and uses AI for intelligent categorization.
- Automated recipe collection from major cooking websites
- Intelligent recipe categorization using AI
- Standardized recipe format and storage
- Nutritional information parsing
- Cooking time calculations
- Database storage and management
- Python 3.8+
- BeautifulSoup4 for web scraping
- Selenium for dynamic content
- SQLAlchemy + MySQL for data storage
- FastAPI for API endpoints
- OpenAI for recipe categorization
- Pytest for testing
- Python 3.8 or higher
- MySQL Server
- Chrome WebDriver for Selenium
- Clone the repository:
git clone https://github.com/yourusername/air-fryer-recipe-hub.git
cd air-fryer-recipe-hub- Create and activate virtual environment:
python -m venv venv
.\venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Set up environment variables:
Create a
.envfile with:
DB_USERNAME=your_username
DB_PASSWORD=your_password
DB_HOST=localhost
DB_NAME=airfryer_recipes
OPENAI_API_KEY=your_openai_key- Initialize the database:
python setup_db.py- Run the scraper:
python main.py- AllRecipes
- EatingWell
- SeriousEats
- SimplyRecipes
- SpruceEats
Run tests using pytest:
pytestapp/
├── ai/ # AI integration components
├── db/ # Database models and CRUD operations
└── scrapers/ # Web scraping implementations
├── _abstract.py # Abstract base classes
├── _utils.py # Utility functions
└── [sites].py # Site-specific scrapers
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- Recipe websites for providing valuable content
- OpenAI for AI capabilities
- Python community for amazing libraries