Last updated: November 30, 2025
Confidence: 95%+ (based on 2025 agentic AI patterns research)
OpenHR implements a multi-agent AI architecture using 2025 agentic patterns for autonomous, collaborative task execution. Each agent is a stateless FastAPI microservice with defined roles, tools, and RAG pipelines. Agents communicate via REST or event bus (Phase 2) for orchestration.
Key Patterns Applied:
- Task-Oriented Agents: Single-responsibility for specific workflows (resume parsing, match scoring).
- Event-Driven Collaboration: Agents trigger each other via message bus for complex workflows.
- RAG Integration: All LLM-facing agents use retrieval-augmented generation with vector DB.
- Orchestrator Pattern: Central coordinator for multi-agent workflows (Temporal.io optional).
Role: Parse resumes, enrich profiles from GitHub/LinkedIn, generate bios/summaries.
Architecture:
- Input: Resume PDF/text, GitHub username, LinkedIn profile URL.
- Tools: LLM (GPT-4o/Llama 3), skill extraction (SBERT), GitHub API, resume parser (PyMuPDF).
- Workflow:
- Parse raw resume → extract structure (experience, skills, education).
- Normalize skills via taxonomy service.
- Generate bio summary using RAG (user history + skill taxonomy).
- Enrich with GitHub stats (stars, forks, contributions).
- Output: Structured profile JSON + LLM-generated bio.
- Latency Target: <5s per profile.
- Cost: $0.01-0.05 per enrichment (optimized).
Implementation: FastAPI endpoint /enrich-profile with async processing queue.
Role: Compute multi-factor match scores between profiles and startup roles.
Architecture:
- Input: Profile ID, startup role ID, user preferences.
- Tools: FAISS vector search, skill taxonomy, collaborative filtering (ALS), LLM for vision alignment.
- Workflow:
- Skill Fit (40%): Semantic similarity via SBERT embeddings (skills vector distance).
- Complementary Skills (20%): Detect gaps in startup role skills, score profile's unique value.
- Vision Alignment (15%): LLM compares profile bio + startup mission (cosine similarity).
- Work Style (10%): Questionnaire-based matching (remote preference, timezone, commitment).
- Location/Timezone (5%): Geographic + timezone compatibility scoring.
- Equity/Stage (10%): User preferences vs role requirements.
- Output: Match object with score breakdown (explainable).
- Latency Target: <100ms per match computation.
- Scalability: Batch processing for 100K+ profiles.
Implementation: FastAPI /compute-match with caching (Redis for recent scores).
Role: Generate personalized match recommendations using hybrid filtering.
Architecture:
- Input: User ID, filters (location, role type, stage), cold-start flag.
- Tools: Two-tower neural network, FAISS for content-based, ALS for collaborative.
- Workflow:
- Cold Start: Content-based only (profile embeddings → FAISS search).
- Warm Users: Hybrid (FAISS top-1000 → ALS re-rank → contextual filter).
- Diversity: Ensure role type, industry, location diversity in top-10.
- Explainability: Generate factor contributions for each recommendation.
- Output: Ranked list of matches with scores and explanations.
- Latency Target: <200ms end-to-end.
- Accuracy Target: 85%+ user satisfaction (A/B tested).
Implementation: FastAPI /recommend-matches with pagination support.
Role: Generate personalized icebreaker messages for new matches.
Architecture:
- Input: Match ID (profile + role details).
- Tools: LLM (fine-tuned for professional tone), RAG with match explanation.
- Workflow:
- Retrieve match details and explanation.
- Generate 3-5 message variants (professional, enthusiastic, question-based).
- Ensure GDPR compliance (no PII in prompts).
- Output: Array of suggested messages.
- Latency Target: <2s per suggestion set.
- Conversion Impact: +60% faster first message composition.
Implementation: FastAPI /suggest-messages triggered on match acceptance.
Role: Validate profiles, detect fraud, verify GitHub/LinkedIn authenticity.
Architecture:
- Input: Profile ID for verification request.
- Tools: GitHub API, LinkedIn API, email verification, LLM for anomaly detection.
- Workflow:
- Verify email ownership.
- Check GitHub activity (commits > threshold, recent activity).
- Cross-validate LinkedIn connections/skills.
- LLM flags suspicious patterns (duplicate profiles, inconsistent data).
- Output: Verification status (verified, pending, flagged) + confidence score.
- Latency Target: <10s per verification.
Implementation: Background job triggered on profile creation/update.
Most user interactions trigger one agent (e.g., /enrich-profile → Profile Enrichment Agent).
Complex flows chain agents:
- Onboarding: Profile Enrichment → Skill Normalization → Recommendation Generation.
- Match Discovery: Recommendation Agent → Match Scoring → Message Suggestion.
- Profile Update: Verification Agent → Re-compute matches → Update recommendations.
Orchestrator: Central FastAPI service routes requests and coordinates via async tasks (Celery/RQ).
- Use Redis Streams or RabbitMQ for agent-to-agent communication.
- Events:
profile_updated,new_match,message_senttrigger relevant agents. - Decouples services for better scalability and fault tolerance.
Agent Base Class: All agents extend a common Python class with:
- Input validation (Pydantic).
- Logging with correlation IDs.
- Error handling and retries.
- Metrics collection (Prometheus).
Example Structure:
class BaseAgent:
def __init__(self, config: AgentConfig):
self.llm = LLMClient(config.llm_model)
self.vector_db = VectorDBClient(config.vector_endpoint)
self.taxonomy = TaxonomyService()
async def execute(self, input_data: dict) -> dict:
# Validate, process, return
pass
class ProfileEnrichmentAgent(BaseAgent):
async def execute(self, resume_data: dict) -> ProfileData:
# Specific implementation
passDeployment: Each agent as separate FastAPI service in Docker/K8s.
All LLM agents use standardized RAG:
- Retrieval: Query vector DB with user query/profile embedding.
- Augmentation: Inject top-K relevant documents (skills, past matches, taxonomy).
- Generation: LLM generates response grounded in retrieved context.
- Validation: Post-process for hallucinations, PII, toxicity.
Vector Sources:
- Skill taxonomy embeddings.
- User interaction history.
- Startup mission statements.
- Industry best practices.
- Metrics: Latency, success rate, token usage per agent (Prometheus).
- Tracing: OpenTelemetry for end-to-end request tracing across agents.
- Logging: Structured logs with agent names, input/output hashes.
- Alerts: PagerDuty for agent failures >5% error rate.
Cost Tracking:
- Track LLM token usage per agent and user.
- Alert on cost spikes (>20% MoM).
- Cache common responses (explanations, suggestions).
MVP (Phase 1): 3 core agents (Profile, Match, Recommendation) as REST endpoints. Phase 2: Add event-driven orchestration, Verification + Message agents. Phase 3: Multi-agent crews for complex workflows (e.g., interview prep, negotiation support).
Scaling Strategy:
- Horizontal scaling per agent type.
- Async queues for bursty workloads.
- Model distillation (fine-tune smaller models per agent).
References:
- Azure Architecture Center: AI Agent Orchestration Patterns [web:222]
- Agent Design Pattern Catalogue (arXiv) [web:216]
- 5 Agentic AI Design Patterns (Shakudo, 2025) [web:226]
- Multi-Agent Architectures Survey (arXiv, 2024) [web:215]