Conversation AI Hackathon ($10k Prize)
Prize Pool
$10,000 USD – awarded to top solutions
Who Can Enter
Open worldwide – software engineers, AI researchers, Indie developers
Submission Deadline
May 10, 2026
Format
Remote/online – Submit via Github + video demo
Judging
Proxa Labs engineering and product teams
Bonus Award
Invitation for future consulting to continue build out of platform and public recognition and portfolio features on Proxa Labs marketing
Background and Problem Statement
Proxa Echo is an AI-powered roleplay and certification platform built for pharmaceutical and life sciences sales organizations. It enables sales representatives to practice high-stakes conversations with simulated healthcare professionals (HCPs) — building confidence, clinical credibility, and compliance readiness before ever entering the field.
The core experience requires a lifelike, conversational AI avatar — one that can listen, respond in real time, and simulate a realistic face-to-face interaction. We are currently evaluating alternatives to our existing third-party avatar provider due to three critical limitations:
· Token costs at scale are prohibitive for enterprise deployment
· The API was designed for one-way video generation (marketing, training), not real-time conversational AI
· Reliance on a closed third-party platform creates unacceptable product risk — we have no control over pricing changes, API deprecations, or feature direction
We explored available open-source options and found none that meet the requirements of low-latency, conversational, avatar-based interaction. That gap is the opportunity this hackathon is designed to solve.
What We’re Looking For
We are looking for working software — not concepts or mockups — that demonstrates a real-time, conversational AI avatar capable of being integrated into a web-based application. The solution should be open-source or licensable by Proxa Labs for commercial use.
Core Requirements
Requirement
Description
Real-time Lip Sync
Avatar speech must be synchronized to audio in real time. Latency from audio input to visible lip movement should be under 300ms for a natural conversational feel.
Natural Language Input
The system must accept voice input from the user, transcribe it, and pass it to an LLM (e.g., Claude, GPT-4) to generate the avatar’s response.
LLM Integration
Must support integration with Anthropic Claude or OpenAI GPT-4 APIs to power the avatar’s dialogue and personality.
Expressive Facial Animation
Avatar should display natural head movement, eye movement, and basic emotional expression (neutral, engaged, skeptical, positive).
Web-Based Delivery
The solution must run in a modern browser environment (Chrome/Safari) without requiring local software installation.
Persona Configuration
Ability to configure the avatar’s name, role, personality traits, and dialogue behavior via a system prompt or config file.
Session State Management
The avatar should maintain conversational context across a multi-turn session (minimum 10–15 exchanges).
Mobile Browser Support
iOS, Safari, Android, Chrome
Bonus Features (Score Higher)
Solutions that include any of the following will receive additional scoring consideration:
· Voice-to-voice with less than 500ms end-to-end latency
· Custom avatar appearance via photo upload or 3D model
· Compliance monitoring hook — ability to flag specific phrases or topics in real time alongside the conversation
· Scoring and feedback module — post-session summary of conversation quality
· Mobile browser support (iOS Safari / Android Chrome)
· Multi-language support
Judging Criteria
Conversational Realism — latency, lip sync, expressiveness: 40%
Technical Completeness — all core requirements met and functional: 30%
Integration Quality — clean API / SDK that Proxa Labs can adopt: 20%
Code Quality — well-documented, maintainable, production-ready: 10%
Submission Requirements
All submissions must include:
• Public GitHub repository with full source code and an open-source or commercial-compatible license (MIT, Apache 2.0, or equivalent)
• A README with clear setup instructions, dependencies, and architecture overview
• A working demo video (3–5 minutes) showing the avatar in a live conversational session — voice in, avatar responds in real time
• A brief written document (1–2 pages) describing your technical approach, key design decisions, and any known limitations
• Documentation of any third-party APIs or models used, including licensing terms
Rules and Eligibility
• Open to individuals and teams worldwide — no geographic restrictions
• Teams may be up to 4 members; each person may only participate in one team
• Submissions must be original work created during the hackathon period
• Use of open-source libraries and pre-trained models is permitted and encouraged
• Proxa Labs retains the right to negotiate licensing of winning solutions; IP ownership remains with the submitting team unless otherwise agreed
• Proxa Labs reserves the right to disqualify submissions that do not meet core requirements or contain plagiarized code
• Prize payments will be made via wire transfer or Wise within 30 days of winner announcement
Technical Context for Builders
To help you build in the right direction, here is how the solution will integrate into Proxa Echo:
· The avatar engine will be embedded in a React-based web application
· Each session is initialized with a system prompt defining the HCP persona (name, specialty, personality, mood, product context)
· The rep’s voice input is captured in the browser, transcribed, and sent to an LLM — the avatar speaks the LLM response
· Sessions run 5–15 minutes with continuous multi-turn dialogue
· The solution must support at least 3–5 distinct avatar appearances (different HCP personas)
· Production infrastructure is AWS-based; solutions should be containerizable (Docker preferred)
Suggested technology stack directions (not prescriptive):
· Talking head models: SadTalker, DiffTalk, Wav2Lip, MuseTalk
· TTS: ElevenLabs API, Coqui TTS, Edge TTS, Kokoro
· STT: Whisper (OpenAI), Deepgram, AssemblyAI
· Real-time streaming: WebRTC, WebSockets
· 3D avatars: Ready Player Me + Three.js, Avaturn