Role overview
We’re looking for founding voice AI systems engineers to build and scale Known’s core voice systems architecture, powering our voice-led onboarding and user experiences.
This is a unique opportunity to work with a hyper-personalized data-set, combining voice transcripts, images, and structured user data to empower real-time, personalized AI voice-led conversations at scale. You’ll work directly with Chen Peng, former head of ML at Uber Eats and Faire.
What we're looking for
We’re looking for someone who obsesses over the "uncanny valley":
- 3-5 Years in ML/Systems: Proven experience deploying high-scale models in production, specifically focusing on audio processing or real-time streaming.
- The Voice Stack: Deep familiarity with modern STT/TTS frameworks (e.g., ElevenLabs, LiveKit, VITS and Sesame) and audio libraries like Librosa or FFmpeg.
- Agentic Conversational AI: Experience building "brain" logic for LLMs using tools like LangGraph or Haystack to manage complex, non-linear dialogue.
- Production Hardened: You’ve optimized model inference for speed using TensorRT, ONNX, or Triton, and you’re comfortable in a Docker/Kubernetes/Cloud environment.
We’re backed by Eurie Kim and Kirsten Green at Forerunner Ventures (the investors behind Decagon, Faire, and Oura), NFX and PearVC.
Compensation Range: $225K - $330K