Role overview
FriendliAI is building the world’s best AI inference platform that makes large language and multi-modal models fast, efficient, and deployable at scale. We power high-throughput, low-latency AI workloads for organizations worldwide and integrate directly with Hugging Face, giving developers instant access to over 500,000 open-source models.
We are a small, fast-moving team doing work that matters at one of the most exciting moments in the history of technology. With our world-class inference engine, we are building a platform that the AI industry can actually rely on.
What you'll work on
- Design, build, and maintain agent APIs and applications that deliver document understanding and other high-value features
- Evaluate and integrate open-source models to power production-ready agent features where possible
- Develop reference agent applications to showcase workflows and accelerate customer adoption
- Collaborate with backend and infrastructure teams to integrate agents with deployment, orchestration, and monitoring systems
- Ensure APIs are robust, developer-friendly, and enterprise-ready through strong design principles and documentation
- Continuously improve the reliability, scalability, and performance of agent features in production
What we're looking for
- Experience with document understanding pipelines (e.g., OCR, RAG, summarization, structured extraction)
- Familiarity with Kubernetes or container orchestration in production
- Built or contributed to agent frameworks, SDKs, or CLIs
- Have worked in a startup or fast-paced environments with ownership and ambiguity
- Passion for developer experience and enabling AI adoption