Role overview
*AI Engineer (Fluent in Thai and English)
Position Overview**
We are seeking an AI Engineer to join our Global Analytics team in London. This role is focused on the end-to-end lifecycle of production-grade AI, from training and fine-tuning specialized models to architecting high-performance inference pipelines.
The ideal candidate views AI as a rigorous engineering discipline. Beyond building models, you will be responsible for writing high-quality, maintainable Python code and ensuring that every solution—whether a voice agent or a document processor—is built for reliability, low latency, and global scale.
What you'll work on
- Model Training & Fine-Tuning: Lead the adaptation of Large Language Models (LLMs) for domain-specific tasks using techniques like LoRA, QLoRA, and PEFT to balance performance with resource efficiency.
- Inference Optimization: Architect and optimize inference pipelines to minimize TTFT (Time to First Token) and maximize throughput. This includes implementing quantization, caching strategies, and efficient batching.
- Production Engineering: Build and maintain real-time AI pipelines using WebSockets and SSE , ensuring seamless low-latency delivery for voice (ASR/TTS) and text applications.
- Architecture & MLOps: Deploy and orchestrate models within containerized microservice architectures ( Docker/Kubernetes ), ensuring robust monitoring, security, and scalability.
- Collaborative Delivery: Work closely with Business Analysts and internal stakeholders to bridge the gap between commercial requirements and technical implementation.
What we're looking for
- Experience in the insurance or financial services sector.
- Deep knowledge of GPU architecture , CUDA, and hardware-level performance optimization.
- Familiarity with Document Intelligence frameworks (OCR, layout analysis, and multimodal extraction).
- MUST be fluent in Thai and English