Company Description
AgentVersity is dedicated to building a community of skilled AI professionals prepared to address the challenges of the future. Our mission is to make AI education accessible, engaging, and effective, empowering individuals to develop impactful AI-driven solutions. With a network of over 8,000 AI professionals from leading companies like Google, Microsoft, and Meta, we provide the latest updates, exclusive training, and exceptional support. Join us to advance your AI expertise and be part of an innovative and dynamic organization.
Role Description
This is a full-time, on-site role for a Generative AI Engineer located in Huntersville, NC. The role involves developing, training, and optimizing generative AI models. Daily responsibilities include researching state-of-the-art AI techniques, designing machine learning algorithms, collaborating with interdisciplinary teams, and testing and deploying AI solutions. As a Generative AI Engineer, you will also help in identifying new AI use cases and driving innovation to meet strategic business objectives.
Responsibilties :
- Design and implement AI agents using frameworks such as LangGraph, Google ADK, or other agentic frameworks
- Build Retrieval-Augmented Generation (RAG) systems using vector databases and structured data sources
- Develop evaluation frameworks for:
- Agent behavior and decision quality
- RAG retrieval accuracy and answer faithfulness
- Fine-tune language models using LoRA / QLoRA for task-specific performance improvements
- Deploy and host small and medium language models using vLLM for high-throughput inference
- Optimize inference performance (latency, memory, GPU utilization)
- Integrate tools, APIs, and external systems into agent workflows
- Work with cloud infrastructure on AWS, GCP, or Azure to deploy GenAI workloads
- Collaborate with DevOps / SRE teams to ensure reliability, observability, and scalability
Required SKills and Experience :
- Strong experience building Generative AI applications using LLMs
- Hands-on experience with AI agents and agent orchestration frameworks:
- LangGraph, Google ADK, or similar agentic frameworks
- Solid understanding of RAG architectures , including:
- Embeddings, vector databases, chunking strategies, and retrieval evaluation
- Experience implementing evaluation pipelines for LLMs, agents, and RAG systems
- Practical experience with LoRA / QLoRA fine-tuning
- Experience hosting models using vLLM or similar inference servers
- Proficiency in Python
- Experience working in at least one cloud environment:
- AWS, GCP, or Azure
- Familiarity with Docker and basic Kubernetes concepts
Nice to have
- Experience with multi-agent systems and tool-using agents
- Knowledge of prompt engineering and structured output generation
- Experience with GPU-based workloads and memory optimization
- Familiarity with observability tools for GenAI systems
- Prior experience deploying GenAI systems in production
What we value :
- Strong problem-solving mindset
- Ability to think in systems, not just prompts
- Focus on production-ready AI , not demos
- Clear communication and collaboration skills
**Preference will be given to USC and GC holders. No H1B. OPT and H4 EAD can apply. NO C2C