Job Summary

We are looking for an experienced Data Engineer to support AI Factory initiatives by building and maintaining data infrastructure for AI, automation, and GenAI applications.

The role will focus on designing data pipelines, preparing knowledge bases, managing vector databases, and enabling enterprise data sources for AI applications such as chatbots, AI assistants, and RAG-based solutions.

The ideal candidate will have strong experience in Python, SQL, Azure data services, ETL/ELT pipelines, data modelling, and AI-ready data preparation. Experience in public sector, higher education, research, or large enterprise environments is highly preferred.

Key Responsibilities

Design, build, and maintain data pipelines to ingest, transform, and prepare data for AI applications
Develop and maintain knowledge bases and vector databases for RAG systems
Implement data quality checks, monitoring, and validation for AI data sources
Build connectors and integrations with enterprise data platforms and systems
Optimise data storage, retrieval, and performance for GenAI applications
Support semantic search, embeddings, and retrieval workflows for LLM-based applications
Document data architectures, data flows, and maintain data catalogues
Collaborate with AI, application, and business teams to understand data requirements for chatbots and AI assistants
Ensure data pipelines are scalable, reliable, and aligned with governance and compliance requirements

Required Qualifications and Skills

Minimum 5+ years of experience in data engineering
Strong programming experience in Python
Strong expertise in SQL
Experience with relational and NoSQL databases
Hands-on experience with Azure data services, such as:
Azure Data Factory
Azure Synapse
Azure Databricks
Knowledge of ETL/ELT patterns and data pipeline orchestration
Experience with data modelling and schema design
Experience building and maintaining scalable data pipelines
Ability to work with enterprise data sources and complex integration environments
Strong documentation and communication skills

Preferred Qualifications

Experience preparing data for LLM or GenAI applications
Knowledge of RAG — Retrieval-Augmented Generation
Experience with chunking, embeddings, semantic search, and retrieval
Experience with vector databases such as:
Pinecone
Weaviate
FAISS
Azure AI Search
Similar vector database platforms
Experience with Azure OpenAI or similar GenAI platforms
Familiarity with data governance, data privacy, and compliance requirements
Experience with Google Cloud Platform data services such as:
BigQuery
Cloud Storage
Dataflow
Pub/Sub
Experience in public sector, higher education, research, government, or multi-entity enterprise environments

Experience Requirement

Candidates must demonstrate recent experience, preferably within the last 18 months, in enterprise-scale data, digital, AI, or automation projects of comparable scope and complexity.

Experience in any of the following environments will be considered an advantage:

Public sector organisations
Government entities
Higher education institutions
Research organisations
State-owned enterprises
Large multi-entity organisations

Job Type: Permanent

Pay: Up to QAR20,000.00 per month

Work Location: In person

Tags & focus areas

Used for matching and alerts on DevFound

Ai Data Engineer

Data Engineer- AI Factory / GenAI Data Infrastructure

Tags & focus areas

Ready to Join the Team?