Role overview

Role summary

We’re hiring a AI Architect to design and guide end-to-end AI solutions—from problem framing and model selection (LLMs/classical ML) to environment architecture, security, and cost performance. You’ll collaborate closely with a Solution Architect (overall system design/enterprise fit), while you own the AI architecture: which models, which vector/search technologies, which AWS services, how to integrate with data platforms, how to run safely and cost-effectively at scale.

What you’ll do

Own AI architecture for initiatives (RAG, agents, predictive models, NLP, vision): define reference architectures, target state, and patterns (batch/real-time, online/offline inference).
Model selection & evaluation: choose LLMs/foundation models by use case (Bedrock models—Anthropic, Mistral, Meta, Cohere, etc.; SageMaker-hosted custom; open-source). Define evals (quality, latency, safety), guardrails, and fallback strategies.
AWS solution design (AI/ML stack): map use cases to services such as Amazon Bedrock, SageMaker (Studio/Training/Inference/Experiments/Model Registry), S3/Lake Formation, Kendra (or alternatives), OpenSearch, Lambda/Step Functions, EKS/ECS, API Gateway, CloudWatch/CloudTrail, ECR, Secrets Manager/SSM Parameter Store, KMS, MSK/Kinesis, Glue, Athena, Redshift, PrivateLink/VPC endpoints, WAF.
Vector & search tech choices: evaluate and standardize options (e.g., Kendra, OpenSearch vector, MongoDB Atlas Vector, pgvector, Pinecone, Weaviate) including ingestion, schema, embeddings, filters, TTL, and ops.
RAG/Agentic patterns: design retrieval pipelines (chunking, hybrid search, re-ranking), prompt orchestration, tool-use/function-calling, persona/policy layers, caching, and safety filters.
Environment planning: define dev/test/prod topologies, network isolation, data zones, GPU/accelerator strategy, CI/CD for models (MLOps) and prompts (PromptOps), blue/green or canary rollout for models.
Cost & performance engineering: produce FinOps projections and guardrails (token/throughput budgeting, autoscaling, spot strategy, quantization, distillation, response caching, batch vs. real-time tradeoffs).
Security, privacy, and governance: design guardrails for PHI/PII; encryption (at rest/in transit), row-level/column-level controls, key management, data retention; prompt-injection/EOP mitigations; model risk documentation.
ML platform collaboration: work with data scientists/ML engineers on feature stores, experiment tracking, offline/online parity, A/B tests, and evaluation pipelines.
Operational readiness: SLOs/SLIs, tracing and telemetry for prompts & models, incident playbooks, reliability and capacity planning.
Partnering & enablement: co-author solution docs with Solution Architect; create reference implementations, templates, and internal standards; mentor teams.

Required qualifications

8+ years in AI/ML and data engineering combined, with 3+ years as an architect for production AI systems at enterprise scale.
Proven delivery of LLM-powered apps (chatbots/agents/RAG) and classical ML services (forecasting, classification, ranking) in production.
Deep hands-on AWS experience across Bedrock and/or SageMaker plus core data/compute/networking (S3, Lake Formation, Glue, Redshift/Athena, Lambda/Step Functions, EKS/ECS, VPC, IAM, KMS).
Strong grasp of vector search and retrieval design, embeddings, re-ranking, and metadata filtering; experience with at least one managed vector/search option (Kendra/OpenSearch/MongoDB Atlas Vector/Pinecone).
Solid MLOps practices: model registries, CI/CD, automated evaluation, feature stores, model/package versioning, rollbacks, and observability.
Security & compliance literacy (IAM, tokenization, KMS, network isolation, audit), with experience in regulated data (e.g., PHI/PII).
Ability to build cost models and optimize end-to-end latency/throughput and spend (GPU sizing, autoscaling, caching, quantization).
Excellent architecture documentation, stakeholder communication, and leadership with cross-functional teams.

What we're looking for

Experience with Azure AI (Azure OpenAI, Cognitive Search, Synapse) for hybrid/multi-cloud considerations.
Experience with Snowflake Cortex/ICEBERG or Databricks MosaicML/Dolly ecosystems.
Familiarity with agent frameworks and tool-calling (e.g., LangChain, LlamaIndex) and productionizing them on AWS.
Hands-on with Kafka/MSK, event-driven patterns, streaming features/online inference.
FinOps Practitioner mindset; prior ownership of multi-million-token or GPU budgets.
Prior work in healthcare/financial services or other regulated industries; FedRAMP experience preferred.
Cloud or AI certifications (AWS ML Specialty, AWS Solutions Architect Pro, Azure Data/AI Engineer).

Job Type: Contract

Application Question(s):

What is your Work Authorization Status?
Do you have experience with Azure AI? if yes then how many years?
How many total years of experience you have as an AI Architect?

Work Location: Remote

Tags & focus areas

Used for matching and alerts on DevFound

Contract Remote Ai Machine Learning Data Science Mlops Generative Ai Data Engineer

Senior AI Architect

Role overview

What we're looking for

Tags & focus areas

Ready to Join the Team?