T
AI

MLOps EngineerW2

TalentXM (Formerly BlockTXM Inc) ·

Actively hiring Posted 28 days ago

Role overview

MLOps Engineer (Contract Role)

**Job Title:**
MLOps Engineer

**Contract Duration:**
6–12 Months (extension possible)

**Location:**
Remote (U.S.-based)

**Reports To:**
DevOps & Service Delivery Leadership

**Company Overview**

Our client is a
**rapidly growing technology services company**
specializing in AI, Data, and Quality Engineering solutions. With a
**global delivery model**
spanning North America, the United Kingdom, and an offshore development center in India, our client consistently delivers cutting-edge solutions that
**surpass expectations**
. The company has achieved
**triple-digit year-over-year growth**
and earned prestigious industry accolades, including making the
*Inc. 5000*
list of fastest-growing companies and being recognized as
**Best IT Services Company**
,
**Best Data Technology Company**
, as well as
**Partner of the Year**
by Tricentis. They uphold a bold vision to be experts in AI, Data, and Quality Engineering transformations – encouraging innovative thinking and authentic relationships in all endeavors.

**Role Overview**

Our client is seeking an experienced
**MLOps Engineer**
to develop and operationalize a comprehensive AI cost tracking and observability framework across multiple cloud platforms. In this role, you will be instrumental in ensuring visibility into AI/ML model performance, usage, and cost metrics across Azure, Google Cloud, and Snowflake environments. You will collaborate closely with cross-functional teams (DevOps, FinOps, and others) to optimize model deployments both for performance and cost-efficiency.

**Key Responsibilities:**

* **Cost & Observability Framework:**
Build a common AI cost tracking and observability framework spanning Azure ML, Google Vertex AI (Gemini), and Snowflake platforms.
* **Cloud Billing Integration:**
Integrate cloud billing and usage APIs (Azure ML, OpenAI, Google Vertex AI/Gemini) to aggregate and monitor AI service costs.
* **Metadata Tagging:**
Develop model-level metadata tagging processes for cost attribution and trend analysis, enabling granular visibility into costs per model or project.
* **Monitoring & Alerting:**
Implement and manage Datadog dashboards (or similar observability tools) with alerts for model performance issues – including latency spikes, model drift, and anomaly detection in predictions or usage.
* **Collaboration for Optimization:**
Work closely with DevOps and FinOps teams to visualize model costs and identify optimization opportunities (e.g. rightsizing resources, adjusting usage patterns).
* **Documentation & Knowledge Transfer:**
Deliver comprehensive documentation and conduct knowledge transfer sessions to internal teams at project closure, ensuring they can maintain and extend the cost tracking framework.

**Required Skills & Experience**

* **MLOps/DevOps Experience:**
5+ years of hands-on experience in MLOps, DevOps, or Cloud Engineering roles focused on AI/ML systems deployment and operations.
* **Cloud AI Platforms:**
Strong experience working with
**Azure ML**
,
**Google Vertex AI (Gemini)**
, and
**OpenAI**
platforms/services, including deploying and managing models on these services.
* **Observability Tools:**
Expertise in
**Datadog**
(or equivalent monitoring/observability tools) for tracking application performance, logs, and metrics.
* **Programming & Automation:**
Advanced proficiency in
**Python**
and
**SQL**
for building automation scripts, data analysis, and integration of monitoring pipelines.
* **CI/CD & Monitoring Integration:**
Proven experience integrating cost and performance monitoring steps into CI/CD pipelines, ensuring that model deployments are coupled with automated observability and cost checks.
* **FinOps & Cost Management:**
Solid understanding of
**FinOps principles**
, cloud billing APIs, and strategies for cloud cost optimization in an engineering context (e.g. optimizing compute/storage for AI workloads).

**Preferred Qualifications**

* **Generative AI Frameworks:**
Experience with
**GenAI/Agentic AI frameworks**
such as LangChain or building RAG (Retrieval-Augmented Generation) pipelines, especially in production environments.
* **Regulated Environment Experience:**
Familiarity with implementing cost tracking and ML monitoring in regulated environments (e.g. ensuring compliance with
**ISO**
,
**SOC 2**
,
**HITRUST**
or similar standards).

Tags & focus areas

Used for matching and alerts on DevFound
Contract Remote Ai Machine Learning Data Science Mlops Generative Ai