Role overview

You will design, optimize, and productionize machine learning systems using CRC's full internal data environment. This includes tuning an existing NLP engine, developing statistical/ML models (including ordered probit and regression-based DFS calculator), building internal LLMs trained on CRC documents, and developing the AI analytics chatbot used by all business units.

This role requires someone who ships ML systems—not someone who just builds notebooks.

Responsibilities

Build, tune, and validate statistical models including multi-stage regression, ordered probit, and generalized linear models, audit automation, acuity scoring, and financial forecasting Engineer features from structured and unstructured healthcare data (EMR, claims, revenue cycle, clinician notes)
Tune the existing CRC NLP engine for clinical note understanding, keyword extraction, concept expansion, negation detection, and sentiment scoring
Build custom clinical embeddings using HuggingFace Transformers, spaCy, and domain-tuned vector models
Develop and maintain a CRC private LLM, trained on internal knowledge bases, documentation, analytics logic, and care guidelines Build automated pipelines for LLM evaluation, retraining, retrieval-augmented generation (RAG), and grounded QA
Architect, build, and deploy the AI Analytics Chatbot, integrating model logic, business rules, and Fabric/Databricks data sources
Integrate ML models into production services using notebooks, APIs, or batch inference jobs Support creation of AI-generated reporting, insights summaries, and automated clinical/financial narratives
Build maintainable ML pipelines (training, validation, deployment) using Databricks, Fabric, MLflow, GitHub, and CI/CD
Implement model monitoring, drift detection, and automated retraining Package and deploy reproducible models via APIs or scheduled Fabric/Databricks workflows
Work with data engineering to embed models into CRC applications
Partner with BI analysts to transform model outputs into dashboards
Document methodologies, assumptions, architecture, and validation processes clearly
3–6 years of hands-on machine learning engineering experience (not just DS notebooks)
Strong Python engineering background: pandas, scikit-learn, statsmodels, PyTorch or TensorFlow, transformers, spaCy
Experience building and tuning LLM and NLP pipelines end-to-end
Experience with regression, ordered probit/logit, hierarchical models, and general statistical modeling
Experience deploying ML workloads in Databricks, Azure ML, and Fabric
Strong SQL for feature engineering and model validation
Prior experience working with healthcare data (EMR, claims, RCM, CMS) preferred
Strong communication and the ability to explain complex ML systems to non-technical stakeholders.
Proactive, self-managing engineer who can independently own ML systems end-to-end.
Fluent English required

Preferred qualifications

Experience with: Retrieval-Augmented Generation (RAG) pipelines Vector databases (FAISS, Chroma, Pinecone, Qdrant) Enterprise chatbot frameworks MLflow, CI/CD, GitHub Actions, and model versioning Power BI integration for ML outputs FHIR/SMART on FHIR
Retrieval-Augmented Generation (RAG) pipelines
Vector databases (FAISS, Chroma, Pinecone, Qdrant)
Enterprise chatbot frameworks
MLflow, CI/CD, GitHub Actions, and model versioning
Power BI integration for ML outputs
FHIR/SMART on FHIR
Databricks ML Associate/Professional
Azure AI Engineer Associate
DeepLearning.AI NLP/LLM specializations

Tags & focus areas

Used for matching and alerts on DevFound

Fulltime Ai Machine Learning Nlp Generative Ai

Machine Learning Engineer

Role overview

Responsibilities

Preferred qualifications

Tags & focus areas

Ready to Join the Team?