Role overview

Have industry experience as a Data Engineer, Machine Learning Engineer, or Data Scientist, dealing with data infrastructure, distributed systems, and fault tolerant data pipelines.
Experience deploying models and infrastructure on Kubernetes.
Experience with infrastructure tools for provisioning, deployment, and monitoring such as Terraform, AWS, Docker, and Datadog.
Experience with heterogeneous data sources and data models including MongoDB, PostgreSQL, Redis, and Neo4J.
Own problems end-to-end, and are willing to pick up whatever context is needed.
You enjoy working as part of a fast-moving team, where perfectionism can sometimes be at odds with pragmatism.
A desire to dig into problems across the stack, whether networking issues, performance bottlenecks, memory leaks, or simply reading unfamiliar code to figure out where potential issues might exist.
Have a strong belief in the crucial need of high-quality data for producing state of the art machine learning systems.

What you'll work on

Design data pipelines to handle large scale data ingest. This includes figuring out ways to store and process this data with robust features for filtering, pre-processing, and versioning.
Build out data infrastructure to train large neural networks using self-supervised and contrastive learning.
Build and refine custom data labeling services that directly influence the quality of our iris recognition engine.
Work closely with other internal stakeholders to incorporate their data usage needs.

Used for matching and alerts on DevFound

Dev Machine Learning Infrastructure Kubernetes Docker Remote Engineer Terraform Aws