Role overview
The leading provider of vehicle lifecycle solutions, with headquarters in Chicago, enables the companies that build, insure, and replace vehicles to power the next generation of transportation. Its platform delivers advanced mobile, artificial intelligence, and car technologies. It connects a network of 350+ insurance companies, 24,000+ repair facilities, hundreds of parts suppliers, and dozens of third-party data and service providers. The customer's collective solutions enhance productivity and help clients deliver better experiences for end consumers.
What you'll work on
- Design and implement end-to-end document intelligence pipelines on AWS
- Develop and optimize ML models for document classification,segmentation, and field extraction
- Build scalable data processing systems handling PDFs up to 2000 pages
- Collaborate with subject matter experts to create and refine requirements for extraction
- Own features from research through production deployment and monitoring
- Establish evaluation frameworks and quality metrics for extraction accuracy
What we're looking for
- Experience with document processing tools (AWS Textract, Azure, Document Intelligence, or similar OCR/layout detection systems)
- Experience with PDF and Image processing libraries (e.g. PyMuPDF, opnecv, pillow)
- Experience in Machine Learning/ Data Science (e.g., ML algorithm selection, feature engineering, model training, hyperparameter tuning, supervised and unsupervised learning implementation, building a model pipelines, using Machine Learning tools/libraries/frameworks)
- Experience working with AWS big data technologies (Redshift, S3, EMR, Glue, etc.)