We are looking for an experienced Senior Data Engineer with strong expertise in PySpark, Python, and Big Data technologies to support enterprise-scale Data Engineering initiatives within the AACOE – Data Engineering chapter. The ideal candidate should possess strong experience in building scalable data pipelines, feature engineering, machine learning data preparation, and modern distributed data processing architectures.

The role requires strong hands-on engineering capability, stakeholder collaboration, Agile delivery experience, and expertise in high-performance big data environments supporting analytics and machine learning use cases.

Requirements

Key Responsibilities:

Gather and analyze business and technical requirements for enterprise data engineering and analytics initiatives.
Perform Exploratory Data Analysis (EDA) to identify data patterns, quality issues, and transformation requirements.
Design, develop, and optimize scalable data pipelines using PySpark, Python, and Big Data technologies.
Ingest, cleanse, transform, and process structured and unstructured datasets from multiple enterprise data sources.
Build feature engineering workflows and data transformation pipelines supporting machine learning model development.
Develop secure, reliable, and high-performance distributed data processing solutions.
Collaborate closely with Analytics Delivery Leads, Data Scientists, ML Engineers, and cross-functional teams to deliver data-driven solutions.
Optimize Spark workloads and implement performance tuning strategies for large-scale distributed environments.
Ensure data quality, governance, scalability, and operational efficiency across data platforms.
Participate in Agile ceremonies including sprint planning, backlog grooming, stand-ups, and retrospectives.
Contribute to data architecture discussions, technical documentation, and engineering best practices.
Troubleshoot data pipeline failures, performance bottlenecks, and production issues within enterprise environments.

Required Technical Skills:

Strong hands-on experience with Python and PySpark development.
Deep expertise in Apache Spark including Spark optimization and performance tuning techniques.
Strong experience with Big Data technologies and distributed processing frameworks.
Hands-on experience with Hadoop ecosystem technologies.
Strong proficiency in SQL and complex data transformation logic.
Experience building and maintaining machine learning data pipelines and feature engineering workflows.
Experience with Git and version control best practices.
Strong understanding of distributed systems, scalable architectures, and data processing frameworks.

Nice to Have:

Exposure to cloud-based data engineering platforms.
Experience with DevOps, CI/CD, and automated deployment workflows for data platforms.
Exposure to real-time streaming or event-driven data architectures.
Familiarity with enterprise analytics and AI/ML ecosystems.

Required Competencies:

Strong analytical and problem-solving skills.
Excellent communication and stakeholder management capability.
Ability to work effectively in Agile and cross-functional delivery environments.
Strong ownership mindset with focus on scalability, quality, and delivery excellence.
Ability to manage multiple priorities in fast-paced enterprise programs.
Strong collaboration skills with both technical and business stakeholders.

Preferred Domain Experience:

Enterprise Data Platforms
Analytics & AI/ML Engineering
Banking / Financial Services (preferred)
Large-Scale Digital Transformation Programs

Education:

Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Technology, or related field.

Tags & focus areas

Used for matching and alerts on DevFound

Machine Learning Data Science Data Engineer Ai

Senior Data Engineer

Tags & focus areas

Ready to Join the Team?