Role overview
Coinbase has built the world's leading compliant cryptocurrency platform serving over 68 million accounts in more than 100 countries. With multiple successful products, and our vocal advocacy for blockchain technology, we have played a major part in mainstream awareness and adoption of cryptocurrency. We are proud to offer an entire suite of products that are helping build the cryptoeconomy and increase economic freedom around the world.
There are a few things we look for across all hires we make at Coinbase, regardless of role or team. First, we look for signals that a candidate will thrive in a culture like ours, where we default to trust, embrace feedback, disrupt ourselves, and expect sustained high performance because we play as a championship team. Second, we expect all employees to commit to our mission-focused approach to our work. Finally, we seek people with the desire and capacity to build and share expertise in the frontier technologies of crypto and blockchain, in whatever way is most relevant to their role.
The Data Platform team builds and operates systems to centralize all of Coinbase's internal and third-party data, make it easy for colleagues to transform and access that data for analytics and machine learning, and power end-user experiences. As an engineer on the team you will contribute to the full spectrum of our systems, from foundational processing and data storage, through scalable pipelines, to frameworks, tools and applications that make that data available to other teams and systems.
What you'll work on
- Build out and operate our foundational data infrastructure: storage (cloud data warehouse, S3 data lake), orchestration (Airflow), processing (Spark, Flink), streaming services (AWS Kinesis & Kafka), BI tools (Looker & Redash), graph database, and real-time large scale event aggregation store.
- Build thenext iteration of our ingestion pipeline for scale, speed, and reliability. Read from a variety of upstream systems (MongoDB, Postgres, DynamoDB, MySQL, APIs), in both batch and streaming fashion, including change data capture. Make it self-service for non-engineers.
- Build and evolve the tools that empower colleagues across the company to access data and build reliable and scalable transformations. This includes UIs and simple frameworks for derived tables and dimensional modeling, APIs and caching layers for high-throughput serving, and SDKs for the orchestration of complex Spark and Flink pipelines.
- Build systems that secure and govern our data end to end: control access across multiple storage and access layers (including BI tools), track data quality, catalogue datasets and their lineage, detect duplication, audit usage and ensure correct data semantics.
What we're looking for
- Knowledge of SQL
- Experience building API layers and microservices
- Experience with AWS, and especially EMR, S3, Glue, Kinesis, IAM
- Computer Science or related engineering degree