Vacancy expired!
- Lead the creation of a next generation data platform to ingest tens of thousands of datasets, support petabyte-scale storage and compute, and deliver billions of real-time queries per month, while maintaining cost effectiveness and implementing appropriate data safeguards.
- Manage a growing organization of software engineers building the next generation of services supporting data storage and compute strategy.
- Identify opportunities to improve the performance and scale of APIs, the velocity and efficiency of data ingestion, the connectivity and linking of datasets, and the extraction of natural language and imagery sources.
- Work with senior leadership to translate platform vision and strategy into an actionable roadmap, maintain KPIs to track process, and deliver on-time and on-budget.
- Collaborate with the application engineering, product management, project management, data science and market-facing teams to align the data platform with business needs.
- Define standards and practices around automation, system reliability, data architecture, process management, containerization, infrastructure-as-code, auto-scaling, data security, etc.
- Serve as a mentor for team members, an evangelist of the data platform for other engineering teams, and a translator between engineering and the business. This will include facilitating and participating in design sessions, code reviews and sprint ceremonies, as well as giving presentations on s data platform for technical and non-technical audiences.
- Investigate and resolve technical and non-technical issues, including leading and participating within incident management processes and root cause analyses.
- Contribute to technology strategy as a member of its architectural leadership team.
- S. in Computer Science (or equivalent)
- 3 or more years of experience managing software engineering teams
- 3 or more years of experience with big data systems and cloud architecture
- 7 or more years of experience in software engineering
- Big data architecture and systems, including distributed data processing systems (such as Spark or Dask), distributed data storage systems (such as Parquet or HDFS), low-latency data lake query architectures (such as Alluxio) and real-time streaming systems (such as Kafka)
- Data lake design strategies for metadata, ontology, governance, authorization, etc.
- Test automation for data quality, data flow, and API endpoints
- Data engineering techniques for big data, including data automation frameworks (such as Airflow or Prefect), metadata management (such as Amundsen) and process management strategies
- Infrastructure management and automation, such as Kubernetes, Terraform and Chef
- Cloud infrastructure management, ideally with experience in AWS, including both technical aspects, such as solutions architecture, and non-technical aspects, such as financial planning
- Modern practices around agile development, release management, continuous integration, system reliability, cloud architecture, authN/Z and data security
- Fundamentals of computer science and software engineering
Vacancy expired!