DescriptionWe are seeking a highly skilled and technically adept Data Engineer to join our PVA DMI team. In this role, you will design, build, and maintain scalable, fault-tolerant data pipelines and infrastructure to support data-driven insights. Leveraging native AWS technologies and Amazon’s internal tools, you will handle data ingestion, transformation, integration, and ensure high standards of data quality through validation, cleansing, and deduplication.You will also be responsible for architecting high-performance, scalable, and cost-efficient big data processing and storage solutions, automating infrastructure with CI/CD principles, and optimizing data models to drive efficient ETL processes. You will be interfacing with several key services and APIs, write code and build applications in a scalable manner to extract/process/ingest unstructured data in a streaming, batch and asynchronous fashion.Your role will also focus on data governance, ensuring compliance with policies, implementing access controls, encryption, retention, and audit mechanisms. Additionally, you will work on continuous improvement by automating processes and adopting the latest data engineering technologies. Collaboration with Data Scientists, Business Intelligence Engineers (BIEs), Software Development Engineers (SDEs), Product Managers (PMs), and functional managers will be key to delivering tailored data solutions.You'll also manage diverse technologies such as Python, EMR, Spark, Iceberg, Airflow, and many AWS data services like Glue, Athena, Redshift to build a scalable platform that supports analytics and experimentation. The ideal candidate must be customer-obsessed with a passion for advertising, able to synthesize a variety of technologies, and effectively lead complex projects.In this position, your attention to detail and dedication to high-quality, well-documented data products will empower stakeholders to make better, data-driven decisions.Key job responsibilities
Design and implement scalable, fault-tolerant data pipelines using AWS technologies and internal Amazon tools to extract, transform, and load data from multiple sources.
Collaborate cross-functionally with BIEs, Data Scientists, PMs, and SDEs to understand data requirements and deliver customized data solutions.
Automate infrastructure deployment with CI/CD pipelines and ensure streamlined processes for deployment and maintenance.
Ensure data quality through robust validation, cleansing, and deduplication techniques.
Implement data governance standards, including access control, encryption, data retention, deletion policies, and audit mechanisms to ensure compliance and security.
Continuously improve and optimize data pipelines and infrastructure, staying up to date with emerging technologies and implementing automation and monitoring tools.
Build a scalable and reliable data platform supporting analytics and experimentation for intuitive, self-service data products.
Write high quality code and build scalable applications that interface with critical services and APIs to extract and process unstructured data
Work with a range of data technologies, including Python, EMR, Spark, Iceberg, Airflow, and many AWS data services like Glue, Athena, Redshift to create end-to-end pipelines that consolidate data from disparate systems.
About the teamPrime Video Advertising - Data, Measurement and Insights(“PVA-DMI”) team is a central data and analytics team within the PV-Ads org that plays a pivotal role in fueling the worldwide growth of Amazon’s video advertising business. We own data platform, reporting, dashboards, measurement, and analytical solutions for PV Ads.Basic Qualifications
Bachelor's degree
3+ years of data engineering experience
Experience with data modeling, warehousing and building ETL pipelines
Experience with SQL
Knowledge of professional software engineering & best practices for full software development life cycle, including coding standards, software architectures, code reviews, source control management, continuous deployments, testing, and operational excellence
Knowledge of distributed systems as it pertains to data storage and computing
Knowledge of batch and streaming data architectures like Kafka, Kinesis, Flink, Storm, Beam
Experience as a data engineer or related specialty (e.g., software engineer, business intelligence engineer, data scientist) with a track record of manipulating, processing, and extracting value from large datasets
Experience in at least one modern scripting or programming language, such as Python, Java, Scala, or NodeJS
Experience with Apache Spark / Elastic Map Reduce
Preferred Qualifications
Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
Experience with non-relational databases / data stores (object storage, document or key-value stores, graph databases, column-family databases)
Master's degree in computer science, engineering, analytics, mathematics, statistics, IT or equivalent
Experience programming with at least one modern language such as C, C#, Java, Python, Golang, PowerShell, Ruby
Experience building/operating highly available, distributed systems of data extraction, ingestion, and processing of large data sets
Strong problem-solving and analytical skills, with the ability to translate business requirements into technical solutions
Amazon is committed to a diverse and inclusive workplace. Amazon is an equal opportunity employer and does not discriminate on the basis of race, national origin, gender, gender identity, sexual orientation, protected veteran status, disability, age, or other legally protected status. For individuals with disabilities who would like to request an accommodation, please visit https://www.amazon.jobs/en/disability/us.Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $118,900/year in our lowest geographic market up to $205,600/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits. This position will remain posted until filled. Applicants should apply via our internal or external career site.