Vacancy expired!
- Data Engineer
- At Client, Client is inspired by a single vision - transforming patients' lives through science.
- In oncology, hematology, immunology, and cardiovascular disease, each of Client passionate colleagues contribute to innovations that drive meaningful change.
- Client brings a human touch to every treatment Client pioneer.
- Join Client and make a difference.
- The Research Engineering group of Client seeks a resilient, results-oriented Data Engineer to join Client motivated and diverse team focusing on informatics and data enablement initiatives in Research and Early Development (R/ED).
- The candidate will play a core role in developing data ingestion pipelines, modeling data and designing storage solutions, and building API's that allow fast, flexible access to ingested data.
- This hands-on role interfaces closely with data and computational scientists in Informatics and Predictive Sciences and business partners in IT and supports programs spanning both discovery and translational sciences.
- Client is seeking an individual with extensive experience integrating data and building data solutions to make data accessible and meaningful for the R/ED community.
- Explore and evaluate the latest technologies and standard methodologies in Data Engineering and be able to identify software solutions that can address hurdles in data enablement
- Design, implement and manage ETL data ingestion pipelines that ingest vast amounts of genomic, phenotypic, and screening data from public, internal, and partner sources
- Evaluate database and data storage solutions to find the most optimal manner in which to model, store and retrieve data
- Design and build REST API's that allow flexible and easy access to ingested data
- Collaborate with data scientist leads to determine best-suited data enablement methods to optimize the enablement and interpretation of the data for downstream scientists
- Proactively communicate data ecosystem and pipeline value propositions to partnering scientific collaborators
- Collaborate with colleagues across Informatics and Predictive Sciences to make data, including raw/interim data, available to R/ED department personnel as the need arises
- Help ensure good engineering practices and code readability across data engineering
- Bachelor's degree and 4 years of experience or a Master's degree and 2 years of experience in an engineering field
- Excellent skills and deep knowledge in Python and object-oriented programming is a must, including common Python libraries such as pandas, boto3, flask, sqlalchemy, psycopg2
- Excellent skills and knowledge of databases such as Postgres, Elasticsearch, Redshift, and Athena, including distributed database design, SQL vs. NoSQL, database optimizations, and database administration
- Experience with Apache Airflow a plus
- Solid understanding of AWS cloud computing services such as S3, EC2, ECS, Batch, Lambda, EMR, RDS, CloudWatch
- Understanding of the principles of HTTP and basic networking concepts
- Prior experience using REST API's to fetch data is required, experience designing and building REST API's a plus
- Understanding of and experience with containers using Docker, ECS, and ECR
- Proficiency with modern software development methodologies such as Agile, source control, CI/CD, project management and issue tracking with JIRA
- Proficiency with Linux
- Working understanding of and experience with various filesystems and object storage, including Linux filesystems, S3, SFTP, and FTP
- Experience in a life sciences research environment a plus
- Data Engineer
- San Diego, CA
- Redwood City, CA
- Seattle, WA
Vacancy expired!