Data Solutions Engineer job vacancy

Vacancy expired!

Title: Data Solutions Engineer

Location: San Diego, CA OR Redwood City, CA OR Seattle, WA

Pay Rate: $90 to $95/Hr on W2

Contract to hire with 6+ Month's contract

Job Description:

Data Engineer
At Client, Client is inspired by a single vision - transforming patients' lives through science.
In oncology, hematology, immunology, and cardiovascular disease, each of Client passionate colleagues contribute to innovations that drive meaningful change.
Client brings a human touch to every treatment Client pioneer.
Join Client and make a difference.
The Research Engineering group of Client seeks a resilient, results-oriented Data Engineer to join Client motivated and diverse team focusing on informatics and data enablement initiatives in Research and Early Development (R/ED).
The candidate will play a core role in developing data ingestion pipelines, modeling data and designing storage solutions, and building API's that allow fast, flexible access to ingested data.
This hands-on role interfaces closely with data and computational scientists in Informatics and Predictive Sciences and business partners in IT and supports programs spanning both discovery and translational sciences.
Client is seeking an individual with extensive experience integrating data and building data solutions to make data accessible and meaningful for the R/ED community.

Responsibilities:

Explore and evaluate the latest technologies and standard methodologies in Data Engineering and be able to identify software solutions that can address hurdles in data enablement
Design, implement and manage ETL data ingestion pipelines that ingest vast amounts of genomic, phenotypic, and screening data from public, internal, and partner sources
Evaluate database and data storage solutions to find the most optimal manner in which to model, store and retrieve data
Design and build REST API's that allow flexible and easy access to ingested data
Collaborate with data scientist leads to determine best-suited data enablement methods to optimize the enablement and interpretation of the data for downstream scientists
Proactively communicate data ecosystem and pipeline value propositions to partnering scientific collaborators
Collaborate with colleagues across Informatics and Predictive Sciences to make data, including raw/interim data, available to R/ED department personnel as the need arises
Help ensure good engineering practices and code readability across data engineering

Education and Experience:

Bachelor's degree and 4 years of experience or a Master's degree and 2 years of experience in an engineering field
Excellent skills and deep knowledge in Python and object-oriented programming is a must, including common Python libraries such as pandas, boto3, flask, sqlalchemy, psycopg2
Excellent skills and knowledge of databases such as Postgres, Elasticsearch, Redshift, and Athena, including distributed database design, SQL vs. NoSQL, database optimizations, and database administration
Experience with Apache Airflow a plus
Solid understanding of AWS cloud computing services such as S3, EC2, ECS, Batch, Lambda, EMR, RDS, CloudWatch
Understanding of the principles of HTTP and basic networking concepts
Prior experience using REST API's to fetch data is required, experience designing and building REST API's a plus
Understanding of and experience with containers using Docker, ECS, and ECR
Proficiency with modern software development methodologies such as Agile, source control, CI/CD, project management and issue tracking with JIRA
Proficiency with Linux
Working understanding of and experience with various filesystems and object storage, including Linux filesystems, S3, SFTP, and FTP
Experience in a life sciences research environment a plus

Note:

Preferred Locations:

Vacancy expired!

Job Details