Job Details

ID #19958978
State Virginia
City Mclean
Job type Contract
Salary USD Depends on Experience Depends on Experience
Source iTech US, Inc.
Showed 2021-09-20
Date 2021-09-17
Deadline 2021-11-15
Category Et cetera
Create resume

Bigdata AND Hadoop Eco Systems

Virginia, Mclean, 20598 Mclean USA

Vacancy expired!

Job Title:

Bigdata and Hadoop Ecosystems

Location: Mc Lean, VA

Duration: 12 Months

Description:5+ years on AWS S3, PySpark/EMR, Okera, Dremio, Snowflake and SQL, Python, Pyspark, Kafka, Sqoop, SQL, Spark,• Responsible for delivery in the areas of: big data engineering with AWS, Python and Spark (PySpark) and a high-level understanding of machine learning • Develop scalable and reliable data solutions to move data across systems from multiple sources in real time (, Kafka) as well as batch modes (Sqoop)
  • Construct data staging layers and fast real-time systems to feed BI applications and machine learning algorithms
  • Utilize expertise in technologies and tools, such as Python, Spark, AWS s3, EMR, as well as other cutting-edge tools and applications for Big Data
  • Demonstrated ability to quickly learn new tools and paradigms to deploy cutting edge solutions.
  • Develop both deployment architecture and scripts for automated system deployment in AWS
  • Create large scale deployments using newly researched methodologies
.• Work in Agile environment
  • Strong SQL skills to process large sets of data Data Engineering and Research:
  • Develop methods to Cleanse, manipulate and analyze large datasets (Structured and Unstructured

Essential Skillsdata – XMLs, JSONs, PDFs) using Hadoop platform.
  • Lead and Design to implement common data ingestion processes and analytical tools to proactively identify, quantify and monitor risks.
  • Design and Lead Extract, transform and summarize information from large data sets to inform management decisions using Hadoop platform.
  • Develop techniques from statistics and machine learning to build data quality controls for predictive models on numeric, categorical, textual, geographic, and other features.
  • Develop and maintain data ingestion pattern using Python, Spark, HIVE scripts to filter/map/aggregate data.
  • Lead and work with the Credit Risk Analytical and Model team members to ensure successful implementation of models in valuation tools used by Single Family business model managers and other model domain users. Ac Analysis and Modelling:
  • Develop and Perform R&D and exploratory analysis using statistical techniques and machine learning clustering methods to understand data.
  • Develop data profiling, deduping logic, matching logic for analysis.
  • Present ideas and recommendations on Hadoop and other technologies best use to management.

Required Education: At least a bachelor’s degree (or equivalent experience) in computers science software/ Electronics Engineering information system or closely related field id required

Vacancy expired!

Subscribe Report job