Job Details

ID #17354328
State Pennsylvania
City Plymouthmeeting
Job type Contract
Salary USD Competitive Competitive
Source Matlen Silver
Showed 2021-07-27
Date 2021-07-26
Deadline 2021-09-24
Category Et cetera
Create resume

Sr. Spark / Scala Developer

Pennsylvania, Plymouthmeeting, 19462 Plymouthmeeting USA

Vacancy expired!

THIS POSITION WILL START OFF REMOTELY BUT WILL EVENTUALLY SIT ONSITE

Job Details:
We are looking for a Senior Spark/Scala Develoepr that has hands on experience working on the Hadoop Ecosystem.

Roles and Responsibilities:
  • Develop end to end data pipeline using Spark, Hive and Impala
  • Write SPARK jobs to fetch large data volumes from source
  • Understand business needs, analyze functional specifications and map those to development and designing Apache Spark programs and algorithms.
  • Optimizing Spark code, Impala queries and Hive partitioning strategy for better scalability, reliability and performance.
  • Work on leading BI technologies like MSTR, Tableau over Hadoop Ecosystem through ODBC/JDBC connection
  • Work on hive performance optimizations like using distributed cache for small datasets, Partition, Bucketing in Hive and Map Side joins.
  • Build Machine Learning Algorithms using Spark.
  • Design and deploy enterprise-wide scalable operations
  • Data wrangling and creating workable datasets and work on different file formats like Parquet, ORC, Sequence files and different serialization formats like Avro
  • Build applications using Maven, SBT and integrated with continuous integration servers like Jenkins to build jobs.
  • Execution of Hadoop ecosystem and Applications through Apache HUE
  • Feasibility Analysis (For the deliverables) - Evaluating the feasibility of the requirements against complexity and time lines.
  • Performance tuning of Impala queries
  • Design and documented operational problems by following standards and procedures using software reporting tool JIRA
  • Installing, configuring, and using Hadoop components like Spark, Spark Job server, Spark Thrift server, Phoenix on HBase, Flume, Sqoop
  • Expertise in Shell-Scripts, Cron Automation and Regular Expressions
  • Coordinating for the Development, Integration and Production deployments.
  • Use Rest services to access HBASE data and used the data for further processing in the downstream systems
  • Good experience in debugging issues using the Hadoop, Spark Log files
  • Responsible for preparing technical specifications, analyzing functional specs, development and maintenance of code
  • Create mapping documents to outline data flow from source to target.
  • Perform migration from Legacy Databases RDBMS to Hadoop Ecosystem
  • Use Cloudera Manager, an end-to-end tool to manage Hadoop operations in Cloudera Cluster
  • Create various database objects like tables, views, functions, and triggers using SQL
  • Experience with Spark and Spark SQL
  • Must have hands on experience in Java, Spark, Scala, AKKA,Hive, Maven/SBT, Amazon S3
  • Experience in Kafka, ReST services is a plus.
  • Experience in Hadoop, HBase, MongoDB, or other NoSQL platforms
  • Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
  • Knowledge in Sqoop, Flume preferred
  • Excellent communication skills with both Technical and Business audience
  • Experience in Apache Phoenix, Text Search (Solr, ElasticSearch, CloudSearch)

Education: Bachelors in Computer Science Degree plus 4years experience or MS in Computer science with 2years experience

Required Technical Skills
  • 3+ years strong native SQL skills
  • 3+ years strong experience in database and data warehousing concepts and techniques. Must understand: relational and dimensional modeling, star/snowflake schema design, BI, Data Warehouse operating environments and related technologies, ETL, MDM, and data governance practices.
  • 3+ years experience working in Linux
  • 3+ years experience with Spark
  • 3+ years experience with Scala
  • 1+ years experience with Hadoop, Hive, Impala, HBase, and related technologies
  • 1+ years strong experience with low latency (near real time) systems and working with Tb data sets, loading and processing billions of records per day
  • 1+ years experience with MapReduce/YARN
  • 1+ years experience with Lambda architectures
  • 1+ years experience with MPP, shared nothing database systems, and NoSQL systems

Qualifications:
  • Ability to work in a fast-paced, team-oriented environment
  • Ability to complete the full lifecycle of software development and deliver on time
  • Ability to work with end-users to gather requirements and convert them to working documents
  • Strong interpersonal skills, including a positive, solution-oriented attitude
  • Must be passionate, flexible and innovative in utilizing the tools, their experience, and any other resources, to effectively deliver to very challenging and always changing business requirements with continuous success
  • Must be able to interface with various solution/business areas to understand the requirements and prepare documentation to support development
  • Healthcare and/or reference data experience is a plus
  • A willingness and ability to travel
  • Right to work in the recruiting country.

Vacancy expired!

Subscribe Report job

Related jobs