Title: Big Data with Python DeveloperWe are getting python front web app developers, in screening itself please check with them do they have Bigdata experience using python as per JD attached.If theyonly from python web app development we do not need them.Pls ask their experience on Bigdata Hadoop ecosystem, Data lakes, DWH, structured/ Unstructured Data, creating Data pipeline/Data frames, Data validations, Querying Data ba
- Development, customize and manage integration tools, databases, warehouses and analytical systems with the use of data related instruments/instances
- Create and run complex queries and automation scripts for operational data processing and building out Python ETL processes and writing complex SQL queries.
- Test the reliability and performance of each part of a system and cooperate with the testing team
- Deploying data models into production environments. This entails providing the model with data stored in a warehouse or coming directly from sources, configuring data attributes, managing computing resources, setting up monitoring tools, etc.
- Responsible for setting up tools to view data, generate reports, and create visuals
- Monitoring the overall performance and stability of the system. Adjust and adapt the automated pipeline as data/models/requirements change.
- Excellent understanding of ETL cycle. Analyze and organize raw data, build data systems and pipelines.
- Combine raw information from different sources, explore ways to enhance data quality and reliability, interpret trends and patterns from the raw data.
- Experience in using of Python/ PySpark and/or Scala for data engineering.
- Understanding of data types/ handling of different data models.
- Good knowledge in various phases of SDLC Requirement Analysis, Design, Development and Testing on various Development and Enhancement Projects.
- Desirable to have experience with Spark, Flink, Kafka, Flask, Scala, PySpark for Data engineering.
- Experience with the Microsoft Azure or AWS data management tools such as Azure Datafactory, Datalake and Databricks or AWS Snowflake is a plus
- Experience with data visualization tools is a plus (PowerBI, Tableau).
- Understanding of descriptive and exploratory statistics, predictive modelling, evaluation metrics, decision trees, machine learning algorithms is a plus
- Good scripting and programming skills.