Job Details

ID #15339048
State Georgia
City Atlanta
Job type Contract
Salary USD Depends on Experience Depends on Experience
Source Hexaware Technologies, Inc
Showed 2021-06-11
Date 2021-06-08
Deadline 2021-08-07
Category Et cetera
Create resume

Site Reliability Engineer

Georgia, Atlanta, 30301 Atlanta USA

Vacancy expired!

Responsibilities:
  • Gain deep knowledge of our complex applications.
  • Serve as a primary point responsible for the overall health, performance, and capacity of one or more of our technology products.
  • Familiar with design principles of monitoring and alerting systems.
  • Designing, implementing, and maintaining robust monitoring and alerting to improve performance and reliability.
  • Experience with automation, configuration management, and developing infrastructure as code.
  • Use engineering best practices — deliver high-quality production code, utilize automated testing, and build reusable components
  • Develop tools to improve our ability to rapidly deploy and effectively monitor custom applications in a large-scale Windows and Linux environment.
  • Work closely with development teams to ensure that platforms are designed with "operability" in mind.
  • Function well in a fast-paced, rapidly-changing environment.
  • Participating in the operations on-call rotation, triaging and addressing production issues

Qualifications:
  • S. or higher in Computer Science or other technical discipline, or related practical experience.
  • Programming skills (Java & Shell Script | Python, Ruby Perl or C).
  • 5 or more years of experience in Unix/ Linux large-scale operations role.
  • Experience in designing, analyzing, and troubleshooting large-scale distributed systems.
  • Debug production issues across services and levels of the stack.
  • Experience with one or more orchestration, deployment tools Docker, Ansible.
  • Familiarity with Git or other source control systems.
  • Experience using tools to create and manage CI (continuous integration) and CD (continuous delivery) pipelines.
  • Shell or Python experience, specifically for systems automation.
  • Good exp in performance Eng. tools like - Selenium, JMeter & Load runner etc.
  • Working knowledge of the TCP/IP stack, internet routing and load balancing.
  • Experience with monitoring alerting using technologies like New Relic, SiteScope, Netcool, Dynatrace, Extrahop, Moogsoft, Prometheus, Sensu, Nagios,Splunk,Dynatrace etc.
  • Optional: Experience implementing, designing, deploying Docker, Kubernetes, Serverless (Function or Lambda’s).
  • Strong interpersonal communication skills (including listening, speaking, and writing) and ability to work well in a diverse, team-focused environment with other SREs, Engineers, Product Managers, etc.
  • Creative thinker and strong problem solver with meticulous attention to detail
  • Highly organized, creative, motivated, and passionate about achieving results
  • Strong experience with AWS (design, SDKs, best practices) – good to have
  • AWS certifications – good to have.

Vacancy expired!

Subscribe Report job