Job Details

ID #44702079
State Georgia
City Atlanta
Job type Contract
Salary USD $Depends on Experience Depends on Experience
Source ERPMark Inc
Showed 2022-08-08
Date 2022-08-05
Deadline 2022-10-04
Category Et cetera
Create resume

Site Reliability Engineer with Google Cloud Platform

Georgia, Atlanta, 30377 Atlanta USA

Vacancy expired!

Job Summary:

  • A client of ours in Atlanta GA is looking for a Site Reliability Engineer with Google Cloud Platform for a Contract opportunity.
Description:
  • Site Reliability Engineer will manage end to end application and system stack and to work with one of the leading financial services organization in the US.
  • Site Reliability Engineering (SRE) is a discipline that combines software and systems engineering for building and running large-scale, distributed, fault-tolerant systems.
  • SRE ensures that internal and external services meet or exceed reliability and performance expectations.
  • SRE is also an engineering approach to building and running production systems engineer solutions to operational problems.
  • As SREs are responsible for overall system operation, utilizing a breadth of tools and approaches to solve a broad set of problems.
  • Practices such as limiting time spent on operational work, blameless postmortems, proactive identification, and prevention of potential outages.
Responsibilities: As a Site Reliability Engineer,
  • You will be part of the team to migrate and transform the on-prem applications and data centers to public Cloud (Google Cloud Platform), and then.
  • You will engage in and improve the software development lifecycle from inception and design, through development, deployment, operation and refinement
  • Develop and maintain the large-scale infrastructure
  • Own build tools and CI/CD automation pipeline
  • You will influence and design infrastructure, architecture, standards and methods for large-scale systems
  • You will support services prior to production via infrastructure design, software platform development, load testing, capacity planning and launch reviews
  • You will maintain services during deployment and in production by measuring and monitoring key performance and service level indicators including availability, latency, and overall system health
  • You will automate system scalability and continually work to improve system resiliency, performance and efficiency
  • Investigate, diagnose, and resolve performance and reliability problems in a wide range of large-scale and high-throughput services
  • Collaborate with architects and application engineers to ensure applications are maintainable, scalable, and follow appropriate disaster recovery and high availability strategies
  • Contributions to handbook, runbooks, and general documentation
  • You will remediate tasks within corrective action plan via sustainable, preventative, and automated measures whenever possible
Qualification & Requirements:
  • BS degree in Computer Science or related technical field, or equivalent job experience required
  • 4plus years of SRE experience in Cloud environments
  • 2+ years of experience developing and/or administering software in public cloud
  • Strong working knowledge and working experience on Google Cloud Platform (Google Cloud Platform)
  • Experience in DevOps and CI/CD pipelines and build tools like Jenkins.
  • 2 -4 years of experience in languages such as Python, Ruby, Bash, Java, Go, Perl, JavaScript and/or node.js
  • Experience managing Infrastructure as code via tools such as Terraform or CloudFormation
  • Must have great communication skills
  • Experience operating a production environment at high scale with emphasis on availability, latency
  • Deep knowledge of container orchestration tools such as Docker, Kubernetes
  • Familiar with configuration management tools and Deployment tools such as Chef, Octopus
  • Experience in software development in one or more of the following: C, C, Java, Go and/or Perl, Python.
  • Prior experience in developing and/or administering software in Windows with Dotnet applications
  • Strong team player with a "can do" attitude, and the flexibility to jump in wherever needed
  • Demonstrable cross-functional knowledge with systems, storage, networking, security and databases
  • System administration skills, including automation and orchestration of Linux/Windows using Chef, Puppet, Ansible, Salt Stack and/or containers (Docker, Kubernetes, etc.)
  • Proficiency with continuous integration and continuous delivery tooling and practices
  • Strong analytical and troubleshooting skills
  • Ability and willingness to learn and apply new tools and technologies
  • Extra Points for any of the following:
  • Prior experience in developing applications in .NET technologies or Java
  • You have expertise designing, analyzing and troubleshooting large-scale distributed systems.
  • You take a system problem-solving approach, coupled with strong communication skills and a sense of ownership and drive
  • You are passionate for automation with a desire to eliminate toil whenever possible
  • You've built software or maintained systems in a highly secure, regulated or compliant industry
  • You thrive in and have experience and passion for working within a DevOps culture and as part of a team
Mandatory Skills:
  • Kubernetes
  • AWS
  • Docker
  • Devops
Job Type: Contract Location: Preferred location is Atlanta GA.

Vacancy expired!

Subscribe Report job