Job Details

ID #39883275
State New York
City New york city
Job type Permanent
Salary USD $125000 - $175000 per annum 125000 - 175000 per annum
Source Jefferson Frank
Showed 2022-04-30
Date 2022-04-29
Deadline 2022-06-28
Category Et cetera
Create resume

Senior Observability and SRE Engineer

New York, New york city, 10008 New york city USA

Vacancy expired!

Summary:

Strong knowledge in observability and Site reliability engineer (SRE) with experience automating and pro-actively monitoring DevOps platforms and a passion for developing and architecting automation solutions., should be able to handle first point escalation for all technical and process issues. Provide technical subject matter expertise wherever required. Ensure proper communication and quick resolution as a crisis manager. Plan and schedule Changes, Coordinating with different stakeholders. Perform RCA for Major Incident's related to his / her tower Follow quality / security process defined for the engagement. Perform Trend analysis, identify top few incidents and work with respective teams/individual to minimize the incidents, Hardware troubleshooting & Vendor coordination Prepare Weekly and monthly status reports. Participate in business meetings with various stake holders on a need basis. Take corrective actions based on the customer satisfaction surveys. Work on the service improvement programs. Effort estimation/reviews on need basis for new projects. Training of new team members. Able to work on Knowledge acquisition and updates to related document

Role Description:

Strong knowledge in observability and Site reliability engineer (SRE) with experience automating and pro-actively monitoring DevOps platforms and a passion for developing and architecting automation solutions., should be able to handle first point escalation for all technical and process issues. Provide technical subject matter expertise wherever required. Ensure proper communication and quick resolution as a crisis manager. Plan and schedule Changes, Coordinating with different stakeholders. Perform RCA for Major Incident's related to his / her tower Follow quality / security process defined for the engagement. Perform Trend analysis, identify top few incidents and work with respective teams/individual to minimize the incidents, Hardware troubleshooting & Vendor coordination Prepare Weekly and monthly status reports. Participate in business meetings with various stake holders on a need basis. Take corrective actions based on the customer satisfaction surveys. Work on the service improvement programs. Effort estimation/reviews on need basis for new projects. Training of new team members. Able to work on Knowledge acquisition and updates to related document

1) Strong scripting/programming skillsBash, Python, Go etc. - Strong scripting skills (coding is really a must) not ALL but at least 1: Bash, Python, GoLang, all nice to have. 2) Understanding of Jenkins, SDLC, Agile and DevOps

Vacancy expired!

Subscribe Report job