Job Details

ID #21196105
State New York
City New york city
Job type Permanent
Salary USD TBD TBD
Source HireNetworks
Showed 2021-10-16
Date 2021-10-15
Deadline 2021-12-13
Category Architect/engineer/CAD
Create resume

Principal Site Reliability Engineer

New York, New york city, 10001 New york city USA

Vacancy expired!

Principal

Site Reliability Engineer

Seeking a Principal SRE to contribute to a thriving digital experience software platform built on AWS and GCP! Our SaaS client's engineering teams have built data pipelines that process 10 billion events daily and applications that support powerful experimentation and collaboration workflows at scale. This is a unique opportunity to lead the engineering organization in areas of standardized automated infrastructure and service provisioning and orchestration, service-oriented architectural excellence, and forward-looking planning and execution of large technical projects.

This is a direct-hire remote opportunity open to candidates throughout the USA. Target salary range is up to $160,000 plus benefits. No visa sponsorship or contracting/subcontracting arrangements are available at this time.

Technical environment includes tools such as Kafka, Samza , HBase, MySQL, and Postgres. Systems are built and managed using TravisCI , Jenkins, Docker, Kubernetes, Terraform, and Chef. The team use s a combination of managed and self-hosted approaches.

Responsibilities of the

Principal

Site Reliability Engineer:

  • Define a roadmap for all engineering teams to utilize fully automated, self-service, highly scalable, cost-efficient, observable, auditable and reliable infrastructure services as standard practice
  • Drive the execution of this roadmap across the engineering organization, collaborating with SREs and senior engineers across engineering while also performing hands-on work on the most critical challenges
  • Provide expert technical guidance and ongoing engineering design review to teams planning and implementing large migrations, service-oriented architecture, broad architectural shifts, and capacity growth
  • Build a metrics-driven operational culture standardizing our practices for SLO definition and review as well as for logging, monitoring, alerting, and on-call practices
  • Make iterative improvements to blameless incident management processes, root cause analyses, outage prevention, and service recovery strategies across the engineering organization
  • Partner closely with Security, Quality, and Product teams to achieve high priority security, privacy, compliance, reliability, and business-continuity objectives on our overall roadmap
  • Propose and drive large improvements to production systems to achieve a significant impact to our business and engineering teams
  • Mentor and coach engineers to be curious and effective at discovering and solving technical challenges

Qualifications

of the Principal Site Reliability Engineer:

  • 8- 10+ years ' experience demonstrating hands-on technical leadership and business impact in combining software engineering skills with systems engineering skills to solve complex automation and reliability challenges
  • D eep technical experience with various cloud providers, containerization technologies, automated deployment frameworks, orchestration frameworks, monitoring, logging, alerting, system internals, networking, databases, distributed systems, and service-oriented architecture
  • Ability to implement load, stress, performance, and reliability testing standards at scale to improve service, platform, and infrastructure resiliency
  • A drive to promote openness, diversity of opinions, and inclusive discussions at all times to evaluate a wide variety of ideas and perspectives in solving challenging problems
  • Clear decision making skills and good trade-offs in complex situations comprising multiple opinions, needs, teams, technologies, cloud providers, and architectural settings
  • Ability to communicate effectively with stakeholders ranging from executives to junior engineers across the breadth and depth of the engineering organization
  • Ability to exemplify high accountability, integrity, and resilience to maintain focus on both big-picture goals and milestones to get there
  • Drive to enable the engineering organization to innovate and deliver with greater speed and safety

Contact Lindsay Allan at lallan@hirenetworks.com regarding this posting. A Word resume is preferred when applying.

When looking for a job, have you ever heard the phrase it is not about what you know, it is who you know?

At HireNetworks , it really is all about who we know.

Whether your current contract is coming to a close, you're looking to advance your career or are a company on the hunt for new talent and wanting to expandlet HireNetworks put our networks to work for you.

HireNetworks is an equal opportunity employer.

Work is generally performed in an office environment in which there is only minimal exposure to unpleasant or hazardous working conditions. Must have an ability to sit for long periods throughout the day. Must be able to use a telephone or headset equipment.

The incumbent must be able to perform work at a computer terminal for 6 to 8 hours a day, function in an environment with

consistent

interruptions, and in rare circumstances, lift 20 lbs.

The work may be stressful at times and demand the ability to hit the key deliverables for the role.

#dice

Vacancy expired!

Subscribe Report job