Vacancy expired!
- The primary responsibility of this position is to perform site reliability tasks such as operations support troubleshooting and remediation, define and measure SLO/SLI/SLA/ErrorBudgets, toil reduction, SLO driven dashboards, resiliency implementation.
- The role will focus on working on complex issues identifying, diagnosing and recommending engineering solutions.
- This position will bring together E2E products and services, including integration with Platform, IoT hardware, Application, Mobile & Technology groups such that we build compelling and differentiated services across the entire customer journey, with focus on quality, robustness, operational excellence, instrumentation and traceability.
- Collaborate closely with Product, Development, Quality and Ops teams to ensure that designed solutions respond to non-functional requirements such as availability, performance, cost, security, maintainability, achieve speed to market and quality to Engineering departments.
- Investigate issues, recommend and test fixes, coordinate issue resolution within technology and with external vendors.
- Evangelize site reliability engineering best practices to improve system reliability across the organization.
- Experience troubleshooting production workloads using technologies such as log aggregation systems and APM tooling, New Relic experience is a plus
- Build SLO/SLA dashboards and monitoring using tools like DataDog, New Relic or equivalent.
- Java technology development experience, ideally with Spring/Hibernate
- Analytical background, in the areas of user experience, data integrity and SLA.
- Experience with RDBMS and NoSQL technologies such as MySQL and MongoDB, Elastic Search
- Hands-on experience designing and developing web services preferred, e.g. REST, JSON
- Strong knowledge of software engineering best practices for the full software development life cycle, including coding standards, code reviews, source control, test, build and release engineering processes with focus on automation and end to end traceability.
- Experience working with data streaming technologies, Kafka is a plus
- Working knowledge of the containers using Docker or Kubernetes preferred
Vacancy expired!